-
Notifications
You must be signed in to change notification settings - Fork 130
shared array in dict layout #6305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: Onur Satici <onur@spiraldb.com>
🚨🚨🚨❌❌❌ SQL BENCHMARK FAILED ❌❌❌🚨🚨🚨Benchmark |
| pub(super) source: ArrayRef, | ||
| pub(super) cache: Arc<OnceLock<Canonical>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have this be a Arc<RwLock> where enum Shared = Array(ArrayRef) | Cached(Canonical)
| impl Clone for SharedArray { | ||
| fn clone(&self) -> Self { | ||
| Self { | ||
| source: self.source.clone(), | ||
| cache: Arc::clone(&self.cache), | ||
| stats: self.stats.clone(), | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default impl?
Merging this PR will not alter performance
Comparing Footnotes
|
| pub fn source(&self) -> &ArrayRef { | ||
| &self.source | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when do you think you ever want this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea not much, maybe tests
| } | ||
|
|
||
| fn deserialize(_bytes: &[u8]) -> VortexResult<Self::Metadata> { | ||
| vortex_error::vortex_bail!("Shared array is not serializable") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
| self.values_evals | ||
| .entry(expr.clone()) | ||
| .or_insert_with(|| { | ||
| self.values_array() | ||
| .map(move |array| { | ||
| let array = array?.apply(&expr)?; | ||
| // We execute the array to avoid re-evaluating for every split. | ||
| let mut ctx = ExecutionCtx::new(session); | ||
| Ok(array.execute::<Canonical>(&mut ctx)?.into_array()) | ||
| Ok(SharedArray::new(array).into_array()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is for a filter and it kind of breaks things?
Benchmarks: TPC-H SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on S3Summary
Detailed Results Table
|
Benchmarks: TPC-DS SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb S3Summary
Detailed Results Table
|
Benchmarks: Statistical and Population GeneticsSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on S3Summary
Detailed Results Table
|
Benchmarks: Clickbench on NVMESummary
Detailed Results Table
|
With this we can get read from a DictLayout without executing anything