List layout: offset index for parallel elements fetch#8639
Conversation
Signed-off-by: "Matthew Katz" <katz@spiraldb.com> Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matthew Katz" <katz@spiraldb.com> Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matthew Katz" <katz@spiraldb.com> Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
…hange Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: "Matt Katz" <mhkatz97@gmail.com> Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Merging this PR will improve performance by 10.32%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | encode_varbin[(1000, 2)] |
153.1 µs | 138.7 µs | +10.32% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing mk/list-layout-deser-io (65c2d1f) with develop (890704f)
Footnotes
-
4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.110x ❌ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.110x ❌, 0↑ 2↓)
File Size Changes (1 files changed, +0.2% overall, 1↑ 0↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.996x ➖, 0↑ 0↓)
datafusion / parquet (0.989x ➖, 0↑ 0↓)
datafusion / arrow (0.993x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.992x ➖, 0↑ 0↓)
duckdb / parquet (0.986x ➖, 1↑ 1↓)
File Size Changes (17 files changed, -44.6% overall, 3↑ 14↓)
Totals:
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.158x ❌, 0↑ 6↓)
datafusion / parquet (1.202x ❌, 0↑ 9↓)
duckdb / vortex-file-compressed (1.401x ❌, 0↑ 9↓)
duckdb / parquet (1.187x ❌, 0↑ 9↓)
File Size Changes (3 files changed, -46.3% overall, 1↑ 2↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.997x ➖, 0↑ 1↓)
datafusion / parquet (1.017x ➖, 2↑ 10↓)
duckdb / vortex-file-compressed (0.994x ➖, 0↑ 0↓)
duckdb / parquet (0.998x ➖, 2↑ 1↓)
File Size Changes (30 files changed, -43.4% overall, 4↑ 26↓)
Totals:
|
Benchmarks: Clickbench Sorted on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.973x ➖, 1↑ 1↓)
datafusion / parquet (0.977x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.966x ➖, 2↑ 1↓)
duckdb / parquet (0.992x ➖, 0↑ 0↓)
File Size Changes (201 files changed, -42.6% overall, 53↑ 148↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.009x ➖, 0↑ 1↓)
datafusion / parquet (0.991x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.913x ➖, 1↑ 0↓)
duckdb / parquet (0.962x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (0.935x ➖, 2↑ 0↓)
duckdb / parquet (0.989x ➖, 0↑ 0↓)
File Size Changes (3 files changed, -32.1% overall, 1↑ 2↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.936x ➖, 5↑ 0↓)
datafusion / parquet (1.006x ➖, 2↑ 6↓)
duckdb / vortex-file-compressed (0.918x ➖, 14↑ 0↓)
duckdb / parquet (0.969x ➖, 0↑ 1↓)
File Size Changes (201 files changed, -39.1% overall, 48↑ 153↓)
Totals:
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.113x ❌, 0↑ 15↓)
datafusion / parquet (1.074x ➖, 0↑ 7↓)
datafusion / arrow (0.976x ➖, 2↑ 1↓)
duckdb / vortex-file-compressed (1.086x ➖, 0↑ 7↓)
duckdb / parquet (1.050x ➖, 0↑ 3↓)
File Size Changes (47 files changed, -44.5% overall, 14↑ 33↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.933x ➖, 3↑ 3↓)
datafusion / parquet (0.943x ➖, 4↑ 3↓)
duckdb / vortex-file-compressed (0.936x ➖, 0↑ 0↓)
duckdb / parquet (0.983x ➖, 0↑ 0↓)
|
First pass at removing the offsets→elements serial round-trip on list random access. Adds a resident, coarse offset-sample index to
ListLayoutso a partial-range read can start the elements fetch in parallel with the offsets fetch instead of waiting on it.ListLayoutStrategysamples offsets at write time.