implement FilterFn for ChunkedArray #794

a10y · 2024-09-11T21:28:29Z

No description provided.

a10y · 2024-09-11T21:35:37Z

Heh, turns out the naive implementation is 3x slower than develop, pretty much entirely due to find_chunk_idx slowness

a10y · 2024-09-11T22:00:04Z

Alright so currently what I'm seeing is for Q15 vortex-compressed-file:

develop: ~112ms
This PR, using ChunkedArray::find_chunk_idx: ~700ms, most of that in find_chunk_idx
The latest commit that uses a binary search on pre-materialized chunk indices: ~106ms

The new flame graph is showing ~6% less time spent in filter, which is what we'd expect roughly just going off the numbers

robert3005 · 2024-09-11T22:35:25Z

Right, this can go lower if all the children also implement filter. Good to know that we need to unpack values

a10y · 2024-09-12T11:37:08Z

~~Most of the children do implement FilterFn, the exception is extension array~~

EDIT: about half do. The missing ones here are FoR and Extension

robert3005 · 2024-09-12T11:55:58Z

FoR has it. Notable omission is Bitpacked which could repack itself and it likely would be faster. Also sparse is missing which likely leads to some extra allocation

robert3005 · 2024-09-12T11:56:11Z

fwiw we should make them in separate prs

github-actions

Vortex benchmarks

Benchmark suite	Current: `d5ecb80`	Previous: `750b9c8`	Ratio
`tpch_q1/vortex-in-memory-no-pushdown`	`467629937` ns/iter (`± 3750524`)	`482302101` ns/iter (`± 5129904`)	`0.97`
`tpch_q1/vortex-in-memory-pushdown`	`515046187` ns/iter (`± 1745901`)	`527534072` ns/iter (`± 3463267`)	`0.98`
`tpch_q1/arrow`	`454002116` ns/iter (`± 2291129`)	`451801735` ns/iter (`± 3758415`)	`1.00`
`tpch_q1/parquet`	`665259754` ns/iter (`± 3313569`)	`661718171` ns/iter (`± 2674937`)	`1.01`
`tpch_q1/vortex-file-compressed`	`610874701` ns/iter (`± 2392209`)	`621200118` ns/iter (`± 6749288`)	`0.98`
`tpch_q1/vortex-file-uncompressed`	`633865011` ns/iter (`± 2741863`)	`592370294` ns/iter (`± 8018622`)	`1.07`
`tpch_q2/vortex-in-memory-no-pushdown`	`144574917` ns/iter (`± 990792`)	`160802521` ns/iter (`± 4678476`)	`0.90`
`tpch_q2/vortex-in-memory-pushdown`	`145267883` ns/iter (`± 876640`)	`159124621` ns/iter (`± 3511202`)	`0.91`
`tpch_q2/arrow`	`122184510` ns/iter (`± 572539`)	`127992668` ns/iter (`± 614847`)	`0.95`
`tpch_q2/parquet`	`158407129` ns/iter (`± 845449`)	`167523912` ns/iter (`± 976972`)	`0.95`
`tpch_q2/vortex-file-compressed`	`154158990` ns/iter (`± 731391`)	`170749370` ns/iter (`± 1095064`)	`0.90`
`tpch_q2/vortex-file-uncompressed`	`164798709` ns/iter (`± 1537024`)	`176352506` ns/iter (`± 1861263`)	`0.93`
`tpch_q3/vortex-in-memory-no-pushdown`	`154082645` ns/iter (`± 945209`)	`164894661` ns/iter (`± 1688574`)	`0.93`
`tpch_q3/vortex-in-memory-pushdown`	`182599172` ns/iter (`± 2140747`)	`204285819` ns/iter (`± 1733504`)	`0.89`
`tpch_q3/arrow`	`148805278` ns/iter (`± 1268268`)	`163147591` ns/iter (`± 2200349`)	`0.91`
`tpch_q3/parquet`	`339122776` ns/iter (`± 2983826`)	`364056632` ns/iter (`± 3161647`)	`0.93`
`tpch_q3/vortex-file-compressed`	`600109967` ns/iter (`± 2475613`)	`238624885` ns/iter (`± 3477685`)	`2.51`
`tpch_q3/vortex-file-uncompressed`	`346678665` ns/iter (`± 4154641`)	`264574306` ns/iter (`± 4886537`)	`1.31`
`tpch_q4/vortex-in-memory-no-pushdown`	`116713076` ns/iter (`± 680431`)	`119477803` ns/iter (`± 1823670`)	`0.98`
`tpch_q4/vortex-in-memory-pushdown`	`140401608` ns/iter (`± 1067465`)	`148794718` ns/iter (`± 4839612`)	`0.94`
`tpch_q4/arrow`	`112121350` ns/iter (`± 969314`)	`126580600` ns/iter (`± 3893440`)	`0.89`
`tpch_q4/parquet`	`223556527` ns/iter (`± 2134392`)	`233726536` ns/iter (`± 2398398`)	`0.96`
`tpch_q4/vortex-file-compressed`	`608122542` ns/iter (`± 1411424`)	`214291423` ns/iter (`± 4481945`)	`2.84`
`tpch_q4/vortex-file-uncompressed`	`301939005` ns/iter (`± 3284469`)	`223841277` ns/iter (`± 4788043`)	`1.35`
`tpch_q5/vortex-in-memory-no-pushdown`	`300395241` ns/iter (`± 2339398`)	`331680262` ns/iter (`± 3094149`)	`0.91`
`tpch_q5/vortex-in-memory-pushdown`	`314484406` ns/iter (`± 2458080`)	`352998734` ns/iter (`± 3928054`)	`0.89`
`tpch_q5/arrow`	`296284099` ns/iter (`± 4279535`)	`322701287` ns/iter (`± 4014088`)	`0.92`
`tpch_q5/parquet`	`456248488` ns/iter (`± 3726847`)	`491143044` ns/iter (`± 5029280`)	`0.93`
`tpch_q5/vortex-file-compressed`	`346751667` ns/iter (`± 4316017`)	`373255236` ns/iter (`± 8149263`)	`0.93`
`tpch_q5/vortex-file-uncompressed`	`354653588` ns/iter (`± 6344729`)	`370011144` ns/iter (`± 7597998`)	`0.96`
`tpch_q6/vortex-in-memory-no-pushdown`	`41564617` ns/iter (`± 702737`)	`45597608` ns/iter (`± 395126`)	`0.91`
`tpch_q6/vortex-in-memory-pushdown`	`91211372` ns/iter (`± 453285`)	`94817690` ns/iter (`± 1631882`)	`0.96`
`tpch_q6/arrow`	`37046230` ns/iter (`± 687907`)	`39935601` ns/iter (`± 153591`)	`0.93`
`tpch_q6/parquet`	`154464040` ns/iter (`± 1078498`)	`158515552` ns/iter (`± 1076605`)	`0.97`
`tpch_q6/vortex-file-compressed`	`77106628` ns/iter (`± 349019`)	`82131752` ns/iter (`± 447633`)	`0.94`
`tpch_q6/vortex-file-uncompressed`	`166363477` ns/iter (`± 1183907`)	`164012224` ns/iter (`± 1392620`)	`1.01`
`tpch_q7/vortex-in-memory-no-pushdown`	`570770557` ns/iter (`± 5628418`)	`628570919` ns/iter (`± 13993824`)	`0.91`
`tpch_q7/vortex-in-memory-pushdown`	`618735527` ns/iter (`± 6725917`)	`685014994` ns/iter (`± 4076828`)	`0.90`
`tpch_q7/arrow`	`555973167` ns/iter (`± 6869244`)	`591575674` ns/iter (`± 9570461`)	`0.94`
`tpch_q7/parquet`	`733884143` ns/iter (`± 7867692`)	`766331481` ns/iter (`± 7275189`)	`0.96`
`tpch_q7/vortex-file-compressed`	`856993945` ns/iter (`± 4929339`)	`664591921` ns/iter (`± 7273788`)	`1.29`
`tpch_q7/vortex-file-uncompressed`	`740677757` ns/iter (`± 5450486`)	`736981022` ns/iter (`± 11454205`)	`1.01`
`tpch_q8/vortex-in-memory-no-pushdown`	`210734148` ns/iter (`± 1364876`)	`216322787` ns/iter (`± 1450325`)	`0.97`
`tpch_q8/vortex-in-memory-pushdown`	`222583044` ns/iter (`± 896335`)	`225993479` ns/iter (`± 1636984`)	`0.98`
`tpch_q8/arrow`	`210921272` ns/iter (`± 691187`)	`213002888` ns/iter (`± 798766`)	`0.99`
`tpch_q8/parquet`	`481543044` ns/iter (`± 2300449`)	`480810962` ns/iter (`± 5440907`)	`1.00`
`tpch_q8/vortex-file-compressed`	`269483461` ns/iter (`± 4783378`)	`250163486` ns/iter (`± 533009`)	`1.08`
`tpch_q8/vortex-file-uncompressed`	`292301158` ns/iter (`± 9043288`)	`289075003` ns/iter (`± 2810625`)	`1.01`
`tpch_q9/vortex-in-memory-no-pushdown`	`405813821` ns/iter (`± 2341096`)	`411252124` ns/iter (`± 5191902`)	`0.99`
`tpch_q9/vortex-in-memory-pushdown`	`406254784` ns/iter (`± 2090194`)	`402404864` ns/iter (`± 10300688`)	`1.01`
`tpch_q9/arrow`	`401193441` ns/iter (`± 3486194`)	`407465021` ns/iter (`± 4936156`)	`0.98`
`tpch_q9/parquet`	`702975198` ns/iter (`± 3508374`)	`690287931` ns/iter (`± 2228853`)	`1.02`
`tpch_q9/vortex-file-compressed`	`454991563` ns/iter (`± 4574292`)	`448813755` ns/iter (`± 3147241`)	`1.01`
`tpch_q9/vortex-file-uncompressed`	`487958941` ns/iter (`± 3139067`)	`493338488` ns/iter (`± 3770312`)	`0.99`
`tpch_q10/vortex-in-memory-no-pushdown`	`240534797` ns/iter (`± 1106459`)	`237008689` ns/iter (`± 999451`)	`1.01`
`tpch_q10/vortex-in-memory-pushdown`	`263164189` ns/iter (`± 1948364`)	`269284156` ns/iter (`± 2766466`)	`0.98`
`tpch_q10/arrow`	`232776848` ns/iter (`± 1296696`)	`232630226` ns/iter (`± 1125035`)	`1.00`
`tpch_q10/parquet`	`491261591` ns/iter (`± 3634032`)	`491838979` ns/iter (`± 2172643`)	`1.00`
`tpch_q10/vortex-file-compressed`	`599147762` ns/iter (`± 2913480`)	`438008551` ns/iter (`± 1560277`)	`1.37`
`tpch_q10/vortex-file-uncompressed`	`397099274` ns/iter (`± 3897509`)	`342085663` ns/iter (`± 1049070`)	`1.16`
`tpch_q11/vortex-in-memory-no-pushdown`	`220454180` ns/iter (`± 2158905`)	`228213202` ns/iter (`± 1929879`)	`0.97`
`tpch_q11/vortex-in-memory-pushdown`	`218691415` ns/iter (`± 2299972`)	`227285629` ns/iter (`± 1260022`)	`0.96`
`tpch_q11/arrow`	`176616687` ns/iter (`± 1047656`)	`181955973` ns/iter (`± 481860`)	`0.97`
`tpch_q11/parquet`	`190341853` ns/iter (`± 1991299`)	`190665473` ns/iter (`± 1014961`)	`1.00`
`tpch_q11/vortex-file-compressed`	`232418289` ns/iter (`± 1398293`)	`234354109` ns/iter (`± 2669234`)	`0.99`
`tpch_q11/vortex-file-uncompressed`	`238199675` ns/iter (`± 2755331`)	`236021473` ns/iter (`± 2076840`)	`1.01`
`tpch_q12/vortex-in-memory-no-pushdown`	`178888983` ns/iter (`± 564729`)	`181725183` ns/iter (`± 665937`)	`0.98`
`tpch_q12/vortex-in-memory-pushdown`	`257220480` ns/iter (`± 1246588`)	`245639824` ns/iter (`± 1117857`)	`1.05`
`tpch_q12/arrow`	`168685105` ns/iter (`± 435972`)	`167848416` ns/iter (`± 951860`)	`1.00`
`tpch_q12/parquet`	`358077917` ns/iter (`± 2317907`)	`355743248` ns/iter (`± 652283`)	`1.01`
`tpch_q12/vortex-file-compressed`	`577060066` ns/iter (`± 2253366`)	`583693695` ns/iter (`± 3524490`)	`0.99`
`tpch_q12/vortex-file-uncompressed`	`353753839` ns/iter (`± 1158424`)	`345717904` ns/iter (`± 982840`)	`1.02`
`tpch_q13/vortex-in-memory-no-pushdown`	`233378060` ns/iter (`± 6792292`)	`212026022` ns/iter (`± 2139385`)	`1.10`
`tpch_q13/vortex-in-memory-pushdown`	`225897062` ns/iter (`± 2687703`)	`215199549` ns/iter (`± 2801269`)	`1.05`
`tpch_q13/arrow`	`220008054` ns/iter (`± 3476872`)	`216288457` ns/iter (`± 7582246`)	`1.02`
`tpch_q13/parquet`	`365108165` ns/iter (`± 8761788`)	`339023424` ns/iter (`± 4982544`)	`1.08`
`tpch_q13/vortex-file-compressed`	`262370823` ns/iter (`± 5771851`)	`242532188` ns/iter (`± 3105220`)	`1.08`
`tpch_q13/vortex-file-uncompressed`	`253289040` ns/iter (`± 5878713`)	`230206578` ns/iter (`± 4731127`)	`1.10`
`tpch_q14/vortex-in-memory-no-pushdown`	`39976992` ns/iter (`± 402331`)	`40066809` ns/iter (`± 123691`)	`1.00`
`tpch_q14/vortex-in-memory-pushdown`	`87308864` ns/iter (`± 536472`)	`88371331` ns/iter (`± 993349`)	`0.99`
`tpch_q14/arrow`	`41047857` ns/iter (`± 171059`)	`39916154` ns/iter (`± 413480`)	`1.03`
`tpch_q14/parquet`	`222397328` ns/iter (`± 984220`)	`219644722` ns/iter (`± 1478739`)	`1.01`
`tpch_q14/vortex-file-compressed`	`90954221` ns/iter (`± 1205267`)	`84406219` ns/iter (`± 549154`)	`1.08`
`tpch_q14/vortex-file-uncompressed`	`145316171` ns/iter (`± 3272870`)	`132056672` ns/iter (`± 2526723`)	`1.10`
`tpch_q15/vortex-in-memory-no-pushdown`	`71163237` ns/iter (`± 568725`)	`70240367` ns/iter (`± 1211358`)	`1.01`
`tpch_q15/vortex-in-memory-pushdown`	`125755999` ns/iter (`± 1421365`)	`120877627` ns/iter (`± 662063`)	`1.04`
`tpch_q15/arrow`	`67457581` ns/iter (`± 563588`)	`65814738` ns/iter (`± 745743`)	`1.02`
`tpch_q15/parquet`	`295049250` ns/iter (`± 2104210`)	`292964357` ns/iter (`± 4176947`)	`1.01`
`tpch_q15/vortex-file-compressed`	`160178907` ns/iter (`± 1035165`)	`149166262` ns/iter (`± 3739034`)	`1.07`
`tpch_q15/vortex-file-uncompressed`	`280301533` ns/iter (`± 3911470`)	`245576987` ns/iter (`± 4228385`)	`1.14`
`tpch_q16/vortex-in-memory-no-pushdown`	`120676960` ns/iter (`± 607001`)	`121362328` ns/iter (`± 501981`)	`0.99`
`tpch_q16/vortex-in-memory-pushdown`	`125649150` ns/iter (`± 652981`)	`125642905` ns/iter (`± 739682`)	`1.00`
`tpch_q16/arrow`	`105605122` ns/iter (`± 1175945`)	`106264398` ns/iter (`± 1435624`)	`0.99`
`tpch_q16/parquet`	`122493799` ns/iter (`± 590708`)	`120515396` ns/iter (`± 330045`)	`1.02`
`tpch_q16/vortex-file-compressed`	`137353235` ns/iter (`± 669637`)	`135440640` ns/iter (`± 2489773`)	`1.01`
`tpch_q16/vortex-file-uncompressed`	`136512392` ns/iter (`± 797010`)	`133236531` ns/iter (`± 804638`)	`1.02`
`tpch_q17/vortex-in-memory-no-pushdown`	`680071206` ns/iter (`± 5626853`)	`633750402` ns/iter (`± 21999972`)	`1.07`
`tpch_q17/vortex-in-memory-pushdown`	`699765651` ns/iter (`± 16552381`)	`644886261` ns/iter (`± 28816494`)	`1.09`
`tpch_q17/arrow`	`614163957` ns/iter (`± 15290848`)	`604794412` ns/iter (`± 19325686`)	`1.02`
`tpch_q17/parquet`	`600504141` ns/iter (`± 3375464`)	`591494702` ns/iter (`± 5886351`)	`1.02`
`tpch_q17/vortex-file-compressed`	`616490641` ns/iter (`± 6109550`)	`618751453` ns/iter (`± 5094786`)	`1.00`
`tpch_q17/vortex-file-uncompressed`	`678439856` ns/iter (`± 5153733`)	`662045625` ns/iter (`± 5462241`)	`1.02`
`tpch_q18/vortex-in-memory-no-pushdown`	`1102704467` ns/iter (`± 10488831`)	`1107670912` ns/iter (`± 28124407`)	`1.00`
`tpch_q18/vortex-in-memory-pushdown`	`1102693920` ns/iter (`± 18019955`)	`1146350113` ns/iter (`± 25935429`)	`0.96`
`tpch_q18/arrow`	`1091548880` ns/iter (`± 11287668`)	`1088592199` ns/iter (`± 13463439`)	`1.00`
`tpch_q18/parquet`	`1269701350` ns/iter (`± 9244969`)	`1295921800` ns/iter (`± 29563581`)	`0.98`
`tpch_q18/vortex-file-compressed`	`1137231576` ns/iter (`± 14509774`)	`1104461804` ns/iter (`± 12804403`)	`1.03`
`tpch_q18/vortex-file-uncompressed`	`1199433179` ns/iter (`± 13902541`)	`1142531705` ns/iter (`± 9323161`)	`1.05`
`tpch_q19/vortex-in-memory-no-pushdown`	`172419237` ns/iter (`± 558277`)	`173585771` ns/iter (`± 474023`)	`0.99`
`tpch_q19/vortex-in-memory-pushdown`	`255865266` ns/iter (`± 777800`)	`263934184` ns/iter (`± 1443569`)	`0.97`
`tpch_q19/arrow`	`158947649` ns/iter (`± 563373`)	`158169603` ns/iter (`± 605941`)	`1.00`
`tpch_q19/parquet`	`482869625` ns/iter (`± 2182275`)	`485808293` ns/iter (`± 3098680`)	`0.99`
`tpch_q19/vortex-file-compressed`	`739251560` ns/iter (`± 3891957`)	`748242129` ns/iter (`± 1861891`)	`0.99`
`tpch_q19/vortex-file-uncompressed`	`361335402` ns/iter (`± 1566759`)	`370691529` ns/iter (`± 2848285`)	`0.97`
`tpch_q20/vortex-in-memory-no-pushdown`	`276953400` ns/iter (`± 4328394`)	`271361202` ns/iter (`± 2598718`)	`1.02`
`tpch_q20/vortex-in-memory-pushdown`	`296899338` ns/iter (`± 2968668`)	`298142045` ns/iter (`± 3138390`)	`1.00`
`tpch_q20/arrow`	`253871397` ns/iter (`± 4922124`)	`268988087` ns/iter (`± 5230411`)	`0.94`
`tpch_q20/parquet`	`368063779` ns/iter (`± 5920070`)	`376959670` ns/iter (`± 2535342`)	`0.98`
`tpch_q20/vortex-file-compressed`	`337625170` ns/iter (`± 3942064`)	`331887989` ns/iter (`± 5057692`)	`1.02`
`tpch_q20/vortex-file-uncompressed`	`419819929` ns/iter (`± 2448535`)	`387430930` ns/iter (`± 3744062`)	`1.08`
`tpch_q21/vortex-in-memory-no-pushdown`	`881470571` ns/iter (`± 5570260`)	`860034433` ns/iter (`± 6315368`)	`1.02`
`tpch_q21/vortex-in-memory-pushdown`	`912642083` ns/iter (`± 12158266`)	`894832538` ns/iter (`± 2774021`)	`1.02`
`tpch_q21/arrow`	`873175820` ns/iter (`± 4749117`)	`860412313` ns/iter (`± 4835728`)	`1.01`
`tpch_q21/parquet`	`1033653879` ns/iter (`± 10326375`)	`1031986215` ns/iter (`± 15406311`)	`1.00`
`tpch_q21/vortex-file-compressed`	`1930001242` ns/iter (`± 9281258`)	`1055005493` ns/iter (`± 8739734`)	`1.83`
`tpch_q21/vortex-file-uncompressed`	`1316949047` ns/iter (`± 3918408`)	`1091486875` ns/iter (`± 12444901`)	`1.21`
`tpch_q22/vortex-in-memory-no-pushdown`	`96238393` ns/iter (`± 774157`)	`96963365` ns/iter (`± 1134408`)	`0.99`
`tpch_q22/vortex-in-memory-pushdown`	`96618668` ns/iter (`± 1183404`)	`97059184` ns/iter (`± 835576`)	`1.00`
`tpch_q22/arrow`	`66826555` ns/iter (`± 2434487`)	`66658773` ns/iter (`± 251660`)	`1.00`
`tpch_q22/parquet`	`96074157` ns/iter (`± 5283274`)	`95223939` ns/iter (`± 748371`)	`1.01`
`tpch_q22/vortex-file-compressed`	`103282488` ns/iter (`± 460786`)	`101715042` ns/iter (`± 409685`)	`1.02`
`tpch_q22/vortex-file-uncompressed`	`110932476` ns/iter (`± 603983`)	`109602568` ns/iter (`± 584164`)	`1.01`

This comment was automatically generated by workflow using github-action-benchmark.

robert3005 · 2024-09-12T13:02:02Z

@a10y you have a bug somewhere

Failed to get min_index from SparseArray: index 0 out of bounds from 0 to 0
Backtrace:
disabled backtrace

a10y · 2024-09-12T14:36:49Z

encodings/fastlanes/src/bitpacking/compute/slice.rs

+
+        assert!(array.patches().is_some());
+
+        let patch_indices = SparseArray::try_from(array.patches().unwrap())


perhaps we should just return SparseArray directly from BPA::patches. I know we always return Array, but turning it into a SparseArray is pretty much the first thing we always do with it

eh, I think we should handle non sparse patches. I will work on that

a10y · 2024-09-12T14:43:25Z

encodings/fastlanes/src/bitpacking/mod.rs

+            if SparseArray::try_from(parray)?.indices().is_empty() {
+                vortex_bail!("cannot construct BitPackedArray using patches without indices");
+            }


We depend pretty deeply on this being a SparseArray so I figured it should be checked in the constructor, along with the non-empty indices when present thing

I think this is fine for now, we will want to change it eventually but then we can find all those places

vortex-array/src/array/sparse/mod.rs

a10y · 2024-09-13T02:29:02Z

Alright benchmark comment is updated. Most queries stayed roughly same, the following got slower, some by a considerable margin:

Q1
Q3
Q4
Q7
Q10
Q15
Q16
Q21
Q22

Will do some profiling investigation in the AM

a10y · 2024-09-13T19:53:48Z

Alright, the issue was with filter_slices, it was creating so many slices that the overhead of slicing was just eating everything else up, especially when it came time to re-canonicalize before passing the RecordBatches back to DF.

Now, filter_slices will instead look at all of the slices and determine which chunks must be sliced, passed through, or dropped completely. I'll kick off a full bench run but for Q16 this saves us ~20% VS what we had on develop.

a10y · 2024-09-13T19:58:06Z

vortex-array/src/array/chunked/canonical.rs

@@ -34,10 +34,6 @@ pub(crate) fn try_canonicalize_chunks(
    validity: Validity,
    dtype: &DType,
 ) -> VortexResult<Canonical> {
-    if chunks.is_empty() {


this was a leftover from a time before dtype was able to be passed into this method, we had needed to examine the first chunk to get the DType but that's no longer the case

a10y · 2024-09-13T20:47:07Z

Ok benchmark run just completed, looks like the 20ms speed up i was seeing locally for Q16 was maybe just test jitter, in the PR run it's roughly same as develop. Most queries are 0-10% faster. A number are considerably slower, this time due to slow take implementation. May try and implement the accelerated search_sorted here as well

robert3005 · 2024-09-13T20:48:06Z

@a10y we should just merge this and make follow up prs. There's a lot of noise in the runs so comparing to latest develop runs is a bit difficult. Anyway I will review this and we can make another pr?

a10y · 2024-09-13T20:49:20Z

Alright I'm ok with that if you are. We have the issues for tracking already. I just hate to see graphs go up :P

robert3005 · 2024-09-15T13:24:51Z

vortex-array/src/array/chunked/compute/filter.rs

+fn find_chunk_idx(idx: usize, chunk_ends: &[u64]) -> (usize, usize) {
+    let chunk_id = chunk_ends
+        .binary_search(&(idx as u64))
+        .map(|i| if i == chunk_ends.len() { i - 1 } else { i })
+        .unwrap_or_else(|end| end - 1);
+    let chunk_offset = idx - (chunk_ends[chunk_id] as usize);
+
+    (chunk_id, chunk_offset)
+}


our search_sorted method is defined on primitive arrays and last I checked it wasn't any different (could be different now since 1.81 changed binary_search)

It's that the chunk_ends are usually compressed, so decompressing them for every call to find_chunk_idx adds up, along with the scalar/dtype overhead.

I meant that you can do this https://github.com/spiraldb/vortex/blob/develop/vortex-array/src/compute/search_sorted.rs#L276-L279

i.e. you can do chunk_ends.search_sorted(&idx, SearchSortedSide::Right)

https://github.com/spiraldb/vortex/blob/develop/vortex-array/src/array/chunked/mod.rs#L98-L101 is the right thing to do

Ah yea, that is probably better than what I am doing. Lemme change

robert3005 · 2024-09-15T13:28:07Z

vortex-array/src/array/chunked/compute/filter.rs

+            #[allow(clippy::needless_range_loop)]
+            for chunk in (start_chunk + 1)..end_chunk {
+                chunk_filters[chunk] = ChunkFilter::All;
+            }


one thing that the clippy suggestion does is avoid bounds checking but it would only show up on large arrays

I forgot that you can do &mut vector[start...end], which I think is both clear and avoids bounds check

robert3005

Since I spent some much time looking at weird arrays I know this has a weird behaviour with empty chunks but we can handle it in a followup. Needs adjustment to find_chunk_idx like I did in a previous pr

robert3005 · 2024-09-15T15:23:49Z

I think I meant something else, you can avoid the scalar/array overhead but still use the search sorted logic, i.e. keep find_chunk_idx in chunked array but you can create a mirror of that function over &[u64]. In the current variant you end up with scalar overhead.

The code I linked was to show that our searching logic is implemented for primitive array slices as well as arrays and you need it to handle empty chunks.

This reverts commit ffaf0c7.

a10y · 2024-09-15T15:58:32Z

like that?

implement FilterFn for ChunkedArray

3511581

direct find_chunk_idx

39d4092

format

a48fa8b

a10y added the benchmark Run benchmarks on this branch label Sep 12, 2024

a10y marked this pull request as ready for review September 12, 2024 12:30

github-actions bot removed the benchmark Run benchmarks on this branch label Sep 12, 2024

github-actions bot reviewed Sep 12, 2024

View reviewed changes

a10y added 2 commits September 12, 2024 09:57

change logic for SparseArray::min_index

4bc19fe

do not propagate patches for BPA if zero indices, remove prints

95b9a65

a10y commented Sep 12, 2024

View reviewed changes

a10y added benchmark Run benchmarks on this branch labels Sep 12, 2024

github-actions bot removed the benchmark Run benchmarks on this branch label Sep 12, 2024

a10y commented Sep 12, 2024

View reviewed changes

robert3005 reviewed Sep 12, 2024

View reviewed changes

vortex-array/src/array/sparse/mod.rs Show resolved Hide resolved

a10y added 2 commits September 12, 2024 17:21

fix take-after-slice for BitPackedArray

a8750fa

Merge remote-tracking branch 'origin/develop' into aduffy/more-filterfn

55b74d3

a10y added the benchmark Run benchmarks on this branch label Sep 12, 2024

github-actions bot removed the benchmark Run benchmarks on this branch label Sep 12, 2024

wow was that really it

c411ff5

a10y added the benchmark Run benchmarks on this branch label Sep 13, 2024

github-actions bot removed the benchmark Run benchmarks on this branch label Sep 13, 2024

improve implementation of filter_slices

dc15df5

a10y added the benchmark Run benchmarks on this branch label Sep 13, 2024

Merge remote-tracking branch 'origin/develop' into aduffy/more-filterfn

d5ecb80

a10y added benchmark Run benchmarks on this branch and removed benchmark Run benchmarks on this branch labels Sep 13, 2024

github-actions bot removed the benchmark Run benchmarks on this branch label Sep 13, 2024

a10y commented Sep 13, 2024

View reviewed changes

robert3005 reviewed Sep 15, 2024

View reviewed changes

robert3005 approved these changes Sep 15, 2024

View reviewed changes

lift find_chunk_idx to function, call from filterfn

ffaf0c7

a10y added 3 commits September 15, 2024 11:29

iter mut

ac438f8

Revert "lift find_chunk_idx to function, call from filterfn"

17f5f7b

This reverts commit ffaf0c7.

reuse search_sorted logic

fa3b343

robert3005 merged commit 03e03ce into develop Sep 15, 2024
5 checks passed

robert3005 deleted the aduffy/more-filterfn branch September 15, 2024 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement FilterFn for ChunkedArray #794

implement FilterFn for ChunkedArray #794

a10y commented Sep 11, 2024

a10y commented Sep 11, 2024

a10y commented Sep 11, 2024

robert3005 commented Sep 11, 2024

a10y commented Sep 12, 2024 •

edited

Loading

robert3005 commented Sep 12, 2024

robert3005 commented Sep 12, 2024

github-actions bot left a comment •

edited

Loading

robert3005 commented Sep 12, 2024

a10y Sep 12, 2024

robert3005 Sep 12, 2024

a10y Sep 12, 2024

robert3005 Sep 12, 2024

a10y commented Sep 13, 2024

a10y commented Sep 13, 2024 •

edited

Loading

a10y Sep 13, 2024

a10y commented Sep 13, 2024

robert3005 commented Sep 13, 2024

a10y commented Sep 13, 2024

robert3005 Sep 15, 2024

a10y Sep 15, 2024

robert3005 Sep 15, 2024

robert3005 Sep 15, 2024

robert3005 Sep 15, 2024

a10y Sep 15, 2024

robert3005 Sep 15, 2024

a10y Sep 15, 2024

robert3005 left a comment

robert3005 commented Sep 15, 2024 •

edited

Loading

a10y commented Sep 15, 2024


		assert!(array.patches().is_some());

		let patch_indices = SparseArray::try_from(array.patches().unwrap())

implement FilterFn for ChunkedArray #794

implement FilterFn for ChunkedArray #794

Conversation

a10y commented Sep 11, 2024

a10y commented Sep 11, 2024

a10y commented Sep 11, 2024

robert3005 commented Sep 11, 2024

a10y commented Sep 12, 2024 • edited Loading

robert3005 commented Sep 12, 2024

robert3005 commented Sep 12, 2024

github-actions bot left a comment • edited Loading

Choose a reason for hiding this comment

Vortex benchmarks

robert3005 commented Sep 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a10y commented Sep 13, 2024

a10y commented Sep 13, 2024 • edited Loading

Choose a reason for hiding this comment

a10y commented Sep 13, 2024

robert3005 commented Sep 13, 2024

a10y commented Sep 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robert3005 left a comment

Choose a reason for hiding this comment

robert3005 commented Sep 15, 2024 • edited Loading

a10y commented Sep 15, 2024

a10y commented Sep 12, 2024 •

edited

Loading

github-actions bot left a comment •

edited

Loading

a10y commented Sep 13, 2024 •

edited

Loading

robert3005 commented Sep 15, 2024 •

edited

Loading