Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add native Parquet load/save to item list collections #551

Merged
merged 22 commits into from
Dec 14, 2024

Conversation

mdekstrand
Copy link
Member

@mdekstrand mdekstrand commented Dec 14, 2024

This adds support for loading and saving item list collections in a native Parquet format that preserves empty lists and is noticeably faster to load.

It also forces ItemList scores to be single-precision floating-point.

@mdekstrand mdekstrand added enhancement New feature or request data Data management support. labels Dec 14, 2024
@mdekstrand mdekstrand added this to the 2025.1 milestone Dec 14, 2024
Copy link

codecov bot commented Dec 14, 2024

Codecov Report

Attention: Patch coverage is 87.15596% with 28 lines in your changes missing coverage. Please review.

Project coverage is 90.18%. Comparing base (5c830c6) to head (694b312).
Report is 22 commits behind head on main.

Files with missing lines Patch % Lines
lenskit/lenskit/data/items.py 83.47% 20 Missing ⚠️
lenskit/lenskit/data/collection.py 92.00% 4 Missing ⚠️
lenskit/lenskit/data/mtarray.py 92.50% 3 Missing ⚠️
lenskit/lenskit/data/checks.py 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #551      +/-   ##
==========================================
- Coverage   90.31%   90.18%   -0.14%     
==========================================
  Files         100      100              
  Lines        5950     6145     +195     
==========================================
+ Hits         5374     5542     +168     
- Misses        576      603      +27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mdekstrand mdekstrand merged commit 694b312 into lenskit:main Dec 14, 2024
37 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Data management support. enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

1 participant