Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support remapping for IVF_FLAT, IVF_PQ and IVF_SQ #2708

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

BubbleCal
Copy link
Contributor

@BubbleCal BubbleCal commented Aug 8, 2024

not support IVF_HNSW_* index yet

prepare for supporting remap for new vector index format,
HNSW remap not supported because simply mapping the row ids could break the connectivity of graph

Signed-off-by: BubbleCal <[email protected]>
@github-actions github-actions bot added the enhancement New feature or request label Aug 8, 2024
Signed-off-by: BubbleCal <[email protected]>
Signed-off-by: BubbleCal <[email protected]>
@codecov-commenter
Copy link

codecov-commenter commented Aug 8, 2024

Codecov Report

Attention: Patch coverage is 73.99267% with 71 lines in your changes missing coverage. Please review.

Project coverage is 78.52%. Comparing base (ef9d0c2) to head (7f5fae7).

Files with missing lines Patch % Lines
rust/lance/src/index/vector/builder.rs 71.73% 0 Missing and 26 partials ⚠️
rust/lance/src/index/vector/ivf.rs 57.44% 12 Missing and 8 partials ⚠️
rust/lance-index/src/vector/storage.rs 64.70% 7 Missing and 5 partials ⚠️
rust/lance/src/index/vector/utils.rs 64.28% 4 Missing and 1 partial ⚠️
rust/lance-index/src/vector/hnsw/builder.rs 0.00% 2 Missing ⚠️
rust/lance-index/src/vector/v3/shuffler.rs 90.90% 0 Missing and 2 partials ⚠️
rust/lance-index/src/vector/hnsw/index.rs 0.00% 1 Missing ⚠️
rust/lance/src/index/vector/fixture_test.rs 0.00% 1 Missing ⚠️
rust/lance/src/index/vector/ivf/v2.rs 98.18% 1 Missing ⚠️
rust/lance/src/session/index_extension.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2708      +/-   ##
==========================================
+ Coverage   78.45%   78.52%   +0.07%     
==========================================
  Files         244      244              
  Lines       84554    84770     +216     
  Branches    84554    84770     +216     
==========================================
+ Hits        66333    66568     +235     
+ Misses      15418    15363      -55     
- Partials     2803     2839      +36     
Flag Coverage Δ
unittests 78.52% <73.99%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wjones127 wjones127 marked this pull request as draft August 13, 2024 18:16
@wjones127
Copy link
Contributor

@BubbleCal I've marked this as draft, since I'm assuming it is not ready for review. (There are no unit tests.) Mark it as ready for review when it is ready.

Signed-off-by: BubbleCal <[email protected]>
Signed-off-by: BubbleCal <[email protected]>
@github-actions github-actions bot added the python label Dec 5, 2024
Signed-off-by: BubbleCal <[email protected]>
Signed-off-by: BubbleCal <[email protected]>
Signed-off-by: BubbleCal <[email protected]>
Signed-off-by: BubbleCal <[email protected]>
Signed-off-by: BubbleCal <[email protected]>
lance_io::ReadBatchParams::RangeFull,
4096,
16,
projection,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need the part_id in batch, just don't read it to save resources

Signed-off-by: BubbleCal <[email protected]>
@BubbleCal BubbleCal changed the title feat: support remapping vector storage and flat index feat: support remapping for IVF_FLAT, IVF_PQ and IVF_SQ Dec 10, 2024
@BubbleCal BubbleCal marked this pull request as ready for review December 10, 2024 11:24
@@ -134,6 +135,10 @@ impl IvfSubIndex for FlatIndex {
Ok(Self {})
}

fn remap(&self, _: &HashMap<u64, Option<u64>>) -> Result<Self> {
Ok(self.clone())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's add a warning log here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh w8, we should remap sub index here no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for v3 we need to remap the subindex & vector storage. flat index doesn't contain anything so it simply returns itself here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants