Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize metadata and tag indexing encoding for RediSearch #2066

Merged
merged 2 commits into from
Jan 30, 2024

Conversation

PragmaTwice
Copy link
Member

@PragmaTwice PragmaTwice commented Jan 28, 2024

It close #2065.

The current encoding design:

search data type metadata:

key (index name) -> flag | expire | version | size | on_data_type (HASH or JSON)

prefixes encoding:

key (index name) | PREFIXES -> prefix1 prefix2 ...

tag field metadata encoding:

key (index name) | TAG_FIELD_META | field name -> separator | case sensitive

tag field index encoding:

key (index name) | TAG_FIELD | field name | tag | key -> (nil)

Refs:
https://redis.io/docs/interact/search-and-query/
https://redis.io/docs/interact/search-and-query/advanced-concepts/tags/

@mapleFU
Copy link
Member

mapleFU commented Jan 28, 2024

How can we list all indexes efficiently? 🤔
RediSearch seems would tracking the records for insert and maintaining index on that, should we support some global tracking structure?

@PragmaTwice
Copy link
Member Author

How can we list all indexes efficiently? 🤔 RediSearch seems would tracking the records for insert and maintaining index on that, should we support some global tracking structure?

Like RediSearch, we will track all HASH and JSON commands and adjust our indexes accordingly.

@mapleFU
Copy link
Member

mapleFU commented Jan 29, 2024

Like RediSearch, we will track all HASH and JSON commands and adjust our indexes accordingly.

I mean, when start the process, we should "LIST" all indexes

@PragmaTwice
Copy link
Member Author

Like RediSearch, we will track all HASH and JSON commands and adjust our indexes accordingly.

I mean, when start the process, we should "LIST" all indexes

Ahh I got your point.

I think we can add a common prefix for all index key, or put them to a new CF. Currently I prefer the latter.

@git-hulk
Copy link
Member

Like RediSearch, we will track all HASH and JSON commands and adjust our indexes accordingly.

I mean, when start the process, we should "LIST" all indexes

Ahh I got your point.

I think we can add a common prefix for all index key, or put them to a new CF. Currently I prefer the latter.

This skeleton code looks good to me. I also prefer the latter one if we want to keep the common prefix since the current column families are not for this purpose. But we need to do a little backward compatibility work while adding a new column family.

Copy link

sonarcloud bot commented Jan 29, 2024

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

3 New issues
0 Security Hotspots
60.5% Coverage on New Code
1.0% Duplication on New Code

See analysis details on SonarCloud

@PragmaTwice PragmaTwice requested a review from mapleFU January 30, 2024 10:42
@PragmaTwice PragmaTwice merged commit 9d618e0 into apache:unstable Jan 30, 2024
30 checks passed
@mapleFU
Copy link
Member

mapleFU commented Jan 30, 2024

So only "tag" is support currently?

@PragmaTwice
Copy link
Member Author

So only "tag" is support currently?

Yeah, I also plan to add the support of numeric indexes.

But for text and vector fields, it is currently not in the plan.

JoverZhang pushed a commit to JoverZhang/kvrocks that referenced this pull request Feb 24, 2024
…2066)

The current encoding design:
```
search data type metadata:

key (index name) -> flag | expire | version | size | on_data_type (HASH or JSON)

prefixes encoding:

key (index name) | PREFIXES -> prefix1 prefix2 ...

tag field metadata encoding:

key (index name) | TAG_FIELD_META | field name -> separator | case sensitive

tag field index encoding:

key (index name) | TAG_FIELD | field name | tag | key -> (nil)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add search metadata and tag field metadata encoding for RediSearch
3 participants