Skip to content
This repository has been archived by the owner on Oct 30, 2024. It is now read-only.

add: -w/--keyword flag for retrieve/askdir + allow selecting multiple datasets #68

Merged
merged 3 commits into from
Aug 14, 2024

Conversation

iwilltry42
Copy link
Collaborator

@iwilltry42 iwilltry42 commented Aug 13, 2024

Upstream: philippgille/chromem-go#96

knowledge retrieve and askdir get new/updated flags:

  • --keyword / -w:
    • enable Hybrid search by pre-filtering the dataset for documents containing (or not containing) some given keywords
    • use - as a prefix for the word to add it to the "not-include" list
    • included keywords are OR'd, excluded keywords are AND'd
    • Example: knowledge retrieve -w "foo" -w "bar" -w "-spam" -w "-eggs" "some query" will retrieve documents that contain either "foo" or "bar" but definitely not "spam" or "eggs"
      • Same via env: KNOW_RETRIEVE_KEYWORDS="foo,bar,-spam,-eggs" knowledge retrieve "some query"
    • Pitfall: `knowledge retrieve "what is foo?" -w "-foo" will likely return nothing (you don't say :O)
  • --dataset / -d can now be used multiple times
    • depending on the used retriever (e.g. the default BasicRetriever won't work with this), multiple selected datasets will be used for retrieval

Fixes #55

@iwilltry42 iwilltry42 force-pushed the feat/hybrid-keyword branch 2 times, most recently from 9ec5b6d to 2ccd537 Compare August 14, 2024 13:54
@iwilltry42 iwilltry42 force-pushed the feat/hybrid-keyword branch from 2ccd537 to 28f66bb Compare August 14, 2024 14:00
@iwilltry42 iwilltry42 merged commit 3ebf9b7 into main Aug 14, 2024
1 check passed
@iwilltry42 iwilltry42 deleted the feat/hybrid-keyword branch August 14, 2024 14:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Keyword-Based Pre-Filtering
1 participant