You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
update to include the KenLMModel example in README
0.1.4 / 2023-11-14
version bump; release
Merge branch 'benlipkin/main' into main
add simple surprisal compute test that should pass as long as full surprise() method executes. no assertions are made.
implement attention mask for use_bos_token token case
Merge branch 'main' of github.com:benlipkin/surprisal into benlipkin/main
Merge branch 'main' of github.com:aalok-sathe/surprisal into main
Merge pull request #14 from aalok-sathe/feature-support-kenlm
OK, we have a MWE! still TODO: figure out surprisal value: do we want that? maybe add an option to show but default to disabling it? do we also want bos?
bugfix in ids handling in CustomEncoding
bugfixes; bump numpy version for typing
make KenLMModel visible at the module level
add an KenLM and NGramSurprisal implementation
move repr() to SurprisalArray rather than huggingfacesurprisal. complete CustomEncoding implementation.
actually no point subclassing from tokenizers.Encoding
flesh out interface towards supporting CustomEncoding for custom-tokenized text, e.g. whitespace for kenlm