-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the MUNI/33/0769/2022 R&D project #8
Conversation
* added evaluation metrics module * added bpref metric * added unit tests for evaluation_metrics module * fixed some style and type errors * fixed some type errors * fixed type error * made the requested changes to EvaluationBase and added to it (and its childred classes) an option to choose evaluation depth * fixed some style errors * removed unused method from EvaluationBase * removed unit tests for no longer existing functions * added checks to avoid division by zero
* added ensembles module * fixed some typos and style checks * added rbc ensamble, and updated setup.py to include sci-kit-learn
* Create and push Docker tags Install htop Load questions in script.download_datasets Only download text collection for ARQMath in scripts.download_datasets Create root directory with mode 777 in scripts.download_datasets Add" notebooks" extra requirements to setup.py Add text+tangentl math format for ARQMath Increment patch version of pv211-utils Fix syntax error in setup.py Fix style error in script.download_datasets Fix test.arqmath.test_loader.TestLoadQueriesTangentL Update gdown in Jupyter notebooks Fetch tags when building Docker image in CI Replace Google Drive with HTTP Do not redownload TREC and ARQMath datasets if they exist Add md5_also_ok to manifest files to allow for two different versions of a dataset Do not download gdown in Jupyter notebooks Support creating fat Docker images out-of-box Update links to collection processing notebooks added arqmath3 judgements,created datasets module, and added arqmath class to datasets module changed and expanded arqmath class, added crenfield, and trec class, added docstring, fixed code convention issues added beir dataset interface, shuffled queries for arqmath and creinfield (added file with the order), and fixed some bugs and naming inconsistencies added two blank lines to imports to fix stylecheck deleted misscommited file small style change sync with main return type change in cranfield loader fixed some bugs in TrecDataset Initial commit, migrated files from gitlab Corrected selected style errors Replaced google_drive_download with http_download to match master Added a rudamentary example ipynb notebook for the CQADupStack datasets of the BEIR collection. Added some basic tests Fixed some basic style errors and added more tests Resolved a bug with id collisions on dataset combination, added a Google Sheet leaderboard, added a proper train/dev/test split, plus some minor changes to example beir notebook. Added sorting to desired dataset input to prevent any unwanted randomness Minor corrections to beir loader and added proper permissions to leaderboard service account Added the ability to prevent unnecessary download of data when data is already present. Resolved some type errors Added the ability to prevent repetitive download of data when data is already present. ver2 Added the default download location. Updated actions to use Node.js 16 fixed some typecheck errors put a file back Don't specify type of `query` in `irsystem.IRSystemBase` Update setup.py fixed some style checks fixed some type errors reverted changes to main.yml fixed nq train set loading reverting some mistakes fixed some type errors reverted changes to main.yml fixed nq train set loading reverting some mistakes reverted some changes * fixed bugs in combine beir datasets and beir eval and removed some redundant modules and updated beir notebook * extended beir.loader tests to cover datasets combining and splitting * fixed typechecks in test/beir/testloader --------- Co-authored-by: Vít Novotný <[email protected]>
* add full doc preprocessing * systems preprocessing * fix code style * add math preprocessing --------- Co-authored-by: MarekToma <[email protected]>
Also rename `preprocessing.preprocessing` to `preprocessing.test_preprocessing`.
This reverts commit 1fd757b.
…processing`" This reverts commit 34a4bb2ece523e9a944d3a1c804a5602b1c2753d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks serviceable. Many thanks to @MarekToma and @VojtechKalivoda for fixing the biggest issues. I opened tickets for non-critical issues: #9, #10, #11, #12
@Witiko I just made the change to make eval metrics faster (according to your suggestion )maybe we should merge that one too :D (sorry for being late, I havent noticed you merged it in mean time) |
@MarekToma Thanks, but we'll have to do that one later. Please, feel free to file it as a pull request that closes #12. I expect to livepatch some of the smaller issues such as this one after the release of the second term project assignment. There is no big hurry as long as |
This pull request merges all changes proposed in #1, #5, #6, and #7 into the
main
branch.