Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MRG] add sqlite3 implementations for
Index
, CollectionManifest
, …
…and `LCA_Database` (#1808) * switch to get_matching_sketches * change default cache size * count overlaps in SQL? * initial addition of 'sig fileinfo' * finish first-draft implementation of fileinfo and get_manifest * cleanup and move over to sourmash_args * add manifest and length support to LCA_Database * add rebuild/no-rebuild args * use BitArray to convert uint to int * cleanup * fix the things? * cleanup * more cleanup * flag when scores are diff * fix __len__ for zipfiles, __bool__ interpretation * add more index, etc * more cleanup * correct for rust panic a la zip * commit every so often... * add some comments * get basic manifest-generating machinery working * update manifest stuff * add bitstring in support of SqliteIndex * more cleanup * add more tests * add conditions to _get_matching_sketches * remove conditions * remove errant raise * update structure * some commentary * switch over to debug_literal * switch to debug_literal; test tricky ordering * add LCA database test for tricky ordering * add test for jaccard ordering to SBTs * add LCA database test for tricky ordering * add test for jaccard ordering to SBTs * add bitstring to setup * factor out CollectionManifest_Sqlite * some basic manifests * add sqlite manifest rows interface * minor refactor * support sig manifest / test it * move row insert into manifest class * test creation of sqlite mf * switch to explicit moltype * cleanup and refactoring * cleanup * SQLite manifests are now first class * pip cache should be looking at setup.cfg I think? * and tox cache should be looking at setup.cfg, too * try again/invalidate cache * try again * remove print * fix some stuff * even more * add 'sourmash_versions' table * test direct sqlmf creation & loading * improve version checkingc * test various insertion errors * fix num support in sqlite manifests (but not index) * add explicit validation code, to be removed later * explicit check of 'num' * add more docs/notes/annotations for work * rename CollectionManifest_Sqlite to SqliteCollectionManifest * preliminary victory over rankinfo * provide generic LCA Database functionality via sqlite * refactor and comment * refactor and document * add sqlite_utils * cleanup * parse out SqliteIndex.create * rm comment * add database_format to lca index * get sql database output working for LCA index * get all lca tests working on SQL version of LCA_Database * add test_index_protocol * add tests of indices after save/load * match Index definition of __len__ in sbt * more index tests * add some generic manifest tests * define abstract base class for CollectionManifest * fix GTDB example, sigh * test hashval_to_idx * add actual test for min num in rankinfo * update 'get_lineage_assignments' in lca_db * update comment * make lid_to_idx and idx_to_ident private * moar comment * add sqlite clases to protocol tests * adjust protocol * update to match protocol * add, then hide, RevIndex test * update the LCA_Database protocol * SqliteCollectionManifest now passes all the tests * update row check to ignore _ prefixes * implement remaining lca_db protocol for sqlite * fix up rankinfo for sqlite LCA_Database * finish testing the rest of the Index classes * cleanup * upd * cleanup LCA_Database creation * backport 08ac110 * add sqlite loading to CollectionManifest * update manifest writing to support SQL, too * switch to using generic manifest.write_to_filename * catch pre-existing sqlite DBs * remove test for now-implemented func * work through various merge implications * switch away from a row tuple in CollectionManifest * more clearly separate internals of LCA_Database from public API * add saved/loaded manifest * add test coverage for exceptions in LazyLoadedIndex * add docstrings to manifest code * add docstrings / comments * fix sig check reliance on internal manifest mechanism * fix picklist stuff when using Sqlite manifests * add lots of debug stmts * remove SQLite pickset as impractical * remove some expensive debugs * remove sql picklist code as too slow * comments and cleanup * much cleanup * re-add debug_literal * more cleanup * comment * fix 'num' select * test and document locations() * use names in namedtuple; add containment test * add numerical values to jaccard order tests * cleanup * remove redundant tests * test scaled=1 stuff pretty explicitly * rename 'create_from_manifest' method * cleanup * add required_keys check * check manifest equality only on required keys * add required_keys check * add index tests for LCA_SqliteDatabase * constructor/etc refactoring * add scaled/dowsample test * add downsample_scaled etc * remove unused code * cleanup * update comment * rename tables to have prefix sourmash_ * update with many a test * fix diagnostic output during sourmash index #1949 * handle bad versions of stuff * update/simplify version checking * add append test * add notes about further tests * minor comment update * fix after merge * update table name for lineage db * more docs * implement loading of LCA_SqliteDatabases at command line * cleanup and testing * start adding some documentation * add location and manifest properties to LCA_SqliteDatabase * update * update index protocol tests to check location, manifest * add tests for fileinfo on all sql db variants * add test for signatures_with_location * upd * add test of new-style lineage db file * upd/cleanup * try out inheritance instead of composition * comment * more cleanup * clean up LCA_SqliteDatabase * create some more tests... * update checklist * refactor and cleanup * round out the tests a bit * allow append * cleanup, doc * cleanup/simplify * support picklists in LCA_Database.signatures * fix up @ctb in LCA tests * cleanup @ctb in test_cmd_signature * add tests for picklist support in LCA_database.signatures() * many minor updates * more tests * add more manifest tests * add some final? tests * one final test * fix typo via @mr-eyes * remove unnecessary PARSE_DECLTYPES * add docs for creating sqldb * do not allow overwrite/append to xisting lca database * Update src/sourmash/lca/lca_db.py Co-authored-by: Mohamed Abuelanin <[email protected]> * fix bug with duplicate lineages in LCA_SqliteDatabase * fix test broken by duplicate lineage fix Co-authored-by: Mohamed Abuelanin <[email protected]>
- Loading branch information