-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MongoDB 3 Compression Options #1099
Comments
The compression can be configured in MongoDB start up options. Mongoengine does not need to do anything. |
I'm not familiar with this, so I may be wrong, but these compression options can be used at collection level, as written in the article linked to by @robodude666. To use them, one needs to pass specific options through kwags (see also this SO question) in pymongo's create_collection. It would make sense to expose these options in MongoEngine and from a quick glance the |
You don't have control over the construction of GridFS collections, either the file tracking one, or the one containing the actual chunks. That leaves such configuration to manual effort or server-wide configuration, as was previously pointed out. Additionally, the MongoDB in-database compression algorithm defaults to Snappy, for performance reasons, or lets you use fast zlib, neither of which offer worthy compression ratios. (Zlib being a typical dictionary based Huffman coder, Snappy using no entropy coding at all, instead relying on repetitions described by relative references in the output stream; so, at worst, it's literally 100% worse than gzip. More akin to RLE. ;) On Lewis Carroll's "Through the Looking Glass" (Project Gutenberg Compare that to something a bit more… modern… like Conclusion: compress material before archiving it into GridFS; WiredTiger compression is intended for absolute speed and data mutability, not efficient archival. This is doubly important if you store mixed content in GridFS, such as including images, audio, or video alongside the text content. Any form of in-database compression would actually increase the size of the stored data, if it's already extremely tightly entropy coded as sound and video are. |
One of the new features added in MongoDB 3.0 is compression. Could these compression options please be supported, especially combined with GridFS for storing text files?
From the article on MongoDB's blog, it appears the WiredTiger Storage Engine would be required to support this functionality; not sure if this is yet supported or not.
The text was updated successfully, but these errors were encountered: