Skip to content
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.

Nested buckets vs key namespacing? #422

Closed
joyrexus opened this issue Sep 15, 2015 · 5 comments
Closed

Nested buckets vs key namespacing? #422

joyrexus opened this issue Sep 15, 2015 · 5 comments

Comments

@joyrexus
Copy link
Contributor

Are there any performance advantages to using nested buckets instead of a suitable key namespace for prefix/range scans?

Context: I'm working on a simple backend web service designed to store/retrieve JSON metadata associated with a set of hierarchically related resources (studies > trials > files). To streamline storing/retrieving this stuff in a bucket, I've started working on a little wrapper for bolt, mentioned in #421. The wrapper provides a PrefixScanner, which makes it relatively straightforward to work with a subset of prefixed keys. Anyway, wondering now if I would have been any better off utilizing nested buckets instead.

@hryx
Copy link

hryx commented Sep 18, 2015

I'm also interested to know. Not so much for performance reasons, but I don't yet understand what the intended purpose of nesting buckets is. Maybe a one-liner in the readme would help clear that up?

In my case, I'm working on a CLI application that stores relatively little data (metadata caching, latest server response messages, etc.).

@benbjohnson
Copy link
Member

@joyrexus If you have a large number of keys in the sub-bucket then you can save space by not having to prefix each one. Bolt doesn't do key compression like LSMs do. However, if you have a small number of keys then you'll likely have better performance with simple prefixing since Bolt won't have to traverse down another level into the sub-bucket's b-tree.

@hryx The original use was to essentially to allow sorting on two fields. We had an application that stored user events so we needed users grouped together (bucket level 1) and then their events sorted by timestamp (bucket level 2). The same could be accomplished with a single bucket and prefixing but nested buckets made the code a lot easier to read/write.

Another common use case for nested buckets is for multi-tenancy. For example, if you want to help ensure that one customer doesn't accidentally see another customer's data, you can store each customer's data in a different sub-bucket.

@hryx
Copy link

hryx commented Sep 20, 2015

That clears it up for me 120%. Megathanks @benbjohnson!

@benbjohnson
Copy link
Member

👍

@joyrexus
Copy link
Contributor Author

Got it. Thanks @benbjohnson!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants