-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bcachefs: document btree iter flags #287
Open
dlrobertson
wants to merge
809
commits into
koverstreet:master
Choose a base branch
from
dlrobertson:add-docs
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…dvance The way btree iterators work internally has been changing, particularly with the iter->real_pos changes, and bch2_btree_iter_next() is no longer hyper optimized - it's just advance followed by peek, so it's more efficient to just call advance where we're not using the return value of bch2_btree_iter_next(). Signed-off-by: Kent Overstreet <[email protected]>
btree node iterators need to obey the regular btree node invarionts w.r.t. iter->real_pos; once they do, bch2_btree_iter_traverse will have less that it needs to check. Signed-off-by: Kent Overstreet <[email protected]>
This means bch2_btree_iter_traverse_one() can be made more efficient. Signed-off-by: Kent Overstreet <[email protected]>
Since we're no longer doing next() immediately followed by peek(), this optimization isn't doing anything anymore. Signed-off-by: Kent Overstreet <[email protected]>
This just gives some internal helpers some better names. Signed-off-by: Kent Overstreet <[email protected]>
Signed-off-by: Kent Overstreet <[email protected]>
Ideally we'll be getting rid of peek_with_updates(), but the callers will need to be checked. Signed-off-by: Kent Overstreet <[email protected]>
peek() has to update iter->real_pos - there's no need for bch2_btree_iter_set_pos() to update it as well. Signed-off-by: Kent Overstreet <[email protected]>
More prep work for snapshots. Signed-off-by: Kent Overstreet <[email protected]>
It was using the method for btree_ptr_v1, but that wasn't checking all the fields. Signed-off-by: Kent Overstreet <[email protected]>
It had some silly redundancies. Signed-off-by: Kent Overstreet <[email protected]>
External (to the btree iterator code) users of bch2_btree_iter_traverse expect that on success the iterator will be pointed at iter->pos and have that position locked - but since we split iter->pos and iter->real_pos, that means it has to update iter->real_pos if necessary. Internal users don't expect it to modify iter->real_pos, so we need two separate functions. Signed-off-by: Kent Overstreet <[email protected]>
This adds a mode to six locks where readers use percpu counters - avoiding writing to shared cachelines. The algorithm is the same as the existing percpu-rwsemaphore's slowpath algorithm: taking a read lock still requires a memory barrier to check if we raced with another thread taking a write lock, but this means that taking a write lock doesn't incur the cost of an RCU barrier. Signed-off-by: Kent Overstreet <[email protected]>
Signed-off-by: Kent Overstreet <[email protected]>
The default was 1/256th of the device and capped at 512MB, which is fairly tiny these days. Signed-off-by: Kent Overstreet <[email protected]>
Bkey noops were introduced to deal with trimming inline data extents in place in the btree: if the u64s field of a bkey was 0, that u64 was a noop and we'd start looking for the next bkey immediately after it. But extent handling has been lifted above the btree - we no longer modify existing extents in place in the btree, and the compatibilty code for old style extent btree nodes is gone, so we can completely drop this code. Signed-off-by: Kent Overstreet <[email protected]>
On btree node split, we weren't ensuring the min_key of the new larger node packs in the new format for this node. This triggers some painful slowpaths in the bset.c aux search tree code - this patch fixes that by calculating a new format for the new node with the new min_key. Signed-off-by: Kent Overstreet <[email protected]>
We weren't packing the min/max keys, which was a major oversight and completely disabled generating bkey_floats for adjacent nodes. Signed-off-by: Kent Overstreet <[email protected]>
Signed-off-by: Kent Overstreet <[email protected]>
…d to When we pass BTREE_INSERT_NOUNLOCK bch2_trans_commit isn't supposed to unlock after a successful commit, but it was calling bch2_trans_cond_resched() - oops. Signed-off-by: Kent Overstreet <[email protected]>
Since we now make sure to always generate packed bkey formats that can pack the min_key of a btree node, this path should actually never happen. Signed-off-by: Kent Overstreet <[email protected]>
The btree key cache mutex was becoming a significant bottleneck - it was mainly used to protect the lists of dirty, clean and freed cached keys. This patch eliminates the dirty and clean lists - instead, when we need to scan for keys to drop from the cache we iterate over the rhashtable, and thus we're able to remove most uses of that lock. Signed-off-by: Kent Overstreet <[email protected]>
Signed-off-by: Kent Overstreet <[email protected]>
With snapshots, we're going to need to differentiate between comparisons that should and shouldn't include the snapshot field. bpos_cmp is now the comparison function that does include the snapshot field, used by core btree code. Upper level filesystem code generally does _not_ want to compare against the snapshot field - that code wants keys to compare as equal even when one of them is in an ancestor snapshot. Signed-off-by: Kent Overstreet <[email protected]>
This patch starts treating the bpos.snapshot field like part of the key in the btree code: * bpos_successor() and bpos_predecessor() now include the snapshot field * Keys in btrees that will be using snapshots (extents, inodes, dirents and xattrs) now always have their snapshot field set to U32_MAX The btree iterator code gets a new flag, BTREE_ITER_ALL_SNAPSHOTS, that determines whether we're iterating over keys in all snapshots or not - internally, this controlls whether bkey_(successor|predecessor) increment/decrement the snapshot field, or only the higher bits of the key. We add a new member to struct btree_iter, iter->snapshot: when BTREE_ITER_ALL_SNAPSHOTS is not set, iter->pos.snapshot should always equal iter->snapshot, which will be 0 for btrees that don't use snapshots, and alsways U32_MAX for btrees that will use snapshots (until we enable snapshot creation). This patch also introduces a new metadata version number, and compat code for reading from/writing to older versions - this isn't a forced upgrade (yet). Signed-off-by: Kent Overstreet <[email protected]>
This patch adds two new inode fields, bi_dir and bi_dir_offset, that point back to the inode's dirent. Since we're only adding fields for a single backpointer, files that have been hardlinked won't necessarily have valid backpointers: we also add a new inode flag, BCH_INODE_BACKPTR_UNTRUSTED, that's set if an inode has ever had multiple links to it. That's ok, because we only really need this functionality for directories, which can never have multiple hardlinks - when we add subvolumes, we'll need a way to enemurate and print subvolumes, and this will let us reconstruct a path to a subvolume root given a subvolume root inode. Signed-off-by: Kent Overstreet <[email protected]>
For snapshots, when we allocate a new inode we want to allocate an inode number that isn't in use in any other subvolume. We won't be able to use ITER_SLOTS for this, inode allocation needs to change to use BTREE_ITER_ALL_SNAPSHOTS. Signed-off-by: Kent Overstreet <[email protected]>
Since move.c isn't aware of what subvolume we're in, we can't use the standard inode lookup code - fortunately, we're just using it for reading IO options. Signed-off-by: Kent Overstreet <[email protected]>
This adds a new watermark for the journal reclaim when flushing btree key cache entries - it should try and stay ahead of where foreground threads doing transaction commits will enter direct journal reclaim. Signed-off-by: Kent Overstreet <[email protected]>
This is specifically to speed up bch2_inode_rm(), so that we're not traversing iterators we're done with. Signed-off-by: Kent Overstreet <[email protected]>
koverstreet
force-pushed
the
master
branch
5 times, most recently
from
November 4, 2024 07:06
e4b827e
to
b3a7824
Compare
koverstreet
force-pushed
the
master
branch
3 times, most recently
from
November 15, 2024 01:20
48c34ec
to
c7e3dd8
Compare
koverstreet
force-pushed
the
master
branch
7 times, most recently
from
November 30, 2024 02:27
8903f5f
to
bc01863
Compare
koverstreet
force-pushed
the
master
branch
8 times, most recently
from
December 8, 2024 05:20
f1d736e
to
f22a971
Compare
koverstreet
force-pushed
the
master
branch
6 times, most recently
from
December 12, 2024 07:29
79c1539
to
f8ed207
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add documentation commets to the btree iter flag definitions.
CC: #269
Signed-off-by: Dan Robertson [email protected]