Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/compact: defer retrieval of external values #4198

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Dec 12, 2024

During compaction iteration, defer the retrieval of values stored in value blocks until either:

a) we need to copy the value before stepping the iterator, or b) the compaction iterator yields the value to the main compaction loop.

This refactor ensures that when a KV is elided by a tombstone, we avoid unnecessarily loading its value from the external value block. Additionally, refactoring the compaction iterator interfaces to propagate LazyValues will be used by value separation (#112) in which we will sometimes propagate external value references to output sstables without ever retrieving the value.

Close #4197.
Informs #112.

@jbowens jbowens requested a review from a team as a code owner December 12, 2024 16:43
@jbowens jbowens requested a review from itsbilal December 12, 2024 16:43
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jbowens jbowens requested a review from sumeerbhola December 12, 2024 16:43
@jbowens
Copy link
Collaborator Author

jbowens commented Dec 12, 2024

Hrm, two cache entry was not freed errors in CI: one on 32-bit linux and another on macOS. I've been unable to repro

@RaduBerinde
Copy link
Member

RaduBerinde commented Dec 12, 2024

I saw that error in CI also: https://github.com/cockroachdb/pebble/actions/runs/12304080308/job/34340708418?pr=4199
Possibly related to #4157

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 6 files reviewed, 2 unresolved discussions (waiting on @itsbilal, @jbowens, and @sumeerbhola)


internal/compact/iterator.go line 1291 at r1 (raw file):

		i.value = base.LazyValue{}
		return
	} else if !callerOwned {

[nit] remove the else or the return


internal/compact/iterator.go line 1408 at r1 (raw file):

	}
	lv = base.MakeInPlaceValue(value)
	return

[nit] return base.MakeInPlaceValue(value), needDelete, closer, err and _ base.LazyValue

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @itsbilal and @sumeerbhola)


internal/compact/iterator.go line 1291 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] remove the else or the return

Done


internal/compact/iterator.go line 1408 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] return base.MakeInPlaceValue(value), needDelete, closer, err and _ base.LazyValue

Done

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 5 files at r1, 1 of 2 files at r3, all commit messages.
Reviewable status: 5 of 6 files reviewed, 11 unresolved discussions (waiting on @itsbilal and @jbowens)


internal/compact/run.go line 218 at r3 (raw file):

			continue
		}
		v, _, err := value.Value(nil)

We used to call LazyValue.Value(nil) to initialize compact.Iter.iterValue for the current iterator position. That made sense since it would never be caller owned. And then any instability was handled by reusing valueBuf.

Now we have two kinds of LazyValues here: those that have not been cloned, and those that have been. For the latter, the valueBlockFetcher may be closed, in which case it will need to allocate. So I think we need another scratch buffer in this Runner to avoid those allocations.


internal/compact/iterator_test.go line 204 at r3 (raw file):

						d.Fatalf(t, "unknown iter command: %s", parts[0])
					}
					value := lv.InPlaceValue()

why is the test assuming in-place instead of using the more general LazyValue.Value?


internal/compact/iterator.go line 560 at r3 (raw file):

					return &i.key, i.value
				} else if i.err != nil {
					return nil, base.LazyValue{}

nit: consider repeating the comment from above // SINGLEDELs are value-less.


internal/compact/iterator.go line 586 at r3 (raw file):

			origSnapshotIdx := i.curSnapshotIdx
			var valueMerger base.ValueMerger
			// MERGE values are always stored in-place.

Can you add an invariants.Enabled assertion.


internal/compact/iterator.go line 668 at r3 (raw file):

func (i *Iter) iterNext() bool {
	i.iterKV = i.iter.Next()
	if i.err != nil {

It seems that we should no longer need to look at i.err given we are not changing it, and it will be nil when this function is called.


internal/compact/iterator.go line 893 at r3 (raw file):

			// continue looping.
			//
			// MERGE values are always stored in-place.

some comment about adding an assertion


internal/compact/iterator.go line 919 at r3 (raw file):

	// Save the current key.
	i.saveKey()
	i.value = i.iterKV.V

This i.value = ... seems dubious. We are going to immediately call i.nextInStripe(), and we haven't cloned the LazyValue, so this value is no longer backed by anything.
I think it worked because i.iterValue was always the empty slice in this case. And now with the LazyValue being in-place, the LazyFetcher will be nil, so it will work. But is in principle incorrect. I think we should explicitly assert that this is in-place, and the empty slice, and then set i.value = LazyValue{}.


internal/compact/iterator.go line 1114 at r3 (raw file):

	// In this case, we still peek forward in case there's another DELSIZED key
	// with a lower sequence number, in which case we'll adopt its value.
	// If the DELSIZED does have a value, it must be in-place.

assertion please


internal/compact/iterator.go line 1160 at r3 (raw file):

			// If the tombstone has a value, it must be in-place. To save it, we
			// can just copy the in-place value directly.
			i.valueBuf = append(i.valueBuf[:0], i.iterKV.InPlaceValue()...)

assertion please


internal/compact/iterator.go line 1286 at r3 (raw file):

// result into i.valueBuf).
func (i *Iter) saveValue() {
	v, callerOwned, err := i.iterKV.V.Value(i.valueBuf[:0])

I don't think we should be fetching. And should be doing i.value, i.valueBuf = i.iterKV.V.Clone(i.valueBuf[:0], &i.scratchFetcher)


internal/compact/iterator.go line 192 at r3 (raw file):

	valueBuf         []byte
	iterKV           *base.InternalKV
	iterValue        []byte

I'm glad iterValue is no more. It seemed to be redundant with iterKV but if iterKV was nil we wouldn't set it to nil, which seemed error prone.

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 6 files reviewed, 10 unresolved discussions (waiting on @itsbilal and @sumeerbhola)


internal/compact/iterator.go line 560 at r3 (raw file):

Previously, sumeerbhola wrote…

nit: consider repeating the comment from above // SINGLEDELs are value-less.

Done.


internal/compact/iterator.go line 586 at r3 (raw file):

Previously, sumeerbhola wrote…

Can you add an invariants.Enabled assertion.

InPlaceValue() already asserts.


internal/compact/iterator.go line 668 at r3 (raw file):

Previously, sumeerbhola wrote…

It seems that we should no longer need to look at i.err given we are not changing it, and it will be nil when this function is called.

You're right, although I'm confused as to why we're not checking i.iter.Error() here?


internal/compact/iterator.go line 893 at r3 (raw file):

Previously, sumeerbhola wrote…

some comment about adding an assertion

InPlaceValue() already asserts.


internal/compact/iterator.go line 919 at r3 (raw file):

Previously, sumeerbhola wrote…

This i.value = ... seems dubious. We are going to immediately call i.nextInStripe(), and we haven't cloned the LazyValue, so this value is no longer backed by anything.
I think it worked because i.iterValue was always the empty slice in this case. And now with the LazyValue being in-place, the LazyFetcher will be nil, so it will work. But is in principle incorrect. I think we should explicitly assert that this is in-place, and the empty slice, and then set i.value = LazyValue{}.

Agreed.


internal/compact/iterator.go line 1114 at r3 (raw file):

Previously, sumeerbhola wrote…

assertion please

InPlaceValue() already asserts.


internal/compact/iterator.go line 1160 at r3 (raw file):

Previously, sumeerbhola wrote…

assertion please

InPlaceValue() already asserts.


internal/compact/iterator.go line 1286 at r3 (raw file):

Previously, sumeerbhola wrote…

I don't think we should be fetching. And should be doing i.value, i.valueBuf = i.iterKV.V.Clone(i.valueBuf[:0], &i.scratchFetcher)

I had that initially but decided I did not want introduce the possibility of needing to load the value from an external block reader in this PR. Fetching the value keeps the behavior closer to what was here originally, constraining the surface area of this PR.


internal/compact/run.go line 218 at r3 (raw file):

Previously, sumeerbhola wrote…

We used to call LazyValue.Value(nil) to initialize compact.Iter.iterValue for the current iterator position. That made sense since it would never be caller owned. And then any instability was handled by reusing valueBuf.

Now we have two kinds of LazyValues here: those that have not been cloned, and those that have been. For the latter, the valueBlockFetcher may be closed, in which case it will need to allocate. So I think we need another scratch buffer in this Runner to avoid those allocations.

Since we're not currently Clone-ing, this value is necessarily in-place. My preference is to defer any use of Clone until a subsequent PR. I'm unsure it's ever worthwhile for compactions to Clone value block references. I don't think there's ever an instance where we call saveValue and then don't end up reading the value, because we only need to iterate forward like this to determine SET vs SETWITHDEL


internal/compact/iterator_test.go line 204 at r3 (raw file):

Previously, sumeerbhola wrote…

why is the test assuming in-place instead of using the more general LazyValue.Value?

This particular unit test just does not support anything other than in-place values. It's trivial to use Value. Done, so there's one less thing to edit if we extend the test to construct non in-place values.

During compaction iteration, defer the retrieval of values stored in value
blocks until either:

a) we need to copy the value before stepping the iterator, or
b) the compaction iterator yields the value to the main compaction loop.

This refactor ensures that when a KV is elided by a tombstone, we avoid
unnecessarily loading its value from the external value block. Additionally,
refactoring the compaction iterator interfaces to propagate LazyValues will be
used by value separation (cockroachdb#112) in which we will sometimes propagate external
value references to output sstables without ever retrieving the value.

Close cockroachdb#4197.
Informs cockroachdb#112.
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 6 files reviewed, 10 unresolved discussions (waiting on @itsbilal and @sumeerbhola)


internal/compact/run.go line 218 at r3 (raw file):

Since we're not currently Clone-ing, this value is necessarily in-place.

I mean, this value is necessarily in-place or for the current iterator position.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r3, 1 of 2 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @itsbilal and @jbowens)


internal/compact/iterator.go line 668 at r3 (raw file):

Previously, jbowens (Jackson Owens) wrote…

You're right, although I'm confused as to why we're not checking i.iter.Error() here?

It's probably because what is returned in i.iter.Error() will also be returned in i.iter.Close().


internal/compact/iterator.go line 1286 at r3 (raw file):

Previously, jbowens (Jackson Owens) wrote…

I had that initially but decided I did not want introduce the possibility of needing to load the value from an external block reader in this PR. Fetching the value keeps the behavior closer to what was here originally, constraining the surface area of this PR.

I'm slightly worried we will forget it. Could you add a TODO, and also include in that scope that run.go should use a scratch buffer?


internal/compact/run.go line 218 at r3 (raw file):

I don't think there's ever an instance where we call saveValue and then don't end up reading the value, because we only need to iterate forward like this to determine SET vs SETWITHDEL

But we will in the future, yes? When a compaction is not rewriting its references to blob files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

db: refactor compaction iteration interfaces for deferring value retrieval
4 participants