-
Notifications
You must be signed in to change notification settings - Fork 3.9k
backfill: handle transaction retry for vector index backfill #144328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball)
pkg/sql/rowexec/indexbackfiller.go
line 197 at r1 (raw file):
} // Initialize the tmpEntry. This will store the input entry that we are encoding
nit: mention here that this allows us to preserve the initial "template" indexEntry across txn retries
pkg/sql/rowexec/indexbackfiller.go
line 207 at r1 (raw file):
tmpEntry.Value.RawBytes = tmpEntry.Value.RawBytes[:0] tmpEntry.Key = append(tmpEntry.Key, indexEntry.Key...) tmpEntry.Value.RawBytes = append(tmpEntry.Value.RawBytes, indexEntry.Value.RawBytes...)
super nit:
tmpEntry.Key = append(tmpEntry.Key[:0], indexEntry.Key...)
tmpEntry.Value.RawBytes = append(tmpEntry.Value.RawBytes[0:], indexEntry.Value.RawBytes...)
pkg/sql/backfill/backfill.go
line 523 at r1 (raw file):
outputEntry.Key = outputEntry.Key[:0] outputEntry.Key = append(outputEntry.Key, vih.indexPrefix...)
super nit: similar to below:
outputEntry.Key = append(outputEntry.Key[:0], vih.indexPrefix...)
Is it possible to add tests that detect and regress this issue? |
80bfbcb
to
4fc6cce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mw5h)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly looks good, i just had some minor nits
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @mw5h)
pkg/sql/rowexec/indexbackfiller.go
line 215 at r3 (raw file):
} if ib.flowCtx.Cfg.TestingKnobs.VectorIndexBackfillTxnError != nil {
super nit: this testing hook could have a more general name, like RunDuringReencodeVectorIndexEntry
, but i have no strong opinion on changing the name if you think we're unlikely to use this for any other purpose.
pkg/sql/backfill/backfill_test.go
line 27 at r3 (raw file):
) func TestMain(m *testing.M) {
nit: let's stick with the convention of keeping TestMain
inside of a file called main_test.go
we also usually add a line like this to main_test.go
, which updates a testing linter
//go:generate ../../util/leaktest/add-leaktest.sh *_test.go
pkg/sql/backfill/backfill_test.go
line 64 at r3 (raw file):
if !errorState.hasErrored { errorState.hasErrored = true return &testRetryError{}
in other tests that simulate retry errors, we use txn.GenerateForcedRetryableErr
cockroach/pkg/sql/mvcc_backfiller_test.go
Line 844 in 14ea61b
return txn.GenerateForcedRetryableErr(ctx, "forcing a retry error") |
could we use that here to avoid needing to create a mocked retry error? it would require passing in the kv.Txn
object into the testing hook, which seems doable.
RFAL |
Previously, when backfilling a vector index, a transaction retry would cause backfill failure because the vector index backfill writer would overwrite the values read by the backfill reader with what it wanted to push down to KV so as to avoid new allocations. This worked well so long as there were no transaction retries but would fail to re-encode the index entry on a retry because the writer lost access to the unquantized vector. The quantized vector cannot be reused on a transaction retry because fixups may cause the target partition to change. This patch creates a scratch rowenc.IndexEntry in the backfill helper to store the input vector entry. Before attempting to write the entry, we copy the input IndexEntry to the scratch entry and use that to re-encode the vector, which is still modified in place to limit new allocations. Additionally, this patch switches the writer from using CPut() to CPutAllowingIfNotExists() so that if the backfill job restarts, we don't see the partially written index and fail due to duplicate keys. Informs: cockroachdb#143107 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @mw5h)
bors r+ |
Build succeeded: |
blathers backport 25.2 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 02738e6 to blathers/backport-release-25.2-144328: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 25.2 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
blathers backport 25.2 |
Previously, when backfilling a vector index, a transaction retry would cause backfill failure because the vector index backfill writer would overwrite the values read by the backfill reader with what it wanted to push down to KV so as to avoid new allocations. This worked well so long as there were no transaction retries but would fail to re-encode the index entry on a retry because the writer lost access to the unquantized vector. The quantized vector cannot be reused on a transaction retry because fixups may cause the target partition to change.
This patch creates a scratch rowenc.IndexEntry in the backfill helper to store the input vector entry. Before attempting to write the entry, we copy the input IndexEntry to the scratch entry and use that to re-encode the vector, which is still modified in place to limit new allocations.
Additionally, this patch switches the writer from using CPut() to CPutAllowingIfNotExists() so that if the backfill job restarts, we don't see the partially written index and fail due to duplicate keys.
Informs: #143107
Release note: None