Skip to content

backfill: handle transaction retry for vector index backfill #144328

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mw5h
Copy link
Contributor

@mw5h mw5h commented Apr 11, 2025

Previously, when backfilling a vector index, a transaction retry would cause backfill failure because the vector index backfill writer would overwrite the values read by the backfill reader with what it wanted to push down to KV so as to avoid new allocations. This worked well so long as there were no transaction retries but would fail to re-encode the index entry on a retry because the writer lost access to the unquantized vector. The quantized vector cannot be reused on a transaction retry because fixups may cause the target partition to change.

This patch creates a scratch rowenc.IndexEntry in the backfill helper to store the input vector entry. Before attempting to write the entry, we copy the input IndexEntry to the scratch entry and use that to re-encode the vector, which is still modified in place to limit new allocations.

Additionally, this patch switches the writer from using CPut() to CPutAllowingIfNotExists() so that if the backfill job restarts, we don't see the partially written index and fail due to duplicate keys.

Informs: #143107
Release note: None

@mw5h mw5h requested review from andy-kimball, DrewKimball and a team April 11, 2025 22:37
@mw5h mw5h requested a review from a team as a code owner April 11, 2025 22:37
Copy link

blathers-crl bot commented Apr 11, 2025

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball)


pkg/sql/rowexec/indexbackfiller.go line 197 at r1 (raw file):

	}

	// Initialize the tmpEntry. This will store the input entry that we are encoding

nit: mention here that this allows us to preserve the initial "template" indexEntry across txn retries


pkg/sql/rowexec/indexbackfiller.go line 207 at r1 (raw file):

	tmpEntry.Value.RawBytes = tmpEntry.Value.RawBytes[:0]
	tmpEntry.Key = append(tmpEntry.Key, indexEntry.Key...)
	tmpEntry.Value.RawBytes = append(tmpEntry.Value.RawBytes, indexEntry.Value.RawBytes...)

super nit:

	tmpEntry.Key = append(tmpEntry.Key[:0], indexEntry.Key...)
	tmpEntry.Value.RawBytes = append(tmpEntry.Value.RawBytes[0:], indexEntry.Value.RawBytes...)

pkg/sql/backfill/backfill.go line 523 at r1 (raw file):

	outputEntry.Key = outputEntry.Key[:0]
	outputEntry.Key = append(outputEntry.Key, vih.indexPrefix...)

super nit: similar to below:

outputEntry.Key = append(outputEntry.Key[:0], vih.indexPrefix...)

@andy-kimball
Copy link
Contributor

Is it possible to add tests that detect and regress this issue?

@mw5h mw5h force-pushed the vec-backfill-fix branch 2 times, most recently from 80bfbcb to 4fc6cce Compare April 14, 2025 20:56
Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mw5h)

Previously, when backfilling a vector index, a transaction retry would
cause backfill failure because the vector index backfill writer would
overwrite the values read by the backfill reader with what it wanted to
push down to KV so as to avoid new allocations. This worked well so long
as there were no transaction retries but would fail to re-encode the
index entry on a retry because the writer lost access to the unquantized
vector. The quantized vector cannot be reused on a transaction retry
because fixups may cause the target partition to change.

This patch creates a scratch rowenc.IndexEntry in the backfill helper to
store the input vector entry. Before attempting to write the entry, we
copy the input IndexEntry to the scratch entry and use that to re-encode
the vector, which is still modified in place to limit new allocations.

Additionally, this patch switches the writer from using CPut() to
CPutAllowingIfNotExists() so that if the backfill job restarts, we don't
see the partially written index and fail due to duplicate keys.

Informs: cockroachdb#143107
Release note: None
@mw5h mw5h force-pushed the vec-backfill-fix branch from 5b1e375 to 87492b9 Compare April 14, 2025 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants