Fix batch split bug by heppu · Pull Request #257 · microsoft/go-mssqldb

heppu · 2025-05-01T20:23:29Z

Parsing breaks on GOTO word so I added ahead lookup with length validation to fix this. Without this fix running DB migration scripts isn't possible. PR waiting for this here.

heppu · 2025-05-01T20:26:59Z

@microsoft-github-policy-service agree

codecov-commenter · 2025-05-01T20:34:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.60%. Comparing base (b935441) to head (53ec202).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #257       +/-   ##
===========================================
+ Coverage   80.70%   96.60%   +15.89%     
===========================================
  Files          35       92       +57     
  Lines        6910    74355    +67445     
===========================================
+ Hits         5577    71828    +66251     
- Misses       1063     2191     +1128     
- Partials      270      336       +66

Flag	Coverage Δ
unittests	`96.53% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
batch/batch.go	`90.00% <100.00%> (ø)`

... and 58 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

heppu · 2025-05-20T04:28:33Z

@shueybubbles could you take a look at this?

Copilot

Pull Request Overview

This PR fixes a bug in the batch splitter where the word GOTO was incorrectly recognized as a GO batch delimiter by adding a forward lookup to ensure the separator isn’t part of a larger word, and adds a corresponding test case.

Added a check in hasPrefixFold to reject matches where the next character is a letter.
Introduced a unit test to verify that GOTO isn’t split like a GO batch command.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
batch/batch.go	Added a forward lookup in `hasPrefixFold` to ensure the next character isn’t a letter when matching.
batch/batch_test.go	Added a test item verifying that `GOTO` doesn’t trigger a batch split at `GO`.

Comments suppressed due to low confidence (1)

batch/batch_test.go:70

[nitpick] Add a test case covering lowercase go (e.g., go vs. gotoflag) to ensure the splitting logic is case-insensitive and handles lowercase separators correctly.

		testItem{

Copilot · 2025-07-02T04:04:29Z

+	if len(s) > len(sep) && unicode.IsLetter(rune(s[len(sep)])) {
+		return false


When checking the character after the prefix, use utf8.DecodeRuneInString to properly handle multi-byte runes instead of casting a single byte to a rune.

Suggested change

if len(s) > len(sep) && unicode.IsLetter(rune(s[len(sep)])) {

return false

if len(s) > len(sep) {

r, _ := utf8.DecodeRuneInString(s[len(sep):])

if unicode.IsLetter(r) {

return false

}

@heppu add some double byte char test cases to cover this.

Addresses review feedback on the GO/GOTO word-boundary fix: - Use utf8.DecodeRuneInString so the follower-char letter check sees the full rune, not the leading byte of a multi-byte UTF-8 sequence. Casting rune(byte) misclassifies multi-byte runes (for example Hebrew aleph U+05D0 has leading byte 0xD7 which is the MULTIPLICATION SIGN, not a letter). - Extend TestHasPrefixFold to cover word-boundary cases (GOTO, gotoflag, GO1, GO_FOO), Latin-1 follower, and the Hebrew-aleph case that distinguishes the two implementations. Co-authored-by: Henri Koski <[email protected]>

dlevy-msft-sql · 2026-05-12T17:00:47Z

@heppu — pushed a follow-up commit to your branch addressing the review feedback so we can move this forward (you had maintainer_can_modify enabled). Happy to revert if you'd rather take it yourself.

What changed:

hasPrefixFold now uses utf8.DecodeRuneInString for the follower-char letter check. The rune(s[i]) cast misclassifies multi-byte UTF-8 sequences; e.g. Hebrew aleph (U+05D0) has leading byte 0xD7 which decodes as × (MULTIPLICATION SIGN, not a letter), so the old check would have allowed GO to match inside GOא....
Expanded TestHasPrefixFold with word-boundary cases (GOTO, gotoflag, GO1, GO_FOO), a Latin-1 follower, and the Hebrew-aleph case that distinguishes the byte-cast and rune-decode implementations.

You're credited via Co-authored-by on the new commit. @shueybubbles this should also pick up your "add some double byte char test cases" ask.

Keep both new TestBatchSplit cases: the GOTO/Bookmark case from this branch and the create-table 'gone_ts' case from microsoft#248 on main. Both exercise the word-boundary protection from different angles.

dlevy-msft-sql · 2026-05-12T17:05:12Z

Resolved the merge conflict with main (commit 53ec202). The conflict was test-table positional — both this branch and #248 added new TestBatchSplit cases at the same spot; kept both.

Heads-up on a redundancy that the merge surfaced: #248 (already merged to main) added its own word-boundary check in stateSep that returns stateText when the byte after the separator is a letter. This branch's check sits in hasPrefixFold and returns false so stateSep is never entered. Both reach the same outcome for the GOTO case, so we now have two defenses for the same scenario.

Two notable differences between the two defenses:

Location. hasPrefixFold is the predicate; rejecting there is arguably the cleaner architecture. stateSep's check is a workaround at the consumer.
Multi-byte UTF-8. stateSep uses unicode.IsLetter(rune(s[0])), which misclassifies the leading byte of a multi-byte sequence. The updated hasPrefixFold here uses utf8.DecodeRuneInString, so it correctly rejects GO followed by, say, Hebrew aleph. Without this PR's fix, stateSep's defense would let multi-byte letter followers through.

Options for getting this to land:

Keep both defenses (current state). Defense-in-depth, no behavior change to stateSep. Slightly redundant but harmless. Easiest path to merge.
Drop the hasPrefixFold check, fix stateSep to use utf8.DecodeRuneInString instead. Less code overall, single point of truth, but rewrites code that just landed in Fix handling of stateSep in batch package. Addresses bug #247 #248.
Drop the stateSep check, keep hasPrefixFold (this PR). Single point of truth at the predicate, but reverts part of Fix handling of stateSep in batch package. Addresses bug #247 #248.

@shueybubbles, any preference? I'm fine with any of the three. The default-easiest path is leaving it as-is now that the merge is clean.

heppu added 2 commits May 1, 2025 22:56

Add test for batch.Split bug

8cf1bb2

Fix GOTO batch parsing bug

760489e

heppu mentioned this pull request May 1, 2025

Add tsql/batch support for sqlserver golang-migrate/migrate#1270

Open

5 tasks

apoorvdeshmukh requested a review from Copilot July 2, 2025 04:03

Copilot AI reviewed Jul 2, 2025

View reviewed changes

dlevy-msft-sql assigned shueybubbles and heppu and unassigned shueybubbles Jan 31, 2026

dlevy-msft-sql added bug Something isn't working Size: S Small issue (less than one week effort, less than 250 lines of code) Area - parameters Issues with requests for additional parameters needs-work and removed needs-work labels Jan 31, 2026

dlevy-msft-sql removed bug Something isn't working Size: S Small issue (less than one week effort, less than 250 lines of code) Area - parameters Issues with requests for additional parameters labels Apr 20, 2026

merge: resolve upstream/main into fix/batch-split-bug

53ec202

Keep both new TestBatchSplit cases: the GOTO/Bookmark case from this branch and the create-table 'gone_ts' case from microsoft#248 on main. Both exercise the word-boundary protection from different angles.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix batch split bug#257

Fix batch split bug#257
heppu wants to merge 4 commits into
microsoft:mainfrom
heppu:fix/batch-split-bug

heppu commented May 1, 2025 •

edited

Loading

Uh oh!

heppu commented May 1, 2025

Uh oh!

codecov-commenter commented May 1, 2025 •

edited

Loading

Uh oh!

heppu commented May 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 2, 2025

Uh oh!

shueybubbles Jul 2, 2025

Uh oh!

dlevy-msft-sql commented May 12, 2026

Uh oh!

dlevy-msft-sql commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		if len(s) > len(sep) && unicode.IsLetter(rune(s[len(sep)])) {
		return false

Conversation

heppu commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

heppu commented May 1, 2025

Uh oh!

codecov-commenter commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

heppu commented May 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

shueybubbles Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

dlevy-msft-sql commented May 12, 2026

Uh oh!

dlevy-msft-sql commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

heppu commented May 1, 2025 •

edited

Loading

codecov-commenter commented May 1, 2025 •

edited

Loading