Skip to content

fix: prevent data race in tdsBuffer from concurrent processSingleResponse goroutines#357

Open
dlevy-msft-sql wants to merge 6 commits into
microsoft:mainfrom
dlevy-msft-sql:fix/data-race-tds-buffer
Open

fix: prevent data race in tdsBuffer from concurrent processSingleResponse goroutines#357
dlevy-msft-sql wants to merge 6 commits into
microsoft:mainfrom
dlevy-msft-sql:fix/data-race-tds-buffer

Conversation

@dlevy-msft-sql
Copy link
Copy Markdown

Summary

Prevents the data race in tdsBuffer caused by concurrent processSingleResponse goroutines reading from the same session buffer.

Problem

When an INSERT...OUTPUT or similar query produces results, startReading() spawns a processSingleResponse goroutine that reads from sess.buf. If the caller doesn't fully drain the rows before executing the next statement, startReading() spawns a second goroutine that reads from the same sess.buf concurrently. Both goroutines write to tdsBuffer.rpos, tdsBuffer.rsize, etc. without synchronization, causing a data race detected by go test -race.

Fix

Add a readDone chan struct{} field to tdsSession. Each processSingleResponse goroutine closes its readDone channel on exit (via defer close(readDone)). The next startReading() call waits on the previous readDone channel before spawning a new goroutine.

The same gate is applied to the cancellation retry path in nextToken() where a second processSingleResponse goroutine is spawned to read the cancellation confirmation.

Why this is safe

  • readDone is nil on the first call, so no blocking occurs
  • Each subsequent call blocks until the previous goroutine exits
  • In normal operation (rows fully consumed before next query), the channel is already closed and the receive returns immediately with zero added latency
  • startReading is only called from the connection's serial execution path, so no mutex is needed
  • The goroutine wrapper uses defer close(readDone) to guarantee the channel is always closed, even on panic recovery

Risk

Low. The wait only adds latency in the exact scenario that causes the data race today.

Fixes #171

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.60%. Comparing base (65e137f) to head (6ecb7e4).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main     #357       +/-   ##
===========================================
+ Coverage   80.62%   96.60%   +15.98%     
===========================================
  Files          35       92       +57     
  Lines        6910    74351    +67441     
===========================================
+ Hits         5571    71829    +66258     
- Misses       1068     2187     +1119     
- Partials      271      335       +64     
Flag Coverage Δ
unittests 96.53% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.
see 59 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dlevy-msft-sql dlevy-msft-sql added this to the v1.11.0 milestone Apr 17, 2026
@dlevy-msft-sql dlevy-msft-sql force-pushed the fix/data-race-tds-buffer branch 3 times, most recently from b8420f2 to e1b94e3 Compare April 24, 2026 22:38
@dlevy-msft-sql dlevy-msft-sql requested a review from Copilot April 26, 2026 16:35
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a go test -race data race in the TDS response-reading path by ensuring only one processSingleResponse goroutine can read from a session’s shared tdsBuffer at a time (fixes #171).

Changes:

  • Add readDone chan struct{} to tdsSession to track completion of the active response-reader goroutine.
  • Introduce (*tdsSession).startResponseReader(...) to wait for the previous reader to exit before starting the next one, and use it from both startReading() and the cancellation retry path in nextToken().
  • Add a unit test intended to validate the new serialization behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tds.go Adds tdsSession.readDone used as a completion gate for response readers.
token.go Centralizes spawning of processSingleResponse via startResponseReader, serializing readers to prevent concurrent buffer reads.
token_test.go Adds a test for response-reader serialization behavior.

Comment thread token_test.go
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread token_test.go Outdated
Comment thread token_test.go Outdated
Comment thread token.go
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread token.go
Comment thread token_test.go
Use a blockingTransport that holds the first reader goroutine in-flight,
then assert that the second startResponseReader call blocks until the
first completes. This catches the actual race condition instead of
testing the already-finished path.
Replace time.Sleep with readEntered channel for deterministic assertion
that the first reader is blocked. Fix comment to accurately describe
EOF error path (not panic recovery).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread token_test.go
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread token_test.go
Comment thread token_test.go
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread token_test.go
…Serializes

Addresses reviewer feedback: the assertion that secondStarted is still 0 could

be a false positive if the goroutine wasn't scheduled yet. Now we wait for a

goroutine-started channel plus a 100ms grace period before checking.
@dlevy-msft-sql dlevy-msft-sql requested a review from Copilot April 27, 2026 02:26
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread token_test.go Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tdsBuffer data race in transaction query with OUTPUT

3 participants