fix: propagate context deadline to readPrelogin to prevent hangs#360
Open
dlevy-msft-sql wants to merge 22 commits into
Open
fix: propagate context deadline to readPrelogin to prevent hangs#360dlevy-msft-sql wants to merge 22 commits into
dlevy-msft-sql wants to merge 22 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #360 +/- ##
===========================================
+ Coverage 80.66% 96.64% +15.97%
===========================================
Files 35 92 +57
Lines 6842 74405 +67563
===========================================
+ Hits 5519 71907 +66388
- Misses 1055 2156 +1101
- Partials 268 342 +74
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
dba4ed8 to
0d5021f
Compare
TestLoginTimeout uses a very tight deadline (latency+200ms) which can cause the cancel-confirmation path to fire, producing a ServerError instead of a timeout. Apply the same flexible error matching used in TestQueryTimeout so the test passes regardless of which timing-dependent error surfaces.
There was a problem hiding this comment.
Pull request overview
This PR addresses a connect-time hang in the TDS prelogin phase by ensuring the prelogin read is bounded by the caller’s context deadline (even when connection timeout=0), and tightens timeout-related tests to match the driver’s real cancellation behaviors.
Changes:
- Add
preloginTimeout()and use it to temporarily reducetimeoutConn.timeoutaroundreadPrelogin()based on the context deadline (including immediate failure for already-expired deadlines). - Tighten
TestQueryTimeout/TestLoginTimeoutassertions to accept only specific, expected timeout/cancel outcomes. - Add new unit tests covering
preloginTimeout()behavior and a regression test ensuring prelogin honors context deadlines.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| tds.go | Adds preloginTimeout() and applies context-derived timeout to prevent readPrelogin() hangs. |
| queries_test.go | Narrows accepted timeout errors to specific expected driver/server/network outcomes. |
| prelogin_deadline_test.go | Adds direct unit/regression coverage for prelogin timeout computation and deadline enforcement. |
…ove coverage - close toconn on preloginTimeout and readPrelogin error paths to prevent socket leaks (addresses Copilot review comments) - extract isAcceptableTimeoutErr to deduplicate TestQueryTimeout and TestLoginTimeout error checking - simplify preloginTimeout expired-deadline path: return ctx.Err() directly instead of branching on nil (the dead branch was unreachable) - add test case for connTimeout==0 to reach 100% coverage on preloginTimeout
Addresses Copilot review: if the context is already canceled (e.g. context.WithCancel) but has no deadline, preloginTimeout previously returned connTimeout with nil error, allowing readPrelogin to block indefinitely when connTimeout==0. Now checks ctx.Err() first regardless of deadline presence.
- Add goroutine to watch ctx.Done() and close the connection to unblock readPrelogin when context is canceled without a deadline - Check ctx.Err() after readPrelogin to avoid using a conn that may have been closed by the cancel watcher (race safety) - Add TestPreloginRespectsContextCancel integration test - Align test comment with actual 5s bound per review feedback
Address Copilot review comments: - Only override read error with context.DeadlineExceeded when the error is actually a net timeout, not EOF/connection reset - Assert net.Error with Timeout()=true in socket timeout test - Verify server-side close detection in routing redirect test - Remove TestPreloginTimeoutErrorPath (context expires before dial, does not exercise preloginTimeout; covered by unit tests)
Replaces goodPreloginSequence (which calls t.Fatal) with inline code using t.Errorf+return. Adds t.Errorf on Accept failure.
- Add preloginTimeout test for ctxTimeout <= 0 race condition using custom context (pastDeadlineContext) to reach the defensive branch where Err() returns nil but time.Until(deadline) is negative - Add TestConnectSuccessfulPreloginAndLogin with mock TDS server that completes the full prelogin+login handshake, covering the success path (toconn = nil) without requiring a real SQL Server instance - Add TestPreloginDeadlineAndSocketTimeoutRace with matched timeouts to exercise the code path where socket timeout and context deadline fire nearly simultaneously - Extract sendLoginResponse helper for reuse in mock server tests preloginTimeout coverage: 90.9% -> 100% connect() success path now covered by unit tests
…Login Inline the prelogin/login handshake logic instead of calling goodPreloginSequence, which uses t.Fatal. Calling t.Fatal from a non-test goroutine only terminates that goroutine via runtime.Goexit, which could leave the test hanging on <-serverErr.
preloginTimeout checks ctx.Err() first, then time.Until(deadline), not the other way around.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Propagate the context deadline through prelogin without relying on SetReadDeadline, close the expired-deadline hole before
readPrelogin(), and tighten the timeout regression test to match the driver's real cancellation outcomes.Problem
The current branch reduced
toconn.timeoutonly whentime.Until(deadline) > 0. That still leaves a narrow hang window if the context deadline expires after the prelogin write but before the blocking prelogin read starts. Withconnection timeout=0,readPrelogin()can still block indefinitely.The timeout regression test was also widened too far and could accept unrelated failures by matching any error string containing
timeout.Fix
preloginTimeout()to compute the effective prelogin read timeoutreadPrelogin()TestQueryTimeoutto accept onlycontext.DeadlineExceeded, timeout-capablenet.Error, SQL error 3980, or the driver's explicit cancel-confirmation failureTesting
go test -run '^TestPrelogin' -count=1 .go test -run '^TestQueryTimeout$' -count=1 .go test -run '^(TestPreloginTimeout|TestPreloginRespectsContextDeadline|TestQueryTimeout)$' -count=1 .go build .Coverage
Two defensive branches in
connect()remain uncovered by unit tests because they guard nanosecond-window race conditions that cannot be reliably triggered without mocking time or context internals:return nil, errfrompreloginTimeoutinconnect()(tds.go line ~1249): This requires the context deadline to expire in the microsecond gap betweenwritePrelogincompleting andpreloginTimeoutbeing called. ThepreloginTimeoutfunction itself has 100% coverage via direct unit tests; only the error-return site inconnect()is not hit.return nil, context.DeadlineExceededin the net.Error timeout guard (tds.go line ~1290): This fires when the socket read returns a timeout error and the context deadline has passed, butctx.Err()has not yet propagated (checked a few lines earlier). Triggering this requires the context deadline and socket timeout to expire at the exact same instant withctx.Err()returning nil on one check and the deadline being past on the next, a sub-microsecond race.Both are intentional safety nets that convert ambiguous I/O timeouts into clear context errors. They are tested indirectly through the
preloginTimeoutunit tests (100% covered) and theTestPreloginDeadlineAndSocketTimeoutRaceintegration test.Fixes #254