Skip to content

fix: consume device code atomically to prevent race in RFC 8628 token exchange#125

Open
antojoseph wants to merge 2 commits into
masterfrom
security/f008-consume-device-code
Open

fix: consume device code atomically to prevent race in RFC 8628 token exchange#125
antojoseph wants to merge 2 commits into
masterfrom
security/f008-consume-device-code

Conversation

@antojoseph
Copy link
Copy Markdown

Summary

  • Adds ConsumeDeviceCode method to store.Store interface, with implementations in both memory and postgres stores
  • Calls ConsumeDeviceCode before issuing the access token in the device auth approved branch — returns 410 Gone if the race is lost (code already consumed by a concurrent request)

Security impact

Without this fix, two concurrent polling requests for the same approved device code both pass the approved state check and both receive valid access tokens for the same account. The consume-before-issue pattern makes token issuance idempotent and closes the TOCTOU window.

Test plan

  • go test ./internal/store/...
  • go test ./internal/api/...
  • Simulate concurrent polling against an approved device code and verify only one token is issued (second request gets 410)

The device code stayed in 'approved' state indefinitely, allowing any
number of repeated polls to each mint a fresh long-lived provider token
for a single browser approval.

Add ConsumeDeviceCode to the store interface (memory + postgres) that
atomically transitions approved → consumed. The token handler calls it
before generating the token; a concurrent or replay poll that loses the
race receives 410 Gone.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
d-inference Ready Ready Preview May 4, 2026 3:41am
d-inference-console-ui-dev Ready Ready Preview May 4, 2026 3:41am
d-inference-landing Ready Ready Preview May 4, 2026 3:41am

Request Review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8455982d7d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread coordinator/internal/api/device_auth.go Outdated
// Atomically consume the device code before issuing the token.
// This prevents a second poll (or a race) from minting a second
// long-lived token for the same browser approval.
if err := s.store.ConsumeDeviceCode(dc.DeviceCode); err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Defer consuming code until token issuance is guaranteed

Consuming the device code before generating and persisting the provider token makes the flow non-recoverable on downstream failures: if rand.Read or CreateProviderToken fails, the code is already marked consumed, so subsequent polls cannot succeed and the user must restart login. This introduces a new failure mode under transient entropy/DB issues and should be handled atomically (consume + token create in one transaction) or with rollback semantics.

Useful? React with 👍 / 👎.

Comment thread coordinator/internal/api/device_auth.go Outdated
Comment on lines +132 to +135
if err := s.store.ConsumeDeviceCode(dc.DeviceCode); err != nil {
// Another concurrent request already consumed it.
writeJSON(w, http.StatusGone, errorResponse("expired_token", "device code is no longer valid"))
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Distinguish store failures from already-consumed device codes

The handler maps every ConsumeDeviceCode error to expired_token, but the Postgres implementation can return operational errors (for example query timeout/connection errors). In those cases this path returns a terminal client error instead of a retriable server error, causing clients to abort login even though the code may still be valid.

Useful? React with 👍 / 👎.

Previously ConsumeDeviceCode was called first; a transient rand.Read or
CreateProviderToken failure left the code permanently consumed with no token
issued. Now the token is created first, then the code is consumed. On consume
failure the token is revoked and the appropriate status is returned:
410 if another request won the race, 500 for transient DB errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant