-
-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: make token refresh robust to network errors #610
Conversation
Hey, thanks for this, and I like the idea of making this more robust. But I think there are some minor security issues with your implementation. You're probably aware of why I'm using refresh token rotation, but here is a short explanation. The idea is that old tokens which might have been leaked cannot be reused. In your implementation:
|
Thanks for the feedback. The goal was to have the following happen in your scenario: |
Yes, I think I get what you're trying to do. I've added some comments on what I would change. Let me know/ping me when you want another review. |
Before working on fixing the implementation, I thought it would be helpful to properly document what I want to implement, so we can separate design and implementation discussions. I took the sequence diagrams of the linked documentation for refresh tokens, and extended them with my proposal: First the normal case without an attacker and any network issues: sequenceDiagram
participant LC as Legitimate Client
participant A as Auth0
Note over LC,A: Initial state: Client has RT1
LC->>A: Exchange RT1 for new tokens
Note over A: Generate AT2, RT2
Note over A: Keep RT1 valid
A-->>LC: Return AT2, RT2
LC->>A: Use AT2 for API call
Note over A: First use of new access token
Note over A: Now invalidate RT1
Note over LC,A: Later: AT2 expires
LC->>A: Exchange RT2 for new tokens
Note over A: Generate AT3, RT3
Note over A: Keep RT2 valid
A-->>LC: Return AT3, RT3
LC->>A: Use AT3 for API call
Note over A: First use of new access token
Note over A: Now invalidate RT2
Now with a network issue: sequenceDiagram
participant LC as Legitimate Client
participant A as Auth0
Note over LC,A: Initial state: Client has RT1
LC->>A: Exchange RT1 for new tokens
Note over A: Generate AT2, RT2
Note over A: Keep RT1 valid
A-->>LC: Return AT2, RT2 (network error)
Note over LC: Never receives tokens
LC->>A: Retry: Exchange RT1 for new tokens
Note over A: RT1 being reused - indicates<br/>possible network issues
Note over A: Invalidate AT2, RT2<br/>since RT1 is being reused
Note over A: Generate AT3, RT3
A-->>LC: Return AT3, RT3
LC->>A: Use AT3 for API call
Note over A: First use of new access token
Note over A: Now invalidate RT1
Now with an attack: sequenceDiagram
participant MC as Malicious Client
participant LC as Legitimate Client
participant A as Auth0
Note over LC,A: Initial state: Both have RT1
LC->>A: Exchange RT1 for new tokens
Note over A: Generate AT2, RT2
Note over A: Keep RT1 valid
A-->>LC: Return AT2, RT2
LC->>A: Use AT2 for API call
Note over A: First use of new access token
Note over A: Now invalidate RT1
MC->>A: Try to use RT1
Note over A: RT1 used after invalidation<br/>indicates compromise
Note over A: Invalidate entire token family<br/>(RT1, AT2, RT2)
A-->>MC: Access Denied
LC->>A: Try to use RT2
A-->>LC: Access Denied
Note over LC: Must re-authenticate
The case where the attacker acts first: sequenceDiagram
participant MC as Malicious Client
participant LC as Legitimate Client
participant A as Auth0
Note over MC,A: Initial state: Both have RT1
MC->>A: Exchange RT1 for new tokens
Note over A: Generate AT2, RT2
Note over A: Keep RT1 valid
A-->>MC: Return AT2, RT2
MC->>A: Use AT2 for API call
Note over A: First use of new access token
Note over A: Now invalidate RT1
LC->>A: Try to use RT1
Note over A: RT1 used after invalidation<br/>indicates compromise
Note over A: Invalidate entire token family<br/>(RT1, AT2, RT2)
A-->>LC: Access Denied
MC->>A: Try to use RT2
A-->>MC: Access Denied
Note over LC, MC: Both must re-authenticate
I tried thinking of race conditions and how to handle them, this is what I came up with: sequenceDiagram
participant LC as Legitimate Client
participant A as Auth0
participant MC as Malicious Client
Note right of LC: Has RT1
Note left of MC: Also has RT1
LC->>A: Exchange RT1 for new tokens
Note over A: Generate AT2, RT2
Note over A: Keep RT1 valid
A-->>LC: Return AT2, RT2 (network delay)
MC->>A: Exchange RT1 for new tokens
Note over A: RT1 still valid (no AT2 use yet)
Note over A: Generate AT3, RT3
Note over A: Invalidate AT2, RT2
A-->>MC: Return AT3, RT3
MC->>A: Use AT3 for API call
Note over A: First use of new access token
Note over A: Now invalidate RT1
Note over LC: Finally receives AT2, RT2
LC->>A: Try to use AT2
Note over A: AT2 was already invalidated
A-->>LC: Access Denied
LC->>A: Try to use RT1
Note over A: RT1 use after invalidation<br/>indicates compromise
Note over A: Invalidate all tokens<br/>(RT1, AT3, RT3)
A-->>LC: Access Denied
Note right of LC: Must re-authenticate
Note left of MC: Must re-authenticate
What do you think, should I try implementing it? |
Your sequence diagrams look good. In addition, if you look at the tree structure of tokens, it should be possible to create as many refresh tokens as possible (multiple network failures) and only once one subtree has been used no more refresh tokens can be created with the original refresh token and all other refresh token subtrees should be invalidated. Furthermore, access tokens don't need to be used. The following structure would be valid and RT2 is the "current" refresh token: stateDiagram-v2
RT1 : RT1 (Used)
AT1 : AT1 (Unused)
RT2 : RT2 (Used)
AT2 : AT2 (Used)
RT3 : RT3 (Unused)
AT3 : AT3 (Unused)
RT4 : RT4 (Unused)
AT4 : AT4 (Unused)
RT5 : RT5 (Unused)
AT5 : AT5 (Unused)
RT1 --> AT1
RT1 --> RT2
RT2 --> AT2
RT2 --> RT3
RT3 --> AT3
RT2 --> RT4
RT4 --> AT4
RT2 --> RT5
RT5 --> AT5
|
f628691
to
c721b93
Compare
Thanks for the additional input. I implemented my sequence diagrams and a sequence derived from your diagram as testcases, and then threw away my first approach (I certainly underestimated the complexity of this). Now my tests are passing, can you please look at my changes again? I think you can discard the previous conversations. |
216c8f1
to
2d488eb
Compare
Sorry for all the force pushs, it took me a while to clean up the commit history. Also: Don't merge this just yet, I wrote some more test cases and discovered another bug. I'll ping you here when it's fixed. |
Ok, one more testcase, one bug fixed. Please have a look, @TomBursch |
First thanks again for all the work, I still would like to change some things (Sorry about that 😅). I want to keep the database requests as minimal as possible inside the
What do you think of this? This should allow us to move most of the logic from My last diagram wasn't correct in that regard (even with the current implementation, the tokens shouldn't exist). stateDiagram-v2
RT1 : RT1 (Used)
RT2 : RT2 (Used)
RT3 : RT3 (Invalidated)
RT4 : RT4 (Invalidated)
RT5 : RT5 (Unused)
AT5 : AT5 (Unused)
RT1 --> RT2
RT2 --> RT3
RT2 --> RT4
RT2 --> RT5
RT5 --> AT5
|
Minimizing database requests is a valid goal. I looked at check_if_token_revoked again. And you are right, half the changes are not needed at all, and one DB access is only needed if you want to ensure that not expired ATs are not accepted anymore, after the use of a new AT activates a new branch. I removed it as well. Let me know if that IS a security goal, then I can add that code again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you so much for working on this!
910a979
to
03978dd
Compare
I finally found the reason for the logouts (first described in #79 ). When a token is refreshed, but the new token newer makes it to the client due to network issues, the client tries to refresh with an old token and is logged out. This change keeps the old token until the new token has been used once. Should fix #414