-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HMAC validation pseudocode. #1634
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting the token throws me off, can I get other folks to review this code?
If it helps I can remove the destructuring to make it that step a bit more straightforward. E.g.:
Instead of the one-liner. |
Anything you can do to simplify this so it's more generic would help me approve this now. :) |
I pushed that change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, this looks pretty solid, but I think there's a bit of clean-up needed because every hmac function that I've ever used has returned a byte array, not a String. And ditto for any call to a CSRNG returning a random value.
Lastly, rather than call a call to cryptograpic.randomValue()
on line 99, I think you want to use something like:
cryptograpic.randomValue(64) // return 64-byte random value
Otherwise, people will be asking how long of a random value do we need here.
cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.md
Outdated
Show resolved
Hide resolved
|
||
```code | ||
// Get the CSRF token from the request | ||
csrfToken = request.getParameter("csrf_token") // From form field or header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're getting this from a cookie, as implied on line 107, then this would probably be 'request.getHeader("Cookie")' and then you'd have to parse out the "csrf_token" cookie from any other cookies returned.
Also, related, despite the comment on line 107, I really don't see why you would want to create the cookie without the HttpOnly flag since I can never see a legitimate need to every have client-side JavaScript access it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand the major web frameworks (Django, Next.js etc.) recommend that client side JavaScript takes the cookie from the httpOnly=false cookie and send it back as a header during a fetch POST. I think this is what @psibean discovered as well.
https://docs.djangoproject.com/en/5.1/howto/csrf/#using-csrf-protection-with-ajax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that getting the token from the cookie is also Angulars default interceptor behaviour for CSRF protection:
When performing HTTP requests, an interceptor reads a token from a cookie, by default XSRF-TOKEN, and sets it as an HTTP header, X-XSRF-TOKEN.
From: https://v17.angular.io/guide/http-security-xsrf-protection
@kwwall for a long time I have completely agreed with you. The cookie should be httpOnly
, then the token should be provided to the frontend via a response payload (HTML, JSON), to then be included in a custom header. I was surprised to see this update to the cheat sheet as I believe I had previously contributed to it myself to clarify this.
However not only do the existing OWASP docs suggest otherwise (that line 107), but many existing implementations already work this way out of the box.
I get the impression the general consensus is, if something can access your cookie, that's a different vulnerability and it isn't the responsibility of CSRF protection.
It'd be great if we could get the cheat sheet cleaned up and filled with accurate information, because now I'm just not sure any more.
The recommended approach here is doing a few things that make frontend tampering impossible:
- A backend secret is being used to generate a hmac for the CSRF token. A valid CSRF token cannot be constructed without this secret, at which point CSRF attacks are the least of your worries.
- The message used to generate the hmac contains the users session ID and a pseudorandom cryptographicallybgenerate nonce, a valid CSRF token cannot be generated without the users session ID, which is already
httpOnly
- Validation is ensuring that the CSRF token received can reconstruct the hmac using the backend token, the users session ID, and the backend secret
All of these characteristics of this approach combined should make it okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the comment to add "cookie":
// From form field, cookie, or header
I think request.getParameter("csrf_token")
is abstract enough to convey the idea in pseudo-code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed the part that we were using cookies for compatibility with some JS frameworks and Django. Of course, I could just as easily argue that none of those frameworks should be relying on a Set-Cookie header to set a Cookie for a CSRF token. Seems to me it would have been better for them to settle on some standard header name such as "X-CSRF" or whatever. At least then if a framework botches the implementation, you don't need to worry about the browser automatically returning that cookie. AJAX works, except when it doesn't. But that's water under the bridge I guess.
As to why I was nothing things like request.getHeader("csrf_token")'
being better pseudo-code, it because in the not-too-distant future, shortly after we publish it, someone somewhere is going to point their favorite generative AI LLM at this pseudo code and ask it to generate code in their favorite programming language. If the humans would actually scrutinize and thoroughly test the generated code, the problem of trying to read this a a query parameter on a URL would be noticed, but someone will inevitably miss it. Many of them overly trust AI-generated code and are largely ignorant of AppSec. So I just figured to try to make the inevitable a bit less likely. YMMV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const randomValue = tokenParts[1]; | ||
|
||
// Recreate the HMAC with the current session and the randomValue from the request | ||
secret = readEnvironmentVariable("CSRF_SECRET") // HMAC secret key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While passing secrets as environment variables is less bad than certain other scenarios, it is still not truly secure. E.g., see:
- https://www.trendmicro.com/en_us/research/22/h/analyzing-hidden-danger-of-environment-variables-for-keeping-secrets.html
- https://blog.diogomonica.com/2017/03/27/why-you-shouldnt-use-env-variables-for-secret-data/
We generally to not allow this except as a last resort. Rather, retrieve the secret CSRF_SECRET in something like AWS KMS or HashiCorp Vault, etc. (If we change this here, a corresponding change is needed on line 97 as well.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what the pseudocode in the existing section says. I kept it consistent.
Complicating pseudocode with details of proprietary services like AWS & HashiCorp doesn't seem like it will age well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just getSecret("CSRF_SECRET")
?
Environment variable, secret, or otherwise (future proof).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just
getSecret("CSRF_SECRET")
? Environment variable, secret, or otherwise (future proof).
Make it getSecretSecurely
or similar, and I could stand behind that.
message = sessionID.length + "!" + sessionID + "!" + randomValue.length + "!" + randomValue | ||
|
||
// Generate the expected HMAC | ||
expectedHmac = hmac("SHA256", secret, message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HMac-SHA256 is going to be expected to return a byte array, not a string, so for here and on line 103, I would recommend changing the call from "hmac" to something like (say) "hmacAsHexString". If you don't do that, then someone trying to implement the pseudo-code on line 104, where it has:
csrfToken = hmac + "." + randomValue;
is likely to come out with a mangled string. Alternate;y, you could do something like:
csrfToken = hmac.asHex() + "." + randomValue.asHex();
here. That would save from having to base64-encoding the cookie value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think those types of implementation details don't belong in pseudocode. The existing pseudocode gives return values the same treatment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends. May be true if you do think that no one will feed the pseudo-code to a generative AI. But I don't believe that for a minute.
// Compare the HMAC from the request with the expected HMAC | ||
if (!constantTimeEquals(hmacFromRequest, expectedHmac)) { | ||
// HMAC validation failed, reject the request | ||
response.sendError(403, "Invalid CSRF token") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also recommend logging this and bogus csrfToken that is being passed in here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. That seems like something implementors should know to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Developer's yes. But AIs, not at this juncture, unless you include it in the prompt, which most will forget. Can't hurt to at least note it in a comment.
I thought the "Pseudo-Code For Implementing HMAC CSRF Tokens" section could use an example of how to validate the CSRF token once returned by the client to the server. This PR adds the pseudocode for that.
Checklist:
[TEXT](URL)