Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HMAC validation pseudocode. #1634

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

chr15m
Copy link

@chr15m chr15m commented Mar 21, 2025

I thought the "Pseudo-Code For Implementing HMAC CSRF Tokens" section could use an example of how to validate the CSRF token once returned by the client to the server. This PR adds the pseudocode for that.

Checklist:

  • In case of a new Cheat Sheet, you have used the Cheat Sheet template.
  • All the markdown files do not raise any validation policy violation, see the policy.
  • All the markdown files follow these format rules.
  • All your assets are stored in the assets folder.
  • All the images used are in the PNG format.
  • Any references to websites have been formatted as [TEXT](URL)
  • You verified/tested the effectiveness of your contribution (e.g., the defensive code proposed is really an effective remediation? Please verify it works!).
  • The CI build of your PR pass, see the build status here.

Copy link
Member

@jmanico jmanico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting the token throws me off, can I get other folks to review this code?

@chr15m
Copy link
Author

chr15m commented Mar 22, 2025

If it helps I can remove the destructuring to make it that step a bit more straightforward. E.g.:

const tokenParts = csrfToken.split(".");
const hmacFromRequest = tokenParts[0];
const randomValue = tokenParts[1];

Instead of the one-liner.

@jmanico
Copy link
Member

jmanico commented Mar 22, 2025

Anything you can do to simplify this so it's more generic would help me approve this now. :)

@chr15m
Copy link
Author

chr15m commented Mar 22, 2025

I pushed that change.

jmanico
jmanico previously approved these changes Mar 22, 2025
Copy link
Collaborator

@kwwall kwwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, this looks pretty solid, but I think there's a bit of clean-up needed because every hmac function that I've ever used has returned a byte array, not a String. And ditto for any call to a CSRNG returning a random value.

Lastly, rather than call a call to cryptograpic.randomValue() on line 99, I think you want to use something like:

    cryptograpic.randomValue(64)    // return 64-byte random value

Otherwise, people will be asking how long of a random value do we need here.


```code
// Get the CSRF token from the request
csrfToken = request.getParameter("csrf_token") // From form field or header
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're getting this from a cookie, as implied on line 107, then this would probably be 'request.getHeader("Cookie")' and then you'd have to parse out the "csrf_token" cookie from any other cookies returned.

Also, related, despite the comment on line 107, I really don't see why you would want to create the cookie without the HttpOnly flag since I can never see a legitimate need to every have client-side JavaScript access it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand the major web frameworks (Django, Next.js etc.) recommend that client side JavaScript takes the cookie from the httpOnly=false cookie and send it back as a header during a fetch POST. I think this is what @psibean discovered as well.

https://docs.djangoproject.com/en/5.1/howto/csrf/#using-csrf-protection-with-ajax

Copy link
Contributor

@psibean psibean Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that getting the token from the cookie is also Angulars default interceptor behaviour for CSRF protection:

When performing HTTP requests, an interceptor reads a token from a cookie, by default XSRF-TOKEN, and sets it as an HTTP header, X-XSRF-TOKEN.

From: https://v17.angular.io/guide/http-security-xsrf-protection

@kwwall for a long time I have completely agreed with you. The cookie should be httpOnly, then the token should be provided to the frontend via a response payload (HTML, JSON), to then be included in a custom header. I was surprised to see this update to the cheat sheet as I believe I had previously contributed to it myself to clarify this.

However not only do the existing OWASP docs suggest otherwise (that line 107), but many existing implementations already work this way out of the box.

I get the impression the general consensus is, if something can access your cookie, that's a different vulnerability and it isn't the responsibility of CSRF protection.

It'd be great if we could get the cheat sheet cleaned up and filled with accurate information, because now I'm just not sure any more.

The recommended approach here is doing a few things that make frontend tampering impossible:

  • A backend secret is being used to generate a hmac for the CSRF token. A valid CSRF token cannot be constructed without this secret, at which point CSRF attacks are the least of your worries.
  • The message used to generate the hmac contains the users session ID and a pseudorandom cryptographicallybgenerate nonce, a valid CSRF token cannot be generated without the users session ID, which is already httpOnly
  • Validation is ensuring that the CSRF token received can reconstruct the hmac using the backend token, the users session ID, and the backend secret

All of these characteristics of this approach combined should make it okay.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the comment to add "cookie":
// From form field, cookie, or header
I think request.getParameter("csrf_token") is abstract enough to convey the idea in pseudo-code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the part that we were using cookies for compatibility with some JS frameworks and Django. Of course, I could just as easily argue that none of those frameworks should be relying on a Set-Cookie header to set a Cookie for a CSRF token. Seems to me it would have been better for them to settle on some standard header name such as "X-CSRF" or whatever. At least then if a framework botches the implementation, you don't need to worry about the browser automatically returning that cookie. AJAX works, except when it doesn't. But that's water under the bridge I guess.

As to why I was nothing things like request.getHeader("csrf_token")' being better pseudo-code, it because in the not-too-distant future, shortly after we publish it, someone somewhere is going to point their favorite generative AI LLM at this pseudo code and ask it to generate code in their favorite programming language. If the humans would actually scrutinize and thoroughly test the generated code, the problem of trying to read this a a query parameter on a URL would be noticed, but someone will inevitably miss it. Many of them overly trust AI-generated code and are largely ignorant of AppSec. So I just figured to try to make the inevitable a bit less likely. YMMV.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psibean wrote:

@kwwall for a long time I have completely agreed with you.

I hope not (on the "completely" part). If that's the case, one of us is redundant. 😄

const randomValue = tokenParts[1];

// Recreate the HMAC with the current session and the randomValue from the request
secret = readEnvironmentVariable("CSRF_SECRET") // HMAC secret key
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While passing secrets as environment variables is less bad than certain other scenarios, it is still not truly secure. E.g., see:

We generally to not allow this except as a last resort. Rather, retrieve the secret CSRF_SECRET in something like AWS KMS or HashiCorp Vault, etc. (If we change this here, a corresponding change is needed on line 97 as well.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what the pseudocode in the existing section says. I kept it consistent.

Complicating pseudocode with details of proprietary services like AWS & HashiCorp doesn't seem like it will age well.

Copy link
Contributor

@psibean psibean Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just getSecret("CSRF_SECRET") ?
Environment variable, secret, or otherwise (future proof).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just getSecret("CSRF_SECRET") ? Environment variable, secret, or otherwise (future proof).

Make it getSecretSecurely or similar, and I could stand behind that.

message = sessionID.length + "!" + sessionID + "!" + randomValue.length + "!" + randomValue

// Generate the expected HMAC
expectedHmac = hmac("SHA256", secret, message)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HMac-SHA256 is going to be expected to return a byte array, not a string, so for here and on line 103, I would recommend changing the call from "hmac" to something like (say) "hmacAsHexString". If you don't do that, then someone trying to implement the pseudo-code on line 104, where it has:

   csrfToken = hmac + "." + randomValue;

is likely to come out with a mangled string. Alternate;y, you could do something like:

   csrfToken = hmac.asHex() + "." + randomValue.asHex();

here. That would save from having to base64-encoding the cookie value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those types of implementation details don't belong in pseudocode. The existing pseudocode gives return values the same treatment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends. May be true if you do think that no one will feed the pseudo-code to a generative AI. But I don't believe that for a minute.

// Compare the HMAC from the request with the expected HMAC
if (!constantTimeEquals(hmacFromRequest, expectedHmac)) {
// HMAC validation failed, reject the request
response.sendError(403, "Invalid CSRF token")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also recommend logging this and bogus csrfToken that is being passed in here as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. That seems like something implementors should know to do.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Developer's yes. But AIs, not at this juncture, unless you include it in the prompt, which most will forget. Can't hurt to at least note it in a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants