Conversation
|
Hey, could someone please kick off CI for this. Also, FWIW I have this working with monty and avoiding stack overflows both in AST parsing for the bytecode compiler and type checking in pydantic/monty#391. |
|
|
Thank you. This is an improvement, but I'm not convinced it is the proper fix; it only moves the needle on for which programs the parser aborts. But it isn't sufficient, e.g., to protect against allocation failures because the program's too large. I also checked, and neither TypeScript nor Rust implements the same treatment. Instead, the common approach across parsers is to:
In the end, protecting against denial-of-service attacks isn't specific to stack overflows. The same protection must be in place to handle the exploitation of bugs (in the parser or elsewhere). Which is why I wouldn't consider this a security bug (it certainly adds a few more guardrails, but it doesn't prevent them). |
|
I get where you're coming from, but the fact is if you limit the code length, stack overflow is one of the only DOS risks in the parser. @zanieb suggested you don't have the bandwidth to rewrite the recursion to a loop, and I certainly don't - so the choice is between adding this improvement, and not adding this improvement. I'd therefore really appreciate it if you accepted this improvement. But I don't get it if you're willing to merge, I'll just use ruff crates from my branch and attempt to keep it up to date. (If you are considering rewriting the parser to a loop, please consider making it available as an iterator so we can avoid the overhead of allocating before the first IR) |
I think there was some misunderstanding of what "rewriting" to a loop means. I'm not suggesting that we rewrite the parser to a loop. Instead, the idea is to unroll the recursion by using a loop, similar to what we do in |
|
I'm fine going ahead with this if we address the following issues:
|
|
great, I'll get those things fixed as soon as I have time. |
Summary
fix #22930.
Without this malicious or machine generated code could cause a stack overflow with something as simple as
'(' * 5000 + '1' + ')' * 5000.I decided to do the simplest thing and have a limit that's always applied with a reasonable default. Since:
Test Plan
PR includes tests.