-
Notifications
You must be signed in to change notification settings - Fork 609
Change to get_value_str() to escape regexes broke capa2yara.py #1909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
here's the regex in the raw rule yaml: - string: /^(\\\\\?\\)?([\w]\:|\\)(\\((?![\<\>\"\/\|\*\?\:\\])[\x20-\x5B\x5D-\x7E])+)+\:\$?[a-zA-Z0-9_]+/ and as logged above, from [regex(string =~ /^(\\\\\?\\)?([\w]\:|\\)(\\((?![\<\>\"\/\|\*\?\:\\])[\x20-\x5B\x5D-\x7E])+)+\:\$?[a-zA-Z0-9_]+/)] and they look the same to me. So Can you explain the problem in a little more detail? |
The line after that is the problem. That's what ends up in the yara rule and it's escaped when it shouldn't: |
is this possibly because the logging statement uses |
I took a look at this a few days ago, but I couldn't figure out what was going on. Here's my output after reverting 58e94a3:
It seems like YARA doesn't like spaces in meta field names, so I had to add underscores to
What did the regex look like when the script was working? |
Hi @ruppde , I wanted to follow up on the issue you raised regarding the regex escaping problem in capa2yara.py. I have attempted to address this by introducing a new function that returns the regex unescaped specifically for use in capa2yara.py. I have modified the code to include this new function, which should resolve the issue of over-escaped regex patterns in the generated YARA rules.
|
hi @Dronesh77, that looks great, could you please make a pull request? |
Hi @ruppde , I’ve raised the pull request as requested. Could you please review it and let me know your feedback? |
Hi @Dronesh77, |
Hi @ruppde , I’ve made the following changes to address the regex escaping issue in capa2yara.py: Introduced a new function: I added get_unescaped_regex() to handle unescaped regex patterns specifically for YARA compatibility. This function removes unnecessary escaping of special characters and adjusts the regex for proper use in YARA rules. Updated regex handling: In the convert_rule() function, I replaced the old logic with this new function to ensure that the regex is correctly formatted. This includes: Removing redundant escape characters (e.g., \\ → \). Adjusting patterns like /reg(|.exe)/ to /reg(.exe)?/, which aligns with YARA's regex engine. Modifying beginning-of-line markers (e.g., ^open → \x00open) for compatibility. Improved performance: To prevent potential YARA warnings about poor performance, I limited patterns like .* to . {,1000}. |
@Dronesh77 sounds great but the PR does not reflect it, it only changes the file permissions: |
Uh oh!
There was an error while loading. Please reload this page.
Description
With 58e94a3 the regexes returned by
get_value_str()
are escaped which breaks e.g.capa/scripts/capa2yara.py
Line 262 in 3f449f3
Steps to Reproduce
Run
The 2nd line shows the regex escaped, which is of no use in yara:
Expected behavior:
No escaping
Actual behavior:
See above
Versions
Most recent github version
Additional Information
How should we fix this? Introduce another function which returns the regex unescaped?
(capa2yara.py is the only script in scripts/ which uses the function, so shouldn't have broken more)
The text was updated successfully, but these errors were encountered: