-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Ignore ES/CS/SS/DS segment overrides in x64 mode #2819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
|
Seems to not break anything. |
|
This is going to need a fair bit of testing. In particular, for multiple segment overrides in 32-bit mode all of the disassemblers (including Capstone) use first-seen. We'll want to make sure that is unchanged. I'm happy to help write a few tests cases. |
Thank you! This is my first time working with the capstone codebase, so I appreciate all help/advice. I am reading up on Capstone's testing set up and I will try to write a few tests myself as well. |
|
Mark this as draft for now. Please change it back once you think the testing is enough. |
|
I have added 21 test cases that cover various prefix combinations for 16, 32 and 64-bit modes and ensure notrack (reuses the DS segment override) is still decoded correctly. I was not sure where to place the tests, so I have put them in a separate file for now. @hainest if you have time, would you mind taking a look? Are there any cases that I missed? |
|
I would recommend explicitly checking that the correct prefix was found; e.g., -
input:
name: "x86-16: rightmost segment override should take priority"
bytes: [ 0x26, 0x65, 0x64, 0x3E, 0x65, 0x2E, 0x00, 0x00 ]
arch: "CS_ARCH_X86"
options: [ CS_MODE_16 ]
expected:
insns:
-
asm_text: "add byte ptr cs:[bx + si], al"
details:
x86:
prefix: [ X86_PREFIX_0, X86_PREFIX_CS, X86_PREFIX_0, X86_PREFIX_0 ]However, this raises an error: I've confirmed I built with |
|
Is there a way to directly check the derived segment override rather than the raw prefix? Due to the fact that the DS segment override prefix, which is normally ignored on 64-bit, is also overloaded as if (insn->mode != MODE_64BIT) {
insn->segmentOverride = SEG_OVERRIDE_CS;
}
insn->prefix1 = byte;So for an instruction with an ignored segment override this would still set It is not entirely clear to me what the best behavior would be here. Capstone expects there to be one relevant prefix per prefix group, which does not work well for |
Currently not. The segment overwrite is not even exposed in the API. If you do a reference search on the You can expose the segment override in the API as well. I think the best way to achieve this is:
That said, I can't really say how useful this information is for the end user. Besides that, one more question (I am really not that much into x86. So please correct me if I miss-understand something). But the ISA says in
Doesn't this mean it is fine to overwrite the prefixes? Because it is the semantically correct, right? Another point, |
|
I have pushed some changes. The patch now does the following:
This keeps I have also updated the tests to check the
Unfortunately just because the manual forbids it, does not mean you will not run into code using it. For example, malware might intentionally use undefined behavior for obfuscation. I will admit that this is a niche use-case, and I understand if you do not want to merge because of this. However, given that this is a fairly small change and that other disassemblers also do this, I think it would be worth merging.
As far as I can tell my code does not ignore the FS/GS prefixes, but if I am overlooking something please let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me.
But I'd like to have another review from one maintainer with more x86 knowledge. And of course from @hainest
Also, please add documentation about this in https://github.com/capstone-engine/capstone/blob/next/docs/cs_v6_release_guide.md
|
Apologies for the delay; I've added this change to the documentation. Please let me know if you'd like me to make any changes to the wording. |
|
@hainest If you find time, I would appreciate your review! |
hainest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suggestion for a refactor, but otherwise looks good. I think the only additional tests that might be useful are for the overlap with branch hint bytes. I'm not sure that's necessary here, but could be useful.
| case 0x2e: | ||
| insn->segmentOverride = SEG_OVERRIDE_CS; | ||
| insn->prefix1 = byte; | ||
| if (insn->mode != MODE_64BIT || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend refactoring to make these checks more explicit and provide some comments as to why certain values are being ignored. Maybe something like
/*
* setSegmentPrefixOverride - Overrides an instruction's prefix1 based on addressing rules
*
* @param insn - The instruction to be overridden.
* @param prefix - The segment override to use.
* @param byte - The current decoded prefix byte.
*/
static void setSegmentPrefix(struct InternalInstruction *insn,
SegmentOverride prefix, uint8_t byte)
{
if (insn->mode != MODE_64BIT) {
insn->segmentOverride = prefix;
insn->prefix1 = byte;
return;
}
// In the case there are multiple segment overrides, do not override
// an existing FS or GS segment prefix.
switch (insn->prefix1) {
case 0x64: // FS
case 0x65: // GS
return;
}
// If the proposed override is for FS or GS, mark it overridden
// All other segment prefixes are ignored.
switch (byte) {
case 0x64: // FS
case 0x65: // GS
insn->segmentOverride = prefix;
break;
}
// Always set the prefix, even if it's not marked overridden
insn->prefix1 = byte;
}
...
switch (byte) {
case 0x2e:
setSegmentPrefix(insn, SEG_OVERRIDE_CS, byte);
break;
case 0x36:
setSegmentPrefix(insn, SEG_OVERRIDE_SS, byte);
break;
case 0x3e:
setSegmentPrefix(insn, SEG_OVERRIDE_DS, byte);
break;
case 0x26:
setSegmentPrefix(insn, SEG_OVERRIDE_ES, byte);
break;
case 0x64:
setSegmentPrefix(insn, SEG_OVERRIDE_FS, byte);
break;
case 0x65:
setSegmentPrefix(insn, SEG_OVERRIDE_GS, byte);
break;
default:
// debug("Unhandled override");
return -1;
}
Your checklist for this pull request
Detailed description
This PR attempts to fix the decoding of x64 instructions with ignored segment overrides. The typical behavior of CPUs, which is copied by most disassemblers, is to completely ignore ES/CS/SS/DS segment overrides and use the last FS/GS override, if any.
...
Test plan
I have added 21 test cases that cover various prefix combinations for 16, 32 and 64-bit modes and ensure notrack (reuses the DS segment override) is still decoded correctly.
...
closes #2818
...