Ignore ES/CS/SS/DS segment overrides in x64 mode#2819
Ignore ES/CS/SS/DS segment overrides in x64 mode#2819Rot127 merged 10 commits intocapstone-engine:nextfrom
Conversation
|
Seems to not break anything. |
|
This is going to need a fair bit of testing. In particular, for multiple segment overrides in 32-bit mode all of the disassemblers (including Capstone) use first-seen. We'll want to make sure that is unchanged. I'm happy to help write a few tests cases. |
Thank you! This is my first time working with the capstone codebase, so I appreciate all help/advice. I am reading up on Capstone's testing set up and I will try to write a few tests myself as well. |
|
Mark this as draft for now. Please change it back once you think the testing is enough. |
|
I have added 21 test cases that cover various prefix combinations for 16, 32 and 64-bit modes and ensure notrack (reuses the DS segment override) is still decoded correctly. I was not sure where to place the tests, so I have put them in a separate file for now. @hainest if you have time, would you mind taking a look? Are there any cases that I missed? |
|
I would recommend explicitly checking that the correct prefix was found; e.g., -
input:
name: "x86-16: rightmost segment override should take priority"
bytes: [ 0x26, 0x65, 0x64, 0x3E, 0x65, 0x2E, 0x00, 0x00 ]
arch: "CS_ARCH_X86"
options: [ CS_MODE_16 ]
expected:
insns:
-
asm_text: "add byte ptr cs:[bx + si], al"
details:
x86:
prefix: [ X86_PREFIX_0, X86_PREFIX_CS, X86_PREFIX_0, X86_PREFIX_0 ]However, this raises an error: I've confirmed I built with |
|
Is there a way to directly check the derived segment override rather than the raw prefix? Due to the fact that the DS segment override prefix, which is normally ignored on 64-bit, is also overloaded as if (insn->mode != MODE_64BIT) {
insn->segmentOverride = SEG_OVERRIDE_CS;
}
insn->prefix1 = byte;So for an instruction with an ignored segment override this would still set It is not entirely clear to me what the best behavior would be here. Capstone expects there to be one relevant prefix per prefix group, which does not work well for |
Currently not. The segment overwrite is not even exposed in the API. If you do a reference search on the You can expose the segment override in the API as well. I think the best way to achieve this is:
That said, I can't really say how useful this information is for the end user. Besides that, one more question (I am really not that much into x86. So please correct me if I miss-understand something). But the ISA says in
Doesn't this mean it is fine to overwrite the prefixes? Because it is the semantically correct, right? Another point, |
|
I have pushed some changes. The patch now does the following:
This keeps I have also updated the tests to check the
Unfortunately just because the manual forbids it, does not mean you will not run into code using it. For example, malware might intentionally use undefined behavior for obfuscation. I will admit that this is a niche use-case, and I understand if you do not want to merge because of this. However, given that this is a fairly small change and that other disassemblers also do this, I think it would be worth merging.
As far as I can tell my code does not ignore the FS/GS prefixes, but if I am overlooking something please let me know. |
There was a problem hiding this comment.
It looks good to me.
But I'd like to have another review from one maintainer with more x86 knowledge. And of course from @hainest
Also, please add documentation about this in https://github.com/capstone-engine/capstone/blob/next/docs/cs_v6_release_guide.md
|
Apologies for the delay; I've added this change to the documentation. Please let me know if you'd like me to make any changes to the wording. |
|
@hainest If you find time, I would appreciate your review! |
hainest
left a comment
There was a problem hiding this comment.
A suggestion for a refactor, but otherwise looks good. I think the only additional tests that might be useful are for the overlap with branch hint bytes. I'm not sure that's necessary here, but could be useful.
As far as I could tell, Capstone currently does not decode branch hints, so I thought it would not be useful to add tests for them. But I am happy to add some if you disagree. |
If they aren't supported, then these tests are plenty. |
hainest
left a comment
There was a problem hiding this comment.
Thanks for doing the refactor!
Rot127
left a comment
There was a problem hiding this comment.
Good idea with the refactor!
Your checklist for this pull request
Detailed description
This PR attempts to fix the decoding of x64 instructions with ignored segment overrides. The typical behavior of CPUs, which is copied by most disassemblers, is to completely ignore ES/CS/SS/DS segment overrides and use the last FS/GS override, if any.
...
Test plan
I have added 21 test cases that cover various prefix combinations for 16, 32 and 64-bit modes and ensure notrack (reuses the DS segment override) is still decoded correctly.
...
closes #2818
...