MZ/PE: Parse and extract more information#2479
MZ/PE: Parse and extract more information#2479JeromeMartinez merged 5 commits intoMediaArea:masterfrom
Conversation
|
Some parts of the parsing are kind of messy since it is not simple but so far it worked fine with files I tested. Oh it seems some issues parsing MediaInfo installer version info. Fixed. Looks like the flag can be both binary or text. |
|
Have not found easy way to detect ARM64EC/ARM64X or the CETCOMPAT flag. |
a5f7850 to
98fc52d
Compare
JeromeMartinez
left a comment
There was a problem hiding this comment.
Thank you for the hard work.
But a bit more work needed, especially for:
Have not found easy way to detect ARM64EC/ARM64X
Aren't they in the Machine Type?
IMAGE_FILE_MACHINE_ARM64EC | 0xA641 | ABI that enables interoperability between native ARM64 and emulated x64 code.
IMAGE_FILE_MACHINE_ARM64X | 0xA64E | Binary format that allows both native ARM64 and ARM64EC code to coexist in the same file.
(but I am a bit lost there, so not sure)
or the CETCOMPAT flag.
DllCharacteristics 0x4000 → either interpreted as Guard CF or CET_COMPAT, depending on OS version and SDK.
Load Config GuardFlags → actual runtime enforcement:
IMAGE_GUARD_CF_INSTRUMENTED → software CFG
IMAGE_GUARD_SHADOW_STACK → CET
Source/MediaInfo/Archive/File_Mz.cpp
Outdated
| static const char* Mz_Windows_Subsystem(int16u Subsystem) | ||
| { | ||
| switch (Subsystem) { | ||
| case 0: return "Unknown"; |
There was a problem hiding this comment.
nitpick: I prefer just a static const char* [] + test on the size of the array, like Mz_Directories.
There was a problem hiding this comment.
The values that have strings are not continuous though.
I'll change to array and put empty strings for the missing ones.
Source/MediaInfo/Archive/File_Mz.cpp
Outdated
| }; | ||
| mz_dllcharacteristics_data Mz_DLLCharacteristics_Data[] = | ||
| { | ||
| { 0x0020, "High Entropy VA" }, |
There was a problem hiding this comment.
nitpick: I prefer just a static const char* [] and the line number of the list is the bit offset. From doc it is "reserved" but still has a name (bit 4 is just "") and this function iterates over the bitfield.
Source/MediaInfo/Archive/File_Mz.cpp
Outdated
| Fill(Stream_General, 0, "Windows_Subsystem", Mz_Windows_Subsystem(Subsystem)); | ||
| if (MajorSubsystemVersion) | ||
| Fill(Stream_General, 0, "Subsystem_Version", std::to_string(MajorSubsystemVersion) + "." + std::to_string(MinorSubsystemVersion)); | ||
| Fill(Stream_General, 0, "Dll_Characteristics", Mz_DLL_Characteristics(DllCharacteristics)); |
There was a problem hiding this comment.
| Fill(Stream_General, 0, "Dll_Characteristics", Mz_DLL_Characteristics(DllCharacteristics)); | |
| Fill(Stream_General, 0, "Format_Settings", Mz_DLL_Characteristics(DllCharacteristics)); |
Source/MediaInfo/Archive/File_Mz.cpp
Outdated
| } | ||
| if (MajorLinkerVersion) | ||
| Fill(Stream_General, 0, "Linker_Version", std::to_string(MajorLinkerVersion) + "." + std::to_string(MinorLinkerVersion)); | ||
| Fill(Stream_General, 0, "Windows_Subsystem", Mz_Windows_Subsystem(Subsystem)); |
There was a problem hiding this comment.
| Fill(Stream_General, 0, "Windows_Subsystem", Mz_Windows_Subsystem(Subsystem)); | |
| Fill(Stream_General, 0, "Subsystem_Name", Mz_Windows_Subsystem(Subsystem)); |
Source/MediaInfo/Archive/File_Mz.cpp
Outdated
| Get_L2(majorver, "MajorVersion"); | ||
|
|
||
| FILLING_BEGIN(); | ||
| Fill(Stream_General, 0, "BootMgr_SVN", std::to_string(majorver) + "." + std::to_string(minorver)); |
There was a problem hiding this comment.
| Fill(Stream_General, 0, "BootMgr_SVN", std::to_string(majorver) + "." + std::to_string(minorver)); | |
| Fill(Stream_General, 0, "BootMgrSecurity_Version", std::to_string(majorver) + "." + std::to_string(minorver)); |
Source/MediaInfo/Archive/File_Mz.cpp
Outdated
|
|
||
| FILLING_BEGIN(); | ||
| if (level == 2) { | ||
| Fill(Stream_General, 0, szKey.To_UTF8().c_str(), Value); |
There was a problem hiding this comment.
Try to add a mapping of "known names" with e.g.
LegalCopyright --> Copyright
CompanytName --> Software_CompanyName
FileVersion --> Software_Version
FileName --> Software_Name
(I try to have something like what we have with EXIF "Encoded_Application_*" stuff so I'll do something similar with one line default display)
|
A rebase first... |
CETCOMPAT is in Extended DLL Characteristics. Not sure how to get/parse there. Looks like it is in Mz_Directories == "Debug". |
Not so simple. Look at MediaInfo.dll ARM64X: but peview can identify that it is ARM64X, not ARM64 or ARM64EC. It also seems it may be able to determine whether the ARM64X defaults to ARM64 or ARM64EC. or an ARM64EC file: It is not simple because for compatibility, the ARM64X binary has to appear like a usual ARM64 or x64 binary and ARM64EC like a x64 if I remember correctly. Looks like we have to get to IMAGE_LOAD_CONFIG_DIRECTORY from Mz_Directories[10] == "Load Config Table". Then parse if 64-bits -> https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-image_load_config_directory64. DynamicValueRelocTable and CHPEMetadataPointer supposedly will be able to give us what is needed to determine if it is ARM64EC or ARM64X. More info: https://ffri.github.io/ProjectChameleon/new_reloc_chpev2/ I see Section Header - .a64xrm which may be hint of ARM64X/ARM64EC. |
|
CET compat detection done. The way I jump around the file for the various elements looks like a hack and does not seem scalable. May need a good "infrastructure" for navigating if we are to parse more stuff. |
JeromeMartinez
left a comment
There was a problem hiding this comment.
Let's go with that for now and we see what can be improved step by step.
Example (MediaInfo here is built in debug mode so no CFG flag in Dll_Characteristics):