The MinnowBoard Chronicles Episode 18: Reverse-Engineering Code Execution

In my last article, I used Last Branch Record (LBR) Trace to manually capture UEFI program flow source and destination addresses. This week, I look at the associated instruction opcodes and mnemonics and try to figure out what is going on.

In last week’s MinnowBoard Chronicles, Episode 17: Using LBR Trace without Source Code, we stopped somewhere in DXE and dumped all of the branch-from and branch-to instruction address pairs, up to a maximum of 8 within the Intel Silvermont architecture.

Why is this interesting? Well, there may be an event you want to debug on an Intel platform where the only “breadcrumbs” are the last branch addresses of code execution immediately prior. As we learned in Episode 16, these are captured within some model-specific registers (MSRs) dedicated to this purpose. On the MinnowBoard, based upon an Intel BayTrail-I processor (that has Silvermont cores), these source/destination pairs are MSR addresses x’40’ through x’47’ and x’60’ through x’67’.

Recall that the LBR recording mechanism tracks not only branch instructions (like JMP, Jcc, LOOP and CALL instructions), but also other operations that cause a change in the instruction pointer (like external interrupts, traps and faults). It has the advantage of being active soon after reset if needed; whereas other tracing mechanisms, such as Branch Trace Store (BTS) and Intel Processor Trace, require system memory to be initialized. LBR is the most “low-level” of tracing features on Intel silicon, so to speak.

To follow up on Episode 17, this week I again halted the system and put it into probe mode within DXE. Then I ran my LBR MSR dump macro to see the branch-from and branch-to address pairs. The address traceback looked like this:

From:          To:

7785b0e4 7785b0c4

7785b0d1 77855c10

77855c29 7785b0d6

7785b0e4 7785b0c4

7785b0d1 77855c10

7785b0e4 7785b0c4

7785b0d1 77855c10

77855c29 7785b0d6

We also know from the Intel Architectures Software Developers Manuals that the TOS Pointer MSR (MSR_LASTBRANCH_TOS, address x’1C9’) contains a pointer to the MSR in the LBR stack that contains the most recent branch, interrupt, or exception recorded. In this case, using the SourcePoint msr(1C9) command, I found that it equaled x’04’, so the last “from_address” is x’7785b0d1’ and the last “to_address” is x’77855c10’ from above. Also, I could see from the SourcePoint Code window that the instruction pointer is at x’77855c1f’. And then the branch traceback goes backwards from there.

Going into the SourcePoint Code window, with its built-in disassembler, we can easily see the assembly language code flow as we go backwards in time. There’s a lot of code here. Let’s look at the individual “chunks” of code sorted by the above “from” and “to” addresses:

MSR_LASTBRANCH_0_FROM_IP

000000007785B0E4L 74DE             JE          short ptr 000000007785b0c4L

MSR_LASTBRANCH_0_TO_IP

000000007785B0C4L 0FB7057D320000   MOVZX       EAX,word ptr [000000007785e348]

000000007785B0CBL 83C005           ADD         EAX,00000005

000000007785B0CEL 4863C8           MOVSXD      RCX,EAX

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

MSR_LASTBRANCH_1_FROM_IP:

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

MSR_LASTBRANCH_1_TO_IP:

0000000077855C10L 48894C2408       MOV         qword ptr [RSP]+08,RCX

0000000077855C15L 4883EC18         SUB         RSP,00000018

0000000077855C19L 0FB7542420       MOVZX       EDX,word ptr [RSP]+20

0000000077855C1EL EC               IN          AL,DX

0000000077855C1FL 880424           MOV         byte ptr [RSP],AL

0000000077855C22L 8A0424           MOV         AL,byte ptr [RSP]

0000000077855C25L 4883C418         ADD         RSP,00000018

0000000077855C29L C3               RETN

MSR_LASTBRANCH_2_FROM_IP:

0000000077855C29L C3               RET

MSR_LASTBRANCH_2_TO_IP

000000007785B0D6L 88442428         MOV         byte ptr [RSP]+28,AL

000000007785B0DAL 0FB6442428       MOVZX       EAX,byte ptr [RSP]+28

000000007785B0DFL 83E020           AND         EAX,00000020

000000007785B0E2L 85C0             TEST        EAX,EAX

000000007785B0E4L 74DE             JE          short ptr 000000007785b0c4L

MSR_LASTBRANCH_3_FROM_IP

000000007785B0E4L 74DE             JE          short ptr 000000007785b0c4L

MSR_LASTBRANCH_3_TO_IP

000000007785B0C4L 0FB7057D320000   MOVZX       EAX,word ptr [000000007785e348]

000000007785B0CBL 83C005           ADD         EAX,00000005

000000007785B0CEL 4863C8           MOVSXD      RCX,EAX

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

MSR_LASTBRANCH_4_FROM_IP

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

MSR_LASTBRANCH_4_TO_IP

0000000077855C10L 48894C2408       MOV         qword ptr [RSP]+08,RCX

0000000077855C15L 4883EC18         SUB         RSP,00000018

0000000077855C19L 0FB7542420       MOVZX       EDX,word ptr [RSP]+20

0000000077855C1EL EC               IN          AL,DX

0000000077855C1FL 880424           MOV         byte ptr [RSP],AL    // Instruction pointer!

0000000077855C22L 8A0424           MOV         AL,byte ptr [RSP]

0000000077855C25L 4883C418         ADD         RSP,00000018

0000000077855C29L C3               RETN    

MSR_LASTBRANCH_5_FROM_IP

000000007785B0E4L 74DE             JE          short ptr 000000007785b0c4L

MSR_LASTBRANCH_5_TO_IP

000000007785B0C4L 0FB7057D320000   MOVZX       EAX,word ptr [000000007785e348]

000000007785B0CBL 83C005           ADD         EAX,00000005

000000007785B0CEL 4863C8           MOVSXD      RCX,EAX

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

MSR_LASTBRANCH_6_FROM_IP

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

MSR_LASTBRANCH_6_TO_IP

0000000077855C10L 48894C2408       MOV         qword ptr [RSP]+08,RCX

0000000077855C15L 4883EC18         SUB         RSP,00000018

0000000077855C19L 0FB7542420       MOVZX       EDX,word ptr [RSP]+20

0000000077855C1EL EC               IN          AL,DX

0000000077855C1FL 880424           MOV         byte ptr [RSP],AL

0000000077855C22L 8A0424           MOV         AL,byte ptr [RSP]

0000000077855C25L 4883C418         ADD         RSP,00000018

0000000077855C29L C3               RETN       

MSR_LASTBRANCH_7_FROM_IP

0000000077855C29L C3               RETN

MSR_LASTBRANCH_7_TO_IP

000000007785B0D6L 88442428         MOV         byte ptr [RSP]+28,AL

000000007785B0DAL 0FB6442428       MOVZX       EAX,byte ptr [RSP]+28

000000007785B0DFL 83E020           AND         EAX,00000020

000000007785B0E2L 85C0             TEST        EAX,EAX

000000007785B0E4L 74DE             JE          short ptr 000000007785b0c4L

Seeing the actual flow of the code without any source code or automated tools is challenging, but it is do-able. I wanted to be able to simulate a debugging scenario whereby you might have access to extracting MSR data (as with a ScanWorks Embedded Diagnostics (SED) On-Target Diagnostic (OTD)), as opposed to a benchtop debugger (such as SourcePoint). Knowing where the instruction pointer is, and working backwards, the actual code flow is reconstructed below:

000000007785B0E4L 74DE             JE          short ptr 000000007785b0c4L

 

000000007785B0C4L 0FB7057D320000   MOVZX       EAX,word ptr [000000007785e348]

000000007785B0CBL 83C005           ADD         EAX,00000005

000000007785B0CEL 4863C8           MOVSXD      RCX,EAX

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

 

0000000077855C10L 48894C2408       MOV         qword ptr [RSP]+08,RCX

0000000077855C15L 4883EC18         SUB         RSP,00000018

0000000077855C19L 0FB7542420       MOVZX       EDX,word ptr [RSP]+20

0000000077855C1EL EC               IN          AL,DX

0000000077855C1FL 880424           MOV         byte ptr [RSP],AL

0000000077855C22L 8A0424           MOV         AL,byte ptr [RSP]

0000000077855C25L 4883C418         ADD         RSP,00000018

0000000077855C29L C3               RETN

 

000000007785B0D6L 88442428         MOV         byte ptr [RSP]+28,AL

000000007785B0DAL 0FB6442428       MOVZX       EAX,byte ptr [RSP]+28

000000007785B0DFL 83E020           AND         EAX,00000020

000000007785B0E2L 85C0             TEST        EAX,EAX

000000007785B0E4L 74DE             JE          short ptr 000000007785b0c4L

 

000000007785B0C4L 0FB7057D320000   MOVZX       EAX,word ptr [000000007785e348]

000000007785B0CBL 83C005           ADD         EAX,00000005

000000007785B0CEL 4863C8           MOVSXD      RCX,EAX

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

 

0000000077855C10L 48894C2408       MOV         qword ptr [RSP]+08,RCX

0000000077855C15L 4883EC18         SUB         RSP,00000018

0000000077855C19L 0FB7542420       MOVZX       EDX,word ptr [RSP]+20

0000000077855C1EL EC               IN          AL,DX

0000000077855C1FL 880424           MOV         byte ptr [RSP],AL

0000000077855C22L 8A0424           MOV         AL,byte ptr [RSP]

0000000077855C25L 4883C418         ADD         RSP,00000018

0000000077855C29L C3               RETN

 

000000007785B0D6L 88442428         MOV         byte ptr [RSP]+28,AL

000000007785B0DAL 0FB6442428       MOVZX       EAX,byte ptr [RSP]+28

000000007785B0DFL 83E020           AND         EAX,00000020

000000007785B0E2L 85C0             TEST        EAX,EAX

000000007785B0E4L 74DE             JE          short ptr 000000007785b0c4L

 

000000007785B0C4L 0FB7057D320000   MOVZX       EAX,word ptr [000000007785e348]

000000007785B0CBL 83C005           ADD         EAX,00000005

000000007785B0CEL 4863C8           MOVSXD      RCX,EAX

000000007785B0D1L E83AABFFFF       CALL        0000000077855c10L

 

0000000077855C10L 48894C2408       MOV         qword ptr [RSP]+08,RCX

0000000077855C15L 4883EC18         SUB         RSP,00000018

0000000077855C19L 0FB7542420       MOVZX       EDX,word ptr [RSP]+20

0000000077855C1EL EC               IN          AL,DX

0000000077855C1FL 880424           MOV         byte ptr [RSP],AL   

This is a little better. The location of the instruction pointer is at the bottom and highlighted. I’ve put spaces between the “cycles” associated with branches to make the code more readable. You can see how powerful Trace is, because it goes backwards in time – as opposed to purely run-control, which stops at an event and allows you to single-step forward in time. The dynamic flow of the code is visible, and the direction taken by the conditional branches gives you a sense of the program logic; for example, the two instructions:

TEST EAX, EAX

JE short ptr 000000007785b0c4L

yield a jump to address x’7785b0c4’ if the outcome of the TEST instruction yields a zero flag of one (ZF = 1) within the EFLAGS register. The only way that the zero flag will be set by TEST EAX, EAX is if the contents of the EAX register is zero. So, you can see that the jump actually happens, without explicitly having knowledge of or access to the contents of the EAX register. This is often the case if you are using SED for backtracing program flow prior to a catastrophic event, such as perhaps a CATERR or IERR.

By going back to the source build for the MinnowBoard, I note that the code I’m in is within PchInitDxe. And I don’t have the source code for that; it’s part of one of the binary blobs within the build. All I have are associated files with suffixes .efi, .depex, .inf and .pdb. What should I do next? Maybe acquire a copy of IDA Pro to help me decompile the code? So much to learn, so little time…

Of course, I wouldn’t even have gotten this far without easy access to SourcePoint. Register for our UEFI Framework Debugging eBook to learn more about JTAG-assisted debug and trace.

Alan Sguigna