Exploiting Intel’s Management Engine – Part 3: USB hijacking (INTEL-SA-00086)

In mid-November, a little over two four nine months ago, I wrote Part 1 and Part 2 of my series of articles about exploiting the Intel ME. I also said I’d write Part 3 by the end of the week. Oops.

Unfortunately, a lot of stuff happened, life caught up to me, then I became busy with another hobby, and then extremely busy with starting a new business of mine (a D&D Game hosting service for Foundry Virtual Tabletop). I’ve wanted to finish my third article for a while now, and I’m taking the time to do it today, though it might not be as verbose as usual because my time is very limited lately (also, it’s been so long, my memory of everything I did isn’t as fresh as I’d want it to be). Me finding a new hobby around D&D gaming and now working on my new business venture is what prompted me to start these articles in the first place, since I knew I wouldn’t have any more time to actually finish this project I started so long ago, so here’s everything I did so far, maybe someone else will finish it.

The big question when it comes to getting code execution on the Intel ME might be “Why?” Well, I thought it would be fun to try and create a keylogger that runs directly on the ME. Not many people know this, but one of the first things I did as a script kiddie was play around with keyloggers. While they have the “steal your friend’s passwords” use, I had one installed on my own PC and it was extremely useful for recovering content I’d write (before text editors or email apps had auto-save features) or using it as a log of what I did and when I did it.

How to keylog? The basics.

How can we use the ME to keylog a machine? The answer is simple, we hijack the USB controller; hence the title of this article!

We need to go back to the PT research Black Hat slides, but not the same ones I mentioned in my previous articles, not the Black Hat Europe 2017 slides, but the Black Hat Asia 2019 ones about Intel VISA (Visualization of Internal Signals Architecture).

In it, Mark Ermolov and Maxim Goryachi talk about the IOSF, the Intel On-chip System Fabric (I briefly mentioned it in Part 2) and explain that it’s used to interconnect all the IP units (USB controller, Graphics card, SATA controller, Audio controller, Ethernet adapter, etc..) of the PCH to each other.

I’m fairly certain (though not 100% sure) that the way it works is that everything is connected to the IOSF, and the IOSF decides which IP unit actually appears to be connected to what, based on their access permissions. So when you talk over the PCI bus to a device, the PCI bridge will simply proxy everything to the IOSF Primary by specifying the Dest ID based on the PCI BDF (Bus/Device/Function), so it uses the IOSF bus internally instead of an actual PCI bus. The whole thing is managed with a permission/identification system called SAI which allows the IOSF to filter the requests and redirect them accordingly.

See the following interesting tidbits from PT’s slides (IP in this context means an Intellectual Property unit):

Page 33 of Intel VISA: Through the Rabbit Hole
Page 34 of Intel VISA: Through the Rabbit Hole

That’s for the IOSF Primary, but there’s also the IOSF Sideband. My understanding of it is that everything is connected to the IOSF via the sideband too and it can be used to bypass the entire bus routes by specifying what device you want to communicate with. I’m not entirely sure yet on the exact differences between the IOSF primary and the sideband, but that’s what we used before in order to enable Red JTAG Unlock on the Dfx-Aggregator device and enable DCI on the DCI device.

Page 35 of Intel VISA: Through the Rabbit Hole

Knowing all that, we can start to poke at the USB device using its Sideband channel.

Woah there, slow down!

Ok! Maybe we can’t start right away, because we have no idea how to do any of that, just yet. I don’t want to explain however how I got to that point because it’s long and boring. The summary though is that I dumped all the addresses from the MMIO ranges that the BUP module had access to, then tried to find some patterns there. I did find a table that contained addresses, and after a while, with a lot of trial and error and some poking at the BUP code, I figured that at address 0xF00A9000 was the “ATT Bridge” that PT mentioned which has tables for mapping local addresses to a sideband channel. I have also figured out (with help from others, as always) that the format of those configuration channels are :

OffsetSizeBitsDescription
0x004MMIO address used to access the SB channel
0x044Size of the SB window
0x084Control flags (bit 0= enabled, bit 1= locked)
0x0C12unused as far as I can see
0x184Sideband channel
0-7Sideband Port ID (0x84 = Dfx-Agg device)
8-15Read opcode (6 = Read Private Configuration Space)
16-23Write opcode (7 = Read Private Configuration Space)
24-27BAR
28posted
29not sure
0x1C4Sideband access type
0-2Function ID
3-7Device ID
8-15Root space (0 = Main CPU, 1 = CSME)

I’ve already explained the format of the ‘Sideband channel’ in my Part 2 article, that was where the 0x70684 magic value was which was used for accessing the Dfx-Agg device (port 0x84, read opcode 6, write opcode 7).

We are also going to need to know what the Port ID for the USB controller (called the XHCI controller) is. For Skylake, it’s easy, it was in the datasheet I mentioned before, but for Apollolake, I had to brute force all the ports and try to figure out what their content was for. Eventually, I had port 0xa2 for APL, and 0xe6 for SKL.

Another thing to note, the main CPU has to be booted for this to work, otherwise the XHCI controller is not powered and doesn’t respond to any requests. I had also spent a lot of time understanding and poking at the memory segments/paging to understand how the CSME memory space is organized. I won’t bore you with that either, but I wrote scripts to dump the segment list and paging information, you will find all of that in my ipclib.

IPCLib?

That’s a little python library I wrote which uses OpenIPC’s ipccli python module. Any time I needed to do something more than once, I wrote it there and used it. There are all sorts of useful functions, but zero documentation. You can call ‘execute_asm‘ and give it a list of instructions and it will run them, you can use ‘print_registers‘ to have it dump all of your registers, use ‘step‘ or ‘stepOver‘ or ‘goUntil‘ to do step by step debugging of the CPU (because the native stepping functions in ipccli had some quirks that made it not work as expected). You should also be using the ‘asm‘ function to decompile code, because if you use OpenIPC’s asm function directly, it will corrupt the registers and things will break when you step back into the code, so I had to back them up before decompiling and restore the registers before continuing. It also has all the bruteforcing methods I wrote as well as the XHCI Controller implementation, but more on that later.

I am releasing the code to ipclib under the GPL license and it’s available on my github. Releasing this is a big part of today’s article as well.

The best thing to do with it though is to actually read the various functions available, they can be self explanatory (like ‘print_registers‘ or ‘escalate_to_ring0‘, or ‘malloc‘ for example), or can be more obscure. Some of the functions won’t make any sense (like ‘v3_resume‘ for my 3rd attempt at resuming ME execution after the exploit runs) because I used them just to test something at some point and they’ll be completely useless to everyone but I’m not bothering on cleaning up the code or documenting it (lack of time, remember?)… I guess that just makes things more fun for reverse engineers 🙂

Also note that while I’ve explained above how to access the IOSF Sideband mapping in the ATT gasket, there’s also a DRAM mapping region which I won’t bother to explain but in ipclib, you will see it be used, so we can map any DRAM region to access it from the ME, which is pretty cool.

The XHCI Controller

Back to the meat of the subject, the XHCI Controller. Can I hijack it? It turns out the answer is yes… and no… It depends on a few things.

On Apollolake, it’s easy, I can use the sideband channel to read from the USB controller, and write to its registers. I had to read the XHCI specification and write my own XHCI controller driver in python, basing it off the coreboot and seabios implementations. I can do things such as :

  • Reset/Initialize the USB controller
  • Enumerate the devices
  • Send commands and receives events from the USB controller
  • Assign a device ID for the usb devices

I didn’t get further than that due to lack of time, but it’s a simple matter of finishing my XHCI Controller (basically rewriting the coreboot implementation in python), then I would have been able to even transfer files from/to the USB flash drives I had connected to the Gigabyte Brix for testing. It’s not much, but at least I was able to see which ones were USB 2 or USB 3 and communicate with them.

On Skylake unfortunately, I was unable to access the XHCI controller via the sideband. You can see my discussion about this on https://github.com/ptresearch/IntelTXE-PoC/issues/12

As you could read from the issue linked to above, I originally thought that it was the EPMASK that was preventing access to the XHCI controller by entirely disabling the endpoint :

EPMASK7 definition

Eventually, I figured that it was the CP, RAC and WAC access policies that were preventing the ME from communicating with the XHCI controller via the Sideband channel.

Description of the Security CP register on Apollolake

The CP, RAC and WAC are registers that define which SAI agents (which other IP units connected to the IOSF fabric) can access the registers of the sideband. CP is “Control Policy” and defines which SAI agents can read and write the CP, RAC and WAC registers themselves. RAC is “Read Access Control” and defines which SAI agents can read the registers of the unit, while WAC is “Write Access Control” and defines which agents can write to its registers.

And the reason I know that the access control is based on the CP, RAC and WAC rather than EPMASK is because when I try to read the registers of the controller using the JTAG interface (the Tap2IOSF stateport device), it works, but more than that, when trying to read the private configuration space, I’m able to read most of it, but a small region is simply not responding to my reads, indicating that access is blocked for a specific portion of the registers, and that means those are the CP, RAC and WAC registers. It also means that while the Tap2IOSF device has read/write access, it does not have CP access which is why it can’t read those access control registers.

Since we now know that the CSME doesn’t have CP or RAC access to the XHCI Controller registers, here’s a question, does it have write access? And the answer is YES! It’s funny, while we can’t read any of the registers, we can write to them without problem (which re-confirms the assumption about CP/RAC/WAC). It’s unfortunate though that PT were mistaken when they said the CSME has a full privileged SAI to access all the devices, it’s just that its SAI agent value has broader permissions, but as we can see for Skylake, the USB controller doesn’t let the ME read any of its registers, so we can’t send commands and receive events from it. To be fair though, I think the ME does have full privileged access but I think the ME removes some of its own privileges during boot, so PT were probably right anyway. It’s unfortunately the same situation with the SATA controller too, and it might be affecting other devices as well.

How to keylog, really.

Now that we can talk to the USB controller, how can we create our keylogger? That’s of course, for Apollolake, since Skylake is challenging in a different way.

First of all, when the main CPU sets the BARs and configures the XHCI Controller’s PCI configuration space, the ME can store those values, change them so the ME is actually the one communicating with the controller and it can proxy all the commands and data back and forth between the main CPU and the controller. Basically, a Man-in-the-middle attack on the USB controller, allowing it to both monitor everything that passes through USB (including keyboard and mouse of course) and store it elsewhere (perhaps on the hard drive itself since it should be relatively easy to do the same thing with the SATA controller and control it from the ME… or it could save it in RAM and periodically send it over the network since the ME also has access to the ethernet adapter). It could also be used to emulate keystrokes, which could be fun.

To explain that further, basically once the OS has booted, it configures the XHCI Controller on the PCI bus so that it can send commands to it though a command buffer by writing the commands to an area of the RAM, and when an event happens (data becomes available) from USB, the event is written in the RAM by the XHCI controller so the main CPU can read it though the event ring, and then it can read the data transferred to/from USB through the transfer ring.

See this image (From page 57 of Intels’ xHCI document) which shows the general architecture of the communication with the XHCI controller :

General Architecture of the XHCI Interface

As you can see the Cmd Ring, Event Ring, Transfer Ring and all the Data Buffers are in the Host Memory, and the configuration of where they are is stored in the MMIO Space that is set in the PCI configuration space.

To hijack the USB, the CSME just needs to wait for the OS to boot, then it would change the configuration from the MMIO registers that the OS has set in such a way that sending commands does not do anything anymore (since the xHCI controller isn’t looking at that Cmd Ring anymore), but instead, the CSME would be the one reading the RAM, deciding if it wants to forward the command, modify it, save information from it, etc.. then writing the same (or modified) command in the actual Cmd Ring that the xHCI controller is listening to. When an event or data is received, The CSME would copy that data over to the Event Ring or Data Ring that the OS has set up. And there you go, we have full control over the USB interface with our CSME in the middle attack!

Here’s a badly drawn (my art skills are amazing!) diagram showing how the Man in the Middle attack works :

CSME in the Middle

An alternative solution would be not to change any of the PCI configuration, but simply look for the command, event and transfer rings of the XHCI controller as set up by the main OS and monitor their contents directly since we can map any DRAM region to access it from the ME. By doing that, we would have less of an impact on the main CPU, with no risk of potentially decreasing the performance (especially when it comes to USB flash drive transfers), but we’d need to be able to receive interrupts from the controller. I’m not so good when it comes to interrupts, so I don’t know how that would work.

How about Skylake ?

Skylake (and Kabylake as well actually) are different because we can’t read the registers from the XHCI controller, so we can’t save the actual RAM addresses that the Host is sending command to, but since we can write to the MMIO registers, we can still do a lot of mischief. I actually had a little fun resetting the controller while the OS is booted and it caused all USB devices to stop working. I also did that on the SATA controller and all hard drives stopped responding… a lot of fun, and more of a malicious virus behavior than my original keylogger plan.

Note that I never got around to actually writing the code as a binary to run on the ME, which was ultimately my plan, which is why I’m mentioning the issue with SKL/KBL. But if I were to do the keylogger entirely using IPC commands over JTAG, then it wouldn’t be a problem since we can read from the XHCI controller via Tap2IOSF. The code I wrote in ipclib for it was mostly just a playground to test and see how it could be done from within the ME itself and I’ve confirmed that anything I’ve done so far would work exactly the same if it were executed from the ME firmware itself.

But all hope isn’t lost for getting it to work on Skylake, there are a few options still available :

  • We could try to find who sets the initial permissions on the USB controller as it boots (is it hardcoded in the silicon? is the ME itself setting it when it turns the power on? Or perhaps it’s configured in the PMC (Power Management Controller) or some other device) and it can be configured to give full access to the CSME’s SAI.
  • Instead of using the sideband channel, we could use the IOSF Primary bus directly, there’s a possibility that we could have access to it fully that.
  • We can perhaps read the registers by using the Tap2IOSF device if the ME is able to communicate with it directly and ask it to read the IOSF for us, emulating the commands sent via JTAG, which do actually work for reading even on SKL and KBL.
  • We can maybe scan the RAM to find the command/event/transfer ring buffers that the OS allocated and use them without having to read their values from the PCI configuration space
  • Maybe map those registers to another address by using another SAI, perhaps a subsystem of the ME or another device could have access to it (isn’t DMA doable from one PCI device to another for example?).

I’m sure that other ideas can be found and this problem can be circumvented. Unfortunately, as I said before, I don’t have time of my own to pursue this further, so perhaps someone else will.

Update from five months later: Well, I’m glad I delayed posting this since I actually see a different solution now. Thanks to Intel’s own CVE 2019-0090 whitepaper available here : https://www.intel.com/content/dam/www/public/us/en/security-advisory/documents/cve-2019-0090-whitepaper.pdf

On page 5, you’ll find this diagram that shows how the CSME sits between the CPU and USB and can communicate with it via the Sideband Fabric.

They also explain how the CSME takes care of loading FW onto the various IPs and it controls the IOMMU unit which checks for the SAI authorization. The CVE is about an exploit where someone could write to the ME’s SRAM before IOMMU is enabled, because SAI is not checked while IOMMU is disabled.

Explanation about CSME, IP, SAI and IOMMU

So, my new idea would be : Can we disable IOMMU? Since the ME controls it and is the one to enable it, then we should be able to disable it, then we would get to ignore SAI and get full access to the xHCI on Skylake without having to use a different method than what I have already built.

This also tells me that at some point, the ME does have access to it, then writes the firmware to it, then locks things up, so we could also try to find where it does that and patch that code to prevent it from happening. This is kind of like my alternate solution #1 above, but this new information seems to indicate that it’s the ME that handles it, and in the RBE module as well.

Let’s do this!

Alright, enough rambling. Let’s go ahead and get you all to reproduce what I did with step by step instructions.

Step 1 – Hardware setup

That’s easy, you first need to enable the exploit and get code execution with OpenIPC working for your machine. Either using the IntelTXE PoC code from PT if you’re using Apollolake, or using my instructions from my previous articles if on Skylake or Kabylake.

Step 2 – Software Setup

This should also be easy, you mostly need to have Intel System Studio installed and patched as explained in the IntelTXE PoC repository or my previous articles. Then get my ipclib code and import it into your ipython console with from ipclib import *

Step 3 – USB Listing (Apollolake)

If you’re on Apollolake, we can read/write to all the registers, and the xhci implementation I wrote is in the xhci object. You can poke at it in different ways, but to disrupt USB while the system is booted, in ipython ipc console, simply call :
xhci.reset()

To setup USB, reset ports and set addresses to the USB devices :
xhci.setup()

It should list the ports that are in use and show if they are USB 2 or USB 3 devices, though that’s the extent of it as the implementation was never finished.

Step 3 – USB Reset (Skylake/Kabylake)

If you are on Skylake or Kabylake, we can only write, but not read any of the registers, so using the xhci object’s methods will not work here as most of them try to read the status registers.

We can however do this simple write command :
xhci.bar_write32(0x80, 0x2)

which will enable the reset bit on the command register. This will reset the USB controller and prevent the OS from accessing devices. This means that your keyboard and mouse on the target machine will stop working immediately after doing that command, proving it works.

A little extra mischief

As I’ve poked at all sorts of PCI devices during my testing, I also did manage to disrupt the SATA controller when I wanted to. It’s the same principle as with USB though there is no SATA object to use.

The following commands should disable the SATA controller on Apollolake :

addr, _ = setup_sideband_channel(0x150100b5, rs=0, fid=0x90)
t.mem(phys(addr + 4), 4, 1)

And here is the command to use on Skylake and Kabylake, using the correct Sideband channel :

addr, _ = setup_sideband_channel(0x150100d9, rs=0, fid=0xb8)
t.mem(phys(addr + 4), 4, 1)

You can test it by booting the machine onto a linux system and doing dd if=/dev/sda bs=256 count=1 | hexdump -C to verify that the SATA device can be read (assuming /dev/sda is a SATA drive), then doing it again after running that command through JTAG to confirm that you get an Input/Output error.

If you’re looking for an explanation, then here’s what the command does :

In the case of kabylake, the fid=0xb8 is the function id where bits [7:3] are the PCI device id, and bits [2:0] are the function id, so 0xb8 means PCI device 23.0 which is the SATA device. The rootspace is 0 meaning the main CPU root space, and the sideband channel 0x150100d9 means :
bit 28 : posted (for writes)
bits 27-24: bar 5 (AHCI)
bits 23-16: write opcode 1 : WriteBar
bits 15-8: read opcode 0 : ReadBar
bits 7-0: Port ID 0xD9

Writing at the address + 4 is the AHCI register control, and writing bit 0 to 1 means reset controller.

As “simple” as that.

Conclusion

That’s it! That was the last of my series of articles. I’m sure there’s a lot more stuff I could write about on how I got to that point, but it’s all very tedious and boring (the proof that it’s boring is that I don’t remember most of it). You can probably figure some of the stuff out from the ipclib code that is in there.

I unfortunately never got around to doing my ‘antivirus-immune keylogger’, but it’s not impossible. It turned out that it needed a lot more work than I originally thought it would, and I blame it all on Intel for not providing documentation. I’m sure it could have been done in one week if I didn’t have to spend a year trying to understand how any of the underlying architecture works.

I’m sure there are a lot of other useful applications that can be done by running your own code in the ME. I’ve only done it using IPC scripts, but it should be possible to run binary executables (I know that PT did in their original exploit).

This whole experience was fun. More complex than I ever thought it would be, but still, a lot of fun. It also shows how dangerous the ME can be if someone hijacks it to run their own malicious firmware on it. The possibilities of having ME-viruses are huge, and it shows the importance of having good security and locking access to your flash chips if possible.

With this last article, I close the saga of the Intel ME reverse engineering and security research I’ve done in the last couple of years. I might still poke at it at some point, but I’m concentrating on other stuff for now, with my new business building a hosting service for D&D games being the focus of all my work. I hope the ipclib release I’m doing as well as these articles will be interesting to others and you’ll find something useful in it and perhaps it will inspire others to poke at things that weren’t really meant to be poked at.

It was fun, thanks for reading, and considering everything that’s happening in the world, stay safe!

Intel FSP reverse engineering: finding the real entry point!

DISCLAIMER: This post was originally posted on Puri.sm‘s blog but then taken down after they received a letter from Intel requesting the article be removed as it contained information about reverse engineering the FSP which was against their License. I am putting this article back up again on my personal blog for the following reasons :

  • Their current license only prohibits the reverse engineering with regards to ‘Redistribution’, and since I am not working for Purism anymore, I am not involved with redistribution of any of their binaries and therefore it does not affect me.
  • The files I had originally worked on were cloned from this specific commit on their repository which had a BSD style license which did not prevent any reverse engineering (but I do know that a more restrictive license was added in a subsequent commit 30 minutes later, but it wouldn’t change the fact that the FSP in that specific branch is using the BSD license and the ‘license change’ wouldn’t be considered retroactive).
  • Since I live in Canada, Reverse Engineering is allowed when it comes to security or interoperability, which is the case here. I know that this is more of a license issue than a copyright violation issue (where Canadian law would apply), but I don’t see why someone could revoke my right to do security research by invoking a license breach.
  • The reverse engineering and security research that has been done in recent years by other companies or individuals (most notably PT Research or Peter Bosch) has far surpassed what I have written in this article, and this article is a lot more educational and along the lines of my previous Introduction to Reverse Engineering article than one about secrets hidden in the assembly code. I think that whatever damage Intel might think it does is extremely minimal compared to other existing projects.
  • The article is and has always been available on the web archive, so it wasn’t ever really taken down from the internet, whether to link to my blog or to the web archive when people mention this article would make no actual difference. I think the important part is that it is not hosted on purism’s website since they are a laptop manufacturer and therefore a distributor of the FSP within their products.

For the above listed reasons, among others, I am releasing this article to the public again. I have also gone through it to remove a particularly long code snippet which was not required for understanding and made sure that any other screenshots I’ve had would fall well within the fair use clause.


After attending 34C3 in Leipzig at the end of December 2017, in which we (Zlatan and me) met with some of you, and had a lot of fun, I took some time off to travel Europe and fall victim to the horrible Influenza virus that so many people caught this year. After a couple more weeks of bed rest, I continued my saga in trying to find the real entry point of the Intel FSP-S module.

WARNING: This post will be very technical, and even if you are a technical person, you will probably need to have read my previous “Primer guide” blog post in order to be able to follow most of it. If however, you’re not a technical person, don’t worry, here’s the non-technical executive summary:

  • I made some good progress in reverse engineering both the FSP-S and FSP-M and I’m very happy with it so far
  • Unfortunately, all the code I’ve seen so far has been about setting up the FSP itself, so I haven’t actually been able to start reverse engineering the actual Silicon initialization code.
  • This blog post is about finding the “real entry point”, the real silicon initialization code and I’ve been jumping through a lot of hoops in how the FSP initializes itself in an attempt to find where it actually does start the initialization code and I believe I’m very close to finding it.
  • Progress is good and still ongoing, and the task will be done at some point, so stay patient as you have been so far.
  • This post is mostly about going step by step over the process of reverse engineering that I’ve done so far. It helps you follow along on the progress, helps some of you learn how it’s done and what happens behind the scenes.

Diving back into the depths

If you remember, in my primer to reverse engineering the FSP, I said the following :

“I’ve finished reverse engineering the FSP-S entry code—from the entry point (FspSiliconInit) all the way to the end of the function and all the subfunctions that it calls. This only represents 9 functions however, and about 115 lines of C code; I haven’t yet fully figured out where exactly it’s going in order to execute the rest of the code. What happens is that the last function it calls (it actually jumps into it) grabs a variable from some area in memory, and within that variable, it will copy a value into the ESP, thus replacing our stack pointer, and then it does a ‘RETN’… which means that it’s not actually returning to the function that called it (coreboot), it’s returning… somewhere, depending on what the new stack contains, but I don’t know where (or how) this new stack is created, so I need to track it down in order to find what the return address is, find where the RETN is returning us into, so I can unlock plenty of new functions and continue reverse engineering this.”

Diving Deeper

Today, we will examine what happens in more details. Get ready for the technical part now, because we’re going to dive right back in, and we’re going to go pretty deep as I walk you through the steps I took to reverse engineer that portion of the code to figure out what happens. I’ll go pretty fast over things like “look at this ASM function, this is what it does” because you don’t need the details; I’ll mostly explain the weird/unusual/non-straightforward things.

First, a little preface: there are two FSP files, the FSP-M and FSP-S. The FSP-M contains the functions for the memory initialization and the FSP-S contains the functions for the silicon initialization. Coreboot will run the MemoryInit from FSP-M during its romstage, then once the RAM is initialized, it will start its ramstage in which it will run the SiliconInit function from the FSP-S file.

The FSP-S file is loaded into memory by coreboot, then the address of the ‘SiliconInit‘ function is retrieved from the FSP-S file header and coreboot calls that function. That function is pretty simple, it just calls the ‘fsp_init_entry‘ function (that’s how I called it). Actually, all of the FSP entry point functions will call this same fsp_init_entry() but will set %eax to a different value each time, to represent which FSP entry point function was called. See for yourselves:

Note that in the FSP-S file, the ‘jmp fsp_memory_init‘ (in the lower-right corner) is replaced with ‘jmp infinite_loop‘ instead. This screenshot was actually taken from the FSP-M file, which is why it shows “jmp fsp_memory_init“.

So, each of the entry points in the various FSP images (on the left, I showed entry points for both FSP-S and FSP-M files) will call fsp_init_entry which will call validate_parameters() and then if the %eax register is 3 (you’ll notice that’s the value set by memory_init_entry), it will call fsp_memory_init, otherwise it will jump into switch_stack_and_run (after calling gst_fsp_info_header, you’ll see why below). All that the switch_stack_and_run() function does is to replace the stack pointer (first storing all of the registers into it and replacing all the register values from ones taken from the new stack), then finally return. See for yourselves:

It might look complicated, but it’s not that much:

  1. it does a bunch of ‘push‘, the first is to push %eax, which is the return value from the previous “call get_fsp_info_header” call in the fsp_init_entry function above,
  2. then it calls ‘pushf‘ which pushes the EFLAGS register,
  3. then “cli” will disable interrupts (this is to avoid having some interrupt triggered and change things from under our noses),
  4. then ‘pusha‘ which will push all of the registers into the stack,
  5. then we subtract 8 bytes from the stack, basically allocating 8 bytes,
  6. then calling ‘sidt‘ which is “Store Interrupt Descriptor Table”.
  7. Finally it calls ‘save_fspd_stack‘ and it gives it the %esp (stack pointer) as argument. That function will store that argument into offset 8 of the address stored in 0xFED00148… but since I already reversed that, let’s make it easier for you and just say that it stored the argument in the StackPointer field (offset 0x08) of the FSPD data structure,
  8. then return in %eax the previous value that was stored there.
  9. switch_stack_and_run will store the returned address into %esp, effectively replacing the entire stack,
  10. then it will proceed to pop back all the registers, flags, IDT back into their respective places,
  11. then return which will make us return not into the fsp_init_entry function (nor to coreboot since fsp_init_entry actually did a ‘jmp‘, not a ‘call‘), but rather it returns to whatever was the return address of the calling function from the new stack pointer.

This is what I explained in my previous blog post (which I quoted at the beginning of this post).

To make things easier to visualize for you, here’s a description of the stack contents (as an IDA structure):

In the picture above: you’ll notice that of course, the top of the stack contains the last thing that was pushed into it, and the ‘dd’ means ‘data double word’ (4 bytes) and ‘dw’ means ‘data word’ (2 bytes) so you’ll see the ‘idt_’ values at the top of the stack represent 8 bytes (2 + 4+ 2) because as the ‘sidt‘ instruction describes, the IDT is made up of 6 bytes, the limit (2 bytes) and the base address (4 bytes). You may also notice the ‘first_argument_on_stack‘, that’s because the silicon_init was called with an argument (UPD configuration structure) and that was initially on the stack and still is on the stack when the stack exchange occurs.

If you want to see the C code equivalent that I wrote when reverse engineering these functions, head over to the new git repository I created for this project. This code is common to both FSP-S and FSP-M and so it’s available in the fsp_common.c file.


I’m FED00148 up

So now, the big question! I had no idea what’s in this “0xFED00148” address (the one you saw as ‘ds:FSPD’ above) or who sets its content, or what it contains. I eventually figured out it’s the “FSP DATA” structure and I know what some of its fields are (such as the Stored StackPointer at offset 8), but at first, I had no idea, so here’s what I did: I dumped the content of the 0xFED00148 address from coreboot prior to calling SiliconInit, that gave me the address of the FSPD structure and at offset 8, I found the new stack pointer that the FSP-S will use, and from there, I manually popped the values until I found the new return address.

Thanks to my previous StackContents structure, we already know that the return address is at offset 0x30 in the saved stack, so in the above coreboot console output, we see the return address value is 0xffcd7681 (what you see as “81 76 cd ff” above, because x86 stores data in Little-Endian, that means the bytes are read right to left), and that doesn’t match anything in the FSP-S since we can see that the silicon_init function is at 0x6f9091da and offset 0xffcd7681 is way beyond the boundaries of the FSP-S file. However, I thought of also printing the offset of the FSP-M file when MemoryInit was being called and the result was: 0xffc82000. That’s a lot more likely to mean that the return will return into a function of the FSP-M file instead, more specifically 349 825 bytes inside the FSP-M file (0xffcd7681 – 0xffc82000 = 0x55681 = 349825).

This also makes more sense because since we just loaded the FSP-S into RAM, and we haven’t called silicon_init yet, that means this FSPD data structure at 0xFED00148 must have been set up by something else, and since coreboot doesn’t know anything about it, it’s obvious that the FSP-M is the one that actually creates and initializes that FSPD data structure. The only ‘safe’ return value that FSP-M knows has to be a function within itself since it doesn’t know yet where FSP-S is loaded into memory.

Jumping through our first hoop

If I go to that return address in IDA, I find an ‘uncharted territory’, meaning that IDA did not think this contained code because no function called into this place, but by pressing ‘c’, I transform it into code, then I go back up and do it again and convert another portion of data into code until I found the “function signature” of most functions (called the function prologue which amounts to “push ebp; mov ebp, esp“) telling me it’s the start of the function, then I pressed the ‘p’ key to tell IDA to transform this into an actual function and success, I got a function disassembled by IDA which contains our return value. Since the FSP-M is supposed to be loaded at 0xFFF6E000, with the 0x55681 offset, that means that we return into address 0xFFFC3681 and I made a label there and called it “RETURN_FROM_ESP” as you can see below, and the interesting thing is that the assembly line right above it is a “call switch_stack_and_run_2” which is actually another function that contains the exact same code as the ‘switch_stack_and_run‘ we saw before (it happens often that functions are duplicated in the code).

This makes sense because this means that this is the last function of the FSP-M. After the Memory Initialization is done, it calls switch_stack_and_run and that causes it to stores its current state (registers, stack, return address) in the FSPD data structure then return into coreboot, and when we call the silicon_init and it also calls switch_stack_and_run it reverts the stack and registers to what it was and the execution continues in this function. It’s pretty weird and convoluted, I know…

So yay, I found where the FSP-S returns into, it’s in this function in FSP-M, now I need to figure out what this does and how it knows where to find the real entry point from FSP-S and how it calls it. So I reverse engineered it (starting at that offset, I don’t care about what happens before) and it was a fairly big/complicated function which translates roughly into the following C code:

[[code]]czoyMTQ0OlwiLy8gVGhpcyBzdGFydHMgYXQgdGhlIG1pZGRsZSBvZiB0aGUgZXhpdCBmdW5jdGlvbiBvZiBGU1AtTS4gVGhpcyBpcyB7WyYqJl19d2hhdCBnZXRzIGNhbGxlZCAocmV0dXJuZWQgaW50bykKLy8gd2hlbiBUZW1wUmFtRXhpdCBvciBTaWxpY29uSW5pdCBnZXQgY2FsbHtbJiomXX1lZC4KRUZJX1NUQVRVUyBpbnRvX25ld19zdGFja19yZXR2YWx1ZSgpIHsKICBGU1BfREFUQSAqZnNwX2RhdGEgPSAqRlNQX0RBVEFfe1smKiZdfUFERFI7CiAgY2hhciBsYXN0X3RzY19ieXRlOwogIHVpbnQzMl90IGZpeGVkX210cnJzWzB4Ql0gPSB7MHgyNTAsIDB4MjU4LCAweDJ7WyYqJl19NTksIDB4MjY4LCAweDI2OSwgMHgyNkEsIDB4MjZCLCAweDI2QywKICAgICAgICAweDI2RCwgMHgyNkUsIDB4MjZGfTsKCiAgaWYgKHtbJiomXX1mc3BfZGF0YS0+QWN0aW9uID09IEZTUF9BQ1RJT05fVEVNUF9SQU1fRVhJVCkgewogICAgZnNwX2RhdGEtPlBvc3RDb2RlID0gMHhCe1smKiZdfTAwMDsgLy8gVGVtcFJhbUluaXQgUE9TVCBDb2RlCiAgICBsYXN0X3RzY19ieXRlID0gMHhGNDsKICB9IGVsc2UgewogICAgZnNwX2R7WyYqJl19YXRhLT5Qb3N0Q29kZSA9IDB4OTAwMDsgLy8gU2lsaWNvbkluaXQgUE9TVCBDb2RlCiAgICBsYXN0X3RzY19ieXRlID0gMHhGNjsKIHtbJiomXX0gfQoKICBzdG9yZV9hbmRfcmV0dXJuX3RzYyhsYXN0X3RzY19ieXRlKTsKICAKICBpZiAoZnNwX2RhdGEtPkFjdGlvbiA9PSBGU1Bfe1smKiZdfUFDVElPTl9URU1QX1JBTV9FWElUKSB7CiAgICBwb3N0X2NvZGUoZnNwX2RhdGEtPlBvc3RDb2RlIHwgMHg4MDApOyAvLyAweEI4MDB7WyYqJl19IFRlbXBSYW1Jbml0IEFQSSBFbnRyeQogICAgc3ViX0M0MzYyKCk7CiAgICBzdWJfQzM0NUYoKTsKICAgIHN0b3JlX2FuZF9yZXR1cntbJiomXX1uX3RzYygweEY1KTsKICAgIGZzcF9kYXRhLT5TdGFja1BvaW50ZXJbMHgyNF0gPSAwOyAvLyBTZXQgZWF4IGluIHRoZSBvbGQgc3Rhe1smKiZdfWNrCiAgICBzd2FwX2VzcF9hbmRfZnNwX3N0YWNrKCk7CiAgICBmc3BfZGF0YS0+UG9zdENvZGUgPSAweDkwMDA7IC8vIFNpbGljb257WyYqJl19SW5pdCBQT1NUIENvZGUKICAgIHN0b3JlX2FuZF9yZXR1cm5fdHNjKDB4RjYpOwogIH0KICBwb3N0X2NvZGUoZnNwX2RhdGEtPlBvc3tbJiomXX10Q29kZSB8IDB4ODAwKTsgLy8gMHg5ODAwIFNpbGljb25Jbml0IEFQSSBFbnRyeQogIAogIGludCBtdHJyX2luZGV4ID0gMDsKICB3e1smKiZdfWhpbGUgKHJkbXNyKGZpeGVkX210cnJbbXRycl9pbmRleF0pID09IDApIHsKICAgIG10cnJfaW5kZXgrKzsKICAgIGlmIChtdHJyX2l7WyYqJl19bmRleCA+PSAweEIpIHsKICAgICAgaW50IG10cnJjYXAgPSByZG1zcihJQTMyX01UUlJDQVApOyAvLyAweEZFOwogICAgICBpbnQgbntbJiomXX11bV9tdHRyID0gKG10cnJjYXAgJmFtcDsgMHhGRikgKiAyOwoKICAgICAgaWYgKG51bV9tdHRyKSB7CiBtdHRyX2luZGV4ID0gMDsKe1smKiZdfSBkbyB7CiAgIGlmIChyZG1zcigweDIwMCArIG10dHJfaW5kZXgpID09IDApCiAgICAgYnJlYWs7CiAgIG10dHJfaW5kZXgrKzsKICB7WyYqJl19IGlmIChtdHRyX2luZGV4ID49IG51bV9tdHRyKSB7CiAgICAgc3ViX0MzNDVGKCk7CiAgIH0KIH0gd2hpbGUobXRycl9pbmRleCAmbHtbJiomXX10OyBudW1fbXRycik7CiAgICAgIH0gZWxzZXsKIHN1Yl9DMzQ1RigpOwogICAgICB9CiAgICB9CiAgfQoKICBpbmZvX2hlYWRlciA9e1smKiZdfSBmc3BfZGF0YS0+U3RhY2tQb2ludGVyWzB4MkNdOwogIGlmIChpbmZvX2hlYWRlci5TaWduYXR1cmUgIT0gXFxcJ0ZTUEhcXFwnKQogICAge1smKiZdfWluZm9faGVhZGVyID0gZnNwX2RhdGEtPkluZm9IZWFkZXJQdHI7CgogIHZvaWQgKnB0ciA9IGluZm9faGVhZGVyLkltYWdlQmFzZTt7WyYqJl19CiAgdXBwZXJfbGltaXQgPSBpbmZvX2hlYWRlci5JbWFnZUJhc2UgKyBpbmZvX2hlYWRlci5JbWFnZVNpemUgLSAxOwoKICB3aGlsZXtbJiomXX0gKHB0ciAmbHQ7IHVwcGVyX2xpbWl0ICZhbXA7JmFtcDsgcHRyWzB4MjhdID09IFxcXCdfRlZIXFxcJykgewogICAgdWludDMyX3QgZ3VpZHtbJiomXX1bXSA9IHsweDFCNUMyN0ZFLCAweDRGQkNGMDFDLCAweDFCMzRBRUFFLCAweDE3MkE5OTJFfTsKCiAgICBpZiAoKih1aW50MTZfdCAqe1smKiZdfSkmYW1wO3B0clsweDM0XSAhPSAwICZhbXA7JmFtcDsgY29tcGFyZV9ndWlkKHB0cisqKHVpbnQxNl90ICopJmFtcDtwdHJbMHgzNF17WyYqJl19LCBndWlkKSAhPSAwKSB7CiAgICAgIHdlaXJkX2Z1bmN0aW9uKHB0ciwgcHRyWzB4MjBdKTsKICAgIH0KICAgIHB0ciArPSBwdHJbMHtbJiomXX14MjBdOwogIH0KICByZXR1cm4gMDsKfQpcIjt7WyYqJl19[[/code]]

It’s pretty long code but relatively easy to understand. Step by step:

  1. It will check if the action value stored in the FSPD data structure at 0xFED00148 is 4 or 5 (remember the “mov %eax, 5” in silicon_init and and “mov %eax, 4” in temp_ram_exit before fsp_init_entry gets called). Since all the registers/stack/etc. get restored, that explains why all the data we need to keep across stack exchanges needs to be stored in this FSPD data structure, and yes, that %eax value from fsp_init_entry gets stored in the FSPD (during validate_parameters).
  2. It then sets the PostCode variable in FSPD to either 0xB000 or 0x9000 (which matches the first nibble of the TempRamInit and SiliconInit POST codes),
  3. It checks if it is TempRamInit, then it does a post_code(0xB800) and does a bunch of stuff that I didn’t bother to reverse because I’m not interested in that, then it calls again the switch_stack_and_run_2 (which I renamed “swap_esp_and_fsp_stack” in the C code). This means that TempRamInit will exit back into the old saved stack, thus it returns into coreboot, and right after that, if we call back into the FSP, it will continue its process from this spot, expecting it to be a SiliconInit that called it.
  4. It sends the Post code 0x9800 (SiliconInit API Entry),
  5. then it will loop looking for an available MTRR, it will check the MTRRs 0x250, 0x258, 0x259, 0x268, etc.. basically, the first available MTRR from IA32_MTRR_FIX64K_00000 to IA32_MTRR_FIX4K_F8000.
  6. If none are available, then it will look for the number of available MTRR using the IA32_MTRRCAP and loop for them until it finds an available one.
  7. If it can’t find one, it calls a function that I didn’t bother to reverse yet.
  8. It checks the image’s base address and looks for the ‘_FVH’ signature (EFI File Volume Header) and the GUID of the FSP-S file
  9. Finally, it then calls a “weird function”.

What is this weirdness you speak of?

The ‘weird_function’ itself isn’t so weird, it does a bunch a rather simple stuff, but in which it calls a couple of actually small and weird functions which makes the entire function impossible to understand. What are these small weird functions? Let’s start with the code itself, and we’ll let it speak for itself:

For those of you who paid attention, this function is calling into an offset of a register (%edx+0x18). So far, that’s not too bad, we often see that (function pointers in a structure are common), the problem is… “Where does this %edx register come from? Oh, it’s the content of the %eax register (the line above). Where does %eax come from? It comes from the content of the [%eax-4] pointer… and where does this %eax come from? Well it comes from var_A, which itself is not modified anywhere in the code…” However, if we look at the code in its entirely, we see that there is a ‘sidt‘ instruction there, which stores the IDT (Interrupt Descriptor Table) into the pointer pointed to by %eax which itself comes from var_4 which itself contains the value of %eax which itself is the address of var_C

So… to simplify, the IDT is stored in var_C, then %eax is taken from var_A (2 bytes into var_C since the stack grows upside down). This means that at this point %eax contains the address of the IDT address, then the function subtracts 4 from the address and grabs the pointer pointed to by that address… then it takes the value pointed to by that pointer and add 0x18 to it and that’s your function pointer. Maybe the function with comments will make it a little less confusing:

So the really weird thing here is that our “function pointer stored in a structure” actually comes from a pointer to a structure that is stored 4 bytes before the Interrupt descriptor table for some magical (or stupid?) reason.

Now that I got there, I felt stuck because I had absolutely no idea what that function is, and while I could have used my previous dump of the stack to figure it out (remember, the IDT was also stored on the stack when the stacks get swapped), I would just get some pointer to a function but I needed to actually understand why it used the [IDT-4] and how the FSP DATA was setup, etc. so I decided to temporarily give up on the Silicon Init and actually start reverse engineering the setup part of the MemoryInit function instead.

Starting from scratch

So, I started again from scratch and I reverse engineered the FSP-M setup code. It was very similar to the FSP-S code, the only difference is that if the action == 3 (MemoryInit), instead of calling the ‘infinite_loop‘ function, it was calling the fsp_memory_init function.

The fsp_memory_init function is a rather simple function that does one small thing: it creates a new stack! Ha, that explains so much. It turns out the MemoryInit function’s UPD configuration has a FspmArchUpd.StackBase and FspmArchUpd.StackSize configuration options that define the address and size of the stack to setup. So the entire FSP-M will run into its own stack and so it leaves the coreboot/BIOS’s stack intact. The FSP-S also needs to run from this stack, which is why when it swaps into it, we end up in FSP-M, because that’s where it last was when it swapped out of it. Great, what next?

The next thing the fsp_memory_init does is to call a function I named setup_fspd_and_run_entrypoint. What that function does is to setup the FSPD structure (the one at 0xFED00148), and I thought that by understanding how that gets setup, I would understand all I needed, but that’s not the case, it just does a bunch of complicated things, such as:

  1. get the ExtendedFeature information of the CPU using the cpuid instruction, but then it ignores the result,
  2. it then loops a bunch of time calling the rdrand instruction to generate random data until it actually generates data (so, I assume it initializes the random number generator by poking it until it gives it something),
  3. then it initiliazes the FPU,
  4. sets some unused variable on the stack to 0,
  5. then creates an IDT entry using the values 0x8FFE4 and 0xFFFF8E00 (which means an IDT to offset 0xFFFFFFFE4 (0x100000000 – 0x1C) with GDT selector 8 and type attributes 0x8E, meaning it’s a 32 bit interrupt gate that is present), then it replaces the Interrupt offset to 0x1C bytes before the end of the FSP-M file (which is all just full of 0xFF bytes, so it’s not a valid function address).
  6. It will then copy that IDT entry 34 times, then it sets the IDT to that pointer with the ‘lidt‘ instruction.
  7. It then calls another function that actually sets up the FSPD by giving it a pointer to its own stack,
  8. then it creates a structure that it fills with a bunch of arguments and calls this ‘entrypoint’ with that structure as argument.

So, the stack of this setup_fspd_and_run_entrypoint is pretty big, it’s about 0x300 bytes. Inside it, we find all of the local variables of this function, such as the FSP DATA structure itself, and the IDT table as well. Thankfully, IDA has a neat feature where you can look at the stack of a function by showing you where in the stack its arguments would be and where its local variables are. Here’s what it looks like for our function:

You can see the idt_table at -0x298, and you can see 4 bytes before it, at-0x29C, there is only undefined data, which means that area of the stack was not modified anywhere in this function. Well that’s not very helpful… So I continued reverse engineering the other sub functions that it calls, which actually initializes the FSPD structure and fills it out, I understood what it’s used for, but still: no idea about this [IDT-4] issue. I didn’t want to enter the entrypoint function, or what I assumed was the MemoryInit real entry point, since its function pointer was given as argument to the function I called setup_fspd_and_run_entrypoint. After I was done reversing all of the setup code, I had no choice but to enter the function I called the ‘entrypoint’ and after looking at it rather quickly I find this little gem:

The structure is found!

I had now finally found the function that calls the sidt instruction to retreive the IDT address and then write the pointer we’re looking for in [IDT-4]; it is indeed a pointer to a pointer as you can see, we store the address of var_2A4 which itself contains the address to var_250, and we can see just above that var_250 gets 0x88 bytes copied into it from a string “PEI SERV(“. If I go to that address, I realize that it’s a structure of size 0x88 and that “PEI SERV” looks like an 8 byte signature at the start of the structure. Searching for what “PEI SERV” means, I find that it’s indeed the signature to a 0x88 sized structure from the UEFI PEI Specification. The bytes that follow specify the major and minor revision of the spec it follows, which is 1.40 in our case, and that turns out to be the specification from the UEFI Platform Initialization Specification Version 1.4 (Errata A). Once I knew that, I was able to follow the specification, define the structure, and rename these unknown functions into their actual function names, and I got this:

And thus, the previous “what is this” function that we saw, with its [edx+0x18] access, became a very simple function that calls the InstallPpi UEFI API function. So yeah, the FSP-M is simply going to do an InstallPpi on the entire FSP-S image, then return back into whoever called that function that the FSP-S jumped back into…

The ‘weird_function‘ translates into this :

void install_silicon_init_ppi(void * image_base, int image_size) {
  uint32_t *Ppi = AllocatePool_and_memset_0(0x20);
  uint32_t *PpiDescriptor;
  uint8_t SiliconPpi_Guid[16] = {0xC1, 0xB1, 0xED, 0x49,
				 0x21, 0xBF, 0x61, 0x47,
				 0xBB, 0x12, 0xEB, 0x00,
				 0x31, 0xAA, 0xBB, 0x39};

  Ppi[0] = 0x8C8CE578;
  Ppi[1] = 0x4F1C8A3D;
  Ppi[2] = 0x61893599;
  Ppi[3] = 0xD32DC385;
  Ppi[4] = image_base;
  Ppi[5] = image_size;
  PpiDescriptor = AllocatePool(0xC);
  PpiDescriptor[0] = 0x80000010; // Flags
  PpiDescriptor[1] = SiliconPpi_Guid;
  PpiDescriptor[2] = Ppi;
  return InstallPpi(&PpiDescriptor);
}

You can also see the use here of AllocatePool which is another one of the PEI_Services API calls (which itself just calls the API CreateHob), and I’m glad I didn’t have to reverse engineer the entire memory allocation code to figure out that function simply allocates memory for us.

So that’s it, I’ve reverse engineered the entire FSP-S entry code, most of the FSP-M initialization code, and I then jumped back into the end/exit function of the FSP-M code (which itself does some small MTRR initialization then Installs the FSP-S as an UEFI Ppi then returns “somewhere”).

By the way, a “PPI” is a “PEIM-to-PEIM Interface” and “PEIM” means “PRE-EFI Initialization Module”. So now, I have to figure out how the PPI gets installed, and more specifically, how it gets used later by the FSP-M code, and who calls that function that exits the MemoryInit and handles the FSP-S return-from-new-stack behavior.

To try to explain “what’s going on in there” in a simple manner, here is my attempt at a flowchart to summarize things:

The big remaining unknown is the questionmark boxes at the bottom of the flow chart. More specifically, we need to figure out who called memory_init_exit_to_bios and how the PEIM gets installed and executed.

You can see the full reverse engineering of that section of the code in the fsp_m.c and fsp_m_init.c files in my FSP code repository.

Next steps

At this point, I’m sort of stuck because I need to find who called memory_init_exit_to_bios, and to do that, I think I’m going to dump the entire stack from within coreboot, both before and after SiliconInit is executed, then use the saved register value of ebp, to figure out the entire call stack. See, most functions do this when they are entered:

push    ebp
mov     ebp, esp
sub     esp, xxx

This stores the %ebp into the stack (right after the return address), then copies the %esp into the %ebp register. This means that throughout the entire function, the %ebp register will always point to the beginning of the stack at the start of the function, and can be used to access variables in an easy way. But also, the end of the function will look like this:

mov     esp, ebp
pop     ebp
retn

This will restore the stack pointer to what it was, then pop %ebp before returning. This is very practical if you don’t want to keep track of how many variables you pushed and popped, or how many bytes you allocated on the stack for local variables (and it’s also faster/more optimized of course than an ‘add’ to %esp).

Here’s a real example in the memory_init_exit_to_bios function:

On the left, you see the begining of the function (the prologue), and on the right, the end of the function (the epilogue), you can see how it stores %ebp, then puts %esp into it, then stores the registers it will need to modify (%ebx, %ebp, %esi and %edi) within this function, then at the end, it restores the registers, then the stack. You can see this same pattern in our previous ‘weird_function‘ screenshot as well.

This means that the stack will usually look like this:

data …
previous ebp
return address
data …
previous ebp
return address
etc.

The only thing is that every ‘previous ebp’ will point to the begining of the stack of the calling function, which will itself be the address in the stack of the ‘previous ebp’. So in theory, I could follow that up all the way to the top, finding the return address of each function that called me, thus building a stack trace like what gdb gives you when you crash your program (that’s actually how gdb does it). Hopefully with that, I’ll get the full trace of who called the memory_init_exit_to_bios function, but also, if I do it after the execution of SiliconInit, I would get the entire trace of the SiliconInit entrypoint all the way to its own version of the silicon_init_exit_to_bios, and hopefully that will help me get exactly what I need.

The other nice thing is that now it’s all probably going to be done via API calls to a UEFI Module and using API interfaces for the PEIM, and using PPI and whatnot, so I will also need to start learning about UEFI and how it works internally, but the nice thing is that it will probably help me reverse engineer more easily, since the API names and function signatures will be known.

Then, once I know what I need to know, I can finally start reverse engineering the actual silicon initialization code. Talk about jumping through hoops to find the front door!

Repairing Windows boot..

This little adventure of the past few days definitely deserves someone to tell its story, so I decided to post about it on my blog, which hasn’t seen much love in a long while. To summarize it : my machine wouldn’t boot, and I tried to fix the windows bootloader and it was much harder than it should have been.

Background

A few months ago, my wife was due for a new PC, so instead of buying one, and since I have a dozen at home from Purism, I lent her the Librem 15 v2 that I had sitting around unused. Unfortunately, that particular unit had some issues which made using it a bit annoying (trying to suspend will cause a reboot, and you can’t shut it down, it will turn itself back on on its own) but it did the job and it was much better than her 10 years old (and extremely loud rattling/noisy) Thinkpad X200.

Every few weeks, I would “borrow” the Librem 15 v2 and attempt to finish porting coreboot to it. In the past week, I’ve finally finished the coreboot port and released it. Unfortunately, her Windows would refuse to boot once Coreboot gets installed. I assume it’s because Windows was installed with EFI and coreboot+SeaBIOS only supports legacy BIOS mode (I could install TianoCore as the payload to get EFI support, but I didn’t want to do that, so I figured I’ll just fix the Windows machine so it can boot from Legacy.. how hard can it be, right ?

First attempt

So, Windows doesn’t want to boot, so let’s go into the Windows 10 installation drive and do a “Startup repair”, that didn’t work, then I followed the various tutorials online and I tried the “bootrec /rebuildbcd” and “bootrec /fixboot” and “bootrec /fixmbr” and still nothing, I even found the “bootsec /nt60 C:” trick, but it still didn’t help, then I figured, maybe since the system was installed in EFI, it fixes the boot but it would only work if I still booted as EFI (regardless of how the installation drive was booted), so I found/used EaseUS Partition Manager to transform the GPT partition table into an MBR partition which shouldn’t really make a difference, but technically GPT is required for EFI so by using MBR, I would effectively force it to be bootable by a legacy BIOS. Again, I did all the steps to fix the MBR/BCD/boot, etc.. Still nothing… Then I decided to delete the EFI partition entirely and retry, still no luck.. Then I thought “ok, maybe the problem is this specific HDD model that doesn’t work with SeaBIOS for some reason?” so I used CloneZilla to clone the HDD into an NVMe drive I had in the machine, then I tried to boot into it. I still get the same result, it prints “Booting from Hard Disk…” and nothing else, none of the “No OS found” or “partition error” or whatever those standard error messages the windows bootloader should print.

So, the problem is not the HDD, but at least now I have a backup of the HDD in the NVMe drive and I can try to mess with it without risk of data loss, so I spent another few hours tweaking and playing around with settings and Windows tools, etc.. and still I got nothing. Then I had an idea. What if I install grub and use grub to boot into Windows! That’s a great idea, now to find how to install grub without having a linux system installed. For some reason, from the CloneZilla live USB, I couldn’t install grub, so I switched to my PureOS live USB, and I managed to install grub on the NVMe drive, but it had no config. I created a partition for it, but “update-grub” wouldn’t work, that’s because the “/” path is mounted as “overlay” and the grub-probe command doesn’t know how to handle that, so I had to edit /usr/sbin/grub-mkconfig to make it use “/mnt” instead of “/” or “/boot” when calling the grub-probe command, so, that partially worked.. unfortunately, grub-probe is also used in the various files in /etc/grub.d/ and even though I gave grub-mkconfig the ‘-o /mnt/grub/grub.cfg’ path, the files in /etc/grub.d/ had some /boot hard coded in them, so I just mounted the partition into the /boot directory and that fixed everything!

Now I boot and I see the grub menu, it shows me the Windows installation from the NVMe and from the HDD, but booting them both gives me the same weird error : “δRÉNTFS” and that’s it.. this weird delta, R, accented E then NTFS being printed in the screen and nothing else… I decided to see if I could restore things now with the windows installation disk, so I did the ‘startup repair’ and all the “bootrec” commands, and I can confirm that grub was removed and replaced by (I assume) the windows bootloader, but unfortunately, it still didn’t help, because now it was giving me this same “δRÉNTFS”  again. I assume the NTFS partition was corrupted somehow or something like that.

I’ve now spent way too many hours (during 2 days) trying to get this to work, so I decided to just ask my wife “I kind of broke your HDD from your old laptop, how about a clean windows install, you’d still have all your files, but you’d need to re-configure it, and reinstall all your apps, etc..” and she freaked out at first because she thought I said that all the data was lost, and when I said “your old laptop”, she thought I was talking about the Thinkpad X200. I had actually decided to just upgrade her to a non-broken librem (Librem 15 v3) because one that doesn’t shut down or suspend isn’t really great for every day use, so when I said “old laptop”, I meant the librem 15 v2.. So, it turns out, the Librem HDD itself was nearly empty (I think 20 or 30GB used, so discounting the windows install itself, not much personal data on it) and she still hadn’t copied any data from the thinkpad to the Librem.

Second attempt

So, I realized “oh wait, what if I used the HDD from the thinkpad instead, that one probably doesn’t use an EFI bootloader anyway”.

Unfortunately, I remembered why I hadn’t put the thinkpad’s hdd into the old Librem 15 v2, it was because that HDD was too thick (I guess 9 mm but the librem only supports 7mm drives) so it wouldn’t fit.

Thankfully, I had a SATA to usb adapter that I used, and when I tried to boot into it.. SUCCESS! It boots, oh wait, Blue Screen of Death… and it reboots… damn it, now it enters into “startup repair” mode which doesn’t let me do anything because it can’t figure out what’s wrong.  the BSOD was going so fast that I had to film it then go image by image to be able to see the STOP code (which was 0x0000007b) and it didn’t help me, other than “use startup repair” or “remove bootloader virus” as the usual advice on the internet…

Before I messed anything up, I went ahead and made another clone with CloneZilla from the HDD into the NVMe, but I would get the same BSOD when running from the NVMe.. Now I decided I’ll start messing with the disk in the startup repair, and I noticed, it can’t find the hard drive, no matter what, the hard drive isn’t visible to the startup repair mode of Windows… I realized, maybe it’s because this is Windows 7, not Windows 10 and it has no NVMe drivers! So I looked for the NVMe drivers and put them on a USB stick, and clicking the “Load drivers” in Windows 7 startup repair showed me that there was no USB stick… apparently, there also are no USB drivers.. Also when I looked for solutions to the BSOD, one solution I saw was “go to BIOS settings and change the SATA settings from AHCI to IDE”.. but there is no such setting in coreboot of course and I didn’t plan on recompiling coreboot just for that, it had to work without changes), so that makes sense, if the Windows installation can’t access to its files, that’s why it does the BSOD, and if it can’t talk to either the original HDD connected via USB (no USB drivers) or to the clone on the NVMe driver (no NVMe drivers), then that’s the problem.. so I swapped out the HDD for another one of my test HDDs that I wasn’t using and I cloned the drive from the NVMe back into a slim HDD that could fit into the Librem.

Now, finally, it boots!!! Now all that I need to do, is install drivers on this Windows 7 instance, I figured I’ll also upgrade to Windows 10 while we’re at it, then I’ll clone (again) from this HDD back into the NVMe drive, and it should all work then.

Getting Windows to work

Now, this wasn’t the hardest part, but it was probably the most annoying… I had to install drivers on this machine, but USB doesn’t work, Wifi doesn’t work, there is no Ethernet either, so I had to find drivers on my other machine, put them on USB, boot into a Live USB of PureOS, then mount the Windows HDD, copy the files from USB to the HDD, then reboot into Windows, then try to get it to work. Unfortunately, the first “USB 3.0 xHCI controller” drivers that I installed was apparently not the right one, and the drivers for the Wifi card weren’t recognized by Windows.. eventually, I found the “Chipset driver” on Intel’s website and that worked, suddenly I got the audio speakers working and I could change the resolution to something other than 800×600, but the PC was still **extremely** slow.. I’m going to blame it on the HDD or something because the CPU would only be used to 1%, but everything was incredibly slow. It doesn’t matter, I’ll just go through it as slow as it is.. but after spending 30 minutes just to get to the device manager and to realize you got the wrong driver, it sucks, especially since you need to reboot into a live USB of linux in order to copy a different one again on the HDD.

Anyways, eventually, I found the correct USB 3.0 xHCI controller driver (from Intel’s website), which made the USB work in theory, but it didn’t want to actually work.. the Windows 10 USB installer wouldn’t appear.. I had also installed the NVMe drivers, but the partition on the NVMe drive didn’t appear either… Eventually, I noticed that the Windows 10 USB that was appearing in the device manager didn’t have a driver, so I had to re-install the usb mass storage driver.. I did find the proper wifi driver, which I installed, then I let Windows find the driver for the USB stick online and install it. Then finally, it said “Your device is ready to use”, but the drive still didn’t appear in “My PC”.. so I opened the disk management tool, and from there, I see that it was mounted on “D:”. I also found that the NVMe drive was indeed recognized by Windows, but it had a little warning next to it that said this drive is offline because one of its partitions conflicts with one that is already mounted… Of course! Since the two drives are clones of one another, it means that they have the same partition GUID or whatever, so Windows couldn’t mount them both. Ok, but why is it saying that the Windows 10 Installer is mounted as “D:” when I can clearly see that “D:” is my NAS.. but the NAS is appearing as ‘disconnected’, well, it turns out this is a bug in Windows.. the NAS was disconnected, so “D:” drive was free, so it mounted the USB on “D:” but it didn’t show it in “my PC”, it was still showing “D: NAS \\NASNAME (Disconnected)” instead of “D: Windows 10”.. but if I clicked on the D: drive, instead of getting the “this drive is not accessible” error, instead, it was opening the USB stick.. :facepalm:

So I run the Windows 10 installer, it wants me to do a windows update first, which takes a couple of hours just “checking for updates”, and when I go to windows update manually, it says there are no new updates, so I just cancelled that, it froze, reboot, 30 minutes later, I can finally start the Windows 10 installer and I let it run… It reboots a few times, it even says “this is taking longer than usual” but eventually, the system is updated, all the files , apps, settings, etc.. are still there, and all the drivers are installed.

Finally, I can go back into CloneZilla and clone the newly-upgraded HDD back into the NVMe drive and call it a day!

What an adventure, and all of it because that stupid Windows refused to boot in Legacy BIOS. It should not have been this hard, I’ve seen tutorials, people saying to just call that “bootrec /fixmbr” and that should do it, but I think that in something I did, I somehow corrupted the partition or something so it wasn’t able to boot into it.

So, I cloned the HDD back into the NVMe and it worked! End of the adventure, great, thanks, bye bye…

Not so fast!

Humm.. yeah, actually, that didn’t work, when I boot the Windows on the NVMe, I get this error :

I checked and the C:\Windows\System32\winload.exe file does exist in the drive, so I’m not sure what’s wrong..

I think that I did boot successfully into the NVMe, but that was when the HDD was still in the machine, so I think it was running the bootloader from the NVMe then loading the winload.exe from the HDD then booting halfway from the HDD and halfway from the NVMe.

I tried again the usual startup repair and usual custom commands, but in the end, it didn’t work, and I realized “why am I bothering with NVMe?” and decided to just give up and try to clone the drive in a regular SATA SSD instead…

So the following day, while I’m backing up the data from my SATA SSD into another drive so I could clone the HDD into it, I decided to put the NVMe drive in another machine (since the Librem 15 v3 was being used to copy data from the SSD) to boot the NVMe and take a photo of this error for this blog post.. and magically, I got a different menu, one that tells me “Choose which OS you want to boot”, and it gives me three choices “Windows 10”, “Windows 10 Pro from volume 2” and “Windows 10 Pro” again.. If I select “Windows 10” or “Windows 10 Pro”, it reboots, and I get the above error message until I press F9 which it says “press F9 to use a different operating system”… hey, I didn’t have that F9 option when I tried this yesterday!

If I choose the “Windows 10 Pro from Volume 2” option, it boots right away into Windows, so.. it works! yes! finally! And this was on a machine with no SSD or HDD, just the NVMe drive, so there can’t be some other drive helping along the boot process.  Unfortunately, every boot, it will ask me to make that choice.. I think a little “bootrec /rebuildbcd” will probably help fix that though.

So, I reboot into the recovery drive, I try the ‘bootrec /rebuildbcd’, it of course doesn’t find the Windows installation in “C:\Windows”, so I first had to run “attrib C:\Boot\BCD -h -r -s” to remove the “hidden/system/read-only” attributes on the C:\Boot\BCD file, then I could delete that file and when I re-run “bootrec /rebuildbcd”, it finds “C:\Windows”, and now, it boots right away into the NVMe drive. The problem is that the rebuildbcd doesn’t always work if it can’t overwrite the BCD file which is by default read-only.

I think what happened before was that since I thought everything was finally done, I removed the HDD and replaced it with the non-bootable one (from my first attempt), so I think when I was trying to repair the boot device with the windows installation drive, it was not fixing the NVMe drive, but the HDD one instead, or when it did fix it, it was telling it to boot Windows from the HDD which is non-bootable.. or something weird like that.

Either way, it doesn’t matter, because, now, finally, for real this time, it works, and I’m done!

 

Upgrading from Fedora 15 i686 to Fedora 16 x86_64

A couple of months ago I bought a new laptop with 8GB of RAM, but I realized I was running on a 32 bits system which meant I couldn’t use all my RAM. I had to switch to 64 bits. It takes so much time for me to restore my system that I didn’t have the courage to go through it again (did it last year, switched from Debian to Fedora, took me a week), so I stayed with 32 bits. Yesterday I had to upgrade to Fedora 16 and decided to do the switch to 64 bits at the same time… I’d like to share my experience with you!

First of all, I had to download the 64 bits version of the fedora CD which is not the default download on the website, I had to click on the small “more download options” to get the choice and I realized that’s how I got the 32 bit  install in the first place (Fedora download page should definitely list both links). Then I made a backup of all the installed packages on my system so I can restore them on the new system :

 yum -C info $(rpm -qa) | grep “Name   :” | cut -c 15- > packages-list.log

This will list all of the packages installed, and ask yum for the exact name of the package (instead of “git-1.7.6.5-1.fc15.i686”, it becomes “git”).. if you have a better method of doing that, let me know, but this did the trick for me.

Update: A better method was given to me by Hansen and Richard Godbee in the comments : rpm -qa –qf “%{name}\n” > packages-list.log

I obviously had a separate partition for the  /home directory, which made things easier, so I backed up in it the important directories which were: /opt, /root, /etc, /usr/local and my scratchbox home dir. Then the moment of truth, reboot into the live cd, install it, make sure not to format the /home partition, and reboot into the new 64 bits system.

First of all, as soon as I tried to login, gnome 3 would completely crash and would not let me log in, so I had to create a new user, login into gnome 3, then “ls -la” the files in the new user’s home dir, then delete (move away) those same files/directories from my own home dir, so that gnome doens’t crash anymore… apparently, my settings suddenly became incompatible or something… It’s important to note that I had some further problems later and I had to copy back .gnome2/keyrings otherwise the gnome-keyring daemon would freeze.

To restore all the packages that I had before, I first had to re-install (manually) the rpmfusion repository (free and nonfree), then I just did a simple :

yum install $(cat packages-list.log)

And after 1.2GB of downloads and 1020 package installs, my system was technically “restored” to how it was before the format. I look at the “No package foobar” lines given by yum at that point which told me what I needed to install manually (opera, skype, dropbox), which I did, and a few libs that apparently don’t exist anymore in Fedora 16. Now I just had to restore the /opt for some apps I had in there (and recompile the EFL/E17),  copy the Enlightenment.desktop file to /usr/share/xsessions, restore my /etc/hosts (which had some custom entries), restore some custom scripts I wrote into /usr/local/bin and recompile the libraries I was working on and had installed in /usr/local (gstreamer, libnice, farstream). I also had to install a few 32 bit libraries so I could install skype (which only comes in 32 bit flavor).

It took me about a day of work/compilation, but now I feel back home, don’t notice any difference in my system other than the fact that I will now be writing 32-bits bugs instead of 64-bits bugs 🙂

 

 

New blog!

Hi all,

Welcome to my new blog!

This is my first blog.. I usually hate blogs, but I thought it would be nice to start sharing some information about what I do.. not what I ate today, but more like, what I’m working on lately…

I just hope I won’t forget about it.. so hopefully, I’ll keep it up to date with  the latest developments on the projects I work on. Mainly, you’ll be seeing stuff about aMSN, Zeitgeist/Teamgeist, Libnice, Farsight and Telepathy!

Click that RSS link now!

KaKaRoTo