ioctlvhax

Introduction

plutoo and I found this bug back in september of last year. Effectively this will grant ROP in an IOS usermode process which may then be further used to target the kernel. The vulnerability itself is a TOCTTOU race condition.

The Bug

Initially the ioctlv handling of the IOS kernel contained a major design flaw, namely that the buffer address verification of the vectors happens in-place provided that we supply more than 8 vectors. This enables the PPC side to change the buffer address after its verification. Due to the nature of the bug exploitation requires the number of vectors passed in to be relatively large.

int _ioctlv(int fd, int cmd, u32 num_in, u32 num_io, ioctlv_vec *vec, ...)
{
/* ... */
dev_ctxt *dev = get_dev(...);
u32 num_total = num_int + num_out;
/* ... */
/* Copy in vectors if num_total <= 8 else use external vectors. */
/* Check vector buffer addrs. */
/* ... */
}
view raw _ioctlv_old.c hosted with ❤ by GitHub

The Fix

The bug was fixed with version 5.2.0 by adding a new field to the device context that limits the number of vectors, which is set to 8 by default and may be changed using syscall 0x2E if required.

int _ioctlv(int fd, int cmd, u32 num_in, u32 num_io, ioctlv_vec *vec, ...)
{
/* ... */
dev_ctxt *dev = get_dev(...);
u32 num_total = num_int + num_out;
/* ... */
if(num_total <= dev->max_vecs)
{
/* Copy in vectors if num_total <= 8 else use external vectors. */
/* Check vector buffer addrs. */
}
else
res = -11;
/* ... */
}
int _dev_register(char *path, int mqid, int pid)
{
/* ... */
dev->max_vecs = 8;
/* ... */
}
int syscall_2E_set_ioctlv_max_vecs(char *name, u16 num)
{
/* ... */
dev->max_vecs = num;
/* ... */
}
view raw _ioctlv_new.c hosted with ❤ by GitHub

Exploitation

The goal was to gain ROP under an IOS usermode process. For this we had to look for a device that did not check the number of vectors itself. It turns out that “/dev/im” provides us with some very handy ioctlv handlers, namely:

static int dev_im_mq;
static int hb_param_idx;
static int hb_param_type;
int SetDeviceState(fd_ctxt *fd, int *buf)
{
/* ... */
switch(buf[0])
{
/* ... */
case 3:
hb_param_idx = buf[2];
hb_param_type = buf[1];
/* ... */
}
/* ... */
}
int GetHomeButtonParams(int *buf)
{
buf[0] = hb_param_type;
buf[1] = hb_param_idx;
return 0;
}
int dev_im_handler(...)
{
while(!recv_msg(dev_im_mq, &req, 0))
{
switch(req->cmd)
{
/* ... */
case 4:
/* ... */
SetDeviceState(fd_ctxt, (int *)req->vecs[0].phys);
/* ... */
case 7:
/* ... */
GetHomeButtonParams((int *)req->vecs[0].phys);
/* ... */
/* ... */
}
/* ... */
}
/* ... */
}
view raw dev_im.c hosted with ❤ by GitHub

Thus this allows us to write 8 bytes worth of data to an address we eventually control. With this arbitrary write we can now carefully setup a ROP stack inside the AUXIL process, overwrite the return address of one of the devices’ handler threads and get the handler thread to return by overwriting the corresponding message queue handle.

Note that this is by no means the only way to exploit this flaw – interested readers are encouraged to let us know about any alternative strategies they might come up with.

Reading BCA – The Hard Way

If you ever wanted to know what’s stored in a DVD’s BCA, you’ve probably read it out using your favourite device. Here I present you a different approach: reading it out by scanning the area and running “pattern recognition” on the image.

Here is a part of a Wii game that I’ve scanned:

Image

After filtering and thresholding it looks like this:

Image

The black and white image is then fed into a tool I wrote that will estimate the midpoint and the major and minor axes of the inner ellipse, detect all the BCA cuts around it, mark them and spit out the timing between the cuts. This timing can then be decoded into the actual data that is stored in the area. Note that the thick cuts are actually two normal ones that my scanner’s resolution didn’t pick up anymore, the tool will detect that too. Here is a part of a generated image (the blue lines are the detected cuts):

Image

And here is a complete output too:

Image

a simple raytracer

Raytracers are fun, maybe you want to check out my quick experiment: https://gist.github.com/944687. It’s called fray (fast ray – although it’s not that fast) and is capable of rendering planes and spheres. The raytracer currently calculates lighting (diffuse and specular shading) and reflections and will be extended to do refractions. It could be optimized by implementing a kd-tree to subdevide the scene and only render the sub-parts with objects in them. But meh, maybe at some point.

This is the output calculated from the test scene in main.cpp:

test scene

Why a blog?

[en]

I don’t really know why, I’d like to have a place to publish some of my thoughts. This blog should be about all the stuff that I think is interesting, like electronics, science, technology or environment. Some of my posts will be in English and some of them in German, as I’m a German native speaker. So be prepared to get some interesting reading here in the future.

Let’s start with something that I’m currently into, a SPU decompiler. If you never heard about a SPU, you should probably look here and here. It’s one of eight co-processor cores of the Cell/B.E. platform, that is also used in the Sony-PS3 for example. One of the remarkable properties of such a processor core is, that it can be isolated, which means, that all of it’s memory (the 265kb local storage) is only accessible from itself and not longer from the main processor code (the PPU). The program images for such isolated SPUs on the PS3 are encrypted. With the rvk-list exploit, published by fail0verflow at 27c3 (which was actually told to them by mathieulh in the first place), certain SPU binaries could be dumped. Later geohot published the  keys used by metldr (it’s the loader that loads isoldr, that loades isolated binaries), so every isolated binary could be decrypted easily.

But there is the problem, that SPU assembly is just pain in the ass to read/reverse engineer if you never looked at it before. So I got the idea to write a decompiler for it, as it is a RISC processor. You can get my current source code for it at the github project page. Currently it disassembles the binary, finds all subroutines and constructs the control flow graph for all of them. The next task is to determine the control structures and register usage of each block.

Thats all for today folks.