Reversing TB – Part 1: The VM

Thanks to oct0xor we could get our hands on the decrypted TB payload (stage 2). Of course the first thing to do is to fire it up in IDA, our favourite tool of the trade. The entry code of the payload looks like this:

1337C0DE00000000 _start:
1337C0DE00000000
1337C0DE00000000 .set var_58, -0x58
1337C0DE00000000 .set arg_10,  0x10
1337C0DE00000000
1337C0DE00000000         mflr      r0
1337C0DE00000004         bl        loc_1337C0DE00000008
1337C0DE00000008 1337C0DE00000008 loc_1337C0DE00000008:
1337C0DE00000008         mflr      r3
1337C0DE0000000C         lis       r4, 0 # 8
1337C0DE00000010         addi      r4, r4, 8 # 8
1337C0DE00000014         subf.     r3, r4, r3
1337C0DE00000018         beq       skip_reloc
1337C0DE0000001C         li        r6, 0
1337C0DE00000020         oris      r6, r6, 0x1337
1337C0DE00000024         ori       r6, r6, 0xC0DE
1337C0DE00000028         lis       r4, 1 # 0xA848
1337C0DE0000002C         addi      r4, r4, -0x57B8 # 0xA848
1337C0DE00000030         lis       r5, 1 # 0x10D18
1337C0DE00000034         addi      r5, r5, 0xD18 # 0x10D18
1337C0DE00000038         subf.     r5, r4, r5
1337C0DE0000003C         beq       skip_reloc
1337C0DE00000040         srdi.     r5, r5, 3
1337C0DE00000044         mtctr     r5
1337C0DE00000048         add       r4, r4, r3
1337C0DE0000004C
1337C0DE0000004C reloc_loop:
1337C0DE0000004C         ld        r5, 0(r4)
1337C0DE00000050         srdi      r7, r5, 32
1337C0DE00000054         cmpw      r7, r6
1337C0DE00000058         bne       skip_rewrite
1337C0DE0000005C         clrldi    r5, r5, 32
1337C0DE00000060         add       r5, r5, r3
1337C0DE00000064         std       r5, 0(r4)
1337C0DE00000068
1337C0DE00000068 skip_rewrite:
1337C0DE00000068         addi      r4, r4, 8
1337C0DE0000006C         bdnz      reloc_loop
1337C0DE00000070
1337C0DE00000070 skip_reloc:
1337C0DE00000070         std       r0, arg_10(r1)
1337C0DE00000074         stdu      r1, -0x80(r1)
1337C0DE00000078         std       r2, 0x80+var_58(r1)
1337C0DE0000007C         lis       r4, 1 # 0x17E40
1337C0DE00000080         addi      r4, r4, 0x7E40 # 0x17E40
1337C0DE00000084         add       r2, r4, r3
1337C0DE00000088         bl        payload_main

In the first loop it will relocate itself using 0x1337C0DE as an identifier for the upper 32 bits and rewrite that to the actual base. The disassembly above was already loaded using 0x1337C0DE00000000 as base. While scrolling through the data section at the end of the payload one quickly figures out that the RTOC is 0x1337C0DE00017E40.

As I was analyzing the code I found a sub that was basically just a really big switch with random looking case values. Once I reversed the sub at 0x1337C0DE00002578 and some of the following ones and analyzed their usage in the switch sub, I knew that I was looking at a fricking virtual machine.

1337C0DE00002578 vm_push_word_0:
1337C0DE00002578         ld        r11, off_1337C0DE00010128 # stack_ptr
1337C0DE0000257C         ld        r9, 0(r11)
1337C0DE00002580         addi      r0, r9, 4
1337C0DE00002584         std       r0, 0(r11)
1337C0DE00002588         stw       r3, 4(r9)
1337C0DE0000258C         blr

Paranoid TB developers even used XOR-tables to obfuscate the VM instructions and data. The virtual machine is mostly stack based but the instructions let you work using registers too. The next thing to do is to reverse all the instructions and write a disassembler and emulator. Here is some code to unscramble the embeded vm binary for further investigation. I’m going to write more about this topic in the future.

Advertisements

Individual Infos

One of the PS3’s console specific cryptography works as follows:

At factory time there is a console specific key generated, probably from a private constant value and a console specific seed. Maybe that’s the key used for encrypting bootldr and metldr. Fact is, that metldr stores another console specific keyset (key/iv) to LS offset 0x00000. That keyset is probably calculated from the first one. At factory time the isolated root keyset (how I call it) is used to encrypt the console’s “Individual Infos”, like eEID. But not the whole eEID is encrypted the same way, special seeds are used to calculate key/iv pairs for the different sections. And not even that is true for every eEID section, because for e.g. EID0 another step is needed to generate the final section key(set). Each of the isolated modules using such an “Individual Info” has a special section that isoldr uses to generate the derived key(set)s. But the generation works in a way, that the section data is encrypted with aes-cbc using the isolated root keyset, so it is not possible to calculate the isolated root keyset back from the derived key(set)s, because aes shouldn’t allow a known plaintext attack. So far I can decrypt some of EID0’s sections, EID1, EID2 and EID4. EID5 encryption should be similar to EID0’s but I lack the generation keys for that one.

About SPU channels 64, 72 and 73

If you are reversing the PS3’s isolated SPU modules, you will eventually notice channels 64, 72 and 73. Here are some C functions, that roughly describe how they work:

void read_ch73(u32 skip, u32 *buf, u32 len)
{
	u32 i;
	spu_wrch(64, 0x10000);
	for(i = 0; i < skip; i++)
		spu_rdch(73);
	for(i = 0; i < len; i++)
		buf[i] = spu_rdch(73);
}

void write_ch72(u32 skip, u32 *buf, u32 len)
{
	u32 i:
	spu_wrch(64, 0x10000);
	for(i = 0; i < skip; i++)
		spu_wrch(72, spu_rdch(73));
	for(i = 0; i < len; i++)
		spu_wrch(72, buf[i]);
}

It seems that lv1ldr is storing it’s version into a special storage area.

s64 lv1ldr_main(...)
{
	//...
	u64 ldr_ver = 0x0003004100000000;
	write_ch72(0, &ldr_ver, 2);
	//...
}

And e.g. isoldr reads the version from the storage area and compares it to it’s own version.
If the check fails, isoldr will just stop execution.

s64 check_version(u64 ldr_ver)
{
	u64 stored_ver;
	read_ch73(0, &stored_ver, 2);
	//...
}

s64 load_isoself(...)
{
	ldr_ver = 0x0003004100000000;
	if(check_version(ldr_ver) != 0)
		return 0x30;
	//...
}

I wonder what else is stored in the area and how long the data in it persists, so my next idea is to code an isolated elf, that allows me to specify the value written to channel 64 and then dumps the data from channel 73.

a simple raytracer

Raytracers are fun, maybe you want to check out my quick experiment: https://gist.github.com/944687. It’s called fray (fast ray – although it’s not that fast) and is capable of rendering planes and spheres. The raytracer currently calculates lighting (diffuse and specular shading) and reflections and will be extended to do refractions. It could be optimized by implementing a kd-tree to subdevide the scene and only render the sub-parts with objects in them. But meh, maybe at some point.

This is the output calculated from the test scene in main.cpp:

test scene

spud control flow graph

Today i did some more work on my spu decompiler and now it correctly builds control flow graphs from spu binaries. It’s a very simple control flow graph by now, but the current implementation is enough to make the execution path of the binary easier to understand already. I also implemented graphviz output to visualize the generated control flow graph. The generated image along with the disassembly of the binary already makes reverse engineering a lot easier.

Take this nonsense sourcecode for example:

void __attribute__((__noinline__)) g(char *dst, char *src, int cnt)
{
    while(cnt > 0)
    {
        *dst = *src;
        dst++;
        src++;
        cnt--;
    }
}

void _start()
{
    char dst[11], src[11] = "abcd1dcba2";
    g(dst, src, 11);

    int i, j = 0;
    for(i = 0; i < 10; i++)
        j += i * i;
    j++;

    asm("stop");
}

If you then compile this code without optimizations and let objdump generate the disassembly you will get something like this for the g function:

000:    24 fe c0 81     stqd    $1,-80($1)
004:    1c ec 00 81     ai    $1,$1,-80
008:    34 01 00 82     lqd    $2,64($1)    # 40
00c:    3e c0 00 86     cwd    $6,0($1)
010:    b0 40 81 86     shufb    $2,$3,$2,$6
014:    24 01 00 82     stqd    $2,64($1)    # 40
018:    34 00 c0 82     lqd    $2,48($1)    # 30
01c:    3e c0 00 83     cwd    $3,0($1)
020:    b0 40 82 03     shufb    $2,$4,$2,$3
024:    24 00 c0 82     stqd    $2,48($1)    # 30
028:    34 00 80 82     lqd    $2,32($1)    # 20
02c:    3e c0 00 83     cwd    $3,0($1)
030:    b0 40 82 83     shufb    $2,$5,$2,$3
034:    24 00 80 82     stqd    $2,32($1)    # 20
038:    32 00 0e 80     br    0xac    # ac
03c:    34 00 c0 82     lqd    $2,48($1)    # 30
040:    1c 03 41 03     ai    $3,$2,13
044:    34 00 01 02     lqd    $2,0($2)
048:    3b 80 c1 02     rotqby    $2,$2,$3
04c:    04 00 01 04     ori    $4,$2,0
050:    34 01 00 82     lqd    $2,64($1)    # 40
054:    34 00 01 03     lqd    $3,0($2)
058:    3e 80 01 05     cbd    $5,0($2)
05c:    b0 60 c2 05     shufb    $3,$4,$3,$5
060:    24 00 01 03     stqd    $3,0($2)
064:    34 01 00 82     lqd    $2,64($1)    # 40
068:    1c 00 41 03     ai    $3,$2,1
06c:    34 01 00 82     lqd    $2,64($1)    # 40
070:    3e c0 00 84     cwd    $4,0($1)
074:    b0 40 81 84     shufb    $2,$3,$2,$4
078:    24 01 00 82     stqd    $2,64($1)    # 40
07c:    34 00 c0 82     lqd    $2,48($1)    # 30
080:    1c 00 41 03     ai    $3,$2,1
084:    34 00 c0 82     lqd    $2,48($1)    # 30
088:    3e c0 00 84     cwd    $4,0($1)
08c:    b0 40 81 84     shufb    $2,$3,$2,$4
090:    24 00 c0 82     stqd    $2,48($1)    # 30
094:    34 00 80 82     lqd    $2,32($1)    # 20
098:    1c ff c1 03     ai    $3,$2,-1
09c:    34 00 80 82     lqd    $2,32($1)    # 20
0a0:    3e c0 00 84     cwd    $4,0($1)
0a4:    b0 40 81 84     shufb    $2,$3,$2,$4
0a8:    24 00 80 82     stqd    $2,32($1)    # 20
0ac:    34 00 80 82     lqd    $2,32($1)    # 20
0b0:    4c 00 01 02     cgti    $2,$2,0
0b4:    21 7f f1 02     brnz    $2,0x3c    # 3c
0b8:    1c 14 00 81     ai    $1,$1,80    # 50
0bc:    35 00 00 00     bi    $0

And this is the disassembly of the _start function:

0c0:    24 00 40 80     stqd    $0,16($1)
0c4:    24 fe 80 81     stqd    $1,-96($1)
0c8:    1c e8 00 81     ai    $1,$1,-96
0cc:    33 80 20 83     lqr    $3,0x1d0    # 1d0
0d0:    34 00 80 82     lqd    $2,32($1)    # 20
0d4:    3e e0 00 84     cdd    $4,0($1)
0d8:    b0 40 81 84     shufb    $2,$3,$2,$4
0dc:    24 00 80 82     stqd    $2,32($1)    # 20
0e0:    40 b0 99 03     il    $3,24882    # 6132
0e4:    34 00 80 82     lqd    $2,32($1)    # 20
0e8:    3e a2 00 84     chd    $4,8($1)
0ec:    b0 40 81 84     shufb    $2,$3,$2,$4
0f0:    24 00 80 82     stqd    $2,32($1)    # 20
0f4:    40 80 00 03     il    $3,0
0f8:    34 00 80 82     lqd    $2,32($1)    # 20
0fc:    3e 82 80 84     cbd    $4,10($1)
100:    b0 40 81 84     shufb    $2,$3,$2,$4
104:    24 00 80 82     stqd    $2,32($1)    # 20
108:    1c 0c 00 83     ai    $3,$1,48    # 30
10c:    1c 08 00 82     ai    $2,$1,32    # 20
110:    04 00 01 04     ori    $4,$2,0
114:    40 80 05 85     il    $5,11
118:    33 7f dd 00     brsl    $0,0x0
11c:    40 80 00 03     il    $3,0
120:    34 01 00 82     lqd    $2,64($1)    # 40
124:    3e c0 00 84     cwd    $4,0($1)
128:    b0 40 81 84     shufb    $2,$3,$2,$4
12c:    24 01 00 82     stqd    $2,64($1)    # 40
130:    40 80 00 03     il    $3,0
134:    34 01 40 82     lqd    $2,80($1)    # 50
138:    3e c0 00 84     cwd    $4,0($1)
13c:    b0 40 81 84     shufb    $2,$3,$2,$4
140:    24 01 40 82     stqd    $2,80($1)    # 50
144:    32 00 0a 80     br    0x198    # 198
148:    34 01 40 82     lqd    $2,80($1)    # 50
14c:    04 00 01 03     ori    $3,$2,0
150:    34 01 40 82     lqd    $2,80($1)    # 50
154:    78 a0 81 85     mpyh    $5,$3,$2
158:    78 a0 c1 04     mpyh    $4,$2,$3
15c:    79 80 81 82     mpyu    $2,$3,$2
160:    18 01 02 83     a    $3,$5,$4
164:    18 00 81 82     a    $2,$3,$2
168:    34 01 00 83     lqd    $3,64($1)    # 40
16c:    18 00 81 83     a    $3,$3,$2
170:    34 01 00 82     lqd    $2,64($1)    # 40
174:    3e c0 00 84     cwd    $4,0($1)
178:    b0 40 81 84     shufb    $2,$3,$2,$4
17c:    24 01 00 82     stqd    $2,64($1)    # 40
180:    34 01 40 82     lqd    $2,80($1)    # 50
184:    1c 00 41 03     ai    $3,$2,1
188:    34 01 40 82     lqd    $2,80($1)    # 50
18c:    3e c0 00 84     cwd    $4,0($1)
190:    b0 40 81 84     shufb    $2,$3,$2,$4
194:    24 01 40 82     stqd    $2,80($1)    # 50
198:    34 01 40 82     lqd    $2,80($1)    # 50
19c:    4c 02 41 02     cgti    $2,$2,9
1a0:    20 7f f5 02     brz    $2,0x148    # 148
1a4:    34 01 00 82     lqd    $2,64($1)    # 40
1a8:    1c 00 41 03     ai    $3,$2,1
1ac:    34 01 00 82     lqd    $2,64($1)    # 40
1b0:    3e c0 00 84     cwd    $4,0($1)
1b4:    b0 40 81 84     shufb    $2,$3,$2,$4
1b8:    24 01 00 82     stqd    $2,64($1)    # 40
1bc:    00 00 00 00     stop
1c0:    1c 18 00 81     ai    $1,$1,96    # 60
1c4:    34 00 40 80     lqd    $0,16($1)
1c8:    35 00 00 00     bi    $0
1cc:    00 20 00 00     lnop

You can see that the functions end with a bi $0 instruction, which means branch indirect to the address stored in the link register ($0). If a function contains calls to other functions the compiler will have to save the link register somewhere. This is done by storing the link register on the stack. If you look at the disassembly of the _start function, there is a stqd $0, 16($1). This is the instruction that does the saving ($1 is the stack pointer). If you look at the end of the function _start there is a lqd $0, 16($1), this instruction restores the link register from the stack. But if you have a look at the g function there is no saving of the link register, as this function does not call any other functions.

And this is the control flow graph generated from the compiled assembly (image generated with graphviz):

control flow graph

The left graph is the g function and the right one the _start function, it is now easy to identify the conditional jumps and even the loops. The edges have different meanings, NEXT means a simple translation to the next block, BRANCH is an unconditional branch and BRANCH_COND is a conditional one. For example block 000ac and block 0003c form a loop, because there exists an edge with the type NEXT from 0003c to 000ac and a conditional branch from 000ac to 0003c and 000b8. Note that any conditional branch produces two edges, one edge to the block if the branch is taken and one to the other if it is not taken.

So the next big step will be to automatically generate the big structures like if and loop from the control flow graph. But more of this later.

alpha vfs design

I am currently writing a microkernel based operating system (called alpha) as my project work for school. To keep things simple I decided to choose the x86 as the only target processor. When I was at the point to implement the virtual file system, I just coded right away. But when it was half finished I started feeling unhappy with my implementation. So I started to rewrite the whole thing with a different approach.

Let me explain my new implementation idea a bit. As my operating system is microkernel based, all servers, drivers, etc. are own user space processes. So the whole virtual filesystem is split into many different parts that communicate with each other by using ipc mechanisms. My idea is to have a central module where all device and filesystem servers can register themselves. This central module is also the owner of the main filesystem tree. You can then request this module to mount a specific device with a specific filesystem somewhere in its filesystem tree. All file operations are routed through this module so device mounts are easy to handle.

At the moment I am coding a framework that will allow to easily write a new filesystem or device driver and a library to communicate with the central module to handle file access.