The CTF challenge
The Capture The Flag challenge offered in the book consists of finding a hidden flag (a string) in a binary, without access to its source code, by using reverse engineering techniques.
Once discovered, the flag unlocks the next levels and so on and so forth.
Only basics tools like a hexeditor, gdb, objdump, nm, readelf, strings will be used, and not more complex tools like IDA, Ghidra or Binary Ninja to be sure to understand the basics first.
Everything has been completed on a Kali Linux VM or on the Linux VM provided by the book author.
This post is my solution for the final level 8.
Level 8
Lorem ipsum
From the previous level, we don’t get started with a binary executable this time, but a 2.5MB text file:
$ ls -lh lvl8
-rw-rw-r-- 1 binary binary 2.6M May 8 21:23 lvl8
$ file lvl8
lvl8: ASCII text, with very long lines
Here are the first few lines:
$ head -n 10 lvl8
lOrem iPsuM doLOr SIT AmEt, conseCTETur adipIscing elit. maecenas eget augue sed leo suscipit ultrICiES sed blandit urna. sed ut risus vitAe Ligula semper scelerisque. fusce et UlTRices telluS, non commodo elit. nullA fACilisi. inteGer pharetra eu massa et ultrIces. nunc dignISsim nisl eu nulla ultricies venenatis. fusce tincidUnT NibH risuS, IN VulputaTe libero congue A. cuRabituR eST diam, lacinia vel placeRat Eget, seMpER nec enim.
proin turpis metus, finibus in porttitor sed, tempor nec ante. nulla ornare volutpat mi, siT AMET VOlUTPAT NUNC. AENEAN VITAE JUSTO IN EX SUSCIPIT PHARETRA. NAM NULLA ERAT, CONGUE eGET QUAM NON, PLACErAT CONSEqUAT RISUs. PROIN QUIS ULTRICeS ODIO. SED AT PHAREtRA QUAM, INTERDUM CoNSECTETuR EROS. SED DOLOR LACUS, VEHICuLA ALIQUeT IACULIs SIT AMET, PRETIUM ViTAE NISI. pHASELLUs AT EX SAGITTIS, ACCUMSAN IPSuM AT, BLANdIT QUAM. MaURIS FERmENTUM VEsTIBULUM vARIUS. PElLENTESQUE CONVALlIS VITAE mI SIT AMEt EUISMOD. dONEC CONsECTETUR tELLUS FAuCIBUS ACcUMSAN EUISMOD. DONeC ULTRICiES RHONCuS ARCU, UT mOLESTIE sEM CONGUe EGET. NULlAM POSUERE LECTUS aC VENENAtIS SEMPEr. PROIN AT nIBH DOLOr.
NULLA PReTIUM MOLeSTIE CONgUE. VIVAMuS EGET LIgULA QUIS tURPIS SAgITTIS RHoNCUS NON uT LECTUS. pHASELLUs id purus ac quam aliquam viverra vitae in massa. donec quis dictum ex. suspendisse ut interdum sem. quisque cursus viverra nisi ac ultrices. in hac habitasse platea dictumst. pellentesque feugiat sodales turpis, nec pretium leo efficitur a.
(...)
It is easy to recognize the Lorem ipsum text, but also the Camel case pattern throughout the text which looks unusual, because not consistent and not following a logic like proper capitalization on start of sentences.
Given the size of the file, and the purpose of the book (binary analysis) I think an executable is somehow hidden into that text, the only catch is to find out how.
As a first try I will suppose that the binary is encoded using the case of the characters. I’ll start with this assumption ignoring all the other characters like punctuation ones, spaces, newlines, …
- 0 for lowercased characters - abcde…z
- 1 for uppercased characters - ABCDE…Z
For instance lOrem iPs will be read as 01000010
We know how to recognize an ELF executable by its first header magic bytes which are 7f 45 4c 46 , so before spending too much time I’ll just work on the first 4 bytes of the resulting ’encoding’ to confirm this hypothesis.
This is a quick and dirty python script to read the whole file, applying the aforementioned logic to create new binary data and display the 4 first bytes:
#!/usr/bin/python3
import string
with open('lvl8') as fh:
buf = ''
for line in fh.readlines():
for c in line:
if c in string.ascii_lowercase:
buf += '0'
elif c in string.ascii_uppercase:
buf += '1'
header = buf[0:32]
n = int(header, 2)
print(n.to_bytes(4, byteorder='big'))
Unfortunately this is not the ELF header:
$ ./l8.py
b'BM\xe8\x1e'
But this BM magic number may already look familiar to you: this is (one of) the BMP file signature! There are a lot of websites referencing a lot of different file signatures, here I went to Wikipedia to confirm my thought.
Bitmap file
Next step is to write to a file the whole interpretation of the given text that can be opened as a BMP image:
#!/usr/bin/python3
import string
with open('lvl8') as fh:
buf = ''
for line in fh.readlines():
for c in line:
if c in string.ascii_lowercase:
buf += '0'
elif c in string.ascii_uppercase:
buf += '1'
n = int(buf, 2)
length = (n.bit_length()+ 7 ) // 8
with open('level8.bmp', 'wb') as out:
out.write(n.to_bytes(length, byteorder='big'))
This is the resulting image:
% file level8.bmp
level8.bmp: PC bitmap, Windows 3.x format, 300 x 300 x 24
I got stuck here for a while, after trying to find text within the file, in the metadata, looking up this image online for clues (what is this elf?), manipulate the image in Gimp, and all sort of similar dead ends.
This is while writing this post, and converting the bitmap to a png file that I noticed the size of the file:
$ ls -lh level8.bmp
-rw-rw-r-- 1 binary binary 264K May 21 13:12 level8.bmp
It seems a bit large for a 300x300 bmp file, right? After a new series of tries to patch the bmp header in different ways (width/height, offset to data, …) still stuck.
Then I decided to use ImageMagick to extract as much metadata as possible from this file to hopefully find something odd, with the identify -verbose level8.bmp command:
$ identify -verbose level8.bmp
Image: level8.bmp
Format: BMP3 (Microsoft Windows bitmap image (V3))
Class: DirectClass
Geometry: 300x300+0+0
Resolution: 28.34x28.34
Print size: 10.5857x10.5857
Units: PixelsPerCentimeter
Type: Palette
Endianess: Undefined
Colorspace: sRGB
Depth: 8-bit
Channel depth:
red: 8-bit
green: 8-bit
blue: 8-bit
Channel statistics:
Pixels: 90000
Red:
min: 0 (0)
max: 255 (1)
mean: 173.646 (0.680964)
standard deviation: 109.84 (0.430743)
kurtosis: -1.35726
skewness: -0.702737
Green:
min: 0 (0)
max: 255 (1)
mean: 179.209 (0.702781)
standard deviation: 104.566 (0.410062)
kurtosis: -0.880643
skewness: -0.950063
Blue:
min: 0 (0)
max: 255 (1)
mean: 143.475 (0.562649)
standard deviation: 121.291 (0.475651)
kurtosis: -1.85896
skewness: -0.26817
Image statistics:
Overall:
min: 0 (0)
max: 255 (1)
mean: 165.444 (0.648798)
standard deviation: 112.116 (0.439672)
kurtosis: -1.41505
skewness: -0.644421
Colors: 42
Histogram:
17643: ( 0, 0, 0) #000000 black
389: ( 0, 0, 1) #000001 srgb(0,0,1)
341: ( 0, 1, 0) #000100 srgb(0,1,0)
269: ( 0, 1, 1) #000101 srgb(0,1,1)
379: ( 1, 0, 0) #010000 srgb(1,0,0)
227: ( 1, 0, 1) #010001 srgb(1,0,1)
259: ( 1, 1, 0) #010100 srgb(1,1,0)
365: ( 1, 1, 1) #010101 srgb(1,1,1)
535: ( 58, 58, 58) #3A3A3A srgb(58,58,58)
12: ( 58, 58, 59) #3A3A3B srgb(58,58,59)
10: ( 58, 59, 58) #3A3B3A srgb(58,59,58)
4: ( 58, 59, 59) #3A3B3B srgb(58,59,59)
15: ( 59, 58, 58) #3B3A3A srgb(59,58,58)
1: ( 59, 58, 59) #3B3A3B srgb(59,58,59)
6: ( 59, 59, 58) #3B3B3A srgb(59,59,58)
1433: ( 59, 59, 59) #3B3B3B grey23
720: ( 84,170, 0) #54AA00 srgb(84,170,0)
27: ( 84,170, 1) #54AA01 srgb(84,170,1)
19: ( 84,171, 0) #54AB00 srgb(84,171,0)
20: ( 84,171, 1) #54AB01 srgb(84,171,1)
8662: ( 85,170, 0) #55AA00 srgb(85,170,0)
22: ( 85,170, 1) #55AA01 srgb(85,170,1)
20: ( 85,171, 0) #55AB00 srgb(85,171,0)
14: ( 85,171, 1) #55AB01 srgb(85,171,1)
1440: ( 94, 65, 6) #5E4106 srgb(94,65,6)
3093: (254,254, 0) #FEFE00 srgb(254,254,0)
169: (254,254, 1) #FEFE01 srgb(254,254,1)
7157: (254,254,254) #FEFEFE srgb(254,254,254)
381: (254,254,255) #FEFEFF srgb(254,254,255)
147: (254,255, 0) #FEFF00 srgb(254,255,0)
143: (254,255, 1) #FEFF01 srgb(254,255,1)
357: (254,255,254) #FEFFFE srgb(254,255,254)
287: (254,255,255) #FEFFFF srgb(254,255,255)
6336: (255,213,176) #FFD5B0 srgb(255,213,176)
199: (255,254, 0) #FFFE00 srgb(255,254,0)
138: (255,254, 1) #FFFE01 srgb(255,254,1)
441: (255,254,254) #FFFEFE srgb(255,254,254)
286: (255,254,255) #FFFEFF srgb(255,254,255)
1023: (255,255, 0) #FFFF00 yellow
128: (255,255, 1) #FFFF01 srgb(255,255,1)
281: (255,255,254) #FFFFFE srgb(255,255,254)
36602: (255,255,255) #FFFFFF white
Rendering intent: Perceptual
Gamma: 0.454545
Chromaticity:
red primary: (0.64,0.33)
green primary: (0.3,0.6)
blue primary: (0.15,0.06)
white point: (0.3127,0.329)
Background color: white
Border color: srgb(223,223,223)
Matte color: grey74
Transparent color: black
Interlace: None
Intensity: Undefined
Compose: Over
Page geometry: 300x300+0+0
Dispose: Undefined
Iterations: 0
Compression: Undefined
Orientation: Undefined
Properties:
date:create: 2020-05-25T21:20:38+02:00
date:modify: 2020-05-25T21:20:38+02:00
signature: 554206c801b93ce2d7bb94da43bbf49c9c13ce3f1dfcf408a92c32c946fa7cd8
Artifacts:
filename: level8.bmp
verbose: true
Tainted: False
Filesize: 270KB
Number pixels: 90K
Pixels per second: 0B
User time: 0.000u
Elapsed time: 0:01.000
Version: ImageMagick 6.8.9-9 Q16 x86_64 2017-07-31 http://www.imagemagick.org
The first parameter that looks off given this picture is the number of colors given by the colors histogram, 42 seems way more than the few ones visible on the picture to the naked eye.
I did count less than a dozen when trying to tell them apart with the Gimp color picker tool.
Using Gimp and setting the contrast to the maximum and brightness to the minimum allow us to see that something is weird around the bottom region of the picture compared to the untouched image, like totally jammed:
Which points us to… steganography!
Steganography
Hiding data unsuspiciously in a file is called steganography.
Here we have a simple BMP file that seems to be normal at first glance, however its size seems to indicate that something is off.
A common and simple steganography technique to hide data within an image file is to embed the data to be hidden by slightly modifying the RGB values of the image pixels: overwriting the Least Significant Bit - LSB of the Red, Green and Blue channels with the bit of the data we want to hide.
Extra data has been stored, but the image looks the same. Extracting that data back is as simple as reading all the LSB bits of all the pixels RGB channels.
This is already well explained in many articles on Internet so I won’t write my own flavor here, but one I did find well explained can be found here.
Besides the quick n dirty check with Gimp, there are tools to analyze and possibly spot if such techniques have been used to hide data within a file, here is a neat list of useful stego tools and resources, I chose zsteg for a quick check:
|
|
Check out line 5: looks like we have an ELF executable hidden!
Extract the payload to a file:
$ zsteg -E b1,lsb,bY level8.bmp > new_binary
$ file new_binary
new_binary: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=3f43c1bc1bc2d1dccc12d2fbb1cb83347e8cb3b4, not stripped
$ ./new_binary
$
The extracted ELF binary is valid and can be executed, however it does nothing and will need to be reverse engineered too to get the flag to pursue the challenge.
But before that, for the sake of fully understanding the LSB trick I’ll extract the same payload myself with a Python script, with the help of the Python Image Library - PIL to avoid wasting time with BMP pixels data row padding.
I tried to make it very explicit, simple and well commented for the sake of understanding:
import sys
# Python Pillow library https://python-pillow.org/
from PIL import Image
input_bmp = sys.argv[1]
output_file = sys.argv[2]
print(f"Reading from BMP file '{input_bmp}'")
bmp = Image.open(input_bmp)
width, height = bmp.size
print(f"BMP height:{height} width:{width}")
# LSB bits retrieved from the RGB channels
lsb_bits = []
# Based on https://wiki.python.org/moin/BitManipulation
def setBit(i, offset):
mask = 1 << (7 - offset)
return (i| mask)
# Iterates over all the pixels of the image the way the pixels of a BMP image are stored:
# https://en.wikipedia.org/wiki/BMP_file_format#Pixel_storage
# "Usually pixels are stored "bottom-up", starting in the lower left corner, going from left to right,
# and then row by row from the bottom to the top of the image"
for y in range(height-1, 0, -1):
for x in range(width):
r, g, b = bmp.getpixel((x, y))
lsb_r = (r & 1)
lsb_g = (g & 1)
lsb_b = (b & 1)
lsb_bits.append(lsb_b)
lsb_bits.append(lsb_g)
lsb_bits.append(lsb_r)
# The byte being built from the LSB bits
byte = 0
bit_index = 0
# The new bytes assembled from the LSB bits - these are the 'hidden' data
new_bytes = []
with open(output_file, 'wb') as out:
for i in lsb_bits:
# No need to deal with the bits set to zero as the
# 'byte' variable is initialized/reset to zero
if i == 1:
byte = setBit(byte, bit_index)
bit_index +=1
# We have set all the bits of a byte, append to the array to be written to disk
if bit_index == 8:
bit_index = 0
new_bytes.append(byte)
byte = 0
# Finally, write to disk all the 'hidden' data bytes
print(f"Written {len(new_bytes)} bytes to '{output_file}'")
out.write(bytearray(new_bytes))
Let’s ensure we get out of it the same binary as with zsteg:
$ python3 lvl8.py level8.bmp new_binary_python
Reading from BMP file 'level8.bmp'
BMP height:300 width:300
Written 33637 bytes to 'new_binary_python'
$ sha1sum new_binary
d53ea88677bb59b1f4186fc635ca29ab96216cdc new_binary
$ sha1sum new_binary_python
d53ea88677bb59b1f4186fc635ca29ab96216cdc new_binary_python
Pfew! the binaries extracted from zsteg and our python script are the same, let’s move on to the reversing of that new binary.
Note: While there are a lot of steganography tools likes zsteg to extract/analyze hidden data in files, I heavily recommend to get your hands dirty with the low level stuff at least once. I personally understood a lot of things while dealing with python and an hex editor.
Another ELF binary
Overview and static analysis
Back to the extracted binary reverse engineering. From the previous file invocation we know it’s a dynamically linked executable.
strace doesn’t reveal anything worth of interest:
$ strace ./new_binary
execve("./new_binary", ["./new_binary"], 0x7ffc4341d030 /* 30 vars */) = 0
brk(NULL) = 0x968000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=87790, ...}) = 0
mmap(NULL, 87790, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3c77988000
close(3) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0n\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1839792, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3c77986000
mmap(NULL, 1852680, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3c777c1000
mprotect(0x7f3c777e6000, 1662976, PROT_NONE) = 0
mmap(0x7f3c777e6000, 1355776, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f3c777e6000
mmap(0x7f3c77931000, 303104, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x170000) = 0x7f3c77931000
mmap(0x7f3c7797c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ba000) = 0x7f3c7797c000
mmap(0x7f3c77982000, 13576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3c77982000
close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7f3c77987540) = 0
mprotect(0x7f3c7797c000, 12288, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ) = 0
mprotect(0x7f3c779c8000, 4096, PROT_READ) = 0
munmap(0x7f3c77988000, 87790) = 0
brk(NULL) = 0x968000
brk(0x98b000) = 0x98b000
mprotect(0x969000, 4096, PROT_READ|PROT_WRITE) = 0
mprotect(0x969000, 4096, PROT_EXEC) = 0
exit_group(0) = ?
+++ exited with 0 +++
With ltrace :
$ ltrace ./new_binary
__libc_start_main(0x4005d6, 1, 0x7ffd1c95a0b8, 0x4006b0 <unfinished ...>
memalign(4096, 4096, 0x7ffd1c95a0c8, 0x7fb34ecb3718) = 0x1089000
mprotect(0x1089000, 4096, 3, 0) = 0
memcpy(0x1089000, "\234\235\236\232\233t\315\314\314\314s\315\314\314\314\204A\371\324\314\314\314\204G\331\376\314\314\314\303\311\223"..., 83) = 0x1089000
mprotect(0x1089000, 4096, 4, 82) = 0
+++ exited (status 0) +++
Besides the usual library calls invoked when a ELF binary is executed, 2 stand out in that context:
- memalign The obsolete function memalign() allocates size bytes and returns a pointer to the allocated memory
- memcpy The memcpy() function copies n bytes from memory area src to memory area dest
Confirmed by the file relocation section, lines 10 and 11:
|
|
No low-hanging fruits (i.e a string or any other clue) in the .rodata section of the binary:
$ objdump -sj .rodata new_binary
new_binary: file format elf64-x86-64
Contents of section .rodata:
400730 01000200 ....
Before moving on to dynamic analysis with gdb , we’ll try to quickly get the gist of the program still with static analysis of the disassembly code obtained from objdump
All the assembly snippets below come from the output of objdump -d -Mintel ./new_binary
Start to follow the code flow with the entry point of the program with readelf :
$ readelf -h ./new_binary_steg |grep Entry
Entry point address: 0x4004e0
Entry point is 0x4004e0 which points, into the .text section to the _start function on line3
|
|
On line 13, the first parameter for __libc_start_main() which lies in the rdi register (see x86 calling convention) is a pointer to the main function of the program, here it is 0x4005d6
Note:There is very well documented page explaining all the low level details on how ELF binary get loaded on Linux.
Below is the main function. Some interesting things can be noticed:
-
there is an interesting symbol named flag_bin_len - see # 6010b4 <flag_bin_len>
- unfortunately its address will only by known at runtime because relative to rip
-
line 9 to 27
-
allocate 4096 bytes of memory (edi register - 0x1000) with memalign()
-
make the allocated memory readable/writable with mprotect() 0x3 parameter stored in edx being PROT_READ | PROD_WRITE
-
then copy something into that memory chunk with memcp()
-
-
line 29 to 44
- there is a xor based loop starting on 0x400642 and branching on 0x400673 jb 400642 if edx != eax i.e. cmp edx,eax This suggests that something was obfuscated through the XOR cipher and is now decrypted, 4 bytes at a time.
-
line 45 and beyond
- another mprotect() call with the flag PROD_EXEC to make the previously allocated memory range executable
- the call rdx would lead us to think what has been decrypted previously was code, and is going to be executed, but that needs to be verified.
|
|
Time to switch to dynamic analysis with gdb to verify these assumptions!
Dynamic analysis with gdb
$ gdb -q --args ./new_binary
Reading symbols from ./new_binary...
(No debugging symbols found in ./new_binary)
(gdb) set disassembly-flavor intel
(gdb) b main
Breakpoint 1 at 0x4005da
(gdb) r
Starting program: /home/loic/shared/new_binary
Breakpoint 1, 0x00000000004005da in main ()
(gdb)
What has been done above:
-
Fire gdb on the binary gdb -q –args ./new_binary
-
Set personal preferences on the way the assembly will be displayed set disassembly-flavor intel
-
Set a breakpoint on main() b main
-
Start the program r
I won’t dump here again the contents of the main function, see above for reference or type x/50i main in gdb to print the 50 first instructions of main
Just a quick thing about what happen at runtime to flag_bin_len we noticed before: its memory address is 0x6010b4 and we can retrieve its value 0x53 – 4 bytes long as shown with DWORD PTR
(gdb) x/3i 0x400609
0x400609 <main+51>: call 0x4004c0 <mprotect@plt>
0x40060e <main+56>: mov eax,DWORD PTR [rip+0x200aa0] # 0x6010b4 <flag_bin_len>
0x400614 <main+62>: mov edx,eax
(gdb) x/4bx 0x6010b4
0x6010b4 <flag_bin_len>: 0x53 0x00 0x00 0x00
How did the memory address 0x6010b4 get computed by gdb?
It’s [rip+0x200aa0] and rip points to the next instruction to be executed which is 0x400614 , so that’s 0x200aa0 + 0x400614 = 0x6010b4
The memcpy call on 0x400622 copies something into the newly allocated chunk of memory if you follow what happen to rax right after the memalign call.
From its manpage and the x86 calling convention we know what are the parameters and the corresponding registers holding them:
- rdi:
void *dest
- rsi:
void *src
- rdx:
size_t n
Put a breakpoint just before to examine these registers:
(gdb) b *0x400622
Breakpoint 4 at 0x400622
(gdb) c
Continuing.
Breakpoint 4, 0x0000000000400622 in main ()
(gdb) p/x $rdi
$10 = 0x603000
(gdb) p/x $rsi
$11 = 0x601060
(gdb) p/x $rdx
$12 = 0x53
memcpy
will copy 0x53 bytes (rdx
) from 0x601060 (rsi
) to 0x603000 (rdi
) we can examine this data, which is 0x53 bytes long:
(gdb) x/53x 0x601060
0x601060 <flag_bin>: 0x9c 0x9d 0x9e 0x9a 0x9b 0x74 0xcd 0xcc
0x601068 <flag_bin+8>: 0xcc 0xcc 0x73 0xcd 0xcc 0xcc 0xcc 0x84
0x601070 <flag_bin+16>: 0x41 0xf9 0xd4 0xcc 0xcc 0xcc 0x84 0x47
0x601078 <flag_bin+24>: 0xd9 0xfe 0xcc 0xcc 0xcc 0xc3 0xc9 0x93
0x601080 <flag_bin+32>: 0x92 0x96 0x95 0x94 0x74 0xf0 0xcc 0xcc
0x601088 <flag_bin+40>: 0xcc 0x84 0xfd 0x33 0xc3 0xc9 0xfe 0xfe
0x601090 <flag_bin+48>: 0xff 0xf9 0xad 0xfa 0xae
There is also at this address a symbol aptly named flag_bin :) looks like we are coming closer to the goal.
Now the rough xor loop outlined before starts to make sense:
|
|
The end condition of the loop is once all the 0x53 bytes (test lines 14-15) have been xor’ed (flag_bin_len) with 0xffffffcc in line 10, then the jb 400642 <main+0x6c> jump won’t be taken and continue below.
Quickly showing what happen to the 4 first bytes of data after a few iterations of this xor loop, by:
- setting a breakpoint right before the exit condition test of the loop
- dumping memory where the result of that operation is stored - remember we got the allocated memory address earlier
|
|
You can see the initial values 0x9c 0x9d 0x9e 0x9a being xor’ed to 0x50 0x51 0x52 0x56 iteration after iteration, same applies to all the other 0x53 bytes.
Since these values seems to fit within the ASCII characters space, try to print them:
(gdb) x/4bc 0x603000
Definitively printable characters!
However this way of dumping the memory is not very practical, so I will be creating first this handy gdb macro to dump that area memory like xxd does:
|
|
Set the breakpoint right after the jb conditional jump, where supposedly the xor loop has been completed and dump the related working memory range:
|
|
Looks like we got something looking like a flag on lines 19-20! Trying it out:
|
|
That’s it, level completed! I won’t spoil the reward though :)
Even we have found the flag, I still wonder what are the instruction after the jb doing:
|
|
It looks code to be executed because:
- lines 5 and 8
mprotect
is called with PROT_EXEC to set as executable the memory range where is stored the flag + previous bytes
- line 16
- call code pointed by the rdx register
To investigate, from the previous running gdb session I’ll put a breakpoint on this instruction to inspect the value of the rdx register and dissasemble the 10 instructions from this address.
|
|
The rdx register has the value 0x603053 , and a disassembly of the instructions at this address show ret as first instruction, so basically it does nothing, just returning to the main function where it sets eax to 0 as an exit code.
As this is still a mystery, I’ll reach out to the book author and post updates here if any.
Feedback
Constructive criticism always welcome! (comments here, contact in About ) as I would be more than happy to learn more!