Simple Functions
Empty Function
We start of with a function that is passed no parameters, doesn't perform anything, and returns nothing. In this instance we'll use -O0 on the command line to disable optimisations.
void void_function(void) { }
This generates the following assembly:
void_function:
push rbp
mov rbp, rsp
nop
pop rbp
ret
The value of the base pointer register rbp
- which holds the previous function's frame pointer - is pushed on to the stack. rbp
is then set to the current value of the stack pointer rsp
. This is part of the function prologue is part of all the functions we define. Later on we'll see space allocation for stack variables in this prologue as well.
A nop
(no operation) is placed to ensure proper alignment. The previous base pointer is then popped off the stack and assigned to the rbp
register (the function epilogue).
The ret
(return from procedure) operation looks at the current value on the stack and transfers program control to that address. We'll see how this is placed on to the stack in the next section.
Calling a Function
Lets add a main()
function which will call our void functions (and other functions we'll defined in this chapter). This isn't a real 'main', which is part of the C runtime library.
The C code is now this:
void void_function(void) { }
int main(void) {
void_function();
}
This generates the following assembly:
void_function:
push rbp
mov rbp, rsp
nop
pop rbp
ret
main:
push rbp
mov rbp, rsp
call void_function
mov eax, 0
pop rbp
ret
Let's start in the main function; after the function prologue (push
/ mov
), we call
our void_function
. There are four different types of call
but we assume this is a 'near' call. The processor pushes the value of the instruction pointer register eip
on to the stack. The eip
register will have the offset of the instruction after the call
instruction. The processor then branches to the location of void_function
.
This should make the ret
instruction clearer - it simply reverses the call
instruction by popping the previous value of the instruction pointer eip
off the stack and branching to that location.
Returning a value
How does a function return a value? Let's return an integer.
int int_function(void) { return 64; }
int main(void) {
int_function();
}
int_return:
push rbp
mov rbp, rsp
mov eax, 64
pop rbp
ret
main:
push rbp
mov rbp, rsp
call int_return
mov eax, 0
pop rbp
ret
We see that the int_return()
function, after the function prologue, places the immediate value of 64 into the eax
register. This is due to the calling convention of the underlying architecture that gcc is compiling for. In this case it's x86_64 on Linux, so it's a System V AMD64 ABI convention. In this convention, an integer return value is returned in the eax
register.
The main()
function doesn't do anything with the return value, so eax
gets clobbered by main()
returning 0. TODO: Where in the standard does it say no return == 0?
What happens if we switch to the -O1 optimisation level? The assembly generated is:
int_function:
mov eax, 64
ret
main:
mov eax, 0
ret
We see int_function()
doesn't perform any of the function prologue - it simply mov
's 64 into the eax
register and returns. In main()
the int_function()
isn't even called - mov
's 0 into eax
and returns.
Stack Variables
How are stack variables allocated? In the stack_vars()
function below, we define three integers on the stack, assign integer values to two of them, perform an addition and assign the result to the third.
void stack_vars(void) {
int a, b, c;
a = 1;
b = 2;
c = a + b;
}
int main(void) {
stack_vars();
}
stack_vars:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-12], 1
mov DWORD PTR [rbp-8], 2
mov edx, DWORD PTR [rbp-12]
mov eax, DWORD PTR [rbp-8]
add eax, edx
mov DWORD PTR [rbp-4], eax
nop
pop rbp
ret
main:
push rbp
mov rbp, rsp
call stack_vars
mov eax, 0
pop rbp
ret
In assembly, we see the instruction mov DWORD PTR [rbp-12], 1
, which corresponds to a = 1
. Let's break this down:
- We know that
rbp
currently holds the memory location of the base of the stack, thusrbp - 12
is a memory location 12 bytes lower than the stack frame.- In x86, the stack grows down from higher addresses to lower addresses, hence the minus.
- Square brackets perform a similar function to the
*
in C, so[rbp - 12]
refers to the value in the memory locationrbp - 12
. DWORD PTR
is a size directive as we cannot determine how many bytes to move purely from the value '1'.- The
DWORD
is a double word, which is 32 bits on x86. - The
PTR
denotes that we are referring to a memory location.
- The
- Thus we move the immediate value '1', considered to be a 4 byte number, to a memory location starting 12 bytes lower on the stack.
The same assignment happens for b
, except that the memory location is 8 bytes lower on the stack.
The values in these memory locations are then moved to general purpose registers, with '1' in edx
and '2' in eax
. Why these registers? Under the System V AMD64 ABI the registers eax
, ecx
, edx
, st0 - st7
, es
and gs
are considered 'volatile'. If they are required by the caller following the callee function, they must be saved by the caller.
The add eax, edx
adds the values in the first register (destination) to the value in the second register (source) and stores the result in the destination register. The result, now in eax
, is then moved to the memory location four bytes into the function's stack.
Parameters
Up until now our functions haven't taken any parameters. Let's create a function add()
which takes two integers and adds them together, then returns the result.
int add(int a, int b) {
return a + b;
}
void main(void) {
add(1, 2);
}
add:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
mov edx, DWORD PTR [rbp-4]
mov eax, DWORD PTR [rbp-8]
add eax, edx
pop rbp
ret
main:
push rbp
mov rbp, rsp
mov esi, 2
mov edi, 1
call add
nop
pop rbp
ret
We start off by looking in the main()
function. After its prologue, the immediate values '2' and '1' are moved into esi
and edi
respectively. The choice of these registers is again tied to the ABI. Under the System V AMD64 ABI, integer or pointer arguments are passed in registers edi
, esi
, rdx
, rcx
, r8
and r9
. Any additional arguments are passed on the stack.
Within the add()
function, the values are moved from the registers on to the stack, with the memory location starting at rbp - 4
holding '1' and the location starting at rbp - 8
holding '2'. They are then moved back into the edx
and eax
registers and the add
instruction is called on them. The caller's base pointer is popped off the stack into rbp
, and the function returns with the result of the addition in eax
.
This previous example begs the question: what if the function has more than six parameters? In this case, the first 6 parameters are stored in the aforementioned registers, and the rest of the arguments are placed by the caller on the stack. Under the System V AMD 64 ABI, the variables are placed on the stack 'right to left'. Thus the first value popped off the stack will be the leftmost argument not placed in register.
Let's create an add_10()
function which adds 10 integers together:
int add_10(int a, int b, int c, int d, int e, int f, int g, int h, int i, int j) {
return a + b + c + d + e + f + g + h + i + j;
}
void main(void) {
add_10(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
}
add_10:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
mov DWORD PTR [rbp-12], edx
mov DWORD PTR [rbp-16], ecx
mov DWORD PTR [rbp-20], r8d
mov DWORD PTR [rbp-24], r9d
mov edx, DWORD PTR [rbp-4]
mov eax, DWORD PTR [rbp-8]
add edx, eax
mov eax, DWORD PTR [rbp-12]
add edx, eax
mov eax, DWORD PTR [rbp-16]
add edx, eax
mov eax, DWORD PTR [rbp-20]
add edx, eax
mov eax, DWORD PTR [rbp-24]
add edx, eax
mov eax, DWORD PTR [rbp+16]
add edx, eax
mov eax, DWORD PTR [rbp+24]
add edx, eax
mov eax, DWORD PTR [rbp+32]
add edx, eax
mov eax, DWORD PTR [rbp+40]
add eax, edx
pop rbp
ret
main:
push rbp
mov rbp, rsp
push 10
push 9
push 8
push 7
mov r9d, 6
mov r8d, 5
mov ecx, 4
mov edx, 3
mov esi, 2
mov edi, 1
call add_10
add rsp, 32
nop
leave
ret
Starting from the main()
function, arguments 'j', 'i', 'h' and then 'g' are pused on to the stack. The rest of the arguments are placed into the registers, then the add_10()
function is called. The six arguments passed in registers are moved on to the stack from rbp - 4
to rbp - 24
. The addition process is then:
- The first two parameters are placed into
edx
andeax
. - These are added together with the result stored in
edx
. - For each subsequent addition, the value is moved from the stack to the
eax
register and added withedx
, with the result stored inedx
.
Once the sixth parameter is added we see the mov
instruction reaching to the memory locations rbp + 16
, rbp + 24
, etc, which is the location on the stack where the four parameters were pushed by the main()
function.
The last addition places the result in the return register eax
rather than edx
, the function epilogue is called, and the function returns.
Back in the main()
function the return value is ignored, then 32 bytes is added to the stack pointer with the add rsp, 32
instruction. This restores the stack pointer to the point before the parameters (4 parameters pushed * 8 bytes per parameter). We see a new leave
instruction used. This instruction is equivalent to mov esp, ebp
, then pop ebp
. This is the reverse of the function prologue, and returns the base pointer and stack pointers to their values previous to this function's invocation.
Putting It Together
We'll finish off by using our add()
function, but we'll use the return value and assign it to a stack variable in the main()
function.
int add(int a, int b) {
return a + b;
}
int main(void) {
int x, y = 2;
x = add(1, 2);
return x + y;
}
add:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
mov edx, DWORD PTR [rbp-4]
mov eax, DWORD PTR [rbp-8]
add eax, edx
pop rbp
ret
.size add, .-add
.globl main
.type main, @function
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 2
mov esi, 2
mov edi, 1
call add
mov DWORD PTR [rbp-8], eax
mov edx, DWORD PTR [rbp-8]
mov eax, DWORD PTR [rbp-4]
add eax, edx
leave
ret
Starting in the main function, following the prologue that we've seen is the sub rsp, 16
, expanding the stack (remember, it grows down) by 16 bytes. This makes room for our two stack integer stack variables which total 8 bytes. The reason grows the stack by 16 is, once again, the due to the ABI.