Simple Functions

Empty Function

We start of with a function that is passed no parameters, doesn't perform anything, and returns nothing. In this instance we'll use -O0 on the command line to disable optimisations.

void void_function(void) { }

This generates the following assembly:

void_function:
    push    rbp
    mov    rbp, rsp
    nop
    pop    rbp
    ret

The value of the base pointer register rbp - which holds the previous function's frame pointer - is pushed on to the stack. rbp is then set to the current value of the stack pointer rsp. This is part of the function prologue is part of all the functions we define. Later on we'll see space allocation for stack variables in this prologue as well.

A nop (no operation) is placed to ensure proper alignment. The previous base pointer is then popped off the stack and assigned to the rbp register (the function epilogue).

The ret (return from procedure) operation looks at the current value on the stack and transfers program control to that address. We'll see how this is placed on to the stack in the next section.

Calling a Function

Lets add a main() function which will call our void functions (and other functions we'll defined in this chapter). This isn't a real 'main', which is part of the C runtime library.

The C code is now this:

void void_function(void) { }

int main(void) {
    void_function();
}

This generates the following assembly:

void_function:
    push    rbp
    mov    rbp, rsp
    nop
    pop    rbp
    ret
main:
    push    rbp
    mov    rbp, rsp
    call    void_function
    mov    eax, 0
    pop    rbp
    ret

Let's start in the main function; after the function prologue (push / mov), we call our void_function. There are four different types of call but we assume this is a 'near' call. The processor pushes the value of the instruction pointer register eip on to the stack. The eip register will have the offset of the instruction after the call instruction. The processor then branches to the location of void_function.

This should make the ret instruction clearer - it simply reverses the call instruction by popping the previous value of the instruction pointer eip off the stack and branching to that location.

Returning a value

How does a function return a value? Let's return an integer.

int int_function(void) { return 64; }

int main(void) {
    int_function();
}

int_return:
    push    rbp
    mov    rbp, rsp
    mov    eax, 64
    pop    rbp
    ret
main:
    push    rbp
    mov    rbp, rsp
    call    int_return
    mov    eax, 0
    pop    rbp
    ret

We see that the int_return() function, after the function prologue, places the immediate value of 64 into the eax register. This is due to the calling convention of the underlying architecture that gcc is compiling for. In this case it's x86_64 on Linux, so it's a System V AMD64 ABI convention. In this convention, an integer return value is returned in the eax register.

The main() function doesn't do anything with the return value, so eax gets clobbered by main() returning 0. TODO: Where in the standard does it say no return == 0?

What happens if we switch to the -O1 optimisation level? The assembly generated is:

int_function:
    mov    eax, 64
    ret
main:
    mov    eax, 0
    ret

We see int_function() doesn't perform any of the function prologue - it simply mov's 64 into the eax register and returns. In main() the int_function() isn't even called - mov's 0 into eax and returns.

Stack Variables

How are stack variables allocated? In the stack_vars() function below, we define three integers on the stack, assign integer values to two of them, perform an addition and assign the result to the third.

void stack_vars(void) { 
    int a, b, c;
    a = 1;
    b = 2;
    c = a + b;
}

int main(void) {
    stack_vars();  
}

stack_vars:
    push    rbp
    mov    rbp, rsp
    mov    DWORD PTR [rbp-12], 1
    mov    DWORD PTR [rbp-8], 2
    mov    edx, DWORD PTR [rbp-12]
    mov    eax, DWORD PTR [rbp-8]
    add    eax, edx
    mov    DWORD PTR [rbp-4], eax
    nop
    pop    rbp
    ret
main:
    push    rbp
    mov    rbp, rsp
    call    stack_vars
    mov    eax, 0
    pop    rbp
    ret

In assembly, we see the instruction mov DWORD PTR [rbp-12], 1, which corresponds to a = 1. Let's break this down:

We know that rbp currently holds the memory location of the base of the stack, thus rbp - 12 is a memory location 12 bytes lower than the stack frame.
- In x86, the stack grows down from higher addresses to lower addresses, hence the minus.
Square brackets perform a similar function to the * in C, so [rbp - 12] refers to the value in the memory location rbp - 12.
DWORD PTR is a size directive as we cannot determine how many bytes to move purely from the value '1'.
- The DWORD is a double word, which is 32 bits on x86.
- The PTR denotes that we are referring to a memory location.
Thus we move the immediate value '1', considered to be a 4 byte number, to a memory location starting 12 bytes lower on the stack.

The same assignment happens for b, except that the memory location is 8 bytes lower on the stack.

The values in these memory locations are then moved to general purpose registers, with '1' in edx and '2' in eax. Why these registers? Under the System V AMD64 ABI the registers eax, ecx, edx, st0 - st7, es and gs are considered 'volatile'. If they are required by the caller following the callee function, they must be saved by the caller.

The add eax, edx adds the values in the first register (destination) to the value in the second register (source) and stores the result in the destination register. The result, now in eax, is then moved to the memory location four bytes into the function's stack.

Parameters

Up until now our functions haven't taken any parameters. Let's create a function add() which takes two integers and adds them together, then returns the result.

int add(int a, int b) {
    return a + b;
}

void main(void) {
    add(1, 2);
}

add:
    push    rbp
    mov    rbp, rsp
    mov    DWORD PTR [rbp-4], edi
    mov    DWORD PTR [rbp-8], esi
    mov    edx, DWORD PTR [rbp-4]
    mov    eax, DWORD PTR [rbp-8]
    add    eax, edx
    pop    rbp
    ret
main:
    push    rbp
    mov    rbp, rsp
    mov    esi, 2
    mov    edi, 1
    call    add
    nop
    pop    rbp
    ret

We start off by looking in the main() function. After its prologue, the immediate values '2' and '1' are moved into esi and edi respectively. The choice of these registers is again tied to the ABI. Under the System V AMD64 ABI, integer or pointer arguments are passed in registers edi, esi, rdx, rcx, r8 and r9. Any additional arguments are passed on the stack.

Within the add() function, the values are moved from the registers on to the stack, with the memory location starting at rbp - 4 holding '1' and the location starting at rbp - 8 holding '2'. They are then moved back into the edx and eax registers and the add instruction is called on them. The caller's base pointer is popped off the stack into rbp, and the function returns with the result of the addition in eax.

This previous example begs the question: what if the function has more than six parameters? In this case, the first 6 parameters are stored in the aforementioned registers, and the rest of the arguments are placed by the caller on the stack. Under the System V AMD 64 ABI, the variables are placed on the stack 'right to left'. Thus the first value popped off the stack will be the leftmost argument not placed in register.

Let's create an add_10() function which adds 10 integers together:

int add_10(int a, int b, int c, int d, int e, int f, int g, int h, int i, int j) {
    return a + b + c + d + e + f + g + h + i + j;
}

void main(void) {
    add_10(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
}

add_10:
    push    rbp
    mov    rbp, rsp
    mov    DWORD PTR [rbp-4], edi
    mov    DWORD PTR [rbp-8], esi
    mov    DWORD PTR [rbp-12], edx
    mov    DWORD PTR [rbp-16], ecx
    mov    DWORD PTR [rbp-20], r8d
    mov    DWORD PTR [rbp-24], r9d
    mov    edx, DWORD PTR [rbp-4]
    mov    eax, DWORD PTR [rbp-8]
    add    edx, eax
    mov    eax, DWORD PTR [rbp-12]
    add    edx, eax
    mov    eax, DWORD PTR [rbp-16]
    add    edx, eax
    mov    eax, DWORD PTR [rbp-20]
    add    edx, eax
    mov    eax, DWORD PTR [rbp-24]
    add    edx, eax
    mov    eax, DWORD PTR [rbp+16]
    add    edx, eax
    mov    eax, DWORD PTR [rbp+24]
    add    edx, eax
    mov    eax, DWORD PTR [rbp+32]
    add    edx, eax
    mov    eax, DWORD PTR [rbp+40]
    add    eax, edx
    pop    rbp
    ret
main:
    push    rbp
    mov    rbp, rsp
    push    10
    push    9
    push    8
    push    7
    mov    r9d, 6
    mov    r8d, 5
    mov    ecx, 4
    mov    edx, 3
    mov    esi, 2
    mov    edi, 1
    call    add_10
    add    rsp, 32
    nop
    leave
    ret

Starting from the main() function, arguments 'j', 'i', 'h' and then 'g' are pused on to the stack. The rest of the arguments are placed into the registers, then the add_10() function is called. The six arguments passed in registers are moved on to the stack from rbp - 4 to rbp - 24. The addition process is then:

The first two parameters are placed into edx and eax.
These are added together with the result stored in edx.
For each subsequent addition, the value is moved from the stack to the eax register and added with edx, with the result stored in edx.

Once the sixth parameter is added we see the mov instruction reaching to the memory locations rbp + 16, rbp + 24, etc, which is the location on the stack where the four parameters were pushed by the main() function.

The last addition places the result in the return register eax rather than edx, the function epilogue is called, and the function returns.

Back in the main() function the return value is ignored, then 32 bytes is added to the stack pointer with the add rsp, 32 instruction. This restores the stack pointer to the point before the parameters (4 parameters pushed * 8 bytes per parameter). We see a new leave instruction used. This instruction is equivalent to mov esp, ebp, then pop ebp. This is the reverse of the function prologue, and returns the base pointer and stack pointers to their values previous to this function's invocation.

Putting It Together

We'll finish off by using our add() function, but we'll use the return value and assign it to a stack variable in the main() function.

int add(int a, int b) {
    return a + b;
}

int main(void) {
    int x, y = 2;
    x = add(1, 2);
    return x + y;
}

add:
    push    rbp
    mov    rbp, rsp
    mov    DWORD PTR [rbp-4], edi
    mov    DWORD PTR [rbp-8], esi
    mov    edx, DWORD PTR [rbp-4]
    mov    eax, DWORD PTR [rbp-8]
    add    eax, edx
    pop    rbp
    ret
    .size    add, .-add
    .globl    main
    .type    main, @function
main:
    push    rbp
    mov    rbp, rsp
    sub    rsp, 16
    mov    DWORD PTR [rbp-4], 2
    mov    esi, 2
    mov    edi, 1
    call    add
    mov    DWORD PTR [rbp-8], eax
    mov    edx, DWORD PTR [rbp-8]
    mov    eax, DWORD PTR [rbp-4]
    add    eax, edx
    leave
    ret

Starting in the main function, following the prologue that we've seen is the sub rsp, 16, expanding the stack (remember, it grows down) by 16 bytes. This makes room for our two stack integer stack variables which total 8 bytes. The reason grows the stack by 16 is, once again, the due to the ABI.

Functions