# x86 Review | Intro (choose your own adventure)

Martin Kellogg

- the PA3c1 (codegen testing) deadline has already passed
  - if you forgot about it, we will still accept submissions (with a penalty)

- the PA3c1 (codegen testing) deadline has already passed
  - if you forgot about it, we will still accept submissions (with a penalty)
- PA3c2 (TAC) is due later this week ("before spring break")
  - you should already have started, or you are behind
  - if there is demand from the class, I will consider a short extension on this assignment to e.g., Monday

- the PA3c1 (codegen testing) deadline has already passed
  - if you forgot about it, we will still accept submissions (with a penalty)
- PA3c2 (TAC) is due later this week ("before spring break")
  - you should already have started, or you are behind
  - if there is demand from the class, I will consider a short extension on this assignment to e.g., Monday
- I have become aware of a **bug in the reference compiler**'s x86-64 module; a fix will be forthcoming. For now, don't trust it.

- the PA3c1 (codegen testing) deadline has already passed
  - if you forgot about it, we will still accept submissions (with a penalty)
- PA3c2 (TAC) is due later this week ("before spring break")
  - you should already have started, or you are behind
  - if there is demand from the class, I will consider a short extension on this assignment to e.g., Monday
- I have become aware of a **bug in the reference compiler**'s x86-64 module; a fix will be forthcoming. For now, don't trust it.
- Don't forget there is a midterm in this class the week after spring break!

• Overview of x86-64 architecture

- Overview of x86-64 architecture
  - this might be a review of some things you learned in 350

- Overview of x86-64 architecture
  - this might be a review of some things you learned in 350
  - it might also be new
    - either is fine! (but we're going to go quickly...)

- Overview of x86-64 architecture
  - this might be a review of some things you learned in 350
  - it might also be new
    - either is fine! (but we're going to go quickly...)
  - my goal today: make sure you're aware of the "usual traps" in the x86-64 standard, and make sure you know where to go to learn more
    - I am not trying to give you a detailed understanding of each construct today (you'll get that from doing PA3...)

- When you're doing PA3, you're going to want to refer to some resources
  - $\circ$  there are many on the web
    - and I encourage you to explore

- When you're doing PA3, you're going to want to refer to some resources
  - $\circ$  there are many on the web
    - and I encourage you to explore
    - I don't suggest trusting ChatGPT or similar tools on this assembly programming requires getting all the details right, and LLMs are very bad at being detail-oriented

- When you're doing PA3, you're going to want to refer to some resources
  - $\circ$  there are many on the web
    - and I encourage you to explore
    - I don't suggest trusting ChatGPT or similar tools on this assembly programming requires getting all the details right, and LLMs are very bad at being detail-oriented
  - I have curated a few resources here: <u>https://kelloggm.github.io/martinjkellogg.com/teaching/cs48</u> <u>5-sp25/languages/#x86-64</u>

- When you're doing PA3, you're going to want to refer to some resources
  - there are many on the web
    - and I encourage you to explore
    - I don't suggest trusting ChatGPT or similar tools on this assembly programming requires getting all the details right, and LLMs are very bad at being detail-oriented
  - I have curated a few resources here: <u>https://kelloggm.github.io/martinjkellogg.com/teaching/cs48</u> <u>5-sp25/languages/#x86-64</u>

- x86 is a very old assembly language
  - 8086 processor for which it was originally designed was released in 1976...

- x86 is a very old assembly language
  - 8086 processor for which it was originally designed was released in 1976...
  - microarchitecture has changed a lot since then: pipelining, super-scalar, out-of-order, caching, multicore, ...

- x86 is a very old assembly language
  - 8086 processor for which it was originally designed was released in 1976...
  - microarchitecture has changed a lot since then: pipelining, super-scalar, out-of-order, caching, multicore, ...
- Modern x86 is **still backward-compatible** with 8086 code
  - You can get VisiCalc 1.0 on the web & run it!

- x86 is a very old assembly language
  - 8086 processor for which it was originally designed was released in 1976...
  - microarchitecture has changed a lot since then: pipelining, super-scalar, out-of-order, caching, multicore, ...
- Modern x86 is still backward-compatible with 8086 code
  You can get VisiCalc 1.0 on the web & run it!
- Intel's descriptions of the architecture are engulfed with modes and flags; the modern processor is fairly straightforward
  - Load/Store from memory
  - Register-register operations

• x86 is technically a CISC (complex instruction set computer) architecture

- x86 is technically a CISC (complex instruction set computer) architecture
  - key definitional feature of a CISC: there are instructions that take more than one clock cycle to execute

- x86 is technically a CISC (complex instruction set computer) architecture
  - key definitional feature of a CISC: there are instructions that take more than one clock cycle to execute
- However, the parts of x86 that you should be using in your compiler are actually closer to a traditional RISC architecture

- x86 is technically a CISC (complex instruction set computer) architecture
  - key definitional feature of a CISC: there are instructions that take more than one clock cycle to execute
- However, the parts of x86 that you should be using in your compiler are actually closer to a traditional RISC architecture
  *RISC* = "reduced instruction set computer"

- x86 is technically a **CISC** (*complex instruction set computer*) architecture
  - key definitional feature of a CISC: there are instructions that take more than one clock cycle to execute
- However, the parts of x86 that you should be using in your compiler are actually closer to a traditional RISC architecture
  - **RISC** = "reduced instruction set computer"
  - most complex instructions exist for backward-compatibility and can be slow

- x86 is technically a **CISC** (*complex instruction set computer*) architecture
  - key definitional feature of a CISC: there are instructions that take more than one clock cycle to execute
- However, the parts of x86 that you should be using in your compiler are actually closer to a traditional RISC architecture
  - **RISC** = "reduced instruction set computer"
  - most complex instructions exist for backward-compatibility and can be slow
  - other complex instructions exist to take advantage of peculiar hardware

- 16 64-bit general registers; 64-bit integers
  - but int is 32 bits usually; long is 64 bits

- 16 64-bit general registers; 64-bit integers
  - but int is 32 bits usually; long is 64 bits
- 64-bit address space; pointers are 8 bytes

- 16 64-bit general registers; 64-bit integers
  but int is 32 bits usually; long is 64 bits
- 64-bit address space; pointers are 8 bytes
- 16 SSE registers for floating point, SIMD

- 16 64-bit general registers; 64-bit integers
  - but int is 32 bits usually; long is 64 bits
- 64-bit address space; pointers are 8 bytes
- 16 SSE registers for floating point, SIMD
- Register-based function call conventions

- 16 64-bit general registers; 64-bit integers
  - $\circ$  but int is 32 bits usually; long is 64 bits
- 64-bit address space; pointers are 8 bytes
- 16 SSE registers for floating point, SIMD
- Register-based function call conventions
- Additional addressing modes (pc relative)

- 16 64-bit general registers; 64-bit integers
  - but int is 32 bits usually; long is 64 bits
- 64-bit address space; pointers are 8 bytes
- 16 SSE registers for floating point, SIMD
- Register-based function call conventions
- Additional addressing modes (pc relative)
- 32-bit legacy mode

- 16 64-bit general registers; 64-bit integers
  - $\circ$  but int is 32 bits usually; long is 64 bits
- 64-bit address space; pointers are 8 bytes
- 16 SSE registers for floating point, SIMD
- Register-based function call conventions
- Additional addressing modes (pc relative)
- 32-bit legacy mode
- Some pruning of old features

#### x86-64 Syntax

• Two main assembler languages for x86-64:

#### x86-64 Syntax

- Two main assembler languages for x86-64:
  - Intel/Microsoft syntax: what's in the Intel docs

### x86-64 Syntax

- Two main assembler languages for x86-64:
  - Intel/Microsoft syntax: what's in the Intel docs
  - **AT&T/GNU syntax**: what we're generating and what's in the linked handouts, course webpage, etc.
#### x86-64 Syntax

- Two main assembler languages for x86-64:
  - Intel/Microsoft syntax: what's in the Intel docs
  - **AT&T/GNU syntax**: what we're generating and what's in the linked handouts, course webpage, etc.
    - You can use gcc -S to generate AT&T-style assembly code from C/C++ code for more examples

#### x86-64 Syntax

- Two main assembler languages for x86-64:
  - Intel/Microsoft syntax: what's in the Intel docs
  - **AT&T/GNU syntax**: what we're generating and what's in the linked handouts, course webpage, etc.
    - You can use gcc -S to generate AT&T-style assembly code from C/C++ code for more examples
  - I will always use AT&T/GNU syntax

| Intel/Microsoft | AT&T/GNU |
|-----------------|----------|
|                 |          |
|                 |          |
|                 |          |
|                 |          |
|                 |          |
|                 |          |
|                 |          |

|               | Intel/Microsoft        | AT&T/GNU              |
|---------------|------------------------|-----------------------|
| Operand order | a = a op b (dst first) | b = a op b (dst last) |
|               |                        |                       |
|               |                        |                       |
|               |                        |                       |
|               |                        |                       |
|               |                        |                       |
|               |                        |                       |

|                  | Intel/Microsoft        | AT&T/GNU              |
|------------------|------------------------|-----------------------|
| Operand order    | a = a op b (dst first) | b = a op b (dst last) |
| Memory addresses | [register+offset]      | offset(register)      |
|                  |                        |                       |
|                  |                        |                       |
|                  |                        |                       |
|                  |                        |                       |
|                  |                        |                       |

|                       | Intel/Microsoft        | AT&T/GNU                                              |
|-----------------------|------------------------|-------------------------------------------------------|
| Operand order         | a = a op b (dst first) | b = a op b (dst last)                                 |
| Memory addresses      | [register+offset]      | offset(register)                                      |
| Instruction mnemonics | mov, add, push,        | movq, addq, pushq (explicit<br>operand size after op) |
|                       |                        |                                                       |
|                       |                        |                                                       |
|                       |                        |                                                       |

|                       | Intel/Microsoft        | AT&T/GNU                                              |
|-----------------------|------------------------|-------------------------------------------------------|
| Operand order         | a = a op b (dst first) | b = a op b (dst last)                                 |
| Memory addresses      | [register+offset]      | offset(register)                                      |
| Instruction mnemonics | mov, add, push,        | movq, addq, pushq (explicit<br>operand size after op) |
| Register names        | rax, rbx, rbp, rsp,    | %rax, %rbx, %rbp, %rsp,                               |
|                       |                        |                                                       |
|                       |                        |                                                       |

|                       | Intel/Microsoft        | AT&T/GNU                                              |
|-----------------------|------------------------|-------------------------------------------------------|
| Operand order         | a = a op b (dst first) | b = a op b (dst last)                                 |
| Memory addresses      | [register+offset]      | offset(register)                                      |
| Instruction mnemonics | mov, add, push,        | movq, addq, pushq (explicit<br>operand size after op) |
| Register names        | rax, rbx, rbp, rsp,    | %rax, %rbx, %rbp, %rsp,                               |
| Constants             | 17, 42                 | \$17, \$42                                            |
|                       |                        |                                                       |

|                       | Intel/Microsoft        | AT&T/GNU                                              |
|-----------------------|------------------------|-------------------------------------------------------|
| Operand order         | a = a op b (dst first) | b = a op b (dst last)                                 |
| Memory addresses      | [register+offset]      | offset(register)                                      |
| Instruction mnemonics | mov, add, push,        | movq, addq, pushq (explicit<br>operand size after op) |
| Register names        | rax, rbx, rbp, rsp,    | %rax, %rbx, %rbp, %rsp,                               |
| Constants             | 17, 42                 | \$17, \$42                                            |
| Comments              | ; to end of line       | # to end of line or /* */                             |

| Intel vs AT&T Syntax  |                      | Intel docs include many complex,                 |                                                       |
|-----------------------|----------------------|--------------------------------------------------|-------------------------------------------------------|
|                       | Intel/Microsoft      | that aren't commonly used by<br>modern compilers |                                                       |
| Operand order         | a = a op b (dst firs |                                                  |                                                       |
| Memory addresses      | [register+offset]    |                                                  | offset(register)                                      |
| Instruction mnemonics | mov, add, push,      | •                                                | movq, addq, pushq (explicit<br>operand size after op) |
| Register names        | rax, rbx, rbp, rsp,  |                                                  | %rax, %rbx, %rbp, %rsp,                               |
| Constants             | 17,42                |                                                  | \$17, \$42                                            |
| Comments              | ; to end of line     |                                                  | # to end of line or /* */                             |

• 8-bit bytes, byte-addressable

- 8-bit bytes, byte-addressable
- 16-, 32-, 64-bit *words*, **double words** and **quad words** (Intel terminology)

- 8-bit bytes, byte-addressable
- 16-, 32-, 64-bit *words*, **double words** and **quad words** (Intel terminology)
  - That's why the 'q' in 64-bit instructions like movq, addq, etc.

- 8-bit bytes, byte-addressable
- 16-, 32-, 64-bit *words*, **double words** and **quad words** (Intel terminology)
  - That's why the 'q' in 64-bit instructions like movq, addq, etc.
- Data should normally be *aligned* on "natural" boundaries

- 8-bit bytes, byte-addressable
- 16-, 32-, 64-bit *words*, **double words** and **quad words** (Intel terminology)
  - That's why the 'q' in 64-bit instructions like movq, addq, etc.
- Data should normally be *aligned* on "natural" boundaries
  - unaligned accesses are generally supported, but with a big performance penalty on modern machines

- 8-bit bytes, byte-addressable
- 16-, 32-, 64-bit *words*, **double words** and **quad words** (Intel terminology)
  - That's why the 'q' in 64-bit instructions like movq, addq, etc.
- Data should normally be *aligned* on "natural" boundaries
  - unaligned accesses are generally supported, but with a big performance penalty on modern machines
  - though function calls into e.g., glibc need to have the stack
     16-bit aligned (ask me how I know)

- 8-bit bytes, byte-addressable
- 16-, 32-, 64-bit *words*, **double words** and **quad words** (Intel terminology)
  - That's why the 'q' in 64-bit instructions like movq, addq, etc.
- Data should normally be *aligned* on "natural" boundaries
  - unaligned accesses are generally supported, but with a big performance penalty on modern machines
  - though function calls into e.g., glibc need to have the stack
     16-bit aligned (ask me how I know)
- Little-endian: address of a multi-byte integer is address of low-order byte

- 16 64-bit general registers
  - %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %rbp, %rsp, %r8-%r15

- 16 64-bit general registers
  - %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %rbp, %rsp, %r8-%r15
- Registers can be used as 64-bit integers or pointers, or as 32-bit ints

- 16 64-bit general registers
  - %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %rbp, %rsp, %r8-%r15
- Registers can be used as 64-bit integers or pointers, or as 32-bit ints
  - Also possible to reference low-order 16- and 8-bit chunks
    - (for the most part you shouldn't need to)

- 16 64-bit general registers
  - %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %rbp, %rsp, %r8-%r15
- Registers can be used as 64-bit integers or pointers, or as 32-bit ints
  - Also possible to reference low-order 16- and 8-bit chunks

■ (for the most part you shouldn't need to)

• To simplify your project, all Cool types have the same size (ints, pointers, even booleans!)

- 16 64-bit general registers
  - %rax, %rbx, %rcx, %rdx, %rsi, %rdi, %rbp, %rsp, %r8-%r15
- Registers can be used as 64-bit integers or pointers, or as 32-bit ints
  - Also possible to reference low-order 16- and 8-bit chunks

■ (for the most part you shouldn't need to)

- To simplify your project, **all Cool types have the same size** (ints, pointers, even booleans!)
  - I suggest you use 64 bits to store them to make your life easy
    - 32 bits is okay too, but it's your funeral

• Basic cycle (same as every processor you've ever seen):

• Basic cycle (same as every processor you've ever seen):

while (running) {

• Basic cycle (same as every processor you've ever seen):

while (running) {
 fetch instruction beginning at rip address

• Basic cycle (same as every processor you've ever seen):

```
while (running) {
   fetch instruction beginning at rip address
   rip <- rip + instruction length</pre>
```

• Basic cycle (same as every processor you've ever seen):

```
while (running) {
   fetch instruction beginning at rip address
   rip <- rip + instruction length
   execute instruction
}</pre>
```

• Basic cycle (same as every processor you've ever seen):

```
while (running) {
   fetch instruction beginning at rip address
   rip <- rip + instruction length
   execute instruction
}</pre>
```

• Sequential execution unless a jump stores a new "next instruction" address in %rip

• Basic cycle (same as every processor you've ever seen):

```
while (running) {
   fetch instruction beginning at rip address
   rip <- rip + instruction length
   execute instruction</pre>
```

- Sequential execution unless a jump stores a new "next instruction" address in %rip
  - %rip is a hidden register; cannot access directly as a register from assembly code, change by sequential instruction execution and jumps (including call and return)

• Typical data manipulation instruction:

opcode src, dst # comment

• Typical data manipulation instruction:

opcode src, dst # comment

• Meaning is:

dst <- dst op src

• Typical data manipulation instruction:

opcode src, dst # comment

• Meaning is:

dst <- dst op src

• Normally, one operand is a register; the other is a register, memory location, or integer constant

• Typical data manipulation instruction:

opcode src, dst # comment

• Meaning is:

dst <- dst op src

- Normally, one operand is a register; the other is a register, memory location, or integer constant
  - Can't have both operands in memory can't encode two memory addresses in a single instruction (e.g., cmp, mov)
### Instruction Format

• Typical data manipulation instruction:

opcode src, dst # comment

• Meaning is:

dst <- dst op src

- Normally, one operand is a register; the other is a register, memory location, or integer constant
  - Can't have both operands in memory can't encode two memory addresses in a single instruction (e.g., cmp, mov)
- Language is free-form, comments and labels may appear on lines by themselves (and can have multiple labels per line of code)

• Register %rsp points to the "top" of stack

- Register %rsp points to the "top" of stack
  - Dedicated for this use; **don't use for anything else**!

- Register %rsp points to the "top" of stack
  - Dedicated for this use; **don't use for anything else**!
- Points to the last 64-bit quadword pushed onto the stack (not next "free" quadword)

- Register %rsp points to the "top" of stack
  - Dedicated for this use; **don't use for anything else**!
- Points to the last 64-bit quadword pushed onto the stack (not next "free" quadword)
  - Should always be quadword (8-byte) aligned
    - It will start out this way, and will stay aligned unless your code does something bad

- Register %rsp points to the "top" of stack
  - Dedicated for this use; **don't use for anything else**!
- Points to the last 64-bit quadword pushed onto the stack (not next "free" quadword)
  - Should always be quadword (8-byte) aligned
    - It will start out this way, and will stay aligned unless your code does something bad
- Should be **16-byte aligned on function calls**

- Register %rsp points to the "top" of stack
  - Dedicated for this use; **don't use for anything else**!
- Points to the last 64-bit quadword pushed onto the stack (not next "free" quadword)
  - Should always be quadword (8-byte) aligned
    - It will start out this way, and will stay aligned unless your code does something bad
- Should be **16-byte aligned on function calls**
- Stack grows down (towards lower addresses)

#### **Stack Instructions**

### **Stack Instructions**

- pushq src
  - %rsp <- %rsp 8; memory[%rsp] <- src (e.g., push src onto the stack)</li>

## **Stack Instructions**

- pushq src
  - %rsp <- %rsp 8; memory[%rsp] <- src (e.g., push src onto the stack)</li>
- popq dst
  - dst <- memory[%rsp]; %rsp <- %rsp + 8</li>
    - (e.g., pop top of stack into dst and logically remove it from the stack)

• When a method is called, a stack frame is normally allocated on the logical "top" of the stack to hold its local variables

- When a method is called, a stack frame is normally allocated on the logical "top" of the stack to hold its local variables
  - Stack actually grows down towards lower memory addresses when a new stack frame is pushed (allocated)

- When a method is called, a stack frame is normally allocated on the logical "top" of the stack to hold its local variables
  - Stack actually grows down towards lower memory addresses when a new stack frame is pushed (allocated)
- Frame is popped on method return

- When a method is called, a stack frame is normally allocated on the logical "top" of the stack to hold its local variables
  - Stack actually grows down towards lower memory addresses when a new stack frame is pushed (allocated)
- Frame is popped on method return
- By convention, %rbp (base pointer) points to a known offset into the current active stack frame

- When a method is called, a stack frame is normally allocated on the logical "top" of the stack to hold its local variables
  - Stack actually grows down towards lower memory addresses when a new stack frame is pushed (allocated)
- Frame is popped on method return
- By convention, %rbp (base pointer) points to a known offset into the current active stack frame
  - Local variables referenced relative to %rbp

- When a method is called, a stack frame is normally allocated on the logical "top" of the stack to hold its local variables
  - Stack actually grows down towards lower memory addresses when a new stack frame is pushed (allocated)
- Frame is popped on method return
- By convention, %rbp (base pointer) points to a known offset into the current active stack frame
  - Local variables referenced relative to %rbp
  - Base pointer common in 32-bit x86 code; less so in x86-64 code where push/pop used less & stack frame normally has fixed size so locals can be referenced from %rsp easily

```
movq $17,%rax  # store 17 in %rax
```

| movq \$17,%rax | <pre># store 17 in %rax</pre>  |
|----------------|--------------------------------|
| movq %rcx,%rax | <pre># copy %rcx to %rax</pre> |

• These should cover most of what you'll need:

movq \$17,%rax# store 17 in %raxmovq %rcx,%rax# copy %rcx to %raxmovq 16(%rbp),%rax# copy memory to %rax

• These should cover most of what you'll need:

 movq \$17,%rax
 # store 17 in %rax

 movq %rcx,%rax
 # copy %rcx to %rax

 movq 16(%rbp),%rax
 # copy memory to %rax

 movq %rax,-24(%rbp)
 # copy %rax to memory

• These should cover most of what you'll need:

 movq \$17,%rax
 # store 17 in %rax

 movq %rcx,%rax
 # copy %rcx to %rax

 movq 16(%rbp),%rax
 # copy memory to %rax

 movq %rax,-24(%rbp)
 # copy %rax to memory

• References to object fields work similarly – put the object's memory address in a register and use that address plus an offset

| movq | \$17,% <b>rax</b> | <pre># store 17 in %rax</pre>    |
|------|-------------------|----------------------------------|
| movq | %rcx,%rax         | <pre># copy %rcx to %rax</pre>   |
| movq | 16(%rbp),%rax     | <pre># copy memory to %rax</pre> |
| movq | %rax,-24(%rbp)    | <pre># copy %rax to memory</pre> |

- References to object fields work similarly put the object's memory address in a register and use that address plus an offset
- Remember: can't have two memory addresses in a single instruction

 A memory address can combine the contents of two registers (with one optionally multiplied by 2, 4, or 8) plus a constant: base address + (index register \* scale) + constant

- A memory address can combine the contents of two registers (with one optionally multiplied by 2, 4, or 8) plus a constant: base address + (index register \* scale) + constant
- Main use of general form is for array subscripting or small computations if the compiler is clever

- A memory address can combine the contents of two registers (with one optionally multiplied by 2, 4, or 8) plus a constant: base address + (index register \* scale) + constant
- Main use of general form is for array subscripting or small computations if the compiler is clever
- Example: suppose we have an array A of 8-byte ints with the address of the array in %rcx and the index i in %rax.

- A memory address can combine the contents of two registers (with one optionally multiplied by 2, 4, or 8) plus a constant: base address + (index register \* scale) + constant
- Main use of general form is for array subscripting or small computations if the compiler is clever
- Example: suppose we have an array A of 8-byte ints with the address of the array in %rcx and the index i in %rax.
- Code to store %rbx in A[i]:

```
movq %rbx, 0(%rcx, %rax, 8)
```

- A memory address can combine the contents of two registers (with one optionally multiplied by 2, 4, or 8) plus a constant: base address + (index register \* scale) + constant
- Main use of general form is for array subscripting or small computations if the compiler is clever
- Example: suppose we have an array A of 8-byte ints with the address of the array in %rcx and the index i in %rax.
- Code to store %rbx in A[i]:

```
movq %rbx, 0(%rcx, %rax, 8)
```



- A memory address can combine the contents of two registers (with one optionally multiplied by 2, 4, or 8) plus a constant: base address + (index register \* scale) + constant
- Main use of general form is for array subscripting or small computations if the compiler is clever
- Example: suppose we have an array A of 8-byte ints with the address of the array in %rcx and the index i in %rax.
- Code to store %rbx in A[i]:



- A memory address can combine the contents of two registers (with one optionally multiplied by 2, 4, or 8) plus a constant: base address + (index register \* scale) + constant
- Main use of general form is for array subscripting or small computations if the compiler is clever
- Example: suppose we have an array A of 8-byte ints with the address of the array in %rcx and the index i in %rax.
- Code to store %rbx in A[i]:

```
movq %rbx, 0(%rcx, %rax, 8)
```

- A memory address can combine the contents of two registers (with one optionally multiplied by 2, 4, or 8) plus a constant: base address + (index register \* scale) + constant
- Main use of general form is for array subscripting or small computations if the compiler is clever
- Example: suppose we have an array A of 8-byte ints with the address of the array in %rcx and the index i in %rax.
- Code to store %rbx in A[i]:



#### Basic Data Movement/Arithmetic Instructions

#### **Basic Data Movement/Arithmetic Instructions**

movq src,dst
 dst <- src</pre>

#### **Basic Data Movement/Arithmetic Instructions**

movq src,dst
 dst <- src</pre>

addq src,dst
 dst <- dst + src</pre>
### **Basic Data Movement/Arithmetic Instructions**

movq src,dst
 dst <- src</pre>

addq src,dst
 dst <- dst + src</pre>

subq src,dst
 dst <- dst-src</pre>

#### **Basic Data Movement/Arithmetic Instructions**

movq src,dst
 dst <- src</pre>

incq dst
 dst <- dst + 1</pre>

addq src,dst
 dst <- dst + src</pre>

decq dst dst <- dst - 1

subq src,dst
 dst <- dst-src</pre>

#### **Basic Data Movement/Arithmetic Instructions**

movq src,dst
 dst <- src</pre>

addq src,dst
 dst <- dst + src</pre>

subq src,dst
 dst <- dst-src</pre>

incq dst
 dst <- dst + 1</pre>

decq dst dst <- dst - 1

negq dst
 dst <- -dst
 (2's complement arithmetic negation)</pre>

imulq src,dst
 dst ← dst \* src
 dst must be a register

imulq src,dst
dst ← dst \* src
dst must be a register

#### idivq src Divide %rdx:%rax by src (%rdx:%rax holds signextended 128-bit value; cannot use other registers for division!) %rax <- quotient %rdx <- remainder

imulq src,dst
 dst ← dst \* src
 dst must be a register

#### cqto

%rdx:%rax <- 128-bit sign extended copy of %rax (why? To prep numerator for idivq!) idivq src Divide %rdx:%rax by src (%rdx:%rax holds signextended 128-bit value; cannot use other registers for division!) %rax <- quotient %rdx <- remainder

andq src,dst
 dst <- src & dst</pre>

andq src,dst
 dst <- src & dst</pre>

orq src,dst
 dst <- dst | src</pre>

xorq src,dst
 dst <- dst ^ src</pre>

andq src,dst
 dst <- src & dst</pre>

orq src,dst
 dst <- dst | src</pre>

xorq src,dst
 dst <- dst ^ src</pre>

notq dst
 dst <- ~dst
 (1's complement logical negation)</pre>

andq src,dst
 dst <- src & dst</pre>

orq src,dst
 dst <- dst | src</pre>

xorq src,dst
 dst <- dst ^ src</pre>

notq dst
 dst <- ~dst
 (1's complement logical negation)</pre>

Note similarity between notq and negq (a few slides back). Difference is 1's vs 2's complement negation.

shlq dst,count
 dst <- dst shifted left count bits</pre>

shlq dst,count
 dst <- dst shifted left count bits</pre>

shrq dst,count
 dst <- dst shifted right count
 bits (0 fill)</pre>

shlq dst,count
 dst <- dst shifted left count bits</pre>

shrq dst,count
 dst <- dst shifted right count
 bits (0 fill)</pre>

sarq dst,count
 dst <- dst shifted right count
 bits (sign bit fill)</pre>

```
shlq dst,count
 dst <- dst shifted left count bits</pre>
```

```
shrq dst,count
 dst <- dst shifted right count
 bits (0 fill)</pre>
```

sarq dst,count
dst <- dst shifted right count
bits (sign bit fill)</pre>

rolq dst,count
dst <- dst rotated left
count bits</pre>

rorq dst,count
 dst <- dst rotated right
 count bits</pre>

- Very fast and can often be used to optimize multiplication and division by small constants
  - If you're interested, look at "Hacker's Delight" by Henry Warren

- Very fast and can often be used to optimize multiplication and division by small constants
  - If you're interested, look at "Hacker's Delight" by Henry Warren
- Lots of very cool bit fiddling and other algorithms
  - But **be careful** in the project: be sure semantics are OK

- Very fast and can often be used to optimize multiplication and division by small constants
  - If you're interested, look at "Hacker's Delight" by Henry Warren
- Lots of very cool bit fiddling and other algorithms
  - But **be careful** in the project: be sure semantics are OK
- Example: right shift is not the same as Java/C/C++/etc. integer divide for negative numbers (why?)

- Very fast and can often be used to optimize multiplication and division by small constants
  - If you're interested, look at "Hacker's Delight" by Henry Warren
- Lots of very cool bit fiddling and other algorithms
  - But **be careful** in the project: be sure semantics are OK
- Example: right shift is not the same as Java/C/C++/etc. integer divide for negative numbers (why?)
- There are additional instructions that shift and rotate double words, use a calculated shift amount instead of a constant, etc.

- Very fast and can often be used to optimize multiplication and division by small constants
  - If you're interested, look at "Hacker's Delight" by Henry Warren
- Lots of very cool bit fiddling and other algorithms
  - But be careful in the project: Should you use any of
- Example: right shift is not the sar divide for negative numbers (wh
- There are additional instructions words, use a calculated shift amount instead of a constant, etc.

- Very fast and can often be used to optimize multiplication and division by small constants
  - If you're interested, look at "Hacker's Delight" by Henry Warren
- Lots of very cool bit fiddling and other algorithms
  - But be careful in the project:

Þr

- Example: right shift is not the sar divide for negative numbers (wh There are additional instructions (wh)
   Should you use any of these tricks in PA3? NO! (encouraged in PA4!)
- There are additional instructions (encouraged in PA4!) words, use a calculated shift amount instead of a constant, etc.

• The unary & operator in C/C++

The unary & operator in C/C++
 leag src,dst # dst <- address of src</li>

- The unary & operator in C/C++
   leag src,dst # dst <- address of src</li>
- Things to note:

- The unary & operator in C/C++
   leag src,dst # dst <- address of src</li>
- Things to note:
  - dst must be a register

• The unary & operator in C/C++

leaq src,dst # dst <- address of src</pre>

- Things to note:
  - dst must be a register
  - Address of **src** includes any address arithmetic or indexing

• The unary & operator in C/C++

leaq src,dst # dst <- address of src</pre>

- Things to note:
  - dst must be a register
  - Address of **src** includes any address arithmetic or indexing
  - Useful to capture addresses for pointers, reference parameters, etc.

• The unary & operator in C/C++

leaq src,dst # dst <- address of src</pre>

- Things to note:
  - dst must be a register
  - Address of **src** includes any address arithmetic or indexing
  - Useful to capture addresses for pointers, reference parameters, etc.
  - Also useful for computing arithmetic expressions of the form const + r1 + scale \* r2

# Trivia Break: Computer Science

This Dutch computer scientist won the 1972 Turing Award for fundamental contributions to developing structured programming languages. Some of his other important contributions include formulating and solving the shortest-path problem in graph theory and co-developing the first Algol 60 compiler. He also famously authored a long, eponymous series of technical reports on various topics both within and without computer science. If you took 490 with me, you might remember his famous aphorism: "tests can only show the presence of bugs, never their absence".

#### Edgar Dijkstra: Go To Statement Considered Harmful

#### Go To Statement Considered Harmful

Key Words and Phrases: go to statement, jump instruction, branch instruction, conditional clause, alternative clause, repetitive clause, program intelligibility, program sequencing CR Categories: 4.22, 5.23, 5.24

#### EDITOR:

For a number of years I have been familiar with the observation that the quality of programmers is a decreasing function of the density of go to statements in the programs they produce. More recently I discovered why the use of the go to statement has such disastrous effects, and I became convinced that the go to statement should be abolished from all "higher level" programming languages (i.e. everything except, perhaps, plain machine code). At that time I did not attach too much importance to this discovery; I now submit my considerations for publication because in very recent discussions in which the subject turned up, I have been urged to do so. dynamic progress is only call of the procedure we we can characterize the textual indices, the len dynamic depth of proce

Let us now consider r or repeat A until B). I superfluous, because we recursive procedures. F clude them: on the one mented quite comfortab the other hand, the re makes us well equipped processes generated by the repetition clauses t describe the dynamic pro a repetition clause, how namic index," inexorat

• At the assembly level, all we have is goto and conditional goto

#### Edgar Dijkstra: Go To Statement Considered Harmful

#### Go To Statement Considered Harmful

Key Words and Phrases: go to statement, jump instruction, branch instruction, conditional clause, alternative clause, repetitive clause, program intelligibility, program sequencing CR Categories: 4.22, 5.23, 5.24

#### EDITOR:

For a number of years I have been familiar with the observation that the quality of programmers is a decreasing function of the density of go to statements in the programs they produce. More recently I discovered why the use of the go to statement has such disastrous effects, and I became convinced that the go to statement should be abolished from all "higher level" programming languages (i.e. everything except, perhaps, plain machine code). At 'that time I did not attach too much importance to this discovery; I now submit my considerations for publication because in very recent discussions in which the subject turned up, I have been urged to do so. dynamic progress is only call of the procedure we we can characterize the textual indices, the len dynamic depth of proce

Let us now consider r or repeat A until B). I superfluous, because we recursive procedures. F clude them: on the one mented quite comfortab the other hand, the re makes us well equipped processes generated by the repetition clauses t describe the dynamic pro a repetition clause, how namic index," inexorat

- At the assembly level, all we have is goto and conditional goto
- Loops and conditional statements are built from these

#### Edgar Dijkstra: Go To Statement Considered Harmful

#### Go To Statement Considered Harmful

Key Words and Phrases: go to statement, jump instruction, branch instruction, conditional clause, alternative clause, repetitive clause, program intelligibility, program sequencing CR Categories: 4.22, 5.23, 5.24

#### EDITOR:

For a number of years I have been familiar with the observation that the quality of programmers is a decreasing function of the density of go to statements in the programs they produce. More recently I discovered why the use of the go to statement has such disastrous effects, and I became convinced that the go to statement should be abolished from all "higher level" programming languages (i.e. everything except, perhaps, plain machine code). At 'that time I did not attach too much importance to this discovery; I now submit my considerations for publication because in very recent discussions in which the subject turned up, I have been urged to do so. dynamic progress is only call of the procedure we we can characterize the textual indices, the len dynamic depth of proce-Let us now consider p

or repeat A until B). I superfluous, because we recursive procedures. Fclude them: on the one mented quite comfortab the other hand, the re makes us well equipped processes generated by the repetition clauses t describe the dynamic pro a repetition clause, how namic index," inexorat
# **Control Flow: GOTO**

- At the assembly level, all we have is goto and conditional goto
- Loops and conditional statements are built from these
- Note: random jumps play havoc with pipeline efficiency; much work is done in modern compilers and processors to minimize this impact

#### Edgar Dijkstra: Go To Statement Considered Harmful

#### Go To Statement Considered Harmful

Key Words and Phrases: go to statement, jump instruction, branch instruction, conditional clause, alternative clause, repetitive clause, program intelligibility, program sequencing CR Categories: 4.22, 5.23, 5.24

#### EDITOR:

For a number of years I have been familiar with the observation that the quality of programmers is a decreasing function of the density of go to statements in the programs they produce. More recently I discovered why the use of the go to statement has such disastrous effects, and I became convinced that the go to statement should be abolished from all "higher level" programming languages (i.e. everything except, perhaps, plain machine code). At that time I did not attach too much importance to this discovery; I now submit my considerations for publication because in very recent discussions in which the subject turned up, I have been urged to do so. dynamic progress is only call of the procedure we we can characterize the textual indices, the len dynamic depth of proce

Let us now consider r or repeat A until B). I superfluous, because we recursive procedures. F clude them: on the one mented quite comfortab the other hand, the re makes us well equipped processes generated by the repetition clauses t describe the dynamic pro a repetition clause, how namic index," inexorat

jmp dst
%rip <- address of dst</pre>

jmp dst
%rip <- address of dst</pre>

• dst is usually a *label* in the code (which can be on a line by itself)

jmp dst %rip <- address of dst

dst is usually a *label* in the code (which can be on a line by itself)
 "labels" are just arbitrary named instructions chosen by the compiler (i.e., you)

jmp dst %rip <- address of dst

- dst is usually a *label* in the code (which can be on a line by itself)
  - "labels" are just arbitrary named instructions chosen by the compiler (i.e., you)
- dst address can also be indirect using the address in a register or memory location ( \*reg or \*(reg) )
  - useful for method calls, case, etc.

• Most arithmetic instructions set "*condition code*" bits to record information about the result (zero, non-zero, >0, etc. - read the docs for an instruction before you rely on this behavior)

- Most arithmetic instructions set "*condition code*" bits to record information about the result (zero, non-zero, >0, etc. read the docs for an instruction before you rely on this behavior)
  - true of: addq, subq, andq, orq

- Most arithmetic instructions set "*condition code*" bits to record information about the result (zero, non-zero, >0, etc. read the docs for an instruction before you rely on this behavior)
  - $\circ$  true of: addq, subq, andq, orq
  - but not: imulq, idivq, leaq

- Most arithmetic instructions set "*condition code*" bits to record information about the result (zero, non-zero, >0, etc. read the docs for an instruction before you rely on this behavior)
  - true of: addq, subq, andq, orq
  - but not: imulq, idivq, leaq
- Other instructions can set condition codes. E.g.,:

- Most arithmetic instructions set "*condition code*" bits to record information about the result (zero, non-zero, >0, etc. read the docs for an instruction before you rely on this behavior)
  - true of: addq, subq, andq, orq
  - but not: imulq, idivq, leaq
- Other instructions can set condition codes. E.g.,:
  - **cmpq src**, **dst** # compare dst to src (e.g., dst-src)
  - testq src,dst # calculate dst & src (logical and)

- Most arithmetic instructions set "*condition code*" bits to record information about the result (zero, non-zero, >0, etc. read the docs for an instruction before you rely on this behavior)
  - true of: addq, subq, andq, orq
  - but not: imulq, idivq, leaq
- Other instructions can set condition codes. E.g.,:
  - **cmpq src,dst** # compare dst to src (e.g., dst-src)
  - testq src,dst # calculate dst & src (logical and)
    - These do not alter src or dst, but then do impact jump targets of conditional jumps

jz label #jump if result == 0

jz label # jump if result == 0
jnz label # jump if result != 0

| jz label |       | # jump if result == 0 |
|----------|-------|-----------------------|
| jnz      | label | # jump if result != 0 |
| jg       | label | # jump if result > 0  |

jz label jnz label jg label jnglabel jgelabel jnge label j1 label jnllabel jlelabel jnle label

# jump if result == 0 # jump if result != 0 # jump if result > 0 # jump if result  $\leq 0$ # jump if result >= 0 # jump if result < 0 # jump if result < 0 # jump if result >= 0 # jump if result <= 0 # jump if result > 0

jz label label jnz jq label jnglabel jqelabel label jnge label j1 jnllabel jlelabel jnle label # jump if result == 0# jump if result != 0 # jump if result > 0 # jump if result <= 0</pre> # jump if result >= 0 # jump if result < 0 # jump if result < 0 # jump if result >= 0 # jump if result <= 0</pre> # jump if result > 0

note: the assembler is mapping multiple mnemonics to a single actual instruction

• Common desire: compare two operands and jump if a relationship holds between them

- Common desire: compare two operands and jump if a relationship holds between them
  - in other words, we *want* an instruction like this:

 $jump_{cond}$  op1, op2, label

- Common desire: compare two operands and jump if a relationship holds between them
  - in other words, we *want* an instruction like this:

 $jump_{cond}$  op1, op2, label

- However, we can't have this instruction: x86-64 cannot support
   3-operand instructions
  - (also true of most other practical machines)

- Common desire: compare two operands and jump if a relationship holds between them
  - in other words, we *want* an instruction like this:

 $jump_{cond}$  op1, op2, label

- However, we can't have this instruction: x86-64 cannot support
   3-operand instructions
  - (also true of most other practical machines)
- What should we do instead?

- Common desire: compare two operands and jump if a relationship holds between them
  - in other words, we *want* an instruction like this:

 $jump_{cond}$  op1, op2, label

- However, we can't have this instruction: x86-64 cannot support
   3-operand instructions
  - (also true of most other practical machines)
- What should we do instead?

cmpq op1, op2 # compute op2 - op1, set condition code
j\_cc label

- Common desire: compare two operands and jump if a relationship holds between them
  - in other words, we *want* an instruction like this:

jump\_cond op1, op2, label

- However, we can't have this instruction: x86-64 cannot support
   3-operand instructions
  - (also true of most other practical machines)
- What should we do instead j<sub>cc</sub> is a conditional jump that's taken if
   cmpq op1, op2 # co
   j<sub>cc</sub> label

 Common desire: compare tv relationship holds between
 in other words, we want
 instructions like je, jne are useful mnemonics here; you can also use the ones on the last slide

Special conditional jump

jump\_cond op1, op2, label

- However, we can't have this instruction: x86-64 cannot support
   3-operand instructions
  - (also true of most other practical machines)
- What should we do instead j<sub>cc</sub> is a conditional jump that's taken if
   cmpq op1, op2 # co
   j<sub>cc</sub> label

• The x86-64 instruction set itself only provides for transfer of control (via jump) and return

- The x86-64 instruction set itself only provides for transfer of control (via jump) and return
- Stack is used to capture return address and recover it

- The x86-64 instruction set itself only provides for transfer of control (via jump) and return
- Stack is used to capture return address and recover it
- Everything else-parameter passing, stack frame organization, register usage is a matter of convention

- The x86-64 instruction set itself only provides for transfer of control (via jump) and return
- Stack is used to capture return address and recover it
- Everything else-parameter passing, stack frame organization, register usage is a matter of convention
  - Follow the conventions even if you write all the code!

- The x86-64 instruction set itself only provides for transfer of control (via jump) and return
- Stack is used to capture return address and recover it
- Everything else-parameter passing, stack frame organization, register usage is a matter of convention
  - Follow the conventions even if you write all the code!
    - Helps anyone reading your code figure out what's happening
    - Lets standard tools like gdb work successfully with your code (in the *unlikely* event that you have to debug something...)

call label

• Push address of next instruction and jump

call label

- Push address of next instruction and jump
- %rsp <- %rsp 8; memory[%rsp] <- %rip; %rip <- address of label

call label

- Push address of next instruction and jump
- %rsp <- %rsp 8; memory[%rsp] <- %rip; %rip <- address of label
- Address can also be in a register or memory as with jmp we'll use these for dynamic dispatch of method calls (more later)

call label

- Push address of next instruction and jump
- %rsp <- %rsp 8; memory[%rsp] <- %rip; %rip <- address of label
- Address can also be in a register or memory as with jmp we'll use these for dynamic dispatch of method calls (more later)

ret

• Pop address from top of stack and jump
# call and ret Instructions

call label

- Push address of next instruction and jump
- %rsp <- %rsp 8; memory[%rsp] <- %rip; %rip <- address of label
- Address can also be in a register or memory as with jmp we'll use these for dynamic dispatch of method calls (more later)

ret

- Pop address from top of stack and jump
- %rip <- memory[%rsp]; %rsp <- %rsp + 8

# call and ret Instructions

call label

- Push address of next instruction and jump
- %rsp <- %rsp 8; memory[%rsp] <- %rip; %rip <- address of label
- Address can also be in a register or memory as with jmp we'll use these for dynamic dispatch of method calls (more later)

ret

- Pop address from top of stack and jump
- %rip <- memory[%rsp]; %rsp <- %rsp + 8
- WARNING! The word on the top of the stack had better be the address we want and not some leftover data

• %rax – function result

- %rax function result
- First six arguments passed in these registers, in this order:
  - %rdi, %rsi, %rdx, %rcx, %r8, %r9

- %rax function result
- First six arguments passed in these registers, in this order:
  - %rdi, %rsi, %rdx, %rcx, %r8, %r9
  - For Java/C++ "this" pointer is first argument, in %rdi
    - you may want to do the same for your Cool programs

- %rax function result
- First six arguments passed in these registers, in this order:
  - %rdi, %rsi, %rdx, %rcx, %r8, %r9
  - For Java/C++ "this" pointer is first argument, in %rdi
    - you may want to do the same for your Cool programs
- %rsp stack pointer;value must be 8-byte aligned always and 16-byte aligned when calling a function

- %rax function result
- First six arguments passed in these registers, in this order:
  - %rdi, %rsi, %rdx, %rcx, %r8, %r9
  - For Java/C++ "this" pointer is first argument, in %rdi
    - you may want to do the same for your Cool programs
- %rsp stack pointer;value must be 8-byte aligned always and 16-byte aligned when calling a function
- **%rbp** frame pointer (optional use)

- A called function **must preserve** these registers(or save/restore them if it wants to use them):
  - %rbx, %rbp, %r12-%r15

- A called function **must preserve** these registers(or save/restore them if it wants to use them):
  - %rbx, %rbp, %r12-%r15
  - %rsp isn't officially on this "callee save list", but needs to be properly restored for return

- A called function **must preserve** these registers(or save/restore them if it wants to use them):
  - %rbx, %rbp, %r12-%r15
  - %rsp isn't officially on this "callee save list", but needs to be properly restored for return
- All other registers can change across a function call

- A called function **must preserve** these registers(or save/restore them if it wants to use them):
  - %rbx, %rbp, %r12-%r15
  - %rsp isn't officially on this "callee save list", but needs to be properly restored for return
- All other registers can change across a function call
  - Debugging/correctness note: always assume every called function will change all registers it is allowed to

- A called function **must preserve** these registers(or save/restore them if it wants to use them):
  - %rbx, %rbp, %r12-%r15
  - %rsp isn't officially on this "callee save list", but needs to be properly restored for return
- All other registers can change across a function call
  - Debugging/correctness note: always assume every called function will change all registers it is allowed to
    - including registers containing function parameters!

- A called function **must preserve** these registers(or save/restore them if it wants to use them):
  - %rbx, %rbp, %r12-%r15
  - %rsp isn't officially on this "callee save list", but needs to be properly restored for return
- All other registers can change across a function call
  - Debugging/correctness note: always assume every called function will change all registers it is allowed to
    - including registers containing function parameters!
      - for debugging, you may want to **deliberately** clobber them!

• Caller places up to 6 arguments in registers, rest on stack, then executes call instruction (which pushes 8- byte return address)

- Caller places up to 6 arguments in registers, rest on stack, then executes call instruction (which pushes 8- byte return address)
- On entry, called function prologue sets up the **stack frame**:

- Caller places up to 6 arguments in registers, rest on stack, then executes call instruction (which pushes 8- byte return address)
- On entry, called function prologue sets up the **stack frame**:

pushq %rbp

# save old frame ptr

- Caller places up to 6 arguments in registers, rest on stack, then executes call instruction (which pushes 8- byte return address)
- On entry, called function prologue sets up the **stack frame**:

pushq %rbp
movq %rsp,%rbp

# save old frame ptr
# new frame ptr is top of
# stack after ret addr and
# old rbp pushed

- Caller places up to 6 arguments in registers, rest on stack, then executes call instruction (which pushes 8- byte return address)
- On entry, called function prologue sets up the **stack frame**:

| pushq %rbp                       | <pre># save old frame ptr</pre>       |
|----------------------------------|---------------------------------------|
| movq %rsp,%rbp                   | <pre># new frame ptr is top of</pre>  |
|                                  | <pre># stack after ret addr and</pre> |
|                                  | <pre># old rbp pushed</pre>           |
| <pre>subq \$framesize,%rsp</pre> | <pre># allocate stack frame</pre>     |
|                                  | <pre># (size should be multiple</pre> |
|                                  | <pre># of 16 normally)</pre>          |

### Stack Frame Layout



• Called function puts result (if any) in %rax and restores any callee-save registers if needed

- Called function puts result (if any) in %rax and restores any callee-save registers if needed
- Called function returns with:

movq %rbp,%rsp # or can use "leave"
popq %rbp # instead of movq/popq

- Called function puts result (if any) in %rax and restores any callee-save registers if needed
- Called function returns with:

movq %rbp,%rsp # or can use ``leave"
popq %rbp # instead of movq/popq

• If caller allocated space for arguments (beyond the 6 in regs) it deallocates as needed

n = sumOf(17, 42)

n = sumOf(17, 42)

movq \$42,%rsi

# load arguments in

n = sumOf(17, 42)

movq \$42,%rsi
movq \$17,%rdi

# load arguments in
# either order, but use
# correct registers

n = sumOf(17, 42)

movq \$42,%rsi
movq \$17,%rdi

call sumOf

# load arguments in # either order, but use # correct registers # jump & push ret addr

n = sumOf(17, 42)

movq \$42,%rsi
movq \$17,%rdi

call sumOf
movq %rax,offset\_(%rbp)

# load arguments in
# either order, but use
# correct registers
# jump & push ret addr
# store result

```
int sumOf(int x, int y) {
    int a, int b;
    a = x;
    b = a + y;
    return b;
```

int sumOf(int x, int y) {
 int a, int b;
 a = x;
 b = a + y;
 return b;
}

sumOf:

<u>pushq %rbp</u> # prologue <u>movq %rsp,%rbp</u>

int sumOf(int x, int y) {
 <u>int a, int b;</u>
 a = x;
 b = a + y;
 return b;
}

int sumOf(int x, int y) {
 int a, int b;
 <u>a = x;</u>
 b = a + y;
 return b;

int sumOf(int x, int y) {
 int a, int b;
 a = x;
 <u>b = a + y;
 return b;
}</u>

sumOf:

pushq %rbp # prologue movq %rsp,%rbp subq \$16,%rsp movq %rdi,-8(%rbp) movq -8(%rbp),%rax addq %rsi,%rax movq %rax,-16(%rbp)
#### **Example Function Body**

```
int sumOf(int x, int y) {
    int a, int b;
    a = x;
    b = a + y;
    <u>return b;</u>
}
```

sumOf: pushq %rbp # proloque movq %rsp,%rbp subg \$16,%rsp movq %rdi,-8(%rbp) movq -8(%rbp),%rax addq %rsi,%rax movq %rax,-16(%rbp) movg -16(%rbp),%rax movg %rbp,%rsp popq %rbp

int sumOf(int x, int y) {
 int a, int b;
 a = x;
 b = a + y;
 return b;
}





























```
int sumOf(int x, int y) {
  int a, int b;
  a = x;
  b = a + y;
  return b;
sumOf:
# prologue
 pushq %rbp
 movq %rsp,%rbp
 subg $16,%rsp
# a = x;
 movq %rdi,-8(%rbp)
\# b = a + y;
 movq -8(%rbp),%rax
 addq %rsi,%rax
 movq %rax,-16(%rbp)
# return b;
 movq -16(%rbp),%rax
 movq %rbp,%rsp
       %rbp
 popq
 ret
#}
```

#### int sumOf(int x, int y) { int a, int b; a = x; Example Stack Frame for sumOf b = a + y;return b; registers: %rax <u>59</u> %rdi <u>17</u> %rsi <u>42</u> sumOf: (caller sets # prologue n to %rax) pushq %rbp %rbp→ previous rbp movq %rsp,%rbp n: <u>59</u> subg \$16,%rsp %rsp→ caller stack frame # a = x; movq %rdi,-8(%rbp) # b = a + y;movq -8(%rbp),%rax addq %rsi,%rax junk movq %rax,-16(%rbp) # return b; movq -16(%rbp),%rax movq %rbp,%rsp %rbp popq ret #}

• The convention I've just shown is the System V/AMD64 ABI convention (used by Linux, MacOS X)

- The convention I've just shown is the System V/AMD64 ABI convention (used by Linux, MacOS X)
- Microsoft's calling conventions are slightly different (of course)

- The convention I've just shown is the System V/AMD64 ABI convention (used by Linux, MacOS X)
- Microsoft's calling conventions are slightly different (of course)
   First four parameters in registers %rcx, %rdx, %r8, %r9; rest on the stack

- The convention I've just shown is the System V/AMD64 ABI convention (used by Linux, MacOS X)
- Microsoft's calling conventions are slightly different (of course)
   First four parameters in registers %rcx, %rdx, %r8, %r9; rest on the stack
  - Called function stack frame must include empty space for called function to save values passed in parameter registers if desired

- The convention I've just shown is the System V/AMD64 ABI convention (used by Linux, MacOS X)
- Microsoft's calling conventions are slightly different (of course)
   First four parameters in registers %rcx, %rdx, %r8, %r9; rest on the stack
  - Called function stack frame must include empty space for called function to save values passed in parameter registers if desired
- Not relevant for us, but worth being aware of it
  - (except that providing space in each stack frame to save parameter registers will be handy for our simple code gen)

# **Course Announcements**

- the PA3c1 (codegen testing) deadline has already passed
  - if you forgot about it, we will still accept submissions (with a penalty)
- PA3c2 (TAC) is due later this week ("before spring break")
  - you should already have started, or you are behind
  - if there is demand from the class, I will consider a short extension on this assignment to e.g., Monday
- I have become aware of a **bug in the reference compiler**'s x86-64 module; a fix will be forthcoming. For now, don't trust it.
- Don't forget there is a midterm in this class the week after spring break!