Starting from:
$30

$24

Lab 1 Unix Terminal part 2 Solution




"Nearing Machine Code Representation"




Introduction/Motivation




Not so long ago (in a galaxy not so far away), programmers wrote most of their code in assembly. While programmers today primarily use higher level languages (Python, C, etc), it is not uncommon to debug the assembly of your code. These higher level languages afterall typically translate down to an assemble or assembly-like language.




If you are interested in cybersecurity and reverse engineering, folks more frequently write and analyze assembly code. For high performance applications like [games](https://www.gamasutra.com/view/news/169946/CC_low_level_curriculum_Looking_at_optimized_assembly.php), programmers may write very optimized code using assembly to get things *just* right. If you are working in hardware or an embededd device, you might also do some assembly programming, as other languages environments are too bulky to support on a small device. Even web developers are using something called 'webassembly'. Hmm, the list is getting long here--I think the point is that learning assembly has quite some relevance! Let's dig in and get some practice.




In today's lab you are going to get some practice looking at assembly.




Part 0 - Godbolt




I **strongly** recommend using the godbolt tool (https://godbolt.org/) to write and experiment with your C programs for this exercise. The color mappings will help you see what is going on with the generated assembly. You **should** try using both godbolt and your compiler to generate assembly.




Here is an example of the Godbolt tool (and also shows part 4 of this lab)

<img src="./assembly.PNG"




Part 1- Compiler Generated Assembly




Let us get some experience reading assembly code generated by the compiler (or godbolt)! It is actually kind of fun, you may learn some new instructions, and at the very least gain some intuition for what code the compiler is generating.




Compiler generated assembly 1 - Swap




- Write a C program that swaps two integers(in the main body of code).

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as swap_int.s).

- Use: `gcc -O0 -fno-builtin swap_int.c -S -o swap_int.s`

- Now modify your program to swap two long's.

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as swap_long.s).

- Use: `gcc -O0 -fno-builtin swap_long.c -S -o swap_long.s`

- Compare each of the two assembly files using diff. See what changed.

- diff syntax

- Use: `diff -y swap_int.s swap_long.s`

 

Response/Observations




subq $16, %rsp | subq $32, %rsp

movl $2, -4(%rbp) | movq $2, -8(%rbp)

movl $3, -8(%rbp) | movq $3, -16(%rbp)

movl -8(%rbp), %edx | movq -16(%rbp), %rdx

movl -4(%rbp), %eax | movq -8(%rbp), %rax

movl %eax, %esi | movq %rax, %rsi

movl $.LC0, %edi movl $.LC0, %edi

movl $0, %eax movl $0, %eax

call printf call printf

movl -4(%rbp), %eax | movq -8(%rbp), %rax

movl %eax, -12(%rbp) | movq %rax, -24(%rbp)

movl -8(%rbp), %eax | movq -16(%rbp), %rax

movl %eax, -4(%rbp) | movq %rax, -8(%rbp)

movl -12(%rbp), %eax | movq -24(%rbp), %rax

movl %eax, -8(%rbp) | movq %rax, -16(%rbp)

movl -8(%rbp), %edx | movq -16(%rbp), %rdx

movl -4(%rbp), %eax | movq -8(%rbp), %rax

movl %eax, %esi | movq %rax, %rsi




Compiler generated assembly 2 - Functions




- Write a C program that swaps two integers in a **function** (You may use today's slide as a reference)

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as swap.s).

- Use: `gcc -O0 -fno-builtin swap.c -S -o swap.s`

- Do the instructions use memory/registers in a different way?




Response/Observations

Yes

movq %rsp, %rbp




Compiler generated assembly 3 - Static Array

- Write a C program called array.c that has an array of 400 integers in the function of main.

- Initialize some of the values to something (do not use a loop) (e.g. myArray[0]=72; myArray[70]=56; etc)

- Note that it is helpful to use 'weird' numbers so you can see where they jump out.

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as array.s).

- Use: `gcc -O0 -fno-builtin array.c -S -o array.s`

- How much are the offsets from the address?




Response/Observations

.cfi_def_cfa_offset 16

.cfi_offset 6, -16




Compiler generated assembly 4 - Dynamic Array




- Write a C program called array2.c that has an array of 400 integers in the function of main that is dynamically allocated.

- Initialize some of the values to something (do not use a loop) (e.g. myArray[66]=712; myArray[70]=536; etc)

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as array2.s).

- Use: `gcc -O0 -fno-builtin array2.c -S -o array2.s`

- Study the assembly and think about what is different from the static array.




Response/Observations

The array is created on the stack, it has automatic storage duration, I don't need to manually manage memory. They get destroyed when

the function they are in ends. They have fixed size like myArray[400]; in array2 myArray has dynnmic storage duration and stored on

the heap. I an size them as 400 * sizeof(int), but I need to free them.




subq $1600, %rsp | subq $16, %rsp

movl $45, -1588(%rbp) | movl $1600, %edi

movl $111, -1484(%rbp) | call malloc

movl $0, -1480(%rbp) | movq %rax, -8(%rbp)




movl (%rax), %eax

movl %eax, %esi movl %eax, %esi

movl $.LC0, %edi movl $.LC0, %edi

movl $0, %eax movl $0, %eax

call printf call printf

movq -8(%rbp), %rax

movq %rax, %rdi

call free

movl $0, %eax movl $0, %eax

leave leave

.cfi_def_cfa 7, 8 .cfi_def_cfa 7, 8




Compiler generated assembly 5 - Goto

The C programming language has a 'goto' command, search how to use it if you have not previously.




- Write a C program using the goto command and a label.

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as goto.s).

- Use: `gcc -O0 -fno-builtin goto.c -S -o goto.s`

- Observe what kind of jmp statement is inserted.




Response/Observations

jg .L3

jmp .L2




Compiler generated assembly 6 - For-loops

- Write a C program using a for-loop that counts to 5.

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as for.s).

- Use: `gcc -O0 -fno-builtin for.c -S -o for.s`

- Observe where the code goes for the condition statement (at the start or at the end?).




Response/Observations




in the end

Compiler generated assembly 7 - Switch Statements




- Write a C program using a switch statement (Sample here)[https://www.tutorialspoint.com/cprogramming/switch_statement_in_c.htm].

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as switch.s).

- Use: `gcc -O0 -fno-builtin switch.c -S -o switch.s`

- See what code a switch statement generates. Is it optimal?




Response/Observations




movq %rsp, %rbp

.cfi_def_cfa_register 6

subq $16, %rsp

movl $1, -4(%rbp)

movl -4(%rbp), %eax

cmpl $1, %eax

je .L3

cmpl $2, %eax

je .L4

jmp .L7

Compiler generated assembly 8 - Add Function




- Write a C program that calls an add function(long add(long a, long b).

- Save, Compile, and Run the program to verify it works.

- Output the assembly from that program (Save it as add.s).

- Use: `gcc -O0 -fno-builtin add.c -S -o add.s`

- Observe the outputs

- Observe arguments put into registers

- Observe where 'popq' is called.




Response/Observations




after add printf

call add

movq %rax, -24(%rbp)

movq -24(%rbp), %rax

movq %rax, %rsi

movl $.LC0, %edi

movl $0, %eax

call printf

movl $0, %eax

leave

.cfi_def_cfa 7, 8

ret

.cfi_endproc

.LFE0:

.size main, .-main

.globl add

.type add, @function

add:

.LFB1:

.cfi_startproc

pushq %rbp

.cfi_def_cfa_offset 16

.cfi_offset 6, -16

movq %rsp, %rbp

.cfi_def_cfa_register 6

movq %rdi, -24(%rbp)

movq %rsi, -32(%rbp)

movq -32(%rbp), %rax

movq -24(%rbp), %rdx

addq %rdx, %rax

movq %rax, -8(%rbp)

movq -8(%rbp), %rax

popq %rbp

.cfi_def_cfa 7, 8

ret

.cfi_endproc

More resources to help




- Matt Godbolt has written a great tool to help understand assembly generated from the compiler.

- https://godbolt.org/

- An assembly cheat sheet from Brown

- https://cs.brown.edu/courses/cs033/docs/guides/x64_cheatsheet.pdf

- MIT Cheat sheet

- http://6.035.scripts.mit.edu/sp17/x86-64-architecture-guide.html




Deliverable




- For part 1, add your .S files that you have generated to this repository.

- Note this submission will be auto graded for completion (i.e. save the file names as shown).

- Add your observations in the appropriate response/observations section for each code.




Going Further




- (Optional) Try the objdump example to read the disassembly from your programs executables. Observe how close the output is to the compiler generated output.

More products