$24
In this homework, you have to implement a simple instruction-level debugger that allows a user to debug a program interactively at the assembly instruction level. You can implement the debugger by using the ptrace interface.
To simplify your program, your debugger only has to handle static-nopie programs.
We use hello64 (https://up23.zoolab.org/up23/hw2/hello64), hello (https://up23.zoolab.org/up23/hw2/hello), guess (https://up23.zoolab.org/up23/hw2/guess) to demonstrate the usage of the debugger.
Launch the program
Unlike gdb and lldb , your debugger launches the target program when the debugger starts.
The program should stop at the entry point, waiting for the user’s cont or si commands.
usage: ./sdb [program]
./sdb ./hello64
When the program is launched, the debugger should print the name of the executable and the entry point address. Before waiting for the user’s input, the debugger should disassemble 5 instructions starting from the current program counter (rip). The detail requirement is described in the following paragraph.
** program './hello64'
loaded. entry point 0x4000b0
4000b0: b8 04 00
00
00
mov
eax, 4
4000b5: bb 01 00
00
00
mov
ebx, 1
4000ba: b9 d4
00
60
00
mov
ecx, 0x6000d4
4000bf: ba 0e
00
00
00
mov
edx, 0xe
4000c4: cd 80
int
0x80
(sdb)
Disassemble
When returning from execution, the debugger should disassemble 5 instruction starting from the current program counter. The address of the 5 instructions should be within the range of the text section specified in the ELF file. We do not care about the format, but in each line, there should be
address, eg. 40000b0
raw instructions in grouping of 1 byte, eg. b8 04 00 00 00
mnemonic, eg. mov
operands of the instruction, eg. eax, 4
And make sure the output is aligned with the columns.
Hint: You can link against the capstone library for disassembling.
After typing an invalid command or using a command which is not si , cont , timetravel , the debugger should not disassemble the program.
Patched instructions like 0xcc (int3) should not appear in the output.
(sdb) si
4000c4: cd 80
int
0x80
4000c6: b8 01 00 00 00
mov
eax, 1
4000cb: bb 00 00 00 00
mov
ebx, 0
4000d0: cd 80
int
0x80
4000d2: c3
ret
(sdb) si
hello, world!
4000c6: b8 01 00 00 00
mov
eax, 1
4000cb: bb 00 00 00 00
mov
ebx, 0
4000d0: cd 80
int
0x80
4000d2: c3
ret
the address is out of the range of the text section. (sdb)
(sdb) si
4000cb: bb 00
00 00 00
mov
ebx,
0
4000d0: cd
80
int
0x80
4000d2: c3
ret
** the address is out of the range of the text section.
Step Instruction
When the user use si command, the target program should execute a single instruction.
(sdb) si
4000c4: cd 80
int
0x80
4000c6: b8 01 00 00 00
mov
eax, 1
4000cb: bb 00 00 00 00
mov
ebx, 0
4000d0: cd 80
int
0x80
4000d2: c3
ret
(sdb) si
hello, world!
4000c6: b8 01 00 00 00
mov
eax, 1
4000cb: bb 00 00 00 00
mov
ebx, 0
4000d0: cd 80
int
0x80
4000d2: c3
ret
the address is out of the range of the text section. (sdb)
Continue
The cont command continues the execution of the target program. The program should keep running until it terminates or hits a breakpoint.
You can only use two ptrace(PTRACE_SINGLE_STEP) and two int3 at most in the implementation of cont , or you will get 0 points.
break <address in hexdecimal>
** program './hello64'
loaded. entry point 0x4000b0
4000b0: b8 04 00
00
00
mov
eax, 4
4000b5: bb 01
00
00
00
mov
ebx, 1
4000ba: b9 d4
00
60
00
mov
ecx, 0x6000d4
4000bf: ba 0e
00
00
00
mov
edx, 0xe
4000c4: cd 80
int
0x80
(sdb) break 0x4000ba
set a breakpoint at 0x4000ba. (sdb) cont
hit a breakpoint at 0x4000ba.
4000ba:
b9
d4
00
60
00
mov
ecx, 0x6000d4
4000bf:
ba 0e 00 00 00
mov
edx, 0xe
4000c4:
cd 80
int
0x80
4000c6:
b8
01
00
00
00
mov
eax, 1
4000cb: bb 00
00
00
00
mov
ebx, 0
(sdb) cont
hello, world!
** the target
program terminated.
Breakpoint
A user can use to set a breakpoint. The target program should
stop before the instruction at the specified address is executed. Then it should print a message
about the program. If the user resumes the program with si
instead of cont , the program
should not stop at the breakpoint twice. The debugger still needs to print the message.
** program './hello64'
loaded. entry point 0x4000b0
4000b0: b8 04 00
00
00
mov
eax, 4
4000b5: bb 01
00
00
00
mov
ebx, 1
4000ba: b9 d4
00
60
00
mov
ecx, 0x6000d4
4000bf: ba 0e
00
00
00
mov
edx, 0xe
4000c4: cd 80
int
0x80
(sdb) break 0x4000ba
set a breakpoint at 0x4000ba. (sdb) si
4000b5: bb 01
00
00
00
mov
ebx, 1
4000ba: b9 d4
00
60
00
mov
ecx, 0x6000d4
4000bf: ba 0e
00
00
00
mov
edx, 0xe
4000c4: cd 80
int
0x80
4000c6: b8 01
00
00
00
mov
eax, 1
(sdb) si
** hit a breakpoint
0x4000ba.
4000ba: b9 d4
00
60
00
mov
ecx, 0x6000d4
4000bf: ba 0e
00
00
00
mov
edx, 0xe
4000c4: cd 80
int
0x80
4000c6: b8 01
00
00
00
mov
eax, 1
4000cb: bb 00
00
00
00
mov
ebx, 0
Time Travel
Sometimes you might see some bugs that are hard to replicate. Use the anchor command set a checkpoint and use the timetravel command to restore the process status.
Hint:
There are two ways to implement this feature.
Snapshot the process memory and general purpose registers.
Patch fork into the target process and stop the parent or child as the checkpoint.
This functionality is inspired by the Checkpoint/Restore In Userspace(CRIU) (https://criu.org/Main_Page). gdb also has a similar feature checkpoint which is implemented in a different way.
** program './hello64'
loaded. entry point 0x4000b0
4000b0: b8 04 00
00
00
mov
eax, 4
4000b5: bb 01 00
00
00
mov
ebx, 1
4000ba: b9 d4
00
60
00
mov
ecx, 0x6000d4
4000bf: ba 0e
00
00
00
mov
edx, 0xe
4000c4: cd 80
int
0x80
(sdb) anchor
dropped an anchor (sdb) break 0x4000cb
set a breakpoint at 0x4000cb (sdb) cont
hello, world!
hit a breakpoint at 0x4000cb
4000cb: bb 00
00 00 00
mov
ebx,
0
4000d0: cd
80
int
0x80
4000d2: c3
ret
the address is out of the range of the text section. (sdb) timetravel
go back to the anchor point
4000b0: b8 04 00 00 00
mov
eax, 4
4000b5: bb 01 00 00 00
mov
ebx, 1
4000ba: b9 d4 00 60 00
mov
ecx, 0x6000d4
4000bf: ba 0e 00 00 00
mov
edx, 0xe
4000c4: cd 80
int
0x80
(sdb) cont
hello, world!
** hit a breakpoint at 0x4000cb
4000cb: bb 00 00 00 00
mov
ebx, 0
4000d0: cd 80
int
0x80
4000d2: c3
ret
** the address is out of the range of the text section.
Examples
Example 1 (10pt)
Command: ./sdb ./hello
Inputs: cont
** program './hello' loaded. entry point 0x401000
401000:
f3
0f
1e fa
endbr64
401004:
55
push
rbp
401005:
48
89
e5
mov
rbp, rsp
401008:
ba 0e
00 00 00
mov
edx, 0xe
40100d:
48
8d
05 ec 0f
00 00
lea
rax, [rip + 0xfec]
(sdb) cont
hello world!
** the target
program terminated.
Example 2 (10pt)
Command: ./sdb ./hello
Inputs:
break 0x401030
break 0x40103b
cont
cont
si
si
** program './hello' loaded. entry point 0x401000
401000: f3 0f 1e fa
endbr64
401004: 55
push
rbp
401005: 48 89 e5
mov
rbp, rsp
401008: ba 0e 00 00 00
mov
edx, 0xe
40100d: 48 8d 05 ec 0f 00 00
lea
rax, [rip + 0xfec]
(sdb) break 0x401030
** set a breakpoint at 0x401030
(sdb) break 0x40103b
** set a breakpoint at 0x40103b
(sdb) cont
** hit a breakpoint at 0x401030
401030: 0f 05
syscall
401032: c3
ret
401033: b8 00 00 00 00
mov
eax, 0
401038: 0f 05
syscall
40103a: c3
ret
(sdb) cont
hello world!
** hit a breakpoint at 0x40103b
40103b: b8 3c 00 00 00
mov
eax, 0x3c
401040: 0f 05
syscall
the address is out of the range of the text section. (sdb) si
401040: 0f 05 syscall
the address is out of the range of the text section. (sdb) si
the target program terminated.
Example 3 (10pt)
Command: ./sdb ./guess
Inputs:
break 0x4010bf
break 0x40111e
cont
anchor
cont
haha
timetravel
cont
42
cont
** program './guess' loaded. entry point 0x40108b
40108b: f3 0f 1e
fa
endbr64
40108f: 55
push
rbp
401090: 48
89
e5
mov
rbp, rsp
401093: 48
83
ec 10
sub
rsp, 0x10
401097: ba 12
00
00
00
mov
edx, 0x12
(sdb) break 0x4010bf
set a breakpoint at 0x4010bf (sdb) break 0x40111e
set a breakpoint at 0x40111e (sdb) cont
guess a number > ** hit a breakpoint at 0x4010bf
4010bf: bf 00 00 00
00
mov
edi, 0
4010c4: e8
67
00
00
00
call
0x401130
4010c9: 48
89
45
f8
mov
qword ptr
[rbp - 8], rax
4010cd:
48
8d
05
3e
0f
00 00
lea
rax,
[rip
+ 0xf3e]
4010d4:
48
89
c6
mov
rsi,
rax
(sdb) anchor
dropped an anchor (sdb) cont
haha
no no
no
** hit a breakpoint at 0x40111e
40111e: bf 00 00 00 00
mov
edi, 0
401123:
e8
10
00
00
00
call
0x401138
401128:
b8
01
00
00
00
mov
eax, 1
40112d:
0f
05
syscall
40112f:
c3
ret
(sdb)
timetravel
** go
back to
the anchor point
4010bf:
bf 00 00 00 00
mov
edi, 0
4010c4:
e8
67
00
00
00
call
0x401130
4010c9:
48
89
45
f8
mov
qword ptr [rbp - 8], rax
4010cd:
48
8d
05
3e
0f 00 00
lea
rax, [rip + 0xf3e]
4010d4:
48
89
c6
mov
rsi, rax
(sdb)
cont
42
yes
** hit a breakpoint at 0x40111e
40111e: bf 00 00 00 00
mov
edi, 0
401123:
e8
10
00
00
00
call
0x401138
401128:
b8
01
00
00
00
mov
eax, 1
40112d:
0f
05
syscall
40112f:
c3
ret
(sdb)
cont
** the target
program terminated.
Grading
2. [70%] We use N additional test cases to evaluate your implementation. You get 70 points for
each correct test case.
N