Writing a Linux Debugger Part 3: Registers and memory
In the last post we added simple address breakpoints to our debugger. This time we’ll be adding the ability to read and write registers and memory, which will allow us to screw around with our program counter, observe state and change the behaviour of our program.
These links will go live as the rest of the posts are released.
- Registers and memory
- Elves and dwarves
- Source and signals
- Source-level stepping
- Source-level breakpoints
- Stack unwinding
- Handling variables
- Next steps
Registering our registers
Before we actually read any registers, we need to teach our debugger a bit about our target, which is x86_64. Alongside sets of general and special purpose registers, x86_64 has floating point and vector registers available. I’ll be omitting the latter two for simplicity, but you can choose to support them if you like. x86_64 also allows you to access some 64 bit registers as 32, 16, or 8 bit registers, but I’ll just be sticking to 64. Due to these simplifications, for each register we just need its name, its DWARF register number, and where it is stored in the structure returned by
ptrace. I chose to have a scoped enum for referring to the registers, then I laid out a global register descriptor array with the elements in the same order as in the
ptrace register structure.
You can typically find the register data structure in
/usr/include/sys/user.h if you’d like to look at it yourself, and the DWARF register numbers are taken from the System V x86_64 ABI.
Now we can write a bunch of functions to interact with registers. We’d like to be able to read registers, write to them, retrieve a value from a DWARF register number, and lookup registers by name and vice versa. Let’s start with implementing
ptrace gives us easy access to the data we want. We just construct an instance of
user_regs_struct and give that to
ptrace alongside the
Now we want to read
regs depending on which register was requested. We could write a big switch statement, but since we’ve laid out our
g_register_descriptors table in the same order as
user_regs_struct, we can just search for the index of the register descriptor, and access
user_regs_struct as an array of
The cast to
uint64_t is safe because
user_regs_struct is a standard layout type, but I think the pointer arithmetic is technically UB. No current compilers even warn about this and I’m lazy, but if you want to maintain utmost correctness, write a big switch statement.
set_register_value is much the same, we just write to the location and write the registers back at the end:
Next is lookup by DWARF register number. This time I’ll actually check for an error condition just in case we get some weird DWARF information:
Nearly finished, now he have register name lookups:
And finally we’ll add a simple helper to dump the contents of all registers:
As you can see, iostreams has a very concise interface for outputting hex data nicely2. Feel free to make an I/O manipulator to get rid of this mess if you like.
This gives us enough support to handle registers easily in the rest of the debugger, so we can now add this to our UI.
Exposing our registers
All we need to do here is add a new command to the
handle_command function. With the following code, users will be able to type
register read rax,
register write rax 0x42 and so on.
Where is my mind?
We’ve already read from and written to memory when setting our breakpoints, so we just need to add a couple of functions to hide the
ptrace call a bit.
You might want to add support for reading and writing more than a word at a time, which you can do by just incrementing the address each time you want to read another word. You could also use
/proc/<pid>/mem instead of
ptrace if you like.
Now we’ll add commands for our UI:
Before we test out our changes, we’re now in a position to implement a more sane version of
continue_execution. Since we can get the program counter, we can check our breakpoint map to see if we’re at a breakpoint. If so, we can disable the breakpoint and step over it before continuing.
First we’ll add for couple of helper functions for clarity and brevity:
Then we can write a function to step over a breakpoint:
First we check to see if there’s a breakpoint set for the value of the current PC. If there is, we first put execution back to before the breakpoint, disable it, step over the original instruction, and re-enable the breakpoint.
wait_for_signal will encapsulate our usual
Finally we rewrite
continue_execution like this:
Testing it out
Now that we can read and modify registers, we can have a bit of fun with our hello world program. As a first test, try setting a breakpoint on the call instruction again and continue from it. You should see
Hello world being printed out. For the fun part, set a breakpoint just after the output call, continue, then write the address of the call argument setup code to the program counter (
rip) and continue. You should see
Hello world being printed a second time due to this program counter manipulation. Just in case you aren’t sure where to set the breakpoint, here’s my
objdump output from the last post again:
0000000000400936 <main>: 400936: 55 push rbp 400937: 48 89 e5 mov rbp,rsp 40093a: be 35 0a 40 00 mov esi,0x400a35 40093f: bf 60 10 60 00 mov edi,0x601060 400944: e8 d7 fe ff ff call 400820 <[email protected]> 400949: b8 00 00 00 00 mov eax,0x0 40094e: 5d pop rbp 40094f: c3 ret
You’ll want to move the program counter back to
0x40093a so that the
edi registers are set up properly.
In the next post, we’ll take our first look at DWARF information and add various kinds of single stepping to our debugger. After that, we’ll have a mostly functioning tool which can step through code, set breakpoints wherever we like, modify data and so forth. As always, drop a comment below if you have any questions!
You can find the code for this post here.
Let me know what you think of this article on twitter @TartanLlama or leave a comment below!