Using GDB
K42 has pretty good support for remote GDB.
Getting the right version of GDB
As with the rest of the toolchain, you need powerpc64-linux-gdb, but there are special patches needed. FIXME: where do you get them?
(I have the single-step-mode patch at http://ozlabs.org/~jk/projects/k42/patches/single-step-mode.patch. This was enough to get my recent gdb to work)
To get an early breakpoint, use `K42_EARLY_BREAKPOINT=1'.
Debugging against hardware will not work unless you do `set single-step-mode 0' in gdb.
Debugging user processes
A few handy environment vars:
TRACEPROC=process print the syscalls (with arguments and return values) that process matching process invokes
DEBUGPROC=process break the process (matching process) immediately after exec()
Send a breakpoint to a process by entering 'I' at the test> prompt.
To debug shared libraries, in debugger load the symbol file from lib/exec.so.dbg
Connectiong to a blocked thread
1. Use the P -dump to identify the thread id blocked thread. 2. Interrupt the blocked process. 3. In gdb do : "set UserThreadToUnblock=<QQQQQQQ>", where "<QQQQQQQ>" is the numeric thread id that is blocked.
4. Set a breakpoint at the return point of the "Scheduler::Block" call in which the thread is blocked.
5. "Continue" in gdb. The app will resume execution and will hit the breakpoint you set above, with the thread that was blocked. You should now be able to examine local variables on the stack of the thread.
You will be able to identify things such as
- the FileLinux object that was being the cause of the block - the type of this object (e.g. pipe? socket?) - identify the file descriptor - maybe even figure out which point in the application was
- the source of the blocking system call
Debugging the kernel
Miscellaneous
These might come in handy when debugging:
An example .gdbinit file with some useful commands for K42
kgdb script for starting GDB with predefined commands for attaching to all the victims at your site. To use, run kgdb <filename> and then use target-victim-kern or target-victim-user to attach to the kernel or first debugged user process on victim.
Debugging Memory Leaks
We support a facility, called LeakProof to debug memory leaks in the application and in the kernel. It keeps track of all memory allocations, and the stack backtrace where the allocations happened. On a free, the entry is removed. After running experiments, you can dump the list of allocates that had no corresponding free from any process in the system.
In /os/kernel/defines/mem_debug.H, enable LeakProof by defining DEBUG_LEAK
On console, type "0", then "K pid", to clear information already collected about a pid. On console, type "0" to get to control, then "L pid" to dump the leak information about a particular pid.
To display in interesting way, kill the first word on each line (address allocated), then:
cat gorp | sort | uniq -c > gorp2
Edit gorp2 to start all addresses on their own lines, and
dezig boot_image.dbg < gorp2
You will get something like:
Using binary image: boot_image.dbg
1 10
.ProcessServer::ClientData::operator new(unsi ProcessServer.C 44
.ProcessServer::giveAccessSetClientData(Objec ProcessServer.C 148
.ProcessReplicated::giveAccessByServer(Object ProcessReplicated.H 112
.ProcessSetKern::RegisterPIDGetOH(ObjectHandl ProcessSetKern.C 78
1 10
.ProcessServer::ClientData::operator new(unsi ProcessServer.C 44
.ProcessServer::giveAccessSetClientData(Objec ProcessServer.C 148
.ProcessReplicated::giveAccessByServer(Object ProcessReplicated.H 112
.ProcessServer::_Create(ObjectHandle&, unsign ProcessServer.C 137
Or, for adventurous people, here is a sequence that does it all, just cut and past output of console into this script:
cut -c "23-" | sort | uniq -c | awk '{printf "instances %s size %s\n %s %s %s %s\n", $1, $2, $3, $4, $5, $6}' | dezig boot_image.dbg
And hit "^D" for end of file
Remote/Shared Debugging
The watson folk tend to use VNC to share a console, but the latency is a pain when connecting from Australia. SharedConsoles describes how to use screen to share a text console.
Early boot debugging
When you need to debug a problem that occurs before err_printf is supported, on G5 machines you can use the tick and printUval functions. These can help track down exactly where the problem is occurring.
Fixing Problems
K42 broke my G5!
(originally from here)
After an attempted boot of K42 on a G5, the boot failed, but now the machine won't boot any OS.
You've probably tried to boot the IBM (RS/6000) boot image rather that the mac boot image. The former will set real-mode? to true in the machine's open firmware. To fix:
- Perform a prom-reset by booting with apple+option+p+r held down. This will set all of the open firmware settings to their defaults.
- Boot into OF by holding down apple+option+o+f
Set real-mode back to false:
setenv real-mode? false
You may have to restore other OF settings - for example, at OzLabs we also need to:
setenv load-base 0x4000000 setenv boot-device enet:dhcp-server-ip,filename
Next time you try K42, boot the chrpboot.mac image rather than chrpboot.tok (if you're using k42console, run it with -f chrpboot.mac)
