How to debug the source
From Emergent
Contents |
GDB Tricks & Tips
Saving a recover file
gdb won't usually do this when you crash. Here's how to make it happen:
- bt -- shows full stack of function calls
- up x -- where x = the # next to the pdpMisc::Main call (should be the 2nd to last item in bt)
- p SaveRecoverFile(1) -- call this static fun on pdpMisc (no ; at end!)
- you will probably get a warning about unwindonsignal -- ignore this -- the recover file has been created already.
Watchpoints for finding when something changes
# in emergent/bin: ./gdb_run (gdb) b ta_type.cpp:2223 // (set breakpoint in start of MemberDef::GetPathName() function) (gdb) run (run program) # in css console: emergent> p .LeabraTrain.init_code[1]; // this is the NetCounterInit (caused breakpoint to trigger when printing out counter variable) # in gdb: (gdb) p &owner > (class MemberSpace **) 0x9af7ef0 # this is key expression: # have to watch the contents of this memory address # (cant just watch the address -- need the *) (gdb) watch *((class MemberSpace **) 0x9af7ef0) > Hardware watchpoint 3: *(MemberSpace **) 162496240 (gdb) c // continue # in css console: emergent> .LeabraTrain.Init(); // run Init, which I had identified as causing problem # in gdb: > Hardware watchpoint 3: *(MemberSpace **) 16249624 (gdb) bt 10 // give me a backtrace of last 10 calls on stack #0 0x0111911e in MemberSpace::El_SetOwner_ (this=0xa03a7d4, it=0x9af7ea8) at ta_type.cpp:1603 #1 0x008d68ac in taPtrList_impl::El_Own_ (this=0xa03a7d4, it=0x9af7ea8) at ./../ta/ta_list.h:214 #2 0x0110c439 in taPtrList_impl::Add_ (this=0xa03a7d4, it=0x9af7ea8, no_notify=false) at ta_list.cpp:227 #3 0x00908570 in taPtrList<MemberDef>::Add (this=0xa03a7d4, item=0x9af7ea8) at ./../ta/ta_list.h:675 #4 0x008f831f in NetMonItem::ScanObject_InObject (this=0xa03a770, obj=0x9faa680, var=@0xbf9e4b44, mk_col=true, own=0x9fa30f4) at netstru.cpp:6882 .... # aha!! (gdb) up 4 // go up 4 stack levels (gdb) l // list code at that point # it is calling members.Add!!!
debugging under dmem
Single processor dmem version
Use this when there is something basic in the dmem startup code or something that isn't actually a problem with the dmem per se -- just runs with dmem on a single processor:
mpirun -machinefile machines -np 1 /usr/bin/gdb /usr/local/bin/emergent_mpi++ (make machines just have dream in it, then pass args during run as usual)
Multiple processor gdb!
1. Create a job .sh file with PBS directives (just copy from a JOB.xxx.sh file)
gdb_job.sh: ----------- #PBS -l nodes=9:ppn=1 #PBS -j oe #PBS -l walltime=120:00:00
2. Run qsub -I on this guy;
qsub -I gdb_job.sh
3. Type (paste) in the commands that you want to run (commands in the .sh file are not run)
cd /media/sdb1/home/dream/oreilly/svn_sims/perception/objrec mpiexec -allstdin gdb /usr/local/bin/emergent_mpi (gdb) run -nogui -ni -p hv.ct256.proj log_trials=true tag=_136_9g
The -allstdin command for mpiexec is key for sending the stdin to all the jobs. Then, if one of them crashes, or you set a breakpoint, all of the jobs will get the commands.
