How to debug the source

From Emergent

Jump to: navigation, search

Contents

GDB Tricks & Tips

Saving a recover file

gdb won't usually do this when you crash. Here's how to make it happen:

  • bt -- shows full stack of function calls
  • up x -- where x = the # next to the pdpMisc::Main call (should be the 2nd to last item in bt)
  • p SaveRecoverFile(1) -- call this static fun on pdpMisc (no ; at end!)
  • you will probably get a warning about unwindonsignal -- ignore this -- the recover file has been created already.

Watchpoints for finding when something changes

# in emergent/bin: 
./gdb_run
(gdb) b ta_type.cpp:2223 // (set breakpoint in start of MemberDef::GetPathName() function)
(gdb) run (run program)

# in css console:
emergent> p .LeabraTrain.init_code[1];  // this is the NetCounterInit (caused breakpoint to trigger when printing out counter variable)

# in gdb:
(gdb) p &owner
> (class MemberSpace **) 0x9af7ef0
# this is key expression: 
# have to watch the contents of this memory address 
# (cant just watch the address -- need the *)
(gdb) watch *((class MemberSpace **) 0x9af7ef0) 
> Hardware watchpoint 3: *(MemberSpace **) 162496240
(gdb) c // continue

# in css console:
emergent> .LeabraTrain.Init(); // run Init, which I had identified as causing problem

# in gdb:
> Hardware watchpoint 3: *(MemberSpace **) 16249624
(gdb) bt 10  // give me a backtrace of last 10 calls on stack
#0  0x0111911e in MemberSpace::El_SetOwner_ (this=0xa03a7d4, it=0x9af7ea8) at ta_type.cpp:1603
#1  0x008d68ac in taPtrList_impl::El_Own_ (this=0xa03a7d4, it=0x9af7ea8) at ./../ta/ta_list.h:214
#2  0x0110c439 in taPtrList_impl::Add_ (this=0xa03a7d4, it=0x9af7ea8, no_notify=false) at ta_list.cpp:227
#3  0x00908570 in taPtrList<MemberDef>::Add (this=0xa03a7d4, item=0x9af7ea8) at ./../ta/ta_list.h:675
#4  0x008f831f in NetMonItem::ScanObject_InObject (this=0xa03a770, obj=0x9faa680, var=@0xbf9e4b44, mk_col=true, own=0x9fa30f4) at netstru.cpp:6882 ....

# aha!! 
(gdb) up 4 // go up 4 stack levels
(gdb) l // list code at that point

# it is calling members.Add!!!

debugging under dmem

Single processor dmem version

Use this when there is something basic in the dmem startup code or something that isn't actually a problem with the dmem per se -- just runs with dmem on a single processor:

mpirun -machinefile machines -np 1 /usr/bin/gdb /usr/local/bin/emergent_mpi++ (make machines just have dream in it, then pass args during run as usual)

Multiple processor gdb!

1. Create a job .sh file with PBS directives (just copy from a JOB.xxx.sh file)

gdb_job.sh:
-----------
#PBS -l nodes=9:ppn=1
#PBS -j oe
#PBS -l walltime=120:00:00

2. Run qsub -I on this guy;

qsub -I gdb_job.sh

3. Type (paste) in the commands that you want to run (commands in the .sh file are not run)

cd /media/sdb1/home/dream/oreilly/svn_sims/perception/objrec
mpiexec -allstdin gdb /usr/local/bin/emergent_mpi
(gdb) run -nogui -ni -p hv.ct256.proj log_trials=true tag=_136_9g

The -allstdin command for mpiexec is key for sending the stdin to all the jobs. Then, if one of them crashes, or you set a breakpoint, all of the jobs will get the commands.

Personal tools