Using Python to debug C and C++ code (using gdb)

David Malcolm, Red Hat

PyCon US 2011

These slides can also be seen via



I'm going to assume basic familiarity with Python, and with either C or C++

Hopefully you've used gdb at least once.

You need gdb 7.0 or later, built with Python embedding enabled.

Why I love this technology

As it happens, the crashing program was itself in Python

Python saves the day

Interactive Python within gdb

The gdb module is built in

Use help if you get lost:

    (gdb) python help(gdb)
    Help on package gdb:


Here's the C code I had to debug:

            static PyObject *interned; /* actually a PyDictObject */

            typedef struct _dictobject PyDictObject;

            struct _dictobject {

               /* (Fields snipped for simplicity) */

               /* Something within here was being
                  corrupted: */
               PyDictEntry *ma_table;

Looking up data

What's the type of the data?

Getting at an underlying pointer value

Use long to extract a pointer value from a gdb.Value:

            (gdb) python print hex(long(val))

Not to be confused with the address of the gdb.Value wrapper within the gdb process:

            (gdb) python print repr(val)
            <gdb.Value object at 0x7f52bf44bdc0>

Casts and pointers


In Python terms, first we need the type:

            (gdb) python \
            type_dict_ptr = \

            (gdb) python print type_dict_ptr
            PyDictObject *

Note how we used gdb.Type.pointer


Now we can cast val:

            (gdb) python val2 = val.cast(type_dict_ptr)
            (gdb) python print val2
            (gdb) python print val2.type
            PyDictObject *

So val2 is equivalent to ((PyDictObject*)interned)

Looking up fields of a structure

Treat a gdb.Value as a dictionary to get at the fields of the underlying data:

            (gdb) python val3 = val2['ma_table']
            (gdb) python print val3
            (gdb) python print val3.type
            PyDictEntry *

So we now have val3, equivalent to:


The easier way

I just showed you the difficult way to do this

The easy way is to use gdb.parse_and_eval directly on a gdb expression:

            (gdb) python \
            val3 = gdb.parse_and_eval(

Looking at C arrays

Pointer and array gdb.Value instances support the Python indexing syntax:

            (gdb) python print val3[2]
            {me_hash = 0, me_key = 0x0, me_value = 0x0}

This is equivalent to the underlying C pointer/array syntax:


Iterating through a data structure

We can then use Python to find all entries in the table satisfying a criteria:

            (gdb) python
            print [i for i in range(8192)
                     if long(val3[i]['me_value']) == 0]

What was the point of all the above?

Pretty-printers for custom data types

Example: LibreOffice's string types


            (gdb) print pWndContents
            (String *) 0x7f842941fcf0


            (gdb) print pWndContents
            String(u'Hello world')

This will show up everywhere in GDB, including backtraces.

Example: LibreOffice's string types (pt2)

            typedef struct _UniStringData {
                sal_Int32    mnRefCount;
                sal_Int32    mnLen;
                sal_Unicode  maStr[1];
            } UniStringData;

            class String {
                UniStringData*  mpData;

How to write a prettyprinter (1)

Get the program to some known state

Go hunting for instances of the type:

            (gdb) p pSVData->maAppData->mpAppName
            $14 = (String *) 0x7f842941fcf0

            (gdb) p $14->mpData
            $15 = (UniStringData *) 0x7f264fb6fda0

            (gdb) p *$15
            $16 = {mnRefCount = 1, mnLen = 7, maStr = {115}}

How to write a prettyprinter (2)

Now capture it as a python variable, to make it easy to go peeking inside it:

            (gdb) python
            appName = gdb.parse_and_eval(

            (gdb) python print appName

Poke at it till it works

Here's the fragment of Python code I came up with for printing (String*) values:

            (gdb) python mpData = appName['mpData']
            (gdb) python
                  for i in range(mpData['mnLen'])]

Giving this output:


Wire up the hack into gdb (1)

A prettyprinter is a class:

            class StringPrinter(object):
                def __init__(self, val):
                    # "val" is a gdb.Value
                    # representing a (String *)
                    # in the inferior process
                    self.val = val

Wire up the hack into gdb (2)

with a to_string method:

            def to_string(self):
                mpData = self.val['mpData']
                length = int(mpData['mnLen'])
                maStr = mpData['maStr']
                chars = [unichr(int(maStr[i]))
                         for i in xrange(length)]
                result = u"".join(chars)
                return "String(%r)" % result

Wire up the hack into gdb (3)

def pp_lookup(gdbval):
    # Only for types that are "String *"
    type = gdbval.type.unqualified()
    if type.code == gdb.TYPE_CODE_PTR:
        type =
        t = str(type)
        if t in ("String"):
            return StringPrinter(gdbval)

Wire up the hack into gdb (4)

def register (obj):
    if obj == None:
        obj = gdb

    # Wire up the pretty-printer

register (gdb.current_objfile ())

See the documentation for more details:


Checking for a NULL pointer:

            if 0 == long(self.val):
                return 'NULL'

Safety limit:

            # Don't send gdb into a long loop if it
            # encounters corrupt data:
            length = min(length, 1024)

The Edit/test cycle

Locate some data of the type in question:

            $ PYTHONPATH=$(pwd) gdb --args PROGRAM
            (gdb) python import YOUR_DEBUG_CODE
            (gdb) print SOME_DATA

You don't need to restart the program each time. Edit, repeat:

            (gdb) python reload(YOUR_DEBUG_CODE)
            (gdb) print SOME_DATA

Write an automated test suite

See Lib/test/ in CPython's source code for examples of this

Hints and tips

Custom gdb commands

Create a subclass of gdb.Command, and write its invoke method.

I've done this for CPython:

See Tools/gdb/ in CPython's source code

What have we covered?

Where to go from here

Lots of other Python/gdb functionality

More information

gdb documentation:

Tom Tromey's blog:

The LibreOffice string pretty-printer I wrote:

Python code that groks GNU libc's malloc/free implementation:

Other examples

Q & A