Python Object Graphs

Build Status Build Status (Windows) Test Coverage Documentation Status

objgraph is a module that lets you visually explore Python object graphs.

You’ll need graphviz if you want to draw the pretty graphs.

I recommend xdot for interactive use. pip install xdot should suffice; objgraph will automatically look for it in your PATH.

Installation and Documentation

pip install objgraph or download it from PyPI.

Documentation lives at https://mg.pov.lt/objgraph.

Quick start

Try this in a Python shell:

>>> x = []
>>> y = [x, [x], dict(x=x)]
>>> import objgraph
>>> objgraph.show_refs([y], filename='sample-graph.png')
Graph written to ....dot (... nodes)
Image generated as sample-graph.png

(If you’ve installed xdot, omit the filename argument to get the interactive viewer.)

You should see a graph like this:

[graph of objects reachable from y]

If you prefer to handle your own file output, you can provide a file object to the output parameter of show_refs and show_backrefs instead of a filename. The contents of this file will contain the graph source in DOT format.

Backreferences

Now try

>>> objgraph.show_backrefs([x], filename='sample-backref-graph.png')
... # doctest: +NODES_VARY
Graph written to ....dot (8 nodes)
Image generated as sample-backref-graph.png

and you’ll see

[graph of objects from which y is reachable]

Memory leak example

The original purpose of objgraph was to help me find memory leaks. The idea was to pick an object in memory that shouldn’t be there and then see what references are keeping it alive.

To get a quick overview of the objects in memory, use the imaginatively-named show_most_common_types():

>>> objgraph.show_most_common_types() # doctest: +RANDOM_OUTPUT
tuple                      5224
function                   1329
wrapper_descriptor         967
dict                       790
builtin_function_or_method 658
method_descriptor          340
weakref                    322
list                       168
member_descriptor          167
type                       163

But that’s looking for a small needle in a large haystack. Can we limit our haystack to objects that were created recently? Perhaps.

Let’s define a function that “leaks” memory

>>> class MyBigFatObject(object):
...     pass
...
>>> def computate_something(_cache={}):
...     _cache[42] = dict(foo=MyBigFatObject(),
...                       bar=MyBigFatObject())
...     # a very explicit and easy-to-find "leak" but oh well
...     x = MyBigFatObject() # this one doesn't leak

We take a snapshot of all the objects counts that are alive before we call our function

>>> objgraph.show_growth(limit=3) # doctest: +RANDOM_OUTPUT
tuple                  5228     +5228
function               1330     +1330
wrapper_descriptor      967      +967

and see what changes after we call it

>>> computate_something()
>>> objgraph.show_growth() # doctest: +RANDOM_OUTPUT
MyBigFatObject        2        +2
dict                797        +1

It’s easy to see MyBigFatObject instances that appeared and were not freed. I can pick one of them at random and trace the reference chain back to one of the garbage collector’s roots.

For simplicity’s sake let’s assume all of the roots are modules. objgraph provides a function, is_proper_module(), to check this. If you’ve any examples where that isn’t true, I’d love to hear about them (although see Reference counting bugs).

>>> import random
>>> objgraph.show_chain(
...     objgraph.find_backref_chain(
...         random.choice(objgraph.by_type('MyBigFatObject')),
...         objgraph.is_proper_module),
...     filename='chain.png') # doctest: +NODES_VARY
Graph written to ...dot (13 nodes)
Image generated as chain.png
[chain of references from a module to a MyBigFatObject instance]

It is perhaps surprising to find linecache at the end of that chain (apparently doctest monkey-patches it), but the important things – computate_something and its cache dictionary – are in there.

There are other tools, perhaps better suited for memory leak hunting: heapy, Dozer.

Reference counting bugs

Bugs in C-level reference counting may leave objects in memory that do not have any other objects pointing at them. You can find these by calling get_leaking_objects(), but you’ll have to filter out legitimate GC roots from them, and there are a lot of those:

>>> roots = objgraph.get_leaking_objects()
>>> len(roots) # doctest: +RANDOM_OUTPUT
4621
>>> objgraph.show_most_common_types(objects=roots)
... # doctest: +RANDOM_OUTPUT
tuple          4333
dict           171
list           74
instancemethod 4
listiterator   2
MemoryError    1
Sub            1
RuntimeError   1
Param          1
Add            1
>>> objgraph.show_refs(roots[:3], refcounts=True, filename='roots.png')
... # doctest: +NODES_VARY
Graph written to ...dot (19 nodes)
Image generated as roots.png
[GC roots and potentially leaked objects]

Support and Development

The source code can be found in this Git repository: https://github.com/mgedmin/objgraph.

To check it out, use git clone https://github.com/mgedmin/objgraph.

Report bugs at https://github.com/mgedmin/objgraph/issues.

For more information, see Hacking on objgraph.