Contents
Python
Run time profiling with cProfile
Python is distributed with profiling modules. They describe the run time operation of a pure python program, providing a variety of statistics.
The cProfile module is the recommended module. To execute your program under the control of the cProfile module, a simple form is
$ python -m cProfile -s cumulative mypythonscript.py ncalls tottime percall cumtime percall filename:lineno(function) 16 0.055 0.003 15.801 0.988 __init__.py:1(<module>) 1 0.000 0.000 11.113 11.113 __init__.py:35(extract) 135 7.351 0.054 11.078 0.082 __init__.py:25(iter_extract) 10350736 3.628 0.000 3.628 0.000 {method 'startswith' of 'str' objects} 1 0.000 0.000 2.422 2.422 pyplot.py:123(show) 1 0.000 0.000 2.422 2.422 backend_bases.py:69(__call__) ...
Each column provides information about time execution of every function calls. -s cumulative orders the result by descending cumulative time.
Note: | You can profile a particular python function such as main() >>> import profile >>> profile.run('main()') |
---|
Graphical tools to show profiling results
Even if report tools are included in cProfile profiler, it can be interesting to use graphical tools. Most of them work with a stat file that can be generated by cProfile using the -o filepath option.
Below are some of available graphical tools that we tested.
Gpro2Dot
is a python based tool that allows to transform profiling results output into a picture containing the call tree graph (using graphviz). A typical profiling session with python looks like this:
$ python -m cProfile -o output.pstats mypythonscript.py $ gprof2dot.py -f pstats output.pstats | dot -Tpng -o profiling_results.png
Each node of the output graph represents a function and has the following layout:
+----------------------------------+ | function name : module name | | total time including sub-calls % | total time including sub-calls % | (self execution time %) |------------------------------------> | total number of self calls | +----------------------------------+
Nodes and edges are colored according to the "total time" spent in the functions.
Note: | The following small patch let the node color correspond to the execution |
---|
time and the edge color to the "total time":
diff -r da2b31597c5f gprof2dot.py
--- a/gprof2dot.py Fri Aug 31 16:38:37 2012 +0200
+++ b/gprof2dot.py Fri Aug 31 16:40:56 2012 +0200
@@ -2628,6 +2628,7 @@
weight = function.weight
else:
weight = 0.0
+ weight = function[TIME_RATIO]
label = '\n'.join(labels)
self.node(function.id,
PyProf2CallTree
is a script to help visualizing profiling data with the KCacheGrind graphical calltree analyzer. This is a more interactive solution than Gpro2Dot but it requires to install KCacheGrind. Typical usage:
$ python -m cProfile -o stat.prof mypythonscript.py $ python pyprof2calltree.py -i stat.prof -k
Profiling data file is opened in KCacheGrind with pyprof2calltree module, whose -k switch automatically opens KCacheGrind.
There are other tools that are worth testing:
RunSnakeRun is an interactive GUI tool which visualizes profile file using square maps:
$ python -m cProfile -o stat.prof mypythonscript.py $ runsnake stat.prof
pycallgraph generates PNG images of a call tree with the total number of calls:
$ pycallgraph mypythonscript.py
lsprofcalltree also use KCacheGrind to display profiling data:
$ python lsprofcalltree.py -o output.log yourprogram.py $ kcachegrind output.log
C/C++ extension profiling
For optimization purpose one may have python extensions written in C/C++. For such modules, cProfile will not dig into the corresponding call tree. Dedicated tools must be used (they are most part of Python) to profile a C++ extension from python.
is a python module dedicated to the profiling of compiled python extension. It uses the google CPU profiler:
$ python -m yep --callgrind mypythonscript.py
Memory Profiler
You may want to control the amount of memory used by a python program. There is an interesting module that fits this need: memory_profiler
You can fetch memory consumption of a program over time using
>>> from memory_profiler import memory_usage >>> memory_usage(main, (), {})
memory_profiler can also spot lines that consume the most using pdb or IPython.
General purpose Profiling
The Linux perf tool gives access to a wide variety of performance counter subsystems. Using perf, any execution configuration (pure python programs, compiled extensions, subprocess, etc.) may be profiled.
Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots.
You can have information about execution times with:
$ perf stat -e cpu-cycles,cpu-clock,task-clock python mypythonscript.py
You can have RAM access information using:
$ perf stat -e cache-misses python mypythonscript.py
Be careful about the fact that perf gives the raw value of the hardware counters. So, you need to know exactly what you are looking for and how to interpret these values in the context of your program.
Note that you can use Gpro2Dot to get a more user-friendly output:
$ perf record -g python mypythonscript.py $ perf script | gprof2dot.py -f perf | dot -Tpng -o output.png