Optimising Raspberry Pi code

In any low-resource system, you need to make maximum use of what is available. Profiling helps you to figure out where to focus your efforts


One of the issues when trying to develop programs to run on a Raspberry Pi is to be as efficient as possible. This is true for any system, but even more so for systems with limited resources. One of the key tools that you can use to help you optimise your code is a profiler. Once your program grows beyond a certain size, it becomes essentially impossible to know where you should focus your attention and spend your time trying to optimise it. Using a profiler, you can get a clear idea of which parts of your program get used the most. These sections are where you should focus your efforts so that you get the biggest impact. You should be aware that profiling is different from benchmarking. Your program will take longer to run under a profile because of the additional overhead tasks. The important information is not how much total time is being used in different sections of code, but what percentage of the time is being spent in each section.

With compiled languages, like C or C++, profilers are external programs that need to link in to your running code and poke around to see what is happening. With Python, profilers are modules that you load just before you run your code. This is one of the great advantages of coding in an interpreted language. You can squeeze other code in to handle monitoring activities. This month, we will take a look at two of the most popular profiling modules and see how they can help optimise your code.

The first profiling module we will look at is cProfile. This module is actually a C extension. Being written in C, it is optimised to minimise the amount of overhead involved, making it very good for any programs that have a longer run time. This is actually the recommended option for most people. You load the cProfile module with the command:

import cProfile

You can now run Python code through the profiler. The call requires the same format you would use if you were to ‘exec’ some code. The easiest thing to do here is to have your code packaged as a function. As a quick and dirty example, let’s write a function that calculates the cube of all of the integers up to some limit:

def looper(size):
   for a in range(looper):

You can then run this function in the profiler with the command:‘looper(1000000)’)

The output you get looks like this:

4 function calls in 0.878 seconds
Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1      0.668   0.668   0.878   0.878 :1()
1      0.000   0.000   0.000   0.000 {method ‘disable’ of....
1      0.210   0.210   0.210   0.210 {range}

You get six columns of output from a profile run. The first column is the number of times that particular function is called. The second column is the total amount of time used by that function, minus any time spent in calls to subfunctions. You should notice that the time values are given to a precision of 0.001 seconds. This is because the profiler only records time to within the nearest clock tick. On most machines today, the clock tick is set to 10ms. In most cases, this should be good enough. If you are worrying about functions that take less than 10ms to run, you probably need to be working closer to the metal then you can get with Python.

The third column is the amount of time used per call. The fourth and fifth columns give the same times, except they include the amount of time spent in calls to subfunctions. The last column is the location of the relevant function call. Some of these can get quite long, so in our example the longer lines have been truncated a bit for space. You may also see two numbers for a given line in the first column. These cases are when a particular function is recursive. In the code listing, you can see a classic factorial function and the output from a profiling run.

While this is great when playing around in an interactive session, what can you do if you want to keep the profiling data for later analysis? The run function of cProfile will accept a second parameter containing a filename where all of the profiling data will get saved off to. So you could call:‘myfunction()’, ‘prof_data’)

The profiler will then save all of the collected data into the file ‘prof_data’. Now you can have a permanent record of your progress in trying to optimise your code. You can use the pstats module to load the data and play around with it. You could load the data, strip any extraneous path information from the module names and sort it with:

import pstats
p = pstats.Stats(‘prof_data’)

These techniques work fine when you are still in the development stage, but what can you do if you inherit some modules? How do you decide where to start? You can run full scripts through cProfile from the command line with:

python -m cProfile -o prof_data

This will run the profiler on the file ‘’ and save the profiling data to the file ‘prof_data’.

The second popular profiling module is profile. This module provides the same interface as cProfile, but is written purely in Python. This means that there is a good deal more overhead involved, so the run time for a profiling run may be significant. But, since it is pure Python, it is open to being modified easily. This is a great help if you wanted to build more functionality on top of that provide by profile. Also, profile will work on systems where cProfile is not implemented. The one that you decide to use will depend on your specific needs.

With the information that we have covered this issue, you should have the tools you need to be able to figure out where to apply your optimisation skills. So now, you should have no excuses for having slow and bloated code. So, go forth and optimise.

Full code listing

# Here we will run through some
# examples for cProfile

# We will define a looping function
# that calculates the cube of a range
# of numbers to chew up CPU cycles
def looper(size):
   for a in range(size):

# Start by importing cProfile
import cProfile

# Run looper for a large size
# and profile it’s behaviour‘looper(10000)’)

# We’ll setup a “bad” factorial
# function to look at recursion
def myfact(a):
   if a == 0:
      return 1
      return a * myfact(a-1)

# We’ll run this and save it off
# to a file for later‘myfact(50)’, ‘prof_data’)

# Now we will load it in to pstats
import pstats
p = pstats.Stats(‘prof_data’)