License : Creative Commons Attribution 4.0 International (CC BY-NC-SA 4.0)
Copyright : Jeremy Fix, CentraleSupelec
Last modified : March 28, 2024 01:04
Link to the source : index.md

Table of contents

Introduction

This tutorial assumes that you are already familiar with the basic syntax of python, you know about conditionals (if, elif), loops (for, while), defining functions (def, maybe even lambda (?)). If not, you should start with introductory textbooks on python.

We may also suggest few links :
- python documentation
- Dive into python3
- Scipy lectures

For coding in Python, you basically need a text editor and a python interpreter. For the text editor, you’d better use an Integrated Development Environment (IDE) such as Visual Studio Code, Atom, Vim, Emacs. For learning, it is discouraged to use editors such as Spyder or jupyter notebooks which keeps a running kernel with a permanent state, which does not guarantee that your code is going to behave the same when ran a brand new python interpreter instance and for which the result may depend on the evaluation order of the cells (for notebooks).

Some advanced python

Preliminary in case you (have to) use python2

This tutorial is using python3. We do not make use of all the bells and whistles of python3 so that you can run the codes in these tutorials even within a python2 interpreter provided you import some future statement definitions at the beginning of your scripts :

from __future__ import nested_scopes 
from __future__ import generators 
from __future__ import division 
from __future__ import absolute_import
from __future__ import with_statement
from __future__ import print_function
from __future__ import unicode_literals

If using ROS1, which is still based on python2, you do not have the choice, you have to use python2. But otherwise, you should move to python3 as python2 is deprecated.

The shebang

A program written in bash, awk, python (basically any langage that is interpreted) is intepreted by a program. When you want to interpret it (you may be tempted to say to run it), you give your code to the interpreter. In Unix OS, there exists an alternative which consists in specifying the interpreter at the first line of your code. Let’s see this an action with an example. Suppose you have the following code in helloworld.py:

print("Let's rock with learning python cool stuff")

It can be interpreted by either the python3 interpreter:

mylogin@mymachine:~$ python3 helloworld.py
Let's rock with learning python cool stuff

But as we said earlier, you can add the so-called shebang which looks like #!/usr/bin/env interpreter. Let’s modify the helloworld.py script to include the appropriate interpreter :

#!/usr/bin/env python3

print("Let's rock with learning python cool stuff")

Now you can make the script runnable (by adding the execute permission to the user) and run it:

mylogin@mymachine:~$ chmod u+x helloworld.py
mylogin@mymachine:~$ ./helloworld.py
Let's rock with learning python cool stuff

Did you really say if __name__ == '__main__' ?? Running a script or importing a module

There are two ways in which you can make use of your python code in a file toto.py. You can either run it as a script :

mylogin@mymachine:~$ python3 toto.py

or import it as a module. For example, from within an interactive console

mylogin@mymachine:~$ python3
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import toto

Usually, there are some functions you want to be executed when you execute the code as script that you do not want to get executed if you import it. The typical situation is for usage examples. Suppose you write a piece of code which has a bunch of functions, you probably want to show someone how to use these functions. And you want this example code to be executed when someone is running your code as a script but not when someone is importing your code as a module for using it. To know in which situation you are, you can test for the value of the special variable main which equals either “main” if your code is ran as a script or the name of the module if your code is imported as a module.

Let us give a try with toto.py. If we run it as a script :

mylogin@mymachine:~$ python3 toto.py
The __name__ variable equals : __main__
mylogin@mymachine:~$ python3 -c "import toto"
The __name__ variable equals : toto

This leads to the frequently used code :

def an_interesting_function():
    [...]

def example_usage():
    # Some code to illustrate how to use your functions
    [...]

if __name__ == '__main__':
    # Execute your example code only if the code is ran as a script
    # and not if imported as a module
    example_usage()

Classes (basic syntax reminder)

In this section, we say few words about classes and its syntax. You can also dig into the documentation. Every object-oriented programming (OOP) language relies on objects. An object is defined by a class which specifies what the object contains (member variables) and what we can do with it (member functions or class functions); in that respect, an object is a coherent set of pieces of memory (which can themselves be objects) and methods to manipulate them.

To construct an object, we call its constructor which returns an instance; that’s like when you give a plan to a factory (that’s the class) and they give you back a real object (that’s the instance). A basic syntax for defining a class is given below :

class A:
    ''' 
    The docstring of a dummy class
    '''

    def __init__(self, a_value):
        '''
        The constructor

        Args:
            a_value : a value of some type
        '''
        self.my_variable = a_value

    def do_something_with_it(self):
        '''
        A member function
        '''
        # You can access my_variable with the 'self'
        print('Hey, my variable value is {}'.format(self.my_variable))


def main():
    # We build an instance "a" of the class A
    a = A(10)
    a.do_something_with_it()

The class we defined has a constructor which takes an argument. To call the constructor, you call the class name like a function, passing in the arguments if any. Then you can call member functions with the syntax “instance.member_function_name()” as we did for “a.do_something_with_it()”. The self keyword is a reference to the instance on which the function gets called; this allows you to access to the member variables or to call member functions.

Object oriented programming allows for inheritance, allowing objects to inherit from base class: if class B inherits from class A, an instance of B will at least contain the member variables/functions of A plus some extra things that are introduced in the definition of B. That way, you can factor some code rather than rewriting everything you wrote for A into your definition of B. Clearly, if you build a B, you first need to build A and then to customize with the extra things that make a B be a B. As an example, inheriting from the class A above :

class B(A):
    '''
    A dummy class for showing the inheritance syntax
    '''

    def __init__(self, value):
        '''
        The constructor
        '''
        # We call the base class constructor
        A.__init__(self, value)

    def do_another_thing(self):
        # We can call the inherited member functions
        self.do_something_with_it()
        # And also the member variables
        self.my_variable += 1
        print('Hey, my variable changed to {}'.format(self.my_variable))


def main():
    b = B(10)
    b.do_another_thing()

That’s really the basics. You can also override methods in derived classes, you can define special methods like __add__, __getitem__, … so that you can add instances together or allow for the operator [] and there is much more.

Thread-based parallelism

Defining, starting and joining a thread

We all now have multi-core machines that can perform multiple computation at the same time. Threading is the way to allow a process to run multiple possible concurrent computation, i.e. multiple computation “at the same time” that may potentially try to access the same ressources (e.g. memory locations). To write a thread, you can subclass the Thread class and override the run method.

Let us see an example with a thread computing, say, the factorial of \(n\) and the code thread1.py. So, what is going on in this script ? First we import the requested module

import threading

Then we define our Factorial class, inheriting from threading.Thread

class Factorial(threading.Thread):

As usual, the constructor, if defined, needs to call the base class constructor :

    def __init__(self, n):
        '''
        Constructor
        '''
        threading.Thread.__init__(self)
        self.n = n
        self.res = 1

Then, the computation done within the thread is given in the run method. In order the computation to take a bit more time, we added a dummy sleep.

    def run(self):
        '''
        The compute loop
        '''
        res = 1
        for i in range(1, self.n):
            res *= i
            time.sleep(0.1)
        self.res = res

Then, in the main() function, we build and run the thread:

def main():
    '''
    Query a number and trigger the computation of factorial
    '''
    n = int(input("Which value do you want to compute the factorial ?"))
    [...]
    fact = Factorial(n)
    fact.start()
    [...]

We can wait for the thread to finish its work by calling join. That method, without argument, is blocking, i.e. the thread calling it will wait there until the worker thread finishes. However, we can also give it an optional timeout …

    while True:
        # Display a progress
        [...]
        # Wait except if the thread has finished its job
        fact.join(timeout=0.01)

        # If the thread is no more running, i.e. its computation is done
        # just leave the loop
        if not fact.is_alive():
            break
    print("\nThe result of !{} is {}".format(n, fact.res))

Locking critical sections

In a multi-threaded program, you have to be carefull when the same memory location is either written or read and written by multiple threads at the same time; this is a so-called data race. To prevent this, you must protect so-called critical sections. One of such protection mechanisms are Locks (also called mutex). Locks are acquired with the acquire method and released with the release method. To protect a piece of memory from concurrent access, we do something like:

lock = threading.Lock()
some_data = SomeData()
[...]
lock.acquire()
some_data.do_something()
lock.release()

A lock can be acquired only once at a time. If two threads try acquire the lock, the latest thread will be blocked at the acquire waiting for the lock to be released and then will acquire it. You can also acquire/release a lock with a shorter syntax:

lock = threading.Lock()
some_data = SomeData()
[...]
with lock:
    some_data.do_something()

Python has its eye candy and the with statement allows to avoid the heavy syntax of calling explicitely “acquire” and “release” (actually, it is also less cumbersome than ensuring ourselves that the lock is correctly release in cases where exception can be thrown). For this to work, the lock object has the two methods __enter__ and __exit__ which are doing the job of acquiring and releasing the lock. If the code block executed after the with lock:, namely some_data.do_something() in our case, raises an exception, the __exit__ is called and therefore the lock is released (more on this in the with statement documentation).

Let us illustrate this with a producer/consumer problem. Say we have \(N_p\) producers and \(N_c\) consumers that all work with the same FIFO. The producer adds in elements in the FIFO while the consumers get elements out of it. We will use one thread per producer and one thread per consumer. Finally, for displaying purpose, we also want a thread showing the content of the FIFO. The FIFO is therefore shared by multiple threads and its reading and writing operations must be protected by a lock. A possible implementation is given in thread2.py. Every access to shared memories are protected by a Lock. For example, for adding to or popping from the FIFO :

class SharedObject(object):
    """
    A thread safe FIFO
    """

    def __init__(self):
        """
        Constructor
        """
        self.__lock = threading.Lock()
        self.__fifo = []

    def append(self, value):
        """
        Insert the element at the end of the FIFO
        """
        with self.__lock:
            self.__fifo.append(value)

    def pop(self):
        """
        Pop the front element
        Raises IndexError if the list is empty
        """
        with self.__lock:
            value = self.__fifo.pop(0)
        return value

The producer and consumers can now safely, from their own thread, access the FIFO :

class Producer(threading.Thread):

    def __init__(self, shared_object):
        threading.Thread.__init__(self)
        self.shared_object = shared_object
        [...]

    def run(self):
        min_value = 0
        max_value = 100
        num_values = 10
        for _ in range(num_values):
            value = random.randint(min_value, max_value)
            self.shared_object.append(value)
            time.sleep(1. * random.random())

class Consumer(threading.Thread):

    def __init__(self, shared_object):
        threading.Thread.__init__(self)
        self.shared_object = shared_object
        [...]

    def run(self):
        while self.running():
            try:
                value = self.shared_object.pop()
            except IndexError:
                pass
            time.sleep(2. * random.random())

One issue you may encounter is that if a thread has locked a lock, it may need to call a method which itself takes the lock. With a threading.Lock as we have seen it before, your program will be blocked. In that situation, you’d better use a lock that can be relocked by the same thread, a RLock. Even if the same thread can re-lock a RLock it has already locked, it still needs to release the lock the same amount of time if has been locked.

Packaging your code

There are various ways to package you work and the community has not settled for the ultimate solution. We shall see one which is based on setuptools. Also we really just give a quick glance at packaging but there is more to say and you can into the subject on the python packaging user guide.

Installing the dummy project

Our dummy project is this examplePackage. The directory structure is the following :

package
├── README.md
├── setup.py
└── examplePackage
    ├── __init__.py
    ├── thread1.py
    └── thread2.py

To install the package using pip:

mylogin@mymachine:~$ tar -zxvf examplePackage.tar.gz
mylogin@mymachine:~$ cd examplePackage
mylogin@mymachine:~/examplePackage$ python3 -m pip install . --user

This will install the package in the user local directory ~/.local/lib/python3.6/site-packages/ if you use python3.6 (you have the permissions to write in that directory as a normal user while you may not have the permissions to write in the system directories /usr/lib/python3.6/site-packages). Once installed, you can test the installed package :

mylogin@mymachine:~$ python3 -c "import examplePackage; examplePackage.thread2.main()"

Some last comments in random order. You can install the package in developer mode which is usefull if you want to test your package outside the source directory while still developing it. To install in developer mode, you need to python3 -m pip install -e . (and add the --user if required).

Setting up the module

Our module lies in the examplePackage directory. Here it is built from the thread1.py and thread2.py and an additional file, the __init__.py file. That init script is actually what makes the examplePackage directory a python module. This is the script that is executed when you call import examplePackage. In our case, we just import the two modules thread1 and thread2:

from . import thread1
from . import thread2

So that when we import examplePackage, we have access to examplePackage.thread1 and examplePackage.thread2.

Writting the setup.py script

To make a setuptools package, we define a setup.py script which contains at least a call to setuptools.setup():

import setuptools

setuptools.setup(
    name = "...",
    version = "...",
    [...]
    packages = setuptools.find_packages(),
    [...]
)

The magic function setuptools.find_packages is browsing the current directory looking for __init__.py files to determine which packages will be installed. You can do much more, for example providing dependencies that are installed with your packages if missing on the system, compiling C++ extensions, providing entrypoints, etc… We will not cover these and you can refer to the python packaging guides to know more.

An example project: Simulating the Ising model

In order to demonstrate the application of the elements shown in the previous section, we will go through an example python package which features the following:
- a python package preprared with setuptools
- some classes
- a GUI for displaying the state of the simulator
- a multi-threaded program with mutex to handle concurrent access

We will also start with a pretty vague idea of what we expect to find at the end, then perform a design step and finally provide the code that makes it.

Problem statement

The Ising model is a mathematical model describing the evolution of the magnetisation of a material. For a general introduction to the Ising model and the history of its development, have a look in (Hayes, 2000). For a more in depth introduction from the statistical mechanisms side, you can check (Sethna, 2006). It considers the magnetic moment of atoms arranged on a regular lattice. The spin of an atom \(i\) is denoted \(\sigma_i \in \{-1, 1\}\). On a discrete finite lattice, there is a finite set of configurations \(\{\sigma_i, \forall i\}\). Yet, as we shall see, the Ising model brings quite intriguing variation of configurations. In the following, we will consider a square lattice.

A 2D regular lattice with a possible spin configuration, in purple for +1 and black for -1
A 2D regular lattice with a possible spin configuration, in purple for +1 and black for -1

There can also an external field \(H\) that will try enforce the spins to be aligned with it. The Hamiltonian of the system is given by: \[ E(\sigma) = - \sum_{i<j} J_{i,j} \sigma_i \sigma_j - \sum_i H_i \sigma_i \]

The term \(J_{i,j}\) is the coupling between the spins and we will suppose interaction to be uniform over the \(4\) nearest neighbours of the lattice, i.e.  \[ \begin{align} \forall i, j \in \{i\rightarrow, i\uparrow, i\leftarrow, i\downarrow\}, J_{i,j} &= J\\ \forall i, j \notin \{i\rightarrow, i\uparrow, i\leftarrow, i\downarrow\}, J_{i,j} &= 0 \end{align} \]

In the equation above, we denote \(i\rightarrow, i\uparrow, i\leftarrow, i\downarrow\) respectively the index of the unit “on the right of \(i\)”, “on the top of \(i\)”, “on the left of \(i\)” and “on the bottom of \(i\)”. If \(J>0\), the interaction favors the alignment of neighbour spins which tends to make the material ferromagnatic. As you see from the equation and from the illustration above, the coupling uses cyclic boundary conditions (the leftmost spins are interacting with the rightmost spins). If there were no coupling \(J\) between the atoms, the spins would align with the external field \(H\) (to minimize the energy). If there is no external field, minimizing the energy is achieved by getting neighbour spins aligned which finally leads to either \(\sigma_i = +1, \forall i\), or \(\sigma_i = -1, \forall i\). We will suppose in the rest that there is no external field \(H=0\), which still allows to observe interesting behaviors. And there is something else: there is an external source of energy which regulates the temperature of the system \(T\) and which tends to randomize the spins. In ths Ising model, the spin configurations can be occupied with the following probability:

\[\begin{align} P(\sigma) = \frac{1}{Z}\exp(-\beta E(\sigma)), \beta = \frac{1}{k_B T} \end{align}\]

with \(k_B=1.38.10^{-23}\) the Boltzmann’s constant, \(T\) the temperature of the system and \(Z\) a normalization constant. Varying the temperature \(T\) leads to interesting phase transitions in the system, especially around the so-called critical temperature \(T_c\) :

\[\begin{align} T_c &= \frac{2}{k_B \ln(1+\sqrt{2})}\\ k_B T_c &\approx 2.269 \end{align}\]

To simulate the system, ensuring that the states are occupied with the probability given above, we can use the Metropolis algorithm which, applied to the Ising model, is given below :

  1. select randomly an atom \(i\) and compute \(\Delta E = 2 \sigma_i (\sum_j J_{i,j} \sigma_j + H_i)\) which is the variation of energy \(E\) of the system in case of a flip of the spin of \(i\)
  2. if flipping the spin \(\sigma_i\) lowers the energy, i.e. \(\Delta E < 0\), then flip the spin
  3. otherwise, flip the spin with probability \(\exp(-\beta \Delta E)\)

The video below illustrates the behavior of the system for various temperatures :

This is the system we want to simulate and play with. We want to change the temperature and see what happens, we may want to plot some quantities of interest (magnetization \(M(\sigma) = \sum_i \sigma_i\), energy \(E(\sigma)\), correlation length, …). In some extesnions, we may want to interact with the system by enforcing some external field \(H\). From this formal description, we now need to specify what we’ll have to program.

API design

We will need at least three classes :
- a Simulator which will be running in its own thread, performs the update algorithm we mentionned before and also provide functions to set/get the temperature, the spins, etc.. To build the simulator, we need to provide the size of the lattice \(N\) and the temperature \(T\) or maybe better the product \(k_B T\) which is around \(2.69\) at the critical temperature
- a command line interface Console to the Simulator which we develop in a first time for testing the simulation and interacting with it almost without any display
- a graphical interface GUI to the Simulator which we develop in a second time and that offers a confortable display for playing with the simulator with some sliders to vary the temperature, a drawing area on which we can visualize the system, etc..

For the graphical user interface, there are many backends (Gtk, Tkinter, Qt, …) and we made the choice to use Qt5 with its python bindings PyQt5. To get information about it, check the official website, the documented API, PyQt5 tutorials here or here.

Let us say few words about this design. For the Simulator, we have the attributes required by the model, namely the spins, lattice_size, J, temperature and the method step to update the spins and implementing the algorithm given in the problem statement. We also have a time attribute to keep track of how many updates have been performed. The set_kT, get_kT, get_critical_kT, get_spins and get_time are setters and getters on the variables of our problem that we want to be accessible from the outside of the simulator instance. There are also methods and attributes for handling the thread performing the computation and these probably deserve few comments :
- keeps_alive : a boolean variable stating if the thread should keeps on running. As soon as the thread is started, it will repeatidly perform updates by calling step and we need a way to tell it to stop. Setting this boolean to False will be this way, - running : a boolean variable stating if the model should keeps on being updated. We may want to pause the simulation, still keeping the thread running; that’s the purpose of this variable
- lock : a reentrant lock protecting critical sections. The simulator thread will read/write pieces of memory that will also be accessed by the Console or the MainWindow/IsingGUI
- run(): the method overloaded from threading.Thread and stating what the thread will be doing
- stop(): indicates the thread it should finish (will simply flip the keeps_alive variable)
- toggle() : pause/unpause the simulation (will simply flip the running variable)
- is_running(), is_kept_alive(): getters on the variables running and keeps_alive

Now, for the Console, it needs a Simulator with which it will interact and the following methods :
- ask_for_interaction() : to ask to the user what it wants to do (e.g. plotting the system, pausing it, changing the \(k T\) value)
- on_choice(str) : to perform the action asked by the user
- start() : to start the interaction loop

Starting the simulator and a console to interact with it will look like this:

def main():
    """
    Create and start the simulator and the interaction loop
    """
    # Build the simulator
    lattice_size = 128
    kT = 2.69
    simu = simulator.Simulator(lattice_size, kT)

    # Build the interaction console
    console = Console(simu)

    # Start them all
    simu.start()
    console.start()


if __name__ == '__main__':
    main()

Note When programming, we often start by writting that piece of code which is calling what we are going to implement, it helps fixing the ideas about how the code to be written will be used.

And finally, the MainWindow/IsingGUI which are the PyQT5 GUI classes. In any GUI toolkit, the elements of the user interface are called widgets. Anytime you have a label, a button, a slider, a drawable area, a window, etc.. these are all widgets. Here we designed our own widget MainWindow as inheriting from QMainWindow which is a window widget possbily hosting a menu, status bar, toolbars, etc… . In this widget we will place all the UI elements such as labels for displaying information about the simulation, a slider for modifying the temperature of the Ising model, and an area for drawing an image representing the state of the system. We will dig into the various methods but before doing this, let us introduce the second component which is the IsingGUI. The IsingGUI class creates a QApplication, required for running a Qt application, and then creates and show our MainWindow widget. That class provides the same interface than the Console with a constructor requesting a Simulator and a start method.

Starting a simulator and a GUI to interact with it will look like this :

def main():
    """
    Create the simulator and GUI
    and run them all
    """
    # Build the simulator
    lattice_size = 128
    kT = 2.269
    simu = simulator.Simulator(lattice_size, kT)

    # Build the UI
    gui = IsingGUI(simu)

    # Start them all
    simu.start()
    gui.start()


if __name__ == '__main__':
    main()

When you have a GUI, there are some signals that are expected to be raised (e.g. button click, window resize, window close, redraw of the window) and your application has to respond to these signals : this is what the so-called main event loop is doing. If you look in the code of qtgui.py, the IsingGUI.start() method is doing the following :

    def start(self):
        """
        Main loop
        """
        # Start the simulator
        self.simulator.start()

        # Run the main loop
        self.app.exec_()

        # Stop the simulator
        self.simulator.stop()
        self.simulator.join()

The call self.app.exec_(), with self.app being a QApplication starts the main loop and is a blocking function, i.e. nothing after that call gets executed until the main loop is stopped. Everything related to the GUI (refreshing the display of the windows, responding to buttons, etc..) is executed in the main loop by the so-called main thread. That event loop must be running as fast as possible, otherwise, your user interface is not going to respond to user interactions. That’s the reason why, in our application, most of the drawing is done in a dedicated thread, outside the main event loop and this thread is running the update_drawing function. In this method, we get the current state of the spins and draw on a QPixmap which is the Qt structure on which we can draw.

For responding to interface events (button click, window draw, ..), we write so-called callbacks : these are just normal methods but with a signature specified by the backend. The callbacks we provide are the following :
- on_refresh_ui(..): periodically triggered by a QTimer, it updates the text of the labels (e.g. the current time) and draw the QPixmap representing the spins on the window
- on_kT(…) : called when the slider for setting the temperature of the system is changed
- on_key_press(…): called when a key of the keyboard is pressed. This method is actually called by the eventFilter method which is the Qt way of handling events such as key strokes and configured in the constructor
- resizeEvent(…) : overloaded method called when the widget is resized. We use it here to store the current size of the drawing area for resizing the pixmap representing the spins

This short introduction to PyQt is not a tutorial at all. This is just a quick glance into programming UI and you still have to work on dedicated tutorials if you want to master it. Anyway, I remind you the objective of this section was to introduce you to the full pipeline of 1) stating a problem, 2) designing the software, 3) implementing it, all of these making use of threads, classes, etc..

Example package

You can have a look to a possible implementation ising_package.zip. To install it, check the README.md file in the archive.

To test it with the console

mylogin@mymachine:~$ python3 -m ising.console

To test it with the GUI:

mylogin@mymachine:~$ python3 -m ising.qtgui

Below are some examples of what we can obtain for different temperatures:

The Ising model at a low temperature The Ising model at a low temperature The Ising model at a low temperature

References

Hayes, B. (2000). Computing science: The world in a spin. American Scientist, 88(5), 384–388. Retrieved from http://www.jstor.org/stable/27858079

Sethna, J. P. (2006). Statistical mechanics: Entropy, order parameters, and complexity, 2nd edition, chap 8. In Oxford master series in physic. Univ. Press.

Jeremy Fix,