Game of Life

From BCCD 3.0

Jump to: navigation, search

This tutorial will walk you through an investigation of parallel computing using the Game of Life. For instructions on how to compile and run MPI programs, please see Compiling and Running. This tutorial assumes that you have booted up the BCCD and have X running (to do this, enter startx). It uses mpirun, which means you need an implementation of MPI installed; the BCCD comes with both OpenMPI and MPICH but with OpenMPI running by default. For more information, see Running MPICH and Running Open MPI).

Cellular Automata: The Game of Life

The Game of Life is an iterative process set up on a square grid. Cells on the grid are either "alive" or "dead" and follow the following rules:

A typical progression might look like this.

Round 1 Round 2 Round 3
Life round1.jpeg Life round2.jpeg Life round3.jpeg


These simple rules can lead to many complicated phenomena, some of which seem quite stable, and some of which seem almost chaotic. For more information, see the Game of Life entry on Wikipedia.

Running the Game of Life on a large scale can require a lot of memory. The amount of storage scales as the side length of your grid squared. This problem is ripe for exploitation by parallel programming. You could break up a larger grid into smaller subgrids. Since each cell only needs information about its nearest neighbors, you only have to communicate among subgrids at the edges of the subgrids.

Life on the Live-BCCD

The MPI Life example is set up to run as “side by side” subgrids. You enter in the number of rows and columns of each subgrid, and the number of iterations to be solved.

To run the program, first move into the Life directory by executing cd ~/Life. Next the executable needs to be "made" by running make. This will create the executables Life.serial, Life.mpi, Life.openmp, and Life.hybrid.

Next we need to copy the MPI executable to all the nodes that will be running it (If you have not yet set up your nodes for remote access, make sure you have logged into each machine as bccd, started the heartbeat (pkbcast) program, run bccd-allowall, and run bccd-snarfhosts. You must do this before continuing.) An automated script exists to copy executables across BCCD nodes without compromising other users' runs. It is called 'bccd-syncdir', and for Life, it is run with the following command:

bccd-syncdir ~/Life ~/machines-openmpi

where ~/Life is the directory which holds the executable, and ~/machines is the machinefile created previously with 'bccd-snarfhosts', containing a list of all the nodes in your cluster. This creates a unique directory in /tmp which holds your executable directory across all nodes. The name of this directory is unique and follows the pattern /tmp/hostname-user (so yours could be /tmp/node009-bccd). Make note of this directory and move into with cd <your directory>.

A full list of Life's options can be seen by giving --help to any of the Life binaries, but for now we'll just be specifying the rows, columns, and generations.

First try running Life on just one processor:

time mpirun –np 1 -machinefile ~/machines-openmpi ./Life.c-mpi --rows 100 --columns 100 --gens 10000

Compare it to two processors (notice we've decreased the number of columns by 50 since each processor will only be responsible for half of the total number of columns):

time mpirun –np 2 -machinefile ~/machines-openmpi ./Life.c-mpi --rows 100 --columns 50 --gens 10000

Does using more CPUs allow you to solve the same problem faster?

Life on the Liberated-BCCD

To run the program, first move into the Life directory by executing cd ~/Life. Next the executable needs to be "made" by running make. This will create the executables Life.serial, Life.mpi, Life.openmp, and Life.hybrid.

Next we need to copy the MPI executable to all the nodes that will be running it. To do so in your terminal first run bccd-allowall, and then run bccd-snarfhosts.

First try running Life on just one processor:

time mpirun –np 1 -machinefile ~/machines-openmpi ./Life.c-mpi --rows 100 --columns 100 --gens 10000

Compare it to two processors (notice we've decreased the number of columns by 50 since each processor will only be responsible for half of the total number of columns):

time mpirun –np 2 -machinefile ~/machines-openmpi ./Life.c-mpi --rows 100 --columns 50 --gens 10000

Does using more CPUs allow you to solve the same problem faster?

To run it on multiple processors, make appropriate changes to the number of processors (-np), etc...


What is the efficiency of this implementation on your cluster? (efficiency can be measured in many ways, but typically can be expressed by taking the running time with 1 processor, and dividing it by the running time with P processors *P)

efficiency = time(1)/(time(P)*P)

Suppose you did the same thing with a 50x50 world? 1000x1000? What is meant by the efficiency of a parallel solution? If you calculate it once, does it apply to any parallel code running on that machine?

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox