Compiling and Running
From BCCD 3.0
The tutorial assumes that you have an environment ready to run MPI in parallel. This means that you have booted multiple machines, logged into each machine as bccd, started the heartbeat (pkbcast) program, run
bccd-allowall, and run
bccd-snarfhosts. You must do this in order to run MPI across multiple machines.
The BCCD ships with multiple implementations of MPI. You can switch between the installations using Modules.
Typically, running an MPI program will consist of three steps:
- Compiling the code
- Copying the code across all the machines you would like to run the code on
- Executing the code
Assuming you have code to compile (if you have binary executables only, proceed to step 2: copy) you need to create an executable. This involves compiling your code with the appropriate compiler, linked against the MPI libraries.
We'll use the example hello.c. You can use your preferred text editor to create this file. After saving the file, you can compile the program using the mpicc command.
mpicc -o hello hello.c
The "-o" option provides an output file name, otherwise your executable would be saved as "a.out". Be careful to make sure you provide an executable name (and not the same name as your source code!) if you use the "-o" option. Many programmers have deleted part of their source code by accidentally giving their source code the same name as their output file name.
If you have typed the file correctly, you should succesfully compile the code, and an
ls command should show that you have created the file
In order for your program to run on each node, the executable must exist on each node. A new automated script now exists to copy executables across BCCD nodes without compromising other users' runs. It is called
bccd-syncdir, and is run with the following command:
bccd-syncdir <directory> ~/machines
<directory> is the directory which holds the executable, and
~/machines is the machinefile created previously with
bccd-snarfhosts, which contains a list of all the nodes in your cluster. This creates a unique directory in /tmp which holds your executable directory across all nodes. The name of this directory is unique and chosen by the first 8 characters of your current host's public key.
Once you have compiled the code and copied it to all of the nodes, you can run the code using the
mpirun command. The mpirun command takes different arguments, depending on whether your environment is set up to use MPICH or OpenMPI. OpenMPI is currently the default environment on the BCCD. For more information about the two environments, go to Running MPICH or Running OpenMPI. In order to run the program on all computers, you must change to the common directory path created using
Two of the more common arguments to the mpirun command are the
np argument that lets you specify how many processes to use, and the
machinefile argument that lets you specify exactly which nodes are available for use.
Change directory to the directory created earlier using
bccd-syncdir where the executable is located, and run your hello command using 4 processes:
mpirun -np 4 -machinefile ~/machines ./hello