In order to schedule the execution of code on specific nodes, MPI (Message Passing Interface) is commonly used. Open MPI is a free library implementation of this, and used by many high-ranking supercomputers. MPICH is another popular implementation.
MPI itself is defined as a communication protocol for the programming of parallel computers. It is currently at its third revision (MPI-3).
In summary, MPI offers the following basic concepts:
- Communicators: A communicator object connects a group of processes within an MPI session. It both assigns unique identifiers to processes and arranges processes within an ordered topology.
- Point-to-point operations: This type of operation allows for direct communication between specific processes.
- Collective functions: These functions involve broadcasting communications within a process group. They can also be used in the reverse manner, which would take the results from all processes in a group and, for example, sum them on a single node. A more selective version would ensure that a specific data item is sent to a specific node.
- Derived datatype: Since not every node in an MPI cluster is guaranteed to have the same definition, byte order, and interpretation of data types, MPI requires that it is specified what type each data segment is, so that MPI can do data conversion.
- One-sided communications: These are operations which allow one to write or read to or from remote memory, or perform a reduction operation across a number of tasks without having to synchronize between tasks. This can be useful for certain types of algorithms, such as those involving distributed matrix multiplication.
- Dynamic process management: This is a feature which allows MPI processes to create new MPI processes, or establish communication with a newly created MPI process.
- Parallel I/O: Also called MPI-IO, this is an abstraction for I/O management on distributed systems, including file access, for easy use with MPI.
Of these, MPI-IO, dynamic process management, and one-sided communication are MPI-2 features. Migration from MPI-1 based code and the incompatibility of dynamic process management with some setups, along with many applications not requiring MPI-2 features, means that uptake of MPI-2 has been relatively slow.