- Parallel: Running multiple computations (often on the same computer) at the same time
- Distributed: Running a calculation across multiple, networked computers
- Often used together, interchangeably, especially in HPC
- Instruction pipelining
- Completely transparent (invisible) parallelism
- Interleave steps, independent operations, "speculative"
- Data parallelism: SIMD (Single Instruction, Multiple Data)
- Instructions operate on multiple values simultaneously
- vectorization: MMX, SSE, AVX
- Sometimes inferred by the compiler from loops
- Hand-written assembly, special functions, libraries
- Everything runs in a process
- Defines global state: memory contents, open files, network connections, etc.
- Only makes use of one core by default
- Parallel execution sharing resources in a single process
- (Global) variables, open files, global state: all shared
- Easy to read the same data
- Hard to write to the same data
- Functions allow synchronization (lock/mutex)
- Some libraries turn single function calls into multi-threaded calculations
- Don't require any explicit code changes
- Consider interaction with explicit parallelism (task count multiplies!)
- Parallel execution in separate resource spaces
- Separate copies of all data
- Need to explicitly communicate (send messages) to coordinate
- Same filesystems, fast IPC, "shared" memory
- Various machines working together
- Must send messages to communicate, coordinate
- Homogeneous cluster of machines
- Low-latency network allows fast communication
- Shared network filesystems
- Low-latency networks (10x slower than local memory, 100x faster than SSD)
- Makes distributed computing similar to process-level parallelism
- Processes running on separate hardware
- No shared memory
- Tightly-coupled execution, often running the same code
- Run more at once than fits on a single machine (more memory, calculations)
- Share intermediate results throughout computation
- Running many independent (though often parallel) computations
- Collect and store many results across a number of inputs, parameters
- A number of ranks work in parallel, usually running the same code
- Divide up work among themselves
- Coordinate/communicate to share intermediate values
- Run one main process
- Hand off pieces of work to a pool of workers
- Coordination happens in main process