Embarrassingly Parallel

January 19, 2024

Performing the same type of calculation as part of a computational job over a broad set of data in parallel on some HPC resources. One of the two ways that parallel computing is implemented in HPC, having an embarrassingly parallel computational job means that you have, for example, thousands of input files containing some standard formatted (but individually different) data. If you need to do the same calculation on each of those input files, and the result of any given input file’s calculations doesn’t depend on any other input file’s calculations, then you can spread the completion of each file’s calculations out over dozens, hundreds, thousands, or in the largest cases, millions of CPU cores at once in order to gain a performance increase by having multiple files being processed at once instead of just potentially a single one in a non-parallelized configuration.

Embarrassingly parallel workloads are ubiquitous across many fields, and while MPI is a massive use case that is used very widely across many fields for doing parallel computing, it can often be found that the embarrassingly parallel paradigm is used more generally for solving a wider set of problems than MPI is.