HPC - Sieve of Eratosthenes

Moreno Marzolla moreno.marzolla@unibo.it

Last updated: 2022-10-14

The sieve of Erathostenes is an algorithm for identifying the prime numbers falling within a given range which usually is the set \(\{2, \ldots, n\}\) . A natural number \(p \geq 2\) is prime if and only if the only divisors are 1 and \(p\) itself (2 is prime).

To illustrate how the sieve of Eratosthenes works, let us consider the case \(n=20\). We start by listing all integers \(2, \ldots n\):

The first value in the list (2) is prime; we mark all its multiples and get:

The next unmarked value (3) is again prime. We mark all its multiples starting from \(3 \times 3\) (indeed, \(3 \times 2\) has been marked at the previous step because it is a multiple of 2). We get:

The next unmarked value (5) is prime. The smaller unmarked multiple of 5 is \(5 \times 5\), because \(5 \times 2\), \(5 \times 3\) and \(5 \times 4\) have all been marked since they are multiples of 2 and 3. However, since \(5 \times 5 > 20\) is outside the upper bound of the interval, the algorithm terminates and all unmarked numbers are prime:

The file omp-sieve.c contains a serial program that, given an integer \(n \geq 2\), computes the number \(\pi(n)\) of primes in the set \(\{2, \ldots n\}\) using the sieve of Eratosthenes1. Although the serial program could be made more efficient both in time and space, here it is best to sacrifice efficiency for readability. The set of unmarked numbers in \(\{2, \ldots, n\}\) is represented by the isprime[] array of length \(n+1\); during execution, isprime[k] is 0 if and only if \(k\) has been marked, i.e., has been determined to be composite (\(2 \leq k \leq n\)); isprime[0] and isprime[1] are not used.

The program contains a function int mark(char *isprime, int k, int from, int to) that marks all multiples of \(k\) belonging to the set \(\{\texttt{from}, \ldots \texttt{to}-1\}\). The function returns the number of values that have been marked for the first time.

The goal is to write a parallel version of the sieve of Erathostenes; to this aim, you might want to use the following hints.

The main program contains the following instructions:

count = n - 1;
for (i=2; i*i <= n; i++) {
    if (isprime[i]) {
        count -= mark(isprime, i, i*i, n+1);
    }
}

To compute \(\pi(n)\) we start by initializing count as the number of elements in the set \(\{2, \ldots n\}\); every time we mark a value for the first time, we decrement count so that, at the end, we have that \(\pi(n) = \texttt{count}\).

It is not possible to parallelize the for loop above, because the content of isprime[] is possibly modified by function mark(), and this represents a loop-carried dependency. However, it is possible to parallelize the body of function mark(). The idea is to partition the set \(\{\texttt{from}, \ldots \texttt{to}-1\}\) among \(P\) OpenMP threads so that every thread will mark all multiples of \(k\) that belong to its partition.

I suggest that you start using the omp parallel construct (not omp parallel for) and compute the bounds of each partition by hand. It is not trivial to do so correctly, but this is quite instructive since during the lectures we only considered the simple case of partitioning a range \(0, \ldots, n-1\), while here the range does not start at zero.

Once you have a working parallel version, you can take the easier route to use the omp parallel for directive and let the compiler partition the iteration range for you.

To compile:

    gcc -std=c99 -Wall -Wpedantic -fopenmp omp-sieve.c -o omp-sieve

To execute:

    ./omp-sieve [n]

Example:

    OMP_NUM_THREADS=2 ./omp-sieve 1000

Table 1 shows the values of the prime-counting function \(\pi(n)\) for some \(n\). Use the table to check the correctness of your implementation.

Table 1
\(n\) \(\pi(n)\)
1 0
10 4
100 25
1000 168
10000 1229
100000 9592
1000000 78498
10000000 664579
100000000 5761455
1000000000 50847534

Files


  1. \(\pi(n)\) is also called prime-counting function