I was playing a bit with O

You know… Numbers that are only dividable by one and itself and that are greater than one. So, 2 is prime. 2 is also the only even prime number because all other even numbers are dividable by two by definition.

There is only one way to know if a number is prime. You need to check a number against every smaller prime number. If it turns out that it is divisible by any prime, then the number is not prime.

Lets start coding

#include <iostream> #define NUMBER_OF_PRIMES 100000 int main() { uint64_t primes[NUMBER_OF_PRIMES]; uint32_t count = 0; for(uint64_t i = 2; i < UINT64_MAX && count < NUMBER_OF_PRIMES; i++) { bool isPrime = true; #pragma omp parallel for for (uint32_t x = 0; x < count; x++) { if(!isPrime){ x = count; continue; } if(i % primes[x] == 0) { #pragma omp critical isPrime = false; } } if(isPrime) { primes[count++] = i; std::cout << "Found " << count << "th prime: " << i << std::endl; } } return 0; }

As you can see `#pragma omp`

`#pragma omp parallel for`

tells the compiler to generate code that executes the following for loop in multiple cores where it divides the work up in chunks.

`#pragma omp critical`

tells the compiler that the following statement should not be executed by multiple threads at the same time.

Now it is compile time!!!

$ g++ -Ofast main.cpp

Note that this command does not enable OpenMP. But I would like to benchmark. So I run it with:

$ time ./a.out ... ... ... real 0m31,409s user 0m31,324s sys 0m0,080s

Now it uses one core to the max but it takes about 30 seconds to complete on my system.

Lets try again with openmp enabled

$ g++ -Ofast -fopenmp main.cpp $ time ./a.out ... ... ... real 0m9,687s user 1m50,757s sys 0m1,008s

That’s quite an improvement. It went from 30 seconds to about 10 seconds. But my computer has 12 cores and they were all 100% busy at the time. So it’s not running efficiently in this example.

I am sure there are some optimizations possible, and that there are probably way better use cases for OpenMP.