I was playing a bit with O
You know… Numbers that are only dividable by one and itself and that are greater than one. So, 2 is prime. 2 is also the only even prime number because all other even numbers are dividable by two by definition.
There is only one way to know if a number is prime. You need to check a number against every smaller prime number. If it turns out that it is divisible by any prime, then the number is not prime.
Lets start coding
#include <iostream>
#define NUMBER_OF_PRIMES 100000
int main()
{
uint64_t primes[NUMBER_OF_PRIMES];
uint32_t count = 0;
for(uint64_t i = 2; i < UINT64_MAX && count < NUMBER_OF_PRIMES; i++)
{
bool isPrime = true;
#pragma omp parallel for
for (uint32_t x = 0; x < count; x++) {
if(!isPrime){
x = count;
continue;
}
if(i % primes[x] == 0) {
#pragma omp critical
isPrime = false;
}
}
if(isPrime) {
primes[count++] = i;
std::cout << "Found " << count << "th prime: " << i << std::endl;
}
}
return 0;
}
As you can see #pragma omp
#pragma omp parallel for
tells the compiler to generate code that executes the following for loop in multiple cores where it divides the work up in chunks.
#pragma omp critical
tells the compiler that the following statement should not be executed by multiple threads at the same time.
Now it is compile time!!!
$ g++ -Ofast main.cpp
Note that this command does not enable OpenMP. But I would like to benchmark. So I run it with:
$ time ./a.out
...
...
...
real 0m31,409s
user 0m31,324s
sys 0m0,080s
Now it uses one core to the max but it takes about 30 seconds to complete on my system.
Lets try again with openmp enabled
$ g++ -Ofast -fopenmp main.cpp
$ time ./a.out
...
...
...
real 0m9,687s
user 1m50,757s
sys 0m1,008s
That’s quite an improvement. It went from 30 seconds to about 10 seconds. But my computer has 12 cores and they were all 100% busy at the time. So it’s not running efficiently in this example.
I am sure there are some optimizations possible, and that there are probably way better use cases for OpenMP.