⬆️ ⬇️

We eliminate the prime of a billion numbers faster than Wikipedia

( Source of the picture )



It is well known that Sieve Eratosthenes (RE) is one of the oldest algorithms that appeared long before the invention of computers. Therefore, one might think that over the centuries this algorithm has been studied far and wide and nothing can be added to it. If you look at Wikipedia, there is a sea of ​​links to reputable sources in which you can easily drown. Therefore, I was surprised when I accidentally discovered the other day that the option that is presented as the optimal one in Wikipedia can be significantly optimized.



Here is how it was. In the discussion of the article on functional programming (FP) I asked the question : how to write the OM in this paradigm. Possessing more than scant knowledge of FP, I do not presume to judge the answers, but other participants in the discussion rejected some of the proposed solutions at once, indicating that instead of theoretical complexity



O(n log logn) (one)

')

the proposed implementation will have computational complexity



O(n2/ logn) (2)



and that with such complexity not to wait, when, for example, 10 million numbers are sifted. It became interesting to me and I tried to implement the optimal version according to Wikipedia, using my usual procedural programming. In Delphi-7, I got the following code:



Listing 1
program EratosthenesSieve; // Sieve of Eratosthenes {$APPTYPE CONSOLE} uses SysUtils, DateUtils,Math; const version ='1.0.1d1'; N = 1000000000; // number of primes var sieve : array [2..N] of boolean; // false if prime t0, t1,dt : TDateTime; O,C : Extended; procedure init; var i : integer; begin for i:=2 to n do sieve [i] := false; end; //init procedure calc (start : integer); var prime, i : integer; breakLoop, exitProc : Boolean; begin prime := start; exitProc := false; repeat // find next prime prime := prime+1; while (prime<N) and sieve[prime] do inc (prime); i:= sqr(prime); // delete if i<= N then begin breakLoop := false; repeat if i<= N then begin sieve [i] := true; inc (i,prime); end else // if i<= N breakLoop := true; until breakLoop; end else // if prime+prime<= N exitProc := true; until exitProc; end; //calc procedure print; var i :integer; found : integer; begin found := 0; for i:=2 to N do if not sieve [i] then begin // write (i,', '); inc(found); end; writeln; writeln ('Found ',found,' primes.'); end; // begin // program body writeln ('Sieve of Eratosthenes version ', version); writeln('N= ',N); init; t0 := now; writeln('Program started ',DateTimeToStr(t0)); t0 := now; calc (1); t1 := now; writeln('Program finished ',DateTimeToStr(t1)); dt := SecondSpan(t1,t0); writeln ('Time is ',FloatToStr(dt),' sec.'); O := N* ln(ln(N)); C := dt/O; writeln ('O(N ln ln N)= ',O,' C=',C); O := sqr(N)/ln(N); C := dt/O; writeln ('O(N*N/ln N)= ',O,' C=',C); print; writeln ('Press Enter to stop...'); readln; end. 




RE is represented by a boolean array sieve with inverse values ​​— if the number is simple, it is denoted as false, which allows reducing the number of negative operations (s) during sifting. The program has 3 procedures: initialization of the ER - init, calculations (sifting and crossing out the numbers in the ER) - calc and outputting the primes found as a result of - print, while counting the number of the numbers found. I will especially pay attention to the commented out output of prime numbers in the print: procedure for testing if N = 1000, the comment is removed.



Here in the calc procedure I use the Wikipedia recommendation: for the next prime number i, delete the numbers from the OM



i2,i2+i,i2+2i,...





This program sifted a billion numbers in 17.6 seconds. on my PC (CPU Intel Core i7 3.4 GHz).

Having made this program, I suddenly remembered the widely known properties of even and odd numbers .



Lemma 1. 1) odd + odd = even; 2) odd + even = odd; 3) even + even = even.



Evidence



one) (2n+1)+(2m+1)=2n+2m+2 divided by 2. CTD.

2) (2n+1)+(2m)=2n+2m+1 not divisible by 2 without remainder. CTD.

3) (2n)+(2m)=2n+2m divided by 2. CTD.



Lemma 2. The square of an odd number is an odd number.

Evidence. (2n+1)2=4n2+4n+1 not divisible by 2 without remainder. CTD.



Comment. A prime number greater than 2 is odd.



Therefore, it is possible to cross out only odd numbers:



i2,i2+2i,i2+4i,... (3)



But first you need to cross out all the even numbers. This is done by the modified init init procedure.



Listing 2
 program EratosthenesSieve; // Sieve of Eratosthenes {$APPTYPE CONSOLE} uses SysUtils, DateUtils,Math; const version ='1.0.1d1'; N = 1000000000; // number of primes var sieve : array [2..N] of boolean; // false if prime t0, t1,dt : TDateTime; O,C : Extended; procedure init; var i : integer; begin for i:=2 to n do sieve [i] := not odd(i); end; //init procedure calc (start : integer); var prime,prime2, i : integer; breakLoop, exitProc : Boolean; begin prime := start; exitProc := false; repeat // find next prime prime := prime+1; while (prime<N) and sieve[prime] do inc (prime); // i:= prime*prime; i:= sqr(prime); prime2 := prime+prime; // delete if i<= N then begin breakLoop := false; repeat if i<= N then begin sieve [i] := true; inc (i,prime2); end else // if i<= N breakLoop := true; until breakLoop; end else // if prime+prime<= N exitProc := true; until exitProc; sieve [2] := false; end; //calc procedure print; var i :integer; found : integer; begin found := 0; for i:=2 to N do if not sieve [i] then begin // write (i,', '); inc(found); end; writeln; writeln ('Found ',found,' primes.'); end; // begin // program body writeln ('Sieve of Eratosthenes version ', version); writeln('N= ',N); init; t0 := now; writeln('Program started ',DateTimeToStr(t0)); t0 := now; calc (2); t1 := now; writeln('Program finished ',DateTimeToStr(t1)); dt := SecondSpan(t1,t0); writeln ('Time is ',FloatToStr(dt),' sec.'); O := N* ln(ln(N)); C := dt/O; writeln ('O(N ln ln N)= ',O,' C=',C); O := sqr(N)/ln(N); C := dt/O; writeln ('O(N*N/ln N)= ',O,' C=',C); print; writeln ('Press Enter to stop...'); readln; end. 




This program worked in 9.9 seconds. - almost twice as fast.



To assess the compliance of the real-time program with a theoretical one, I assumed that



dt=CO,





Where dt - measured operating time;

C - constant with the dimension of time;

O - theoretical evaluation.



Calculated from here C for evaluation (1) and (2). For N=106 and less dt close to zero. Therefore, I cite data on the first program for large N.



N(one)(2)
1071.69 cdot1092.74 cdot109
1085.14 cdot1091.47 cdot108
1095.80 cdot1091.29 cdot107


As we can see, the estimate (1) is much closer to the real results. For the second program, a similar pattern is observed. I strongly doubt that I discovered America using the sequence (3) and I will be very grateful for the link to the work where this approach was applied. Most likely, the Wikipedia authors themselves drowned in a sea of ​​information on the OM and missed this work.

Source: https://habr.com/ru/post/450604/



All Articles