Sieve of Eratosthenes : A basic implementation

The Sieve of Eratosthenes is a process to find all primes not exceeding a specific positive integer. The process is simple. First all the numbers within the range is populated in a list.The process strikes out all the numbers in the list which are multiple of some previous number. This strike-out process starts with the integer 2, thus first all the multiple of 2 upto the limit n are striked out. Note that the number 2 itself is not striked out, because it is not a multiple of any integer lesser than it (except 1). Next the just following unstriked integer is selected, which is in this case 3, and then all the multiples of 3 are striked out from the like, and again the next unstriked out integer is selected and the process continues until no such integer remains in the list which is unstriked and a multiple of some integer lesser that it. Note that the next integer to strike out after the i^th unstriked integer is i * i , as all the multiples of i less than i are striked in the previous passes. At the end if the process only those numbers will be still unstriked in the list which are not a multiple of some other integer lesser that it, thus only the prime numbers will remain unstriked in the list. At the end of the process the list can be rescanned and the unstriked integers could be found.
Read more about Sieve of Eratosthenes here: http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

Implementation

The simple implementation is to allocate an array and then populate the array with integers sequentially. But this process will take a lot of memory as each integer will take up 4 bytes of space. Instead we can use each bit position of byte to represent an integer, like the n^th bit in the bit string list will represent the integer n in the list. The bit can have two values 1 or 0 depending upon if that (number) bit is striked out or not. If the n^th bit is 0 then n is striked out else if it is 1 then it is not striked out. Thus with this bit implementation each bit of the memory is used to represent a single integer, and the state of the number can be found from the bit value of the memory.
To create a bit-field as described above, a character array would be taken. Each to access the bit positions of each byte of the separately we need to mask out that particular bit from the byte. We do this with the bit operators. We define the state of an integer as follows:

If n^th bit position is 0 it the integer n is NOT striked out
If n^th bit position is 1 it the integer n IS striked out

There are two operations that we would do on each bit:

Check if the bit is set or not (if striked out or not)
Set a bit (strike out the position)

The above two operations on the bits are implemented as macros which will take an array and an integer value representing the bit position to operate upon.

The bit testing and setting macros are defined as below

#define IS_SET(x) (arr[(x)>>3] & (0x01 << ((x) & 0x07)))
#define SET(x) (arr[(x)>>3] |= (0x01 << ((x) & 0x07)))

This works as follows: Because each element of an array is of one bytes (8 bits) of length, the high order 29 bits (as we consider x as integer) will be used to locate the array position in which the x^th bit would be present. The remaining lower order 3 bits of x will locate the bit within the byte selected with the high order 29 bits. For example if we want to locate the bit position 12541 then the higher order bits construct the integer 1567 and selects the 1567^th entry of the char array. And the lower order bits of 12541 which is the value 5 locates the bit positin within the 1567^th byte. It is similar to dividing and remaindering the integer. The division result 12541 / 8 gives us the char array location where the bit is in, and 12541 % 8 gives is the exact bit within that bit within that byte which overall represents the 12541^th bit.

In the IS_SET macro arr[x>>3] selects the location of the char array (the 12541/8 effect), and then the bit position within it described by the lower 3 bits are checked. First the lower order 3 bits are extracted with (x & 0x07) then the value 0x01 is shifted the amount of positions described by the 3 bit value. Thus we get a 1 in the desired bit location. ANDing this value with the byte position would be true if and only if the byte in the char array also has a 1 in that same but position. If it has a 1 , then the ANDing would result in true, that is the value is set, else it is not set.

The thing is same as with the SET macro. in this case instead of ANDing the value we first generate a value with a 1 in the desired location and then OR this value with the existing value of the byte location. This would set the corresponding bit position of the char array selected location.

After these functions are created the code is pretty straight forward. We need multiple passes through th bit field. Each pass it crosses out integer multiple of some integer. When there are no more integers to cross (no bit positions to SET), the primes would lie un-SET or uncrossed in the bit array, which we can test with another loop and find.

Making it 50% ligher on memory

The above soluition already saves a great deal of memory by utilizing each bit position. But we can see that the even positions are always marked, or ignored on (on the above code), and thus wastes a lot of space. There are 50% of even numbers within any range input. So if interpreat the bitfield as of which each bit position represents only the odd numbers then we can save 50% memory. For example the 1^st location of the bitfield represents 3 , the bit position 2 represents 5, bit position 3 represents 7, and so on. Thus we construct such a way such that the (i+1)^th bit position represents the next odd number represented by the i^th bit position.
In this version we have implemented this by simply ignoring the LSB (Least Significant Bit) of the bit field index number. This is done by shifting the integer one location to the right. The bit set and test macros are unchanged. Thus an even integer and its next odd integer wil map into the same location on the bit field, like 4 (100) and, 5 (101) will map into the same location which is 2 (10) . So we have to make sure that we do not use any even number to map into this field, because no bit position represent any even number. When setting the bit in the first for loop, we have tested if it is an odd number, and then only have marked it. When scanning the map for unmarked positions the process is similar, ignoring the LSB and counting if the position is unmarked. Thus, if range is the range within which we need to find prime numbers, then the amount of memory needed is range / (8 * 2) . This makes the memory usage for 7% of the input value of range. For example if the input range is 1000 then only 63 bytes are allocated to do the marking work.

The total process described above is implemented in C Language below.

Sourcecode

/* A basic implementation of Sieve of Eratosthenes */
/* This code is a part of https://phoxis.org/2010/09/18/sieve-of-eratosthenes-a-basic-implementation/ */

#include <stdio.h>
#include <stdlib.h>

#define SET(x) (arr[(x)>>3] |= (0x01 << ((x) & 0x07)))
#define IS_SET(x) (arr[(x)>>3] & (0x01 << ((x) & 0x07)))

int
main (void)
{
  unsigned long long int primes, i, k, range, n;
  unsigned char *arr;

  /* Enter Range */
  printf ("\nEnter Range: ");
  scanf ("%Lu", &range);

  /* Set max array index, one byte represents 8 integers and
   * the even integers are not stored. that makes range / (8 * 2)
   * bytes of memory needed.
   */
  n = range / (8 * 2);

  /* Allocate bit array */
  arr = calloc (sizeof (unsigned char), n);
  if (arr == NULL)
    {
      perror ("Exiting");
      exit (1);
    }

  /* Default primes to 1 , for integer 2
   */
  primes = 1;

  /* start marking process from 3 */
  i = 3;

  /* Marking process starts from 3 and marks starting at the 
   * multiple (greater than 1) of the current value of i
   */
  while (i < range)
    {
      /* Mark multiples of k if and only if k is an odd integer. We are
       * ignoring marking and also storing the even integers, by right 
       * shifting the integer index by one location. The first integer 
       * to mark after the first unmarked location is i * i, as the other 
       * multiple of i lessthan i are already marked in the previous passes.
       */
      for (k = i * i; k <= range; k += i)
      {
        /* If k is odd, then the LSB is 1 */
	if (k & 0x01)
	  SET (k >> 1);
      }
      
      /* Find the next not-crossed integer and start
       * crossing from it in the next iteration
       */
      i += 2;
      while ((i <= range) && (IS_SET (i >> 1)))
        i += 2;
    }

  /* Scan the bit array to look for uncrossed area to find the 
   * number of primes.
   */
  for (i = 3; i <= range; i += 2)
    {
      if (!IS_SET (i >> 1))
        {
          /* Print the prime number */
          printf ("%Ld,",i);
          primes++;
        }
    }

  printf ("\n\nprimes = %Ld\n", primes);
  
  return 0;
}

In the above code we have avoided storing and also marking multiple of 2’s, and thus was ale to save save 50% of computation and also storage. Thus we only need memory about 7% of the input range. An input range 5000000000 finished computation in 230 seconds and takes about 298MB of memory, in an AMD Athlon X2 4400+ running @ 2.3 GHz . Binary was compiled without optimization. Timed with time -p

Further Development

To make even better we can skip the multiples of two and three and save more markings.
For even faster processing we can make two (or more than two) bit fields and dedicate one single thread on each bit-field to work on, each thread would mark a certain region within the total input range. If n CPUs are available then the processing speed would decrease in almost a factor of n as each memory location and thread is independent of themselves and thus no sharing and memory protection is needed.

Update Information

18:09:2010 : Code Updated, new section introduced, “Making it 50% ligher on memory”

7 thoughts on “Sieve of Eratosthenes : A basic implementation”

Saikat Mukherjee says:

September 18, 2010 at 3:27 pm

পড়ে দেখতে হবে।

Isn’t this a better way to find primes on a computer? Copied verbatim from wikipedia:

def euler_sieve(n):
    # build a table of odd numbers.
    num_tab = range(1,n,2)
    # our table looks like 1,3,5,7,... we change the first
    # to the first prime, 2
    num_tab[0] = 2
    i = 1
    # biggest number in our table
    highestval = num_tab[-1]
    while True:
        # find first operator in the sieve
        cx = num_tab[i]
        # non working value so move to the next
        if cx == False:
            i += 1
            continue
        # first value to be sieved is always going to be
        # the current number * itself. all the next numbers in
        # that sieve will be larger.
        if cx**2 &gt; n:
            break
        # strike off - our sieve
        tostrike = []
        for j in xrange(i,len(num_tab)):
            # find the second operator in the sieve
            cy = num_tab[j]
            # non-working value... skip over
            if cy == False:
                continue
            cut = cx*cy
            # outside the sieve's bounds
            if cut &gt; highestval:
                break
            # add to our sieve
            tostrike.append(cut)
        # sieve the values from our number table
        for d in tostrike:
            ind = (d - 1)/2
            num_tab[ind] = False
        # find the highest value in our number table that
        # hasn't been sieved
        hiind = -1
        while num_tab[hiind] == False:
            hiind -= 1
        highestval = num_tab[hiind]
 
        i += 1
    return num_tab
 
...
 
    print [x for x in euler_sieve(100) if x != False]

Although a lot more processor intensive, it’s also easier to comprehend and the algorithm is cleaner. The code is Python.

phoxis says:

September 18, 2010 at 4:14 pm

The algorithm is the same, just different implementations. Both of the versions strike out the multiples in passes through the array and then scan for the unstriked values. Although the method above presented will use far more less memory as each integer is represented with one bit, the value depends on the bit index. I agree that the marking process could be advanced by avoiding already marked area, and i hope to update the code and the documentation with the update soon.
—- comment update —
Code updated, now it computes a bit more faster but uses 50% less memory. For me it is a bit different, because i am not used to python, so it was not easier for me to get into it, and i had to read it in detail. Although i have not run the python code, but i expect it to take more time and memory than the C solution presented above.

st0le says:

February 19, 2011 at 12:57 pm

for (k = i * i; k <= range; k += i)

since i is odd, so i*i is odd… so the first value of k is odd, but when you add i (odd number) to it, you get a even number (sum of 2 odds = even)….so instead add twice of i (odd + even = odd)
for (k = i * i; k <= range; k += 2*i)

You won’t have to check for k to be odd, it will always be odd. so eliminate the if condition. cheers!

1. phoxis says:
  
  February 19, 2011 at 1:35 pm
  
  Yes, that is correct, thanks for making this very important point out.
  
st0le says:

February 19, 2011 at 12:59 pm

Another memory Optimization, that can be done is eliminate factors of 5, but its tricky…

1. phoxis says:
  
  February 19, 2011 at 1:43 pm
  eliminating marking of 2 is implemented, but elimination of storing of 2’s and 3’s is not too difficult to implement. It can be done with say
```
int i=5, dx = 4;
while (i < range)
{
  dx = 6 - dx;
  i = i + dx;
}
```
  This would skip all multiple of 2’s and 3’s thus would save 17% more computation and storage marking 3’s which is wasted in this implementation.
  
  Skipping factors of 5 is definitely tricky, the series is a tough one to generate.