Codility problem — Count Semiprimes

Codility problem — Count Semiprimes

Problem-solving is an important aspect of our daily work as software developers.

Solving algorithmic problems keeps mental nerves active and strong. This in turn results in our work. In this post, let me discuss a Codility problem, Count Semiprimes.

Problem Statement


A prime is a positive integer X that has exactly two distinct divisors: 1 and X. The first few prime integers are 2, 3, 5, 7, 11 and 13.

A semiprime is a natural number that is the product of two (not necessarily distinct) prime numbers. The first few semiprimes are 4, 6, 9, 10, 14, 15, 21, 22, 25, 26.

You are given two non-empty arrays P and Q, each consisting of M integers. These arrays represent queries about the number of semiprimes within specified ranges.

Query K requires you to find the number of semiprimes within the range (P[K], Q[K]), where 1 ≤ P[K] ≤ Q[K] ≤ N.

For example, consider an integer N = 26 and arrays P, Q such that:
    P[0] = 1    Q[0] = 26
    P[1] = 4    Q[1] = 10
    P[2] = 16   Q[2] = 20

The number of semiprimes within each of these ranges is as follows:

        (1, 26) is 10,
        (4, 10) is 4,
        (16, 20) is 0.

Write a function:

    class Solution { public int[] solution(int N, int[] P, int[] Q); }

that, given an integer N and two non-empty arrays P and Q consisting of M integers, returns an array consisting of M elements specifying the consecutive answers to all the queries.

For example, given an integer N = 26 and arrays P, Q such that:
    P[0] = 1    Q[0] = 26
    P[1] = 4    Q[1] = 10
    P[2] = 16   Q[2] = 20

the function should return the values [10, 4, 0], as explained above.

Write an efficient algorithm for the following assumptions:

        N is an integer within the range [1..50,000];
        M is an integer within the range [1..30,000];
        each element of arrays P and Q is an integer within the range [1..N];
        P[i] ≤ Q[i].

Before jumping into the solution, let us recap basic Mathematics.

A prime number is something which can be divisible only by itself and 1.

For example: 2, 3, 5, 7, 11, …

A semiprime number is something which is a product of exactly two prime numbers.

For example: 4, 6, 9, 10, 14, …

I find it motivating to learn algorithms if I know their real-world applications. Semiprimes are widely used in cryptography and data compression.

Now, to find a count of semiprimes in the given range, we need to identify whether the given number has prime factors or not.

A given number n can be semiprime only if n = p1\p2*

Here, p1 and p2 are prime numbers and are less than n.

If we find either p1 or p2, we should be able to find the other factor easily.

That is, p2 = n/p1

Now the big question is how to find p1 efficiently?

Main Idea

For this, we need to store prime numbers in a such a way that we should be able to say for the given number n does it have a factor of prime number?

Here, we use the Sieve of Eratosthenes with slight changes.

For the sake of discussion, let the range be from 1 to 10. That is, a=1, b=10

Consider an array of length b+1 to store prime factors.

Initial state of the array

Now, according to Sieve of Eratosthenes, for each number we will compute its square v1. Any value from square v1 which is divisible by the number will be marked.

For example, for 2 its square is 4. After 6, 8, and 10 are divisible by 2. Hence, 4, 6, 8, and 10 are marked.

The slight change here is instead of just marking the index, we will store the original number itself. In this case, it's 2.

Number 2 marked

Similarly, will mark number 3.

Number 3 marked

In this array, elements with 0 values will be treated as prime numbers (except the first two items for 0 and 1). Other elements with a value > 0 are multiples of prime numbers.

In this array, for a cell value v at the index i

v=0 ← Indicates that i is a prime number

v>0 ← Indicates the prime factor for the i

Now, we have required data ready to compute whether the given number is semiprime or not.

Let’s say, we want to determine 6 is semiprime or not.

Then, at the index 6, we have value as 2. That means, p1=2. Then, p2=6/2=3. But, is 3 a prime number? Index 3 holds the value of 0. That means it's a prime number.

6 is a product of two prime numbers 2 and 3. Hence, its semiprime number.

Now, since we know how to decide whether a given number is semiprime or not instantly, it's time to connect the dots to write a program for the final solution.

public class CountSemiprimes {

    public int[] solution(int maxNumber, int[] firstQuery, int[] secondQuery) {
        int[] factors = new int[maxNumber + 1];

        // Compute minimum factors for each in 2..sqrt(maxNumber)
        // Iteration upto sqrt(maxNumber) will cover numbers from sqrt(maxNumber)+1..maxNumber
        for (int i = 2; i <= Math.sqrt(maxNumber); i++) {
            if (factors[i] != 0) {
                continue;
            }
            int multiple = i * i;
            while (multiple <= maxNumber) {
                if (factors[multiple] == 0) {
                    // If value is not set
                    factors[multiple] = i;
                }
                multiple += i;
            }
        }

        // Compute count of semiprimes for each in 4..maxNumber
        int[] count = new int[maxNumber + 1];
        for (int i = 4; i <= maxNumber; i++) {
            int fact1 = factors[i];
            if (fact1 > 0) {
                // First factor is a prime number
                // Check whether its second factor is also a prime number
                int fact2 = factors[i / fact1];
                if (fact2 == 0) {
                    // Second factor is also a prime number. Hence, it is a semiprime
                    count[i] = count[i - 1] + 1;
                } else {
                    // Second factor is not a prime number. Hence, it is not a semiprime
                    count[i] = count[i - 1];
                }
            } else {
                count[i] = count[i - 1];
            }
        }

        int[] result = new int[firstQuery.length];

        for (int i = 0; i < firstQuery.length; i++) {
            result[i] = count[secondQuery[i]] - count[firstQuery[i] - 1];
        }
        return result;
    }

    public static void main(String[] args) {

        int maxNumber = 10;
        int[] firstQuery = new int[]{1};
        int[] secondQuery = new int[]{10};

        CountSemiprimes countSemiprimes = new CountSemiprimes();

        int[] result = countSemiprimes.solution(maxNumber, firstQuery, secondQuery);
        for (int i : result) {
            System.out.println(i);
        }
    }
}

Happy to have your feedback/suggestions.

See you in the next post!