Thursday, March 15, 2012

Prime Numbers

Mathematics that includes prime numbers are quite tough for me. They have unique kind of property that we all are aware of (i.e, the number is divisible by only two distinct numbers being one and itself).

To test a number to be prime or not, has a very basic approach:
1. Find all divisors of the number.
2. If it is 1 and number itself (excluding 1) then its a prime number.

So a nested task is how to get all the divisors. And a very basic approach is:
1. 1 is a divisor
2. for i in 2 to Number if i divides Number, i is a divisor.

A little better approach could be
1. 1 is a divisor
2. Number is a divisor
3. for i in 2 to Number/2 if i divides Number, i is a divisor.
(This approach works since a divisor(apart from number) will not be greater than half of the number)

Another good approach to me is
1. 1 is a divisor
2. Number is a divisor
3. from i = 2 to Sqrt(Number) if i divides Number, i is a divisor and (Number/i) is also a divisor.

Back to prime Numbers!! What if we need to keep generating prime numbers (mind it!... there is no such wonder in maths/computer that has generated all the primes yet!.. obviously they are infinite and everyday a new prime is discovered). Ok!...Lets have an upper bound. Its like generate all prime numbers upto 100
Oh! thats easy... any naive approach could get it.
Naive approach:
1. if number is less than 2 => NOT PRIME
2. if number is 2 => PRIME
3. if number is divisible by 2 => NOT PRIME (special case already handled)
4. for i in 3 to UPPER_BOUND (100 here), if number is divisible by i => NOT PRIME
5. loop in step 4 completed, ok! its a prime number!!

Step 4 in naive approach can be optimized when increment of i is set by 2 instead of 1 (since number is definately not even)

How can we further optimize Step 4, no need to go upto UPPER_BOUND.
We can go till UPPER_BOUND/2
And in similar way we can further optimize by iterating upto Sqrt(UPPER_BOUND)
(Similar explaination with divisor algo)

Are we ok!... Not satisfied.. true, even this approach fails when UPPER_BOUND gets into category of 'large numbers' something like 10^7

How about memoization?... doesn't help much, you cannot pre compute anything that helps to check if a number is prime or not.

There google googles down another fantastic approach, a totally different perception and so far the quickest approach to generate list of primes and it is called
*drum beats*... tada...
"Sieve of Eratosthenes"
directly pasted from wiki:

"To find all the prime numbers less than or equal to a given integer n by Eratosthenes' method:
Create a list of consecutive integers from 2 to n: (2, 3, 4, ..., n).
Initially, let p equal 2, the first prime number.
Starting from p, count up in increments of p and mark each of these numbers greater than p itself in the list. These numbers will be 2p, 3p, 4p, etc.; note that some of them may have already been marked.
Find the first number greater than p in the list that is not marked. If there was no such number, stop. Otherwise, let p now equal this number (which is the next prime), and repeat from step 3.
When the algorithm terminates, all the numbers in the list that are not marked are prime"

This is fast...
Here is a sample private method in java to do the same

private static void generateSieve(){
System.out.println("generating sieve");
int MAX = 1000000;
boolean[] primeSieve = new boolean[MAX+2];
int MAXplus2 = MAX+2;
for (int i=0; i
for (int i=2; i<=MAXplus2/2; i++){
if (primeSieve[i]){
for (int j=(2*i); j

time for prime time..

Thursday, February 23, 2012

Java Unique Exception Extractor Utility

During development of a java application, the greatest problem is runtime Exceptions that occur in background. These background RuntimeExceptions are generally important to fix the code. Some of them are SQLException, NullPointerException, RuntimeUserRuntimeException, ClassCastException, IndexOutOfBoundException etc. These exception are hard to be caught during construction and are generally logged in background (It may even get unnoticed, if something went wrong in a thread which is not main thread).
Java logging frameworks used in standard java application development, somehow makes the log quite verbose. Each and every step/flow fired in application gets logged (For instance, current value of x = someDynamicAppValue.
A developer is interested in only exceptions that is useful for them.

Please find link to one such utility that I created recently which helps an analyst to just collect unique stack traces from the logs and rest of the logs need not be scanned. The utility was tested with >500MB of txt java logs, and it did it job within 3 minutes.

Suggestion / Bugs / Feedback will be highly appreciated.

Is my blog being read?