Question:
Possible Duplicate:
Why should hash functions use a prime number modulus?
Why is it necessary for a hash table's (the data structure) size to be a prime?
From what I understand, it assures a more even distribution but is there any other reason?
Answer:
The only reason is to avoid clustering of values into a small number of buckets (yes, distribution). A more even distributed hashtable will perform more consistently.
from http://srinvis.blogspot.com/2006/07/hash-table-lengths-and-prime-numbers.html
If suppose your hashCode function results in the following hashCodes among others {x , 2x, 3x, 4x, 5x, 6x...}, then all these are going to be clustered in just m number of buckets, where m = table_length/GreatestCommonFactor(table_length, x). (It is trivial
to verify/derive this). Now you can do one of the following to avoid clustering
Make sure that you don't generate too many hashCodes that are multiples of another hashCode like in {x, 2x, 3x, 4x, 5x, 6x...}.But this may be kind of difficult if your hashTable is supposed to have millions of entries.
Or simply make m equal to the table_length by making GreatestCommonFactor(table_length, x) equal to 1, i.e by making table_length coprime with x. And if x can be just about any number then make sure that table_length is a prime number.