Ruby, 6473 collisions, 129 bytes
h=->(w){@p=@p||(2..999).select{|i|(2..i**0.5).select{|j|i%j==0}==[]};c=w.chars.reduce(1){|a,s|(a*@p[s.ord%92]+179)%((1<<24)-3)}}
The @p variable is filled with all the primes below 999.
This converts ascii values into primes numbers and takes their product modulo a large prime. The fudge factor of 179 deals with the fact that the original algorithm was for use in finding anagrams, where all words that are rearrangements of the same letters get the same hash. By adding the factor in the loop, it makes anagrams have distinct codes.
I could remove the **0.5 (sqrt test for prime) at the expense of poorer performance to shorten the code. I could even make the prime number finder execute in the loop to remove nine more characters, leaving 115 bytes.
To test, the following tries to find the best value for the fudge factor in the range 1 to 300. It assume that the word file in in the /tmp directory:
h=->(w,y){
@p=@p||(2..999).
select{|i|(2..i**0.5).
select{|j|i%j==0}==[]};
c=w.chars.reduce(1){|a,s|(a*@p[s.ord%92]+y)%((1<<24)-3)}
}
american_dictionary = "/usr/share/dict/words"
british_dictionary = "/tmp/british-english-huge.txt"
words = (IO.readlines british_dictionary).map{|word| word.chomp}.uniq
wordcount = words.size
fewest_collisions = 9999
(1..300).each do |y|
whash = Hash.new(0)
words.each do |w|
code=h.call(w,y)
whash[code] += 1
end
hashcount = whash.size
collisions = whash.values.select{|count| count > 1}.inject(:+)
if (collisions < fewest_collisions)
puts "y = #{y}. #{collisions} Collisions. #{wordcount} Unique words. #{hashcount} Unique hash values"
fewest_collisions = collisions
end
end