Quantcast
Viewing all articles
Browse latest Browse all 25

Tweetable hash function challenge

In this you will write a hash function in 140 bytes1 or less of source code. The hash function must take an ASCII string as input, and return a 24-bit unsigned integer ([0, 224-1]) as output.

Your hash function will be evaluated for every word in this large British English dictionary2. Your score is the amount of words that share a hash value with another word (a collision).

The lowest score wins, ties broken by first poster.

Test case

Before submitting, please test your scoring script on the following input:

duplicate
duplicate
duplicate
duplicate

If it gives any score other than 4, it is buggy.


Clarifying rules:

  1. Your hash function must run on a single string, not a whole array. Also, your hash function may not do any other I/O than the input string and output integer.
  2. Built-in hash functions or similar functionality (e.g. encryption to scramble bytes) is disallowed.
  3. Your hash function must be deterministic.
  4. Contrary to most other contests optimizing specifically for the scoring input is allowed.

1 I am aware Twitter limits characters instead of bytes, but for simplicity we will use bytes as a limit for this challenge.
2 Modified from Debian's wbritish-huge, removing any non-ASCII words.


Viewing all articles
Browse latest Browse all 25

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>