Quantcast
Viewing all articles
Browse latest Browse all 25

Answer by Anders Kaseorg for Tweetable hash function challenge

Alright fine I’ll go learn a golfing language.

CJam, 140 bytes, 3314 colliding words

00000000: 7b5f 3162 225e d466 4a55 a05e 9f47 fc51  {_1b"^.fJU.^.G.Q
00000010: c45b 4965 3073 72dd e1b4 d887 a4ac bcbd  .[Ie0sr.........
00000020: 9c8f 70ca 2981 b2df 745a 10d0 dfca 6cff  ..p.)...tZ....l.
00000030: 7a3b 64df e730 54b4 b068 8584 5f6c 9f6b  z;d..0T..h.._l.k
00000040: b7f8 7a1f a2d3 b2b8 bcf5 cfa6 1ef7 a55c  ..z............\
00000050: dca8 795c 2492 dc32 1fb6 f449 f9ca f6b7  ..y\$..2...I....
00000060: a2cf 4772 266e ad4f d90c d236 b51d c5d5  ..Gr&n.O...6....
00000070: 5c46 3f9b 7cb4 f195 4efc fe4a ce8d 9aee  \F?.|...N..J....
00000080: 9dbc 223d 6962 3443 2329 257d            .."=ib4C#)%}

Defines a block (anonymous function). To test, you could add qN%%N*N to take the newline-separated list of words on stdin and write a newline-separated list of hashes on stdout. Equivalent Python code:

b=lambda s,a:reduce(lambda n,c:n*a+ord(c),s,0)
f=lambda s:b(s,ord('^\xd4fJU\xa0^\x9fG\xfcQ\xc4[Ie0sr\xdd\xe1\xb4\xd8\x87\xa4\xac\xbc\xbd\x9c\x8fp\xca)\x81\xb2\xdftZ\x10\xd0\xdf\xcal\xffz;d\xdf\xe70T\xb4\xb0h\x85\x84_l\x9fk\xb7\xf8z\x1f\xa2\xd3\xb2\xb8\xbc\xf5\xcf\xa6\x1e\xf7\xa5\\\xdc\xa8y\\$\x92\xdc2\x1f\xb6\xf4I\xf9\xca\xf6\xb7\xa2\xcfGr&n\xadO\xd9\x0c\xd26\xb5\x1d\xc5\xd5\\F?\x9b|\xb4\xf1\x95N\xfc\xfeJ\xce\x8d\x9a\xee\x9d\xbc'[b(s,1)%125]))%(8**8+1)

Pyth, 140 bytes, 3535 3396 colliding words

00000000: 4c25 4362 2d68 5e38 2038 2a36 3643 4022  L%Cb-h^8 8*66C@"
00000010: aa07 f29a 27a7 133a 3901 484d 3f9b 1982  ....'..:9.HM?...
00000020: d261 79ab adab 9d92 888c 3012 a280 76cf  .ay.......0...v.
00000030: a2e5 8f81 7039 acee c42e bc18 28d8 efbf  ....p9......(...
00000040: 0ebe 2910 9c90 158e 3742 71b4 bdf5 59c2  ..).....7Bq...Y.
00000050: f90b e291 8673 ea59 6975 10be e750 84c8  .....s.Yiu...P..
00000060: 0b0f e7e8 f591 f628 cefa 1ab3 2e3c 72a3  .......(.....<r.
00000070: 7f09 6190 dbd2 d54e d6d0 d391 a780 ebb6  ..a....N........
00000080: ae86 2d1e 49b0 552e 7522 4362            ..-.I.U.u"Cb

Defines a function named y. To test, you could add jmyd.z to take the newline-separated list of words on stdin and write a newline-separated list of hashes on stdout. Equivalent Python code:

b=lambda s,a:reduce(lambda n,c:n*a+ord(c),s,0)
f=lambda s:b(s,256)%(8**8+1-66*ord("\xaa\x07\xf2\x9a'\xa7\x13:9\x01HM?\x9b\x19\x82\xd2ay\xab\xad\xab\x9d\x92\x88\x8c0\x12\xa2\x80v\xcf\xa2\xe5\x8f\x81p9\xac\xee\xc4.\xbc\x18(\xd8\xef\xbf\x0e\xbe)\x10\x9c\x90\x15\x8e7Bq\xb4\xbd\xf5Y\xc2\xf9\x0b\xe2\x91\x86s\xeaYiu\x10\xbe\xe7P\x84\xc8\x0b\x0f\xe7\xe8\xf5\x91\xf6(\xce\xfa\x1a\xb3.<r\xa3\x7f\ta\x90\xdb\xd2\xd5N\xd6\xd0\xd3\x91\xa7\x80\xeb\xb6\xae\x86-\x1eI\xb0U.u"[b(s,256)%121]))

Theoretical limits

How well can we expect to do? Here is a plot of x, the number of colliding words, vs. y, the entropy in bytes required to get at most x colliding words. For example, the point (2835, 140) tells us that a random function gets at most 2835 colliding words with probability 1/256**140, so it’s exceedingly unlikely that we’ll ever be able to do much better than that with 140 bytes of code.

Image may be NSFW.
Clik here to view.
graph


Viewing all articles
Browse latest Browse all 25

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>