Python 2, 140 bytes, 4266 colliding words
I didn’t really want to start with the non-printable bytes thing given their unclear tweetability, but well, I didn’t start it. :-P
00000000: efbb bf64 6566 2066 2873 293a 6e3d 696e ...def f(s):n=in
00000010: 7428 732e 656e 636f 6465 2827 6865 7827 t(s.encode('hex'
00000020: 292c 3336 293b 7265 7475 726e 206e 2528 ),36);return n%(
00000030: 382a 2a38 2b31 2d32 3130 2a6f 7264 2827 8**8+1-210*ord('
00000040: 6f8e 474c 9f5a b49a 01ad c47f cf84 7b53 o.GL.Z........{S
00000050: 49ea c71b 29cb 929a a53b fc62 3afb e38e I...)....;.b:...
00000060: e533 7360 982a 50a0 2a82 1f7d 768c 7877 .3s`.*P.*..}v.xw
00000070: d78a cb4f c5ef 9bdb 57b4 7745 3a07 8cb0 ...O....W.wE:...
00000080: 868f a927 5b6e 2536 375d 2929 ...'[n%67]))
Python 2, 140 printable bytes, 4662 4471 4362 colliding words
def f(s):n=int(s.encode('hex'),16);return n%(8**8+3-60*ord('4BZp%(jTvy"WTf.[Lbjk6,-[LVbSvF[Vtw2e,NsR?:VxC0h5%m}F5,%d7Kt5@SxSYX-=$N>'[n%71]))
Inspired by the form of kasperd’s solution, obviously—but with the important addition of an affine transformation on the modulus space, and entirely different parameters.