Why the 6 in relu6?

From this reddit thread:

This is useful in making the networks ready for fixed-point inference.
If you unbound the upper limit, you lose too many bits to the Q part
of a Q.f number. Keeping the ReLUs bounded by 6 will let them take a
max of 3 bits (upto 8) leaving 4/5 bits for .f

It seems, then, that 6 is just an arbitrary value chosen according to the number of bits you want to be able to compress your network’s trained parameters into.
As per the “why” only the version with value 6 is implemented, I assume it’s because that’s the value that fits best in 8 bits, which probably is the most common use-case.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)