Why is operating on Float64 faster than Float16?

Question

As you can see, the effect you are expecting is present for Float32:

julia> rnd64 = rand(Float64, 1000);

julia> rnd32 = rand(Float32, 1000);

julia> rnd16 = rand(Float16, 1000);

julia> @btime $rnd64.^2;
  616.495 ns (1 allocation: 7.94 KiB)

julia> @btime $rnd32.^2;
  330.769 ns (1 allocation: 4.06 KiB)  # faster!!

julia> @btime $rnd16.^2;
  2.067 μs (1 allocation: 2.06 KiB)  # slower!!

Float64 and Float32 have hardware support on most platforms, but Float16 does not, and must therefore be implemented in software.

Note also that you should use variable interpolation ($) when micro-benchmarking. The difference is significant here, not least in terms of allocations:

julia> @btime $rnd32.^2;
  336.187 ns (1 allocation: 4.06 KiB)

julia> @btime rnd32.^2;
  930.000 ns (5 allocations: 4.14 KiB)

Leave a Comment Cancel reply