Eigen has lazy evaluation. From How does Eigen compare to BLAS/LAPACK?:
For operations involving complex expressions, Eigen is inherently
faster than any BLAS implementation because it can handle and optimize
a whole operation globally — while BLAS forces the programmer to
split complex operations into small steps that match the BLAS
fixed-function API, which incurs inefficiency due to introduction of
temporaries. See for instance the benchmark result of a Y = aX + bY
operation which involves two calls to BLAS level1 routines while Eigen
automatically generates a single vectorized loop.
The second chart in the benchmarks is Y = a*X + b*Y
, which Eigen was specially designed to handle. It should be no wonder that a library wins at a benchmark it was created for. You’ll notice that the more generic benchmarks, like matrix-matrix multiplication, don’t show any advantage for Eigen.