The thing that stands out initially is that your code is highly polymorphic. You should specialize your floating point type uniformly to Double
, so GHC (and LLVM) have a chance of applying more aggressive optimizations.
Note, for those trying to reproduce, this code imports:
import qualified Data.Vector as V
import Data.Bits
import Data.Time.Clock
import System.Random
import System.Random.Shuffle
type Permutation = V.Vector Int
Ok. There’s lots of things you can try to improve this code.
Improvements
Data representation
- Specialize to a concrete floating point type, instead of polymorphic floating point functions
- Replace tuple
(a,a,a)
with unboxed tripleT !Double !Double !Double
- Switch from
Data.Array
toData.Array.Unboxed
forPermutations
- Replace use of boxed array of triples with multidimensional unboxed array from
repa
package
Compiler flags
- Compile with
-O2 -fvia-C -optc-O3 -fexcess-precision -optc-march=native
(or equivalent with-fllvm
) - Increase spec constr threshold —
-fspec-constr-count=16
More efficient library functions
- Use mersenne-random instead of StdGen to generate randoms
- Replace
mod
withrem
- Replace
V.!
indexing with unchecked indexingVU.unsafeIndex
(after moving toData.Vector.Unboxed
Runtime settings
- Increase the default allocation area:
-A20M
or-H
Also, check your algorithm is identical to the C# one, and you’re using the same data structures.