6

I have a problem with Haskell module optimization.

There is Main module.

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Control.DeepSeq
import Formatting
import Formatting.Clock
import System.Clock

import Data.Array

size :: Int
size = 200 :: Int

stdMult     :: (Ix a, Ix b, Ix c, Num d) =>
               Array (a,b) d -> Array (b,c) d -> Array (a,c) d
stdMult x y =  array resultBounds
                 [((i,j), sum [ x!(i,k) * y!(k,j) | k <- range (lj,uj)])
                                   | i <- range (li,ui),
                                     j <- range (lj',uj') ]
    where ((li,lj),(ui,uj))     = bounds x
          ((li',lj'),(ui',uj')) = bounds y
          resultBounds
            | (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
            | otherwise = error "error"


main :: IO ()
main = do
  let a = array ((1,1),(size, size)) [((i,j), 2*i-j) |
                                  i <- range (1,size),
                                  j <- range (1,size)]
  let b = array ((1,1),(size, size)) [((i,j), 2*i+3*j) |
                                  i <- range (1,size)`,
                                  j <- range (1,size)]

  start <- getTime ProcessCPUTime
  let
    c = stdMult a b
  end <- c `deepseq` getTime ProcessCPUTime
  fprint (timeSpecs % "\n") start end
  return()

When stdMult in Main module, everything works ok. I replace stdMult to another module. When I don't use ghc optimization, execution time is the same. When I use ghc options -O3, when stdMult in Main module time execution decreases, but when stdMult in another module, execution time is almost unchanged! For example, when stdMult in Main I have time 3 seconds, and when stdMult not in Main I have time 30 seconds, for matrix 500x500.

It is very strange!

5
  • 3
    What is stdMult3? Commented Oct 15, 2017 at 22:41
  • It is stdMult. I corrected, thanks. Commented Oct 15, 2017 at 23:21
  • 2
    What happens when you add {-# INLINABLE stdMult #-}? Commented Oct 15, 2017 at 23:53
  • 2
    Separately, you should generally use Criterion for microbenchmarks, rather than trying to roll your own. There are lots of ways to accidentally measure something other than what you think you're measuring. Commented Oct 15, 2017 at 23:56
  • @dfeuer thanks for good advice! Commented Oct 16, 2017 at 7:11

1 Answer 1

10

(You need the clock and formatting packages from Hackage to compile the code.)

I can reproduce the 10x slowdown when stdMult is in a different module. Luckily a fix is easy: in the module where stdMult is defined, add an INLINABLE pragma:

{-# INLINABLE stdMult #-}

It adds the definition to the interface file (.hi) which allows inlining in the modules that uses it, which in turn allows it to be specialized to fast machine Int instead of slow abstract Ix and Num polymorphic code. (If it's in the same module GHC can inline and specialize at will, and things aren't INLINABLE by default because it can cause executable code bloat and slower compilation.)

Alternatively to INLINABLE, you can manually SPECIALIZE to the types you want optimized implementations for. This is a bit more verbose, but should be faster to compile in big projects (it will be specialized once per export, instead of once per import, at a rough guess).

{-# SPECIALIZE stdMult :: Array (Int, Int) Int -> Array (Int, Int) Int -> Array (Int, Int) Int #-}
Sign up to request clarification or add additional context in comments.

4 Comments

Perhaps "instead of once per import" could be made more precise as "instead of once per imported use" or something like that. If stdMult is used 10 times, it will be inlined 10 times, I think, so it can be more that "once per import" (which is still a reasonable rough guess, as you state above).
@chi, that's not likely true. The specializer works across modules too, and should create just one specialization per module per type.
@dfeuer Interesting. I was perhaps thinking too much about inlining. Thanks.
@chi, of course, as usual, the best way to be sure is to actually inspect the compiled core. I don't have time to play with that today.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.