Competitive programming using Haskell

Question

I am currently trying to refresh my Haskell knowledge by solving some Hackerrank problems.

For example:

https://www.hackerrank.com/challenges/maximum-palindromes/problem

I've already implemented an imperative solution in C++ which got accepted for all test cases. Now I am trying to come up with a pure functional solution in (reasonably idiomatic) Haskell.

My current code is

module Main where

import           Control.Monad
import qualified Data.ByteString.Char8 as C
import           Data.Bits
import           Data.List
import qualified Data.Map.Strict       as Map
import qualified Data.IntMap.Strict    as IntMap
import           Debug.Trace

-- precompute factorials
compFactorials :: Int -> Int -> IntMap.IntMap Int
compFactorials n m = go 0 1 IntMap.empty
  where
    go a acc map
      | a < 0     = map
      | a < n     = go a' acc' map'
      | otherwise = map'
      where
        map' = IntMap.insert a acc map
        a'   = a + 1
        acc' = (acc * a') `mod` m

-- precompute invs
compInvs :: Int -> Int -> IntMap.IntMap Int -> IntMap.IntMap Int
compInvs n m facts = go 0 IntMap.empty
  where
    go a map
      | a < 0     = map
      | a < n     = go a' map'
      | otherwise = map'
      where
        map' = IntMap.insert a v map
        a' = a + 1
        v = (modExp b (m-2) m) `mod` m
        b = (IntMap.!) facts a


modExp :: Int -> Int -> Int -> Int
modExp b e m = go b e 1
  where
    go b e r
      | (.&.) e 1 == 1 = go b' e' r'
      | e > 0 = go b' e' r
      | otherwise = r
        where
          r' = (r * b) `mod` m
          b' = (b * b) `mod` m
          e' = shift e (-1)

-- precompute frequency table
initFreqMap :: C.ByteString -> Map.Map Char (IntMap.IntMap Int)
initFreqMap inp = go 1 map1 map2 inp
  where
    map1 = Map.fromList $ zip ['a'..'z'] $ repeat 0
    map2 = Map.fromList $ zip ['a'..'z'] $ repeat IntMap.empty

    go idx m1 m2 inp
      | C.null inp = m2
      | otherwise  = go (idx+1) m1' m2' $ C.tail inp
      where
        m1' = Map.update (\v -> Just $ v+1) (C.head inp) m1
        m2' = foldl' (\m w -> Map.update (\v -> liftM (\c -> IntMap.insert idx c v) $ Map.lookup w m1') w m)
              m2 ['a'..'z']


query :: Int -> Int -> Int -> Map.Map Char (IntMap.IntMap Int)
         -> IntMap.IntMap Int -> IntMap.IntMap Int -> Int
query l r m freqMap facts invs
  | x > 1     = (x * y) `mod` m
  | otherwise = y
  where
    calcCnt cs = cr - cl
      where
         cl = IntMap.findWithDefault 0 (l-1) cs
         cr = IntMap.findWithDefault 0 r cs

    f1 acc cs
      | even cnt = acc
      | otherwise = acc + 1
      where
        cnt = calcCnt cs

    f2 (acc1,acc2) cs
      | cnt < 2   = (acc1 ,acc2)
      | otherwise = (acc1',acc2')
      where
        cnt = calcCnt cs

        n = cnt `div` 2

        acc1' = acc1 + n
        r = choose acc1' n
        acc2' = (acc2 * r) `mod` m


    -- calc binomial coefficient using Fermat's little theorem
    choose n k
      | n < k = 0
      | otherwise = (f1 * t) `mod` m
      where
        f1 = (IntMap.!) facts n
        i1 = (IntMap.!) invs k
        i2 = (IntMap.!) invs (n-k)

        t = (i1 * i2) `mod` m


    x = Map.foldl' f1 0 freqMap
    y = snd $ Map.foldl' f2 (0,1) freqMap


main :: IO()
main = do
    inp <- C.getLine
    q   <- readLn :: IO Int

    let modulo  = 1000000007
    let facts   = compFactorials (C.length inp) modulo
    let invs    = compInvs (C.length inp) modulo facts
    let freqMap = initFreqMap inp

    forM_ [1..q] $ \_ -> do

      line <- getLine

      let [s1, s2] = words line
      let l = (read s1) :: Int
      let r = (read s2) :: Int

      let result = query l r modulo freqMap facts invs

      putStrLn $ show result

It passes all small and medium test cases but I am getting timeout with large test cases. The key to solve this problem is to precompute some stuff once at the beginning and use them to answer the individual queries efficiently.

Now, my main problem where I need help is:

The initital profiling shows that the lookup operation of the IntMap seems to be the main bottleneck. Is there better alternative to IntMap for memoization? Or should I look at Vector or Array, which I believe will lead to more "ugly" code. Even in current state, the code doesn't look nice (by functional standards) and as verbose as my C++ solution. Any tips to make it more idiomatic? Other than IntMap usage for memoization, do you spot any other obvious problems which can lead to performance problems?

And is there any good sources, where I can learn how to use Haskell more effectively for competitive programming?

A sample large testcase, where the current code gets timeout:

input.txt output.txt

For comparison my C++ solution:

#include <vector>
#include <iostream>

#define MOD 1000000007L

long mod_exp(long b, long e) {
    long r = 1;

    while (e > 0) {
        if ((e & 1) == 1) {
            r = (r * b) % MOD;
        }

        b = (b * b) % MOD;
        e >>= 1;
    }

    return r;
}

long n_choose_k(int n, int k, const std::vector<long> &fact_map, const std::vector<long> &inv_map) {
    if (n < k) {
        return 0;
    }

    long l1 = fact_map[n];
    long l2 = (inv_map[k] * inv_map[n-k]) % MOD;

    return (l1 * l2) % MOD;
}

int main() {
    std::string s;
    int q;

    std::cin >> s >> q;

    std::vector<std::vector<long>> freq_map;
    std::vector<long> fact_map(s.size()+1);
    std::vector<long> inv_map(s.size()+1);

    for (int i = 0; i < 26; i++) {
        freq_map.emplace_back(std::vector<long>(s.size(), 0));
    }

    std::vector<long> acc_map(26, 0);
    for (int i = 0; i < s.size(); i++) {
        acc_map[s[i]-'a']++;

        for (int j = 0; j < 26; j++) {
            freq_map[j][i] = acc_map[j];
        }
    }

    fact_map[0] = 1;
    inv_map[0] = 1;
    for (int i = 1; i <= s.size(); i++) {
        fact_map[i] = (i * fact_map[i-1]) % MOD;
        inv_map[i] = mod_exp(fact_map[i], MOD-2) % MOD;
    }

    while (q--) {
        int l, r;

        std::cin >> l >> r;
        std::vector<long> x(26, 0);

        long t = 0;
        long acc = 0;
        long result = 1;

        for (int i = 0; i < 26; i++) {
            auto cnt = freq_map[i][r-1] - (l > 1 ? freq_map[i][l-2] : 0);

            if (cnt % 2 != 0) {
                t++;
            }

            long n = cnt / 2;

            if (n > 0) {
                acc += n;
                result *= n_choose_k(acc, n, fact_map, inv_map);
                result = result % MOD;
            }
        }

        if (t > 0) {
            result *= t;
            result = result % MOD;
        }

        std::cout << result << std::endl;
    }
}

UPDATE:

DanielWagner's answer has confirmed my suspicion that the main problem in my code was the usage of IntMap for memoization. Replacing IntMap with Array made my code perform similar to DanielWagner's solution.

module Main where

import           Control.Monad
import           Data.Array            (Array)
import qualified Data.Array            as A
import qualified Data.ByteString.Char8 as C
import           Data.Bits
import           Data.List
import           Debug.Trace


-- precompute factorials
compFactorials :: Int -> Int -> Array Int Int
compFactorials n m = A.listArray (0,n) $ scanl' f 1 [1..n]
  where
    f acc a = (acc * a) `mod` m

-- precompute invs
compInvs :: Int -> Int -> Array Int Int -> Array Int Int
compInvs n m facts = A.listArray (0,n) $ map f [0..n]
  where
    f a = (modExp ((A.!) facts a) (m-2) m) `mod` m

modExp :: Int -> Int -> Int -> Int
modExp b e m = go b e 1
  where
    go b e r
      | (.&.) e 1 == 1 = go b' e' r'
      | e > 0 = go b' e' r
      | otherwise = r
        where
          r' = (r * b) `mod` m
          b' = (b * b) `mod` m
          e' = shift e (-1)

-- precompute frequency table
initFreqMap :: C.ByteString -> Map.Map Char (Array Int Int)
initFreqMap inp = Map.fromList $ map f ['a'..'z']
  where
    n = C.length inp
    f c = (c, A.listArray (0,n) $ scanl' g 0 [0..n-1])
      where
        g x j
          | C.index inp j == c = x+1
          | otherwise = x

query :: Int -> Int -> Int -> Map.Map Char (Array Int Int)
         -> Array Int Int -> Array Int Int -> Int
query l r m freqMap facts invs
  | x > 1     = (x * y) `mod` m
  | otherwise = y
  where
    calcCnt freqMap = cr - cl
      where
         cl = (A.!) freqMap (l-1)
         cr = (A.!) freqMap r

    f1 acc cs
      | even cnt = acc
      | otherwise = acc + 1
      where
        cnt = calcCnt cs

    f2 (acc1,acc2) cs
      | cnt < 2   = (acc1 ,acc2)
      | otherwise = (acc1',acc2')
      where
        cnt = calcCnt cs

        n = cnt `div` 2

        acc1' = acc1 + n
        r = choose acc1' n
        acc2' = (acc2 * r) `mod` m


    -- calc binomial coefficient using Fermat's little theorem
    choose n k
      | n < k = 0
      | otherwise = (f1 * t) `mod` m
      where
        f1 = (A.!) facts n
        i1 = (A.!) invs k
        i2 = (A.!) invs (n-k)

        t = (i1 * i2) `mod` m


    x = Map.foldl' f1 0 freqMap
    y = snd $ Map.foldl' f2 (0,1) freqMap


main :: IO()
main = do
    inp <- C.getLine
    q   <- readLn :: IO Int

    let modulo  = 1000000007
    let facts   = compFactorials (C.length inp) modulo
    let invs    = compInvs (C.length inp) modulo facts
    let freqMap = initFreqMap inp

    replicateM_ q $ do

      line <- getLine

      let [s1, s2] = words line
      let l = (read s1) :: Int
      let r = (read s2) :: Int

      let result = query l r modulo freqMap facts invs

      putStrLn $ show result

Please narrow this question down to something that can be fully explained in a couple of sentences, without requiring a link to something like HackerRank. As it stands, it's too open-ended. Also, why don't you show your C++ code and what exactly works faster in that version? — leftaroundabout
– leftaroundabout, Commented Sep 10, 2018 at 18:02
You might want to split this up into a several different questions. — Kwarrtz
– Kwarrtz, Commented Sep 10, 2018 at 18:03
Isn't this simply even numbered of letters counts plus one (if there are odd numbered letters)? week -> e:2 w:1 k:1 -> 2+1=3 — karakfa
– karakfa, Commented Sep 10, 2018 at 18:32
@karafka No; e.g. for week there's only two palindromes of maximum length, namely ewe and eke. And no, it's not the number of odd letter counts either. Think of e.g. bookmark, where long palindromes begin ok or ko, then are followed by b, m, a, or r, so there are 2*4=8 maximum-length palindromes. — Daniel Wagner
– Daniel Wagner, Commented Sep 10, 2018 at 19:43
Oh, the number of maximum length palindromes. Not the maximum length! — karakfa
– karakfa, Commented Sep 11, 2018 at 16:57

Daniel Wagner · Accepted Answer · 2018-09-10 22:39:12Z

6

I think you've shot yourself in the foot by trying to be too clever. Below I'll show a straightforward implementation of a slightly different algorithm that is about 5x faster than your Haskell code.

Here's the core combinatoric computation. Given a character frequency count for a substring, we can compute the number of maximum-length palindromes this way:

Divide all the frequencies by two, rounding down; call this the div2-frequencies. We'll also want the mod2-frequencies, which is the set of letters for which we had to round down.
Sum the div2-frequencies to get the total length of the palindrome prefix; its factorial gives an overcount of the number of possible prefixes for the palindrome.
Take the product of the factorials of the div2-frequencies. This tells the factor by which we overcounted above.
Take the size of the mod2-frequencies, or choose 1 if there are none. We can extend any of the palindrome prefixes by one of the values in this set, if there are any, so we have to multiply by this size.

For the overcounting step, it's not super obvious to me whether it would be faster to store precomputed inverses for factorials, and take their product, or whether it's faster to just take the product of all the factorials and do one inverse operation at the very end. I'll do the latter, because it just intuitively seems faster to do one inversion per query than one lookup per repeated letter, but what do I know? Should be easy to test if you want to try to adapt the code yourself.

There's only one other quick insight I had vs. your code, which is that we can cache the frequency counts for prefixes of the input; then computing the frequency count for a substring is just pointwise subtraction of two cached counts. Your precomputation on the input I find to be a bit excessive in comparison.

Without further ado, let's see some code. As usual there's some preamble.

module Main where

import           Control.Monad
import           Data.Array (Array)
import qualified Data.Array as A
import           Data.Map.Strict (Map)
import qualified Data.Map.Strict as M
import           Data.Monoid

Like you, I want to do all my computations on cheap Ints and bake in the modular operations where possible. I'll make a newtype to make sure this happens for me.

newtype Mod1000000007 = Mod Int deriving (Eq, Ord)

instance Num Mod1000000007 where
    fromInteger = Mod . (`mod` 1000000007) . fromInteger
    Mod l + Mod r = Mod ((l+r) `rem` 1000000007)
    Mod l * Mod r = Mod ((l*r) `rem` 1000000007)
    negate (Mod v) = Mod ((1000000007 - v) `rem` 1000000007)
    abs = id
    signum = id

instance Integral Mod1000000007 where
    toInteger (Mod n) = toInteger n
    quotRem a b = (a * b^1000000005, 0)

I baked in the base of 1000000007 in several places, but it's easy to generalize by giving Mod a phantom parameter and making a HasBase class to pick the base. Ask a fresh question if you're not sure how and are interested; I'll be happy to do a more thorough writeup. There's a few more instances for Mod that are basically uninteresting and primarily needed because of Haskell's wacko numeric class hierarchy:

instance Show Mod1000000007 where show (Mod n) = show n
instance Real Mod1000000007 where toRational (Mod n) = toRational n
instance Enum Mod1000000007 where
    toEnum = Mod . (`mod` 1000000007)
    fromEnum (Mod n) = n

Here's the precomputation we want to do for factorials...

type FactMap = Array Int Mod1000000007

factMap :: Int -> FactMap
factMap n = A.listArray (0,n) (scanl (*) 1 [1..])

...and for precomputing frequency maps for each prefix, plus getting a frequency map given a start and end point.

type FreqMap = Map Char Int

freqMaps :: String -> Array Int FreqMap
freqMaps s = go where
    go = A.listArray (0, length s)
        (M.empty : [M.insertWith (+) c 1 (go A.! i) | (i, c) <- zip [0..] s])

substringFreqMap :: Array Int FreqMap -> Int -> Int -> FreqMap
substringFreqMap maps l r = M.unionWith (-) (maps A.! r) (maps A.! (l-1))

Implementing the core computation described above is just a few lines of code, now that we have suitable Num and Integral instances for Mod1000000007:

palindromeCount :: FactMap -> FreqMap -> Mod1000000007
palindromeCount facts freqs
    =     toEnum (max 1 mod2Freqs)
    *     (facts A.! sum div2Freqs)
    `div` product (map (facts A.!) div2Freqs)
    where
    (div2Freqs, Sum mod2Freqs) = foldMap (\n -> ([n `quot` 2], Sum (n `rem` 2))) freqs

Now we just need a short driver to read stuff and pass it around to the appropriate functions.

main :: IO ()
main = do
    inp <- getLine
    q   <- readLn

    let freqs = freqMaps inp
        facts = factMap (length inp)

    replicateM_ q $ do
        [l,r] <- map read . words <$> getLine
        print . palindromeCount facts $ substringFreqMap freqs l r

That's it. Notably I made no attempt to be fancy about bitwise operations and didn't do anything fancy with accumulators; everything is in what I would consider idiomatic purely-functional style. The final count is about half as much code that runs about 5x faster.

P.S. Just for fun, I replaced the last line with print (l+r :: Int)... and discovered that about half the time is spent in read. Ouch! Seems there's still plenty of low-hanging fruit if this isn't fast enough yet.

edited Sep 10, 2018 at 22:39

answered Sep 10, 2018 at 22:08

Daniel Wagner

156k10 gold badges231 silver badges393 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

bmk Over a year ago

Thank you so much for your detailed answer! As I suspected (and your answer confirmed it), my main problem was that I used IntMap for the memoization... I replaced in my code IntMap with Array and voila, now it has similar performance to your solution. And my frequency map is sure not as compact as your transposed version but in my opinion it doesn't make big difference especially for large test instances. The main lesson I learned is, that Array is in this case much better choice than IntMap for memoization.

bmk Over a year ago

But still, our Haskell solutions perform about 10x worse than the equivalent C++ solution. As you suspect in your last paragraph, maybe the IO is here the culprit... Do you think we can optimize it further to reach C++'s performance without making the code "too ugly"?

Daniel Wagner Over a year ago

@bmk It is usually possible to come within a factor of 2-3x of C and C++ solutions without becoming ugly, and to be comparable if you're willing to deal with a few scattered eyesores. If you want to go faster, it is probably time to start profiling.

Bartek Banachewicz Over a year ago

Is the instance Integral correct? The 5 looks like a typo.

Daniel Wagner Over a year ago

@BartekBanachewicz It is correct. Try it yourself and see: try (x `div` y) * y for a couple different choices of x and y.

Collectives™ on Stack Overflow

Competitive programming using Haskell

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related