1

I'm rewriting the Huffman coding algorithm as a Haskell rookie exercise and I'm struggling a bit reforming the tree I serialized using the following technique:
- Walk the tree from root in depth-first preorder
- Encounter a Node put 0 followed by recurse on left then right childs
- Encounter a Leaf put 1 followed by the symbol byte

My serialization code:

serializeHtree :: (Ord a) => CodeDict a -> HTree a -> [Bit]
serializeHtree dict (Leaf a) = I : (myLpad $ dict M.! a)
serializeHtree dict (Branch l r) = O : (serializeHtree dict r ++ serializeHtree dict l)

Where:
- CodeDict is a Map from a to [Bit]
- myLpad fill the variable-length huffman code with 0 to make it fixed-length
- and Bit is my own datatype constructed by O and I

For example,
Huffman tree

The tree above will be represented as:
0100000000001000001001000001010100000110100000111

Now to deserialize it, I know I have to read the stream bit per bit and:
- Encounter a 1 make a Leaf with the next byte
- Encounter a 0 make a branch recursing on left then right subtrees

But I'm failing in my approach, here is my deserialization code (non functional this time):

deserializeHtree :: [Bit] -> HTree a
deserializeHtree (x:xs) = case x of
    'O' -> Branch (deserializeHtree xs) (deserializeHtree xs)
    'I' -> Leaf (head xs)  

Thanks for support

4
  • Your algorithm description doesn't match annotations in the picture, which doesn't match the bit representation under it. Something is amiss. Commented Apr 21, 2018 at 1:16
  • 1
    minimal reproducible example ________ Commented Apr 21, 2018 at 9:13
  • Sorry I took the picture from the net. I should have erased the 0 and 1 on the branches (they are used for encoding phase). I admit it's confusing Commented Apr 21, 2018 at 11:34
  • This seems quite similar to this question but with bits instead of characters. Commented Apr 22, 2018 at 16:27

1 Answer 1

2

You need to write a parser, using a type suitable for recursion.

At the top level you indeed want [Bit] -> HTree a (probably restricting a to some type class, but I'll neglect this). However, for enabling recursion you need

parser :: [Bit] -> (HTree a, [Bit])
-- or, if you need to handle failure
parser :: [Bit] -> Maybe (HTree a, [Bit])

The idea is that parser is fed with the bits to be parsed. It then tries to parse the first tree which is represented in the prefix of those bits. If it succeeds, it returns an HTree a and a list of the exceeding (non consumed) bits.

Returning the non consumed bits is crucial for Branch where you need to parse two subtrees. You parse the first one, take the non consumed bits, and then use those to start the parser of the right subtree.

This topic is too broad for a SO answer, but if you google for "Haskell parser monad" you should find plenty of examples.

Sign up to request clarification or add additional context in comments.

2 Comments

Oh thanks that's exactly what I needed: to understand the methodology behind. It's easier for me to have a function returning the bits left. I realize it is the functional equivalent of maintaining a pointer to the last bit read we would have done in an imperative paradigm. Have a nice day
@Ben Yes, that's a close analogy. More in general, an imperative function which returns type A, while modifying (through assignment / mutation) an external variable B, is often rendered using a state monad, i.e. using a type like B -> (A,B). This, in a sense, reads the old B and produces a new one, along with the result A. The state monad is only a small library around this to make it more convenient for the programmer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.