# Crafting Interpreters in Haskell - Parsing Expressions

In this post, we will continue from where we left off on the Scanner. Today, we implement a parser close to the one described in Parsing Expressions

We follow the same approach as earlier. That is, we will continue to use `Parsec`

and try to stay close to the implementation of the book as possible. The module presented in this post can be found at ExprParser.hs

Let’s first get some `import`

statements out of the way.

```
--- Scanner.hs
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE NoImplicitPrelude #-}
module ExprParser where
import Data.Text as T
import Import hiding (many, try, (<|>))
import Scanner
import Text.Parsec
```

# Layout

# Expression Grammar

As outlined in the book Chapter 6, the expression grammar we want our parser to understand is given by:

```
expression → literal
| unary
| binary
| grouping ;
literal → NUMBER | STRING | "true" | "false" | "nil" ;
grouping → "(" expression ")" ;
unary → ( "-" | "!" ) expression ;
binary → expression operator expression ;
operator → "==" | "!=" | "<" | "<=" | ">" | ">="
| "+" | "-" | "*" | "/" ;
```

This grammar will represent the AST we will build after the parser parses the `LoxTokInfo`

tokens that were created previously in the `Scanning`

step. Therefore, we need to represent the above grammar in Haskell as follows:

```
data BinOp = NotEqual | EqualEqual | Gt | Gte | Lt | Lte | Plus | Minus | Star | Slash
deriving (Show, Eq)
data UnaryOp = UnaryMinus | UnaryBang deriving (Show, Eq)
data Expr
= Number Double -- NUMBER
| Literal T.Text -- STRING
| LoxBool Bool -- "true", "false"
| LoxNil -- 'nil'
| Paren Expr -- grouping
| Unary UnaryOp Expr -- unary
| Binary Expr BinOp Expr -- binary
deriving (Show, Eq)
```

The comments provided besides each data constructor ties them back to the productions in the expression grammar that is presented above. This `data Expr`

will be the AST that our parser will produce using the same top-down approach we took in the previous chapter.

# Making Parsec Work on LoxTokInfo

Similar to `scanning`

step we want to use the `Parsec`

library to parse the `LoxTokInfo`

tokens we created in the `lex`

step. So far we have used `Parsec`

to parse either `String`

or `Char`

data. Now, we need a way to use `Parsec`

to scan input which are a list of `LoxTokInfo`

tokens. We can do that by implementing our own `satisfy`

version as shown below.

```
satisfyT :: (LoxTokInfo -> Maybe a) -> Parser a
= tokenPrim showTok updateTokPos match
satisfyT f where
= show $ tokinfo_type ti
showTok ti : _) = tok_position s
updateTokPos _ _ (s = pos
updateTokPos pos _ [] = f t match t
```

The `satisfyT`

is implemented over the `tokenPrim`

parser. The `tokenPrim`

combinator is the most primitive combinator that all other combinators can be built out of. Since, we want all are combinators to work with `LoxTokInfo`

, we can build the `satisfyT`

function over this combinator and then use the `satisfyT`

function is all other other combinators you will build.

More detail description of we this works can be found here. For now, is is sufficient to understand the `tokenPrim`

return the `Parser a`

that satisfies a condition.

Here the `a`

is generic in `satisfyT :: (LoxTokInfo -> Maybe a) -> Parser a`

. In most cases the `a`

will be the `Expr`

type itself. This will become clear as we look at other combinators.

# Implementing Parser

Now, that we have the `satisfyT`

function, we are ready to build other parser combinators. Let first define a type alias that will represent the `parser`

each of our combinators will be returning.

`type Parser a = ParsecT [LoxTokInfo] () Identity a`

In Chapter 6, the shape of recursive-descent parser we want to build is given by the following grammar that is suggested. The grammar suggested takes care of required precedence and associativity relationships.

```
expression → equality ;
equality → comparison ( ( "!=" | "==" ) comparison )* ;
comparison → term ( ( ">" | ">=" | "<" | "<=" ) term )* ;
term → factor ( ( "-" | "+" ) factor )* ;
factor → unary ( ( "/" | "*" ) unary )* ;
unary → ( "!" | "-" ) unary
| primary ;
primary → NUMBER | STRING | "true" | "false" | "nil"
| "(" expression ")" ;
```

This grammar will guide us through the combinators we have to build. First let’s build some combinators for the `primary`

productions. You will notice that these combinator uses the `satisfyT`

function to check the `LoxTokInfo`

they need to match and accordingly return an `Expr`

object.

We have `number`

, `literal`

, `loxBool`

, `loxNil`

and `loxParenExpr`

all lined up with a `choice`

operator in `loxPrimary`

.

As you will notice, each of the `Parsers`

returned by the combinators, take a `LoxTokInfo`

as input, and if that token matches (via `satisfyT`

), then we will return a `Expr`

.

```
number :: Parser Expr
= satisfyT f
number where
LoxTokInfo (NUMBER x) _ _ _) = Just (Number x)
f (= Nothing
f _
literal :: Parser Expr
= satisfyT f
literal where
LoxTokInfo (STRING x) _ _ _) = Just (Literal $ T.pack x)
f (= Nothing
f _
loxBool :: Parser Expr
= satisfyT f
loxBool where
LoxTokInfo TRUE _ _ _) = Just (LoxBool True)
f (LoxTokInfo FALSE _ _ _) = Just (LoxBool False)
f (= Nothing
f _
loxNil :: Parser Expr
= satisfyT f
loxNil where
LoxTokInfo NIL _ _ _) = Just LoxNil
f (= Nothing
f _
loxParenExpr :: Parser Expr
= do
loxParenExpr *> loxExpr <* satisfyT parenClose
satisfyT parenOpen where
-- use LoxNil as placeholder, since we do not have an equilivalent Expr for Paren
LoxTokInfo LEFT_PAREN _ _ _) = Just ()
parenOpen (= Nothing
parenOpen _
LoxTokInfo RIGHT_PAREN _ _ _) = Just ()
parenClose (= Nothing
parenClose _
loxPrimary :: Parser Expr
= number <|> literal <|> loxBool <|> loxNil <|> loxParenExpr loxPrimary
```

Once we have the primitives, it is just a matter off expanding out each production(line) of the grammar into its own function. But, before that you will notice a common pattern across all expressions. For example,we need to match `comparison`

to one or more of its instances. But, we also want to make sure these are all `left-associative`

.

To help with that, we have a helper function called `leftChain`

, which closely simulates the `chainl`

function in `Parsec`

. We have to implement our own function, since we want a different type for both `p`

and `op`

that will be passed to the `leftChain`

function. The existing definition of `chainl`

does not work for us.

```
-- this is similar to chainl in `Text.Parsec` but works on `BinOp`
-- adopted from https://jakewheat.github.io/intro_to_parsing/
leftChain :: Parser Expr -> Parser BinOp -> Parser Expr
= do
leftChain p op <- p
expr
maybeAddSuffix exprwhere
= do
addSuffix e0 <- op
op' <- p
e1 Binary e0 op' e1)
maybeAddSuffix (
= addSuffix e <|> return e maybeAddSuffix e
```

Once we have the `leftChain`

function which just need to add a function for each production. For example, for

```
unary → ( "!" | "-" ) unary
| primary ;
```

we have,

```
unary' :: Parser Expr
= Unary <$> satisfyT f <*> unary
unary' where
LoxTokInfo BANG _ _ _) = Just UnaryBang
f (LoxTokInfo MINUS _ _ _) = Just UnaryMinus
f (= Nothing
f _
unary :: Parser Expr
= unary' <|> loxPrimary unary
```

We follow the same pattern for `factor`

, `term`

and `comparison`

. You will notice that similar to the grammar, the dependency goes from `equality -> comparison -> term -> factor -> unary`

. The only difference in each of these functions are the `BinOp`

we match.

```
factor :: Parser Expr
= leftChain unary (satisfyT f)
factor where
= case tokinfo_type x of
f x STAR -> Just Star
SLASH -> Just Slash
-> Nothing
_
term :: Parser Expr
= leftChain factor (satisfyT f)
term where
= case tokinfo_type x of
f x MINUS -> Just Minus
PLUS -> Just Plus
-> Nothing
_
comparison :: Parser Expr
= leftChain term (satisfyT f)
comparison where
= case tokinfo_type x of
f x GREATER -> Just Gt
GREATER_EQUAL -> Just Gte
LESS -> Just Lt
LESS_EQUAL -> Just Lte
-> Nothing _
```

Finally, we can add our helper scanner function:

```
scannerLoxTokens :: [LoxTokInfo] -> LoxParserResult
= parse loxExpr "" scannerLoxTokens
```

The concludes the implementation of the `Parser`

.

# Examples

```
-- term
>>P.parse equality "" $ fromRight [] (scanner "1+2;")
λ Right (Binary (Number 1.0) Plus (Number 2.0))
-- factor
>>P.parse equality "" $ fromRight [] (scanner "1*2/3;")
λ Right (Binary (Binary (Number 1.0) Star (Number 2.0)) Slash (Number 3.0))
-- comparisons
>>P.parse equality "" $ fromRight [] (scanner "1>2<=3;")
λ Right (Binary (Binary (Number 1.0) Gt (Number 2.0)) Lte (Number 3.0))
-- equality
>>P.parse equality "" $ fromRight [] (scanner "1>2==3;")
λ Right (Binary (Binary (Number 1.0) Gt (Number 2.0)) EqualEqual (Number 3.0))
```

# Conclusion

That concludes the implementation of the expression parser. With this we have laid of the groundwork to extend the suite of combinators we have and support more language features in later chapters.

I would like to thank bss03 for providing good ideas here to refine the implemenation of `satistyT`

.