CSE399: Advanced ProgrammingHandout 13Functional ParsersWhat is a Parser?A parser is a program that analyzes a piece of text todetermine its structure (and, typically, returns a treerepresenting this structure).The World is Full of Parsers!Almost every real-life program involves some kind ofparsing...•Hugs and GHC parse Haskell programs•Unix shells (bash, sh, etc.) parse shell scripts•Explorer, Mozilla, etc., parse HTML•Command-line utilities parse command lines•etc., etc.Functional ParsersIn Haskell , a parser is naturally viewed as a function:type Parser = String -> TreeFunctional ParsersHowever, a parser might not actually use up the wholestring, so we also return the unused portion of the input:type Parser = String -> (Tree,String)Functional ParsersAlso, a given string might be parseable in many ways(including zero!), so we generalize to a list of results:type Parser = String -> [(Tree,String)]Functional ParsersThe result returned by a parser might not always be a tree,so we generalize once more to make the Par ser typepolymorphic:type Parser a = String -> [(a,Strin g)]Functional ParsersFinally, for the sake of readability, let’s change the typedeclaration into a newtype and add a constructor on theright-hand side. The convenience functi on parse takes aparser and appl ies it to a given string.newtype Parser a = Parser (String -> [(a,String)])parse :: Parser a -> String -> [(a,String )]parse (Parser p) = pPrimitive ParsersParsing an Arbitrary Characte rThe parser item fails if the input is empty, and consumesthe first character otherwise:item :: Parser Charitem = Parser (\cs -> case cs of"" -> [](c:cs) -> [(c,cs)])parse item "hello"⇒ [(’h’,"ello")]parse item ""⇒ []Parsing Nothing (Successfully)The parser return a always succeeds, returning the value awithout consuming any input:returnP :: a -> Parser areturnP a = Parser (\cs -> [(a,cs)])parse (returnP 5) "hello"⇒ [(5,"hello")]Putting Parsers i n S e quencep ‘seqP‘ q is a parser that first applies p and then applies qto e ach result from p.seqP :: Parser a -> (a -> Parser b) -> Parser bp ‘seqP‘ q =Parser(\cs -> concat [parse (q a) cs ’| (a,cs’) <- parse p cs])ExampleparseTwo :: Parser (Char,Char)parseTwo = item‘seqP‘ \x -> item‘seqP‘ \y -> return (x,y)parse parseTwo "hello"⇒ [((’h’,’e’),"llo" )]parse parseTwo "h"⇒ []Note that, if any parser in a sequence fails, then the wholesequence fails.Parsers Are a MonadThe Parser MonadThe defi nitions of returnP and seqP have the right types(and obey the required laws) to make Parser into a monad.instance Monad Parser wherereturn = returnP(>>=) = seqPThe Parser MonadHaving made this instance declaration, we can use do sy ntaxto simplify the presentation of the parseTwo function:parseTwo2 :: Pa rser (Char,Char)parseTwo2 = do x <- itemy <- itemreturn (x,y)More PrimitivesParsing Nothing (Unsuccessfully)The parser zeroP al ways fails:zeroP :: Parser azeroP = Parser (\cs -> [])Parsing a Character If It Satisfies Some TestThe parser sat p behaves like item if the firs t character onthe input string satisfies the predicate p ; othe rwi se it fails.sat :: (Char -> Bool) -> Parser Charsat p = do c <- itemif p c then return c else zeroPparse (sat (==’h’)) "hello"⇒ [(’h’,"ello")]parse (sat (==’x’)) "hello"⇒ []Exampleschar :: Char -> Parser Charchar c = sat (c ==)alphachar :: Parser Charalphachar = sa t isAlphanumchar :: Parser Charnumchar = sat isDigitdigit :: Parser Intdigit = do {c <- numchar; return (ord c - ord ’0’)}(isAlpha and isDigi t come from the Char module in thestandard library.)Nondetermin istic Choicep ‘choose P‘ q yields all the results of applying either p or qto the whol e input string.chooseP :: Parser a -> Parser a -> Parser ap ‘choose P‘ q = Parser(\cs -> parse p cs ++ parse q cs)alphanum :: Parser Charalphanum = alphachar ‘chooseP‘ numcharAnother Examplep = do { x <- item; return ("Got "++[x]) }‘chooseP‘do { x <- item; return ("Parsed "++ [x]) }parse p "xyz"⇒ [("Got x","yz"),( "Parsed x","yz")]Yet Another ExampleThis parser yields a function:addop :: Parser (Int -> Int -> Int)addop = do {char ’+’; return (+)}‘chooseP‘do {char ’-’; return (-)}For example:calc = do x <- digit; op <- addop; y <- d igitreturn (x ‘op‘ y)parse calc "1+2"⇒ [(3,"")]Recursive ParsersRegognizing a St ringstring s i s a parser that recognizes (and returns) exactlythe string s:string :: String -> Parser Stringstring "" = return ""string (c:cs) = do {char c; string cs; return (c:cs)}Parsing a Sequencemany :: Parser a -> Parser [a]many p = many1 p ‘chooseP‘ return []many1 :: Parser a -> Parser [a]many1 p = do {a <- p; as <- many p; return (a:as)}parse (many numchar) "123ab"⇒ [("123","ab"),("1 2","3ab"),("1","23ab"),("","123ab")]A Parser for Ari thmetic Expressionscalc1 = do x <- digitop <- addopy <- calc1return (x ‘op‘ y)‘chooseP‘digitparse calc1 "3+4-1"⇒ [(6,""),(7,"-1"), (3,"+4-1")]Note that, for simplicity, we’re taking + and - to beright-associative for the moment.Query: What happe ns if we exchange the arguments tochooseP?A Complete ParserMultiplication-like OperatorsAs before...mulop :: Parser (Int -> Int -> Int)mulop = do {char ’*’; return (*)}‘chooseP‘do {char ’/’; return (div)}Complete Parserexpr = do x <- term; op <- a ddop; y <- exprreturn (x ‘op‘ y)‘chooseP‘termterm = do x <- factor; op <- mulop; y <- termreturn (x ‘op‘ y)‘chooseP‘factorfactor = digit‘chooseP‘do {char ’(’; n <- expr; char ’)’; return n}parse expr "(3+4)*5"⇒ [(35,""),(7,"*5") ]A Little More Abstractio nNote the similarity in the definitions of expr and term.expr = do x <- term; op <- a ddop; y <- exprreturn (x ‘op‘ y)‘chooseP‘termterm = do x <- factor; op <- mulop; y <- termreturn (x ‘op‘ y)‘chooseP‘factorCan we express them both as instances of a commonabstraction?A Little More Abstractio nThe parser chainl p op consumes a non-empty sequence ofps from the front of the input and combines them together(in the style of foldl) using op.chainl1 :: Parser a -> Parser (a -> a -> a)-> Parser ap ‘chainl 1‘ op =do {a <- p; rest a}whererest a = do {f <- o p; b <- p; rest (f a b)}‘chooseP‘ return aA similar chaining function al so works for empty sequences:chainl :: Parser a -> Parser ( a -> a -> a) -> a-> Parser achainl p op a =(p ‘chainl1‘ op) ‘chooseP‘ return aA Better Arithmetic Expression Parserexpr2,term2,factor2 :: Parser Intexpr2 =
View Full Document