Unformatted text preview:

CSE399: Advanced ProgrammingHandout 13Functional ParsersWhat is a Parser?A parser is a program that analyzes a piece of text todetermine its structure (and, typically, returns a treerepresenting this structure).The World is Full of Parsers!Almost every real-life program involves some kind ofparsing...•Hugs and GHC parse Haskell programs•Unix shells (bash, sh, etc.) parse shell scripts•Explorer, Mozilla, etc., parse HTML•Command-line utilities parse command lines•etc., etc.Functional ParsersIn Haskell , a parser is naturally viewed as a function:type Parser = String -> TreeFunctional ParsersHowever, a parser might not actually use up the wholestring, so we also return the unused portion of the input:type Parser = String -> (Tree,String)Functional ParsersAlso, a given string might be parseable in many ways(including zero!), so we generalize to a list of results:type Parser = String -> [(Tree,String)]Functional ParsersThe result returned by a parser might not always be a tree,so we generalize once more to make the Par ser typepolymorphic:type Parser a = String -> [(a,Strin g)]Functional ParsersFinally, for the sake of readability, let’s change the typedeclaration into a newtype and add a constructor on theright-hand side. The convenience functi on parse takes aparser and appl ies it to a given string.newtype Parser a = Parser (String -> [(a,String)])parse :: Parser a -> String -> [(a,String )]parse (Parser p) = pPrimitive ParsersParsing an Arbitrary Characte rThe parser item fails if the input is empty, and consumesthe first character otherwise:item :: Parser Charitem = Parser (\cs -> case cs of"" -> [](c:cs) -> [(c,cs)])parse item "hello"⇒ [(’h’,"ello")]parse item ""⇒ []Parsing Nothing (Successfully)The parser return a always succeeds, returning the value awithout consuming any input:returnP :: a -> Parser areturnP a = Parser (\cs -> [(a,cs)])parse (returnP 5) "hello"⇒ [(5,"hello")]Putting Parsers i n S e quencep ‘seqP‘ q is a parser that first applies p and then applies qto e ach result from p.seqP :: Parser a -> (a -> Parser b) -> Parser bp ‘seqP‘ q =Parser(\cs -> concat [parse (q a) cs ’| (a,cs’) <- parse p cs])ExampleparseTwo :: Parser (Char,Char)parseTwo = item‘seqP‘ \x -> item‘seqP‘ \y -> return (x,y)parse parseTwo "hello"⇒ [((’h’,’e’),"llo" )]parse parseTwo "h"⇒ []Note that, if any parser in a sequence fails, then the wholesequence fails.Parsers Are a MonadThe Parser MonadThe defi nitions of returnP and seqP have the right types(and obey the required laws) to make Parser into a monad.instance Monad Parser wherereturn = returnP(>>=) = seqPThe Parser MonadHaving made this instance declaration, we can use do sy ntaxto simplify the presentation of the parseTwo function:parseTwo2 :: Pa rser (Char,Char)parseTwo2 = do x <- itemy <- itemreturn (x,y)More PrimitivesParsing Nothing (Unsuccessfully)The parser zeroP al ways fails:zeroP :: Parser azeroP = Parser (\cs -> [])Parsing a Character If It Satisfies Some TestThe parser sat p behaves like item if the firs t character onthe input string satisfies the predicate p ; othe rwi se it fails.sat :: (Char -> Bool) -> Parser Charsat p = do c <- itemif p c then return c else zeroPparse (sat (==’h’)) "hello"⇒ [(’h’,"ello")]parse (sat (==’x’)) "hello"⇒ []Exampleschar :: Char -> Parser Charchar c = sat (c ==)alphachar :: Parser Charalphachar = sa t isAlphanumchar :: Parser Charnumchar = sat isDigitdigit :: Parser Intdigit = do {c <- numchar; return (ord c - ord ’0’)}(isAlpha and isDigi t come from the Char module in thestandard library.)Nondetermin istic Choicep ‘choose P‘ q yields all the results of applying either p or qto the whol e input string.chooseP :: Parser a -> Parser a -> Parser ap ‘choose P‘ q = Parser(\cs -> parse p cs ++ parse q cs)alphanum :: Parser Charalphanum = alphachar ‘chooseP‘ numcharAnother Examplep = do { x <- item; return ("Got "++[x]) }‘chooseP‘do { x <- item; return ("Parsed "++ [x]) }parse p "xyz"⇒ [("Got x","yz"),( "Parsed x","yz")]Yet Another ExampleThis parser yields a function:addop :: Parser (Int -> Int -> Int)addop = do {char ’+’; return (+)}‘chooseP‘do {char ’-’; return (-)}For example:calc = do x <- digit; op <- addop; y <- d igitreturn (x ‘op‘ y)parse calc "1+2"⇒ [(3,"")]Recursive ParsersRegognizing a St ringstring s i s a parser that recognizes (and returns) exactlythe string s:string :: String -> Parser Stringstring "" = return ""string (c:cs) = do {char c; string cs; return (c:cs)}Parsing a Sequencemany :: Parser a -> Parser [a]many p = many1 p ‘chooseP‘ return []many1 :: Parser a -> Parser [a]many1 p = do {a <- p; as <- many p; return (a:as)}parse (many numchar) "123ab"⇒ [("123","ab"),("1 2","3ab"),("1","23ab"),("","123ab")]A Parser for Ari thmetic Expressionscalc1 = do x <- digitop <- addopy <- calc1return (x ‘op‘ y)‘chooseP‘digitparse calc1 "3+4-1"⇒ [(6,""),(7,"-1"), (3,"+4-1")]Note that, for simplicity, we’re taking + and - to beright-associative for the moment.Query: What happe ns if we exchange the arguments tochooseP?A Complete ParserMultiplication-like OperatorsAs before...mulop :: Parser (Int -> Int -> Int)mulop = do {char ’*’; return (*)}‘chooseP‘do {char ’/’; return (div)}Complete Parserexpr = do x <- term; op <- a ddop; y <- exprreturn (x ‘op‘ y)‘chooseP‘termterm = do x <- factor; op <- mulop; y <- termreturn (x ‘op‘ y)‘chooseP‘factorfactor = digit‘chooseP‘do {char ’(’; n <- expr; char ’)’; return n}parse expr "(3+4)*5"⇒ [(35,""),(7,"*5") ]A Little More Abstractio nNote the similarity in the definitions of expr and term.expr = do x <- term; op <- a ddop; y <- exprreturn (x ‘op‘ y)‘chooseP‘termterm = do x <- factor; op <- mulop; y <- termreturn (x ‘op‘ y)‘chooseP‘factorCan we express them both as instances of a commonabstraction?A Little More Abstractio nThe parser chainl p op consumes a non-empty sequence ofps from the front of the input and combines them together(in the style of foldl) using op.chainl1 :: Parser a -> Parser (a -> a -> a)-> Parser ap ‘chainl 1‘ op =do {a <- p; rest a}whererest a = do {f <- o p; b <- p; rest (f a b)}‘chooseP‘ return aA similar chaining function al so works for empty sequences:chainl :: Parser a -> Parser ( a -> a -> a) -> a-> Parser achainl p op a =(p ‘chainl1‘ op) ‘chooseP‘ return aA Better Arithmetic Expression Parserexpr2,term2,factor2 :: Parser Intexpr2 =


View Full Document

Penn CIS 399 - Functional Parsers

Download Functional Parsers
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Functional Parsers and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Functional Parsers 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?