Rewinding the stack for parsing and pretty printing Mathieu Boespflug McGill University 26 July 2011 1 / 31
A little primer on H ASKELL ◮ The polymorphic type of lists is written [ a ] . ◮ head of type [ a ] → a is written head :: [ a ] → a . ◮ () :: () ◮ new datatype: data Maybe a = Nothing | Just a ◮ type synonym: type Env a = [( Int , a )] 2 / 31
The Problem 3 / 31
What is a parser? type P a = String → Either Error ( a , String ) 4 / 31
What is a parser? type P a = String → Either Error ( a , String ) fail :: Error → Either Error ( a , String ) success :: ( a , String ) → Either Error ( a , String ) lit :: P () lit x = P ( λ s → case stripPrefix x s of Nothing → fail "Parse error." Just s ′ → success (() , s ′ )) true :: P Bool true = P ( λ s → case stripPrefix "true" s of Nothing → fail "Parse error." Just s ′ → success ( True , s ′ )) false :: P Bool false = P ( λ s → case stripPrefix "false" s of Nothing → fail "Parse error." Just s ′ → success ( False , s ′ )) 4 / 31
Combining parsers ◮ Parsers are monadic actions. ◮ Can be built compositionally from existing parser combinators , which are also monadic actions. pure f = P ( λ s → ( f , s )) P m ⊗ P k = P ( λ s → case m s of ( f , s ′ ) → case k s ′ of ( x , s ′′ ) → ( f x , s ′′ )) 5 / 31
Example parser pure :: a → P a ( ⊗ ) :: P ( a → b ) → P a → P b ( ⊕ ) :: P a → P a → P a E :: = true | false | if E then E else E data Tm = Boolean Bool | If Tm Tm Tm tm :: P Tm tm = pure Boolean ⊗ true ⊕ pure Boolean ⊗ false ⊕ pure ( λ x y z → If x y z ) ⊗ lit "if" ⊗ tm ⊗ "then" ⊗ tm ⊗ lit "else" ⊗ tm 6 / 31
Example Pretty Printer E :: = true | false | if E then E else E data Tm = Boolean Bool | If Tm Tm Tm tm t = case t of Boolean True → "true" Boolean False → "false" If x y z → "if " + + x + + " then " + + y + + " else " + + z 7 / 31
Objective: ◮ Write the parser once, get the pretty printer for free. ◮ Write the pretty printer once, get the parser for free. 8 / 31
Objective: ◮ Write the parser once, get the pretty printer for free. ◮ Write the pretty printer once, get the parser for free. Why? ◮ Synchrony! ◮ Synchrony means easier to maintain. ◮ Synchrony means less code. ◮ Less code means fewer bugs. ◮ Pollack consistency. 8 / 31
Objective: ◮ Write the parser once, get the pretty printer for free. ◮ Write the pretty printer once, get the parser for free. Why? ◮ Synchrony! ◮ Synchrony means easier to maintain. ◮ Synchrony means less code. ◮ Less code means fewer bugs. ◮ Pollack consistency. How? ◮ Write both at the same time. 8 / 31
The Solution 9 / 31
A Cassette 10 / 31
A Kassette in H ASKELL data K7 a b = K7 { sideA :: a , sideB :: b } 11 / 31
A Kassette in H ASKELL data K7 a b = K7 { sideA :: a , sideB :: b } ( ♦ ) :: K7 ( b → c ) ( a → b ) → K7 ( a → b ) ( b → c ) → K7 ( a → c ) ( c → a ) ∼ ( K7 f f ′ ) ♦ ∼ ( K7 g g ′ ) = K7 ( f ◦ g ) ( g ′ ◦ f ′ ) 11 / 31
The category of cassettes Can overload ( ◦ ) with ( ♦ ) : class Category κ where id :: κ a a ( ◦ ) :: κ b c → κ a b → κ a c instance Category K7 where id = K7 id id ( ◦ ) = ( ♦ ) 12 / 31
Sequencing 13 / 31
A tentative parsing and pretty printing cassette type PP a = K7 ( String → Either Error ( a , String )) ( ?? ) 14 / 31
A tentative parsing and pretty printing cassette type PP a = K7 ( String → Either Error ( a , String )) (( a , String ) → String ) 14 / 31
A tentative parsing and pretty printing cassette type PP a = K7 ( String → Either Error ( a , String )) ( Either Error ( a , String ) → String ) 14 / 31
A tentative parsing and pretty printing cassette type PP a = K7 ( String → Either Error ( a , String )) (( a , String ) → String ) 14 / 31
A tentative parsing and pretty printing cassette type PP a = K7 ( String → Either Error ( a , String )) ( a → String → String ) 14 / 31
A tentative parsing and pretty printing cassette type PP a = K7 ( String → Either Error ( a , String )) ( a → String → String ) pure ( λ x y z → If x y z ) :: P ( Tm → Tm → Tm → Tm ) pure ( λ x y z → If x y z ) ⊗ tm :: P ( Tm → Tm → Tm ) 14 / 31
A tentative parsing and pretty printing cassette type PP a = K7 ( String → Either Error ( a , String )) ( a → String → String ) pure ( λ x y z → If x y z ) :: P ( Tm → Tm → Tm → Tm ) pure ( λ x y z → If x y z ) ⊗ tm :: P ( Tm → Tm → Tm ) K7 ( pure ( λ x y z → If x y z )) ( ?? ) :: K7 ( String → ( Tm → Tm → Tm → Tm , String )) (( Tm → Tm → Tm → Tm ) → String → String ) 14 / 31
A tentative parsing and pretty printing cassette type PP a = K7 ( String → Either Error ( a , String )) ( a → String → String ) pure ( λ x y z → If x y z ) :: P ( Tm → Tm → Tm → Tm ) pure ( λ x y z → If x y z ) ⊗ tm :: P ( Tm → Tm → Tm ) K7 ( pure ( λ ( x , y , z ) → If x y z )) ( ?? ) :: K7 ( String → (( Tm , Tm , Tm ) → Tm , String )) (( Tm → ( Tm , Tm , Tm )) → String → String ) 14 / 31
The problem To summarize: ◮ Need uncurried functions so that type to parse and type to pretty print match. ◮ Can inductively construct curried function type a 1 → ( a 2 → ( ... → a n )) . ◮ Uncurried function type ( a 1 , a 2 ,..., a n − 1 ) → a n cannot be inductively constructed. ◮ Cannot feed arguments to an uncurried function incrementally. ◮ Tuples as arguments and returning tuples breaks composability. 15 / 31
Recovering symmetry with continuation passing style Type of consumer in CPS: ( a 1 → ... → a n → r ) → r Type of producer in CPS: r → a 1 → ... → a n → r 16 / 31
Recovering symmetry with continuation passing style Type of parser in CPS: ( String → a 1 → ... → a n → r ) → String → r Type of pretty printer CPS: ( String → r ) → String → a 1 → ... → a n → r 16 / 31
Recovering symmetry with continuation passing style Type of parser in CPS: ( String → a 1 → ... → a n → r ) → String → r Type of pretty printer CPS: ( String → r ) → String → a 1 → ... → a n → r ◮ Both producer and consumer can be curried! ◮ Complete symmetry. 16 / 31
Recovering symmetry with continuation passing style Type of parser in CPS: ( String → a 1 → ... → a n → r ) → String → r Type of pretty printer CPS: ( String → r ) → String → a 1 → ... → a n → r Type of parser in CPS: ( String → a 1 → ... → a n → r ) → ( String → r ) Type of pretty printer in CPS: ( String → r ) → ( String → a 1 → ... → a n → r ) ◮ Both producer and consumer can be curried! ◮ Complete symmetry. 16 / 31
Composing parsers in CPS f :: ( String → b → r 1 ) → ( String → r 1 ) g :: ( String → a → r 2 ) → ( String → r 2 ) f ◦ g :: ( String → a → b → r 1 ) → ( String → r 1 ) Unification constraints: r 2 = b → r 1 . 17 / 31
Composing pretty printers in CPS (Danvy, 1998) f ′ :: ( String → r 1 ) → ( String → b → r 1 ) g ′ :: ( String → r 2 ) → ( String → a → r 2 ) g ′ ◦ f ′ :: ( String → r 1 ) → ( String → a → b → r 1 ) Unification constraints: r 2 = b → r 1 . 18 / 31
Putting it all together K7 f f ′ ◦ K7 g g ′ :: K7 (( String → a → b → r ) → ( String → r )) (( String → r ) → ( String → a → b → r )) 19 / 31
{ 0,1 } -parsers and { 0,1 } -printers Existentially pack answer type: type PPP a = ∀ r . K7 (( String → a → r ) → ( String → r )) (( String → r ) → ( String → a → r )) type PPP0 = ∀ r . K7 (( String → r ) → ( String → r )) (( String → r ) → ( String → r )) ◮ Not closed under composition! ◮ Compose n -parser with (pure) n -consumer to get 1-parser. ◮ Compose n -printer with (pure) n -producer to get 1-printer. ◮ Parser-consumer and printer-producer composition written using ( −→ ) (alias for ( ♦ ) , but with lower precedence). 20 / 31
Example: parsing and printing pairs lit :: String → PPP0 lit x = K7 ( λ k s → case stripPrefix x s of Just s ′ → k s ′ ) ( λ k s → k ( x + + s )) anyChar :: PPP Char anyChar = K7 ( λ k s → k ( tail s ) ( head s )) ( λ k s x → k ([ x ] + + s )) 21 / 31
Example: parsing and printing pairs lit :: String → PPP0 lit x = K7 ( λ k s → case stripPrefix x s of Just s ′ → k s ′ ) ( λ k s → k ( x + + s )) anyChar :: PPP Char anyChar = K7 ( λ k s → k ( tail s ) ( head s )) ( λ k s x → k ([ x ] + + s )) kpair :: K7 (( String → ( a , b ) → r ) → ( String → b → a → r )) (( String → b → a → r ) → ( String → ( a , b ) → r )) kpair = K7 ( λ k s y x → k s ( x , y )) ( λ k s ( x , y ) → k s y x ) 21 / 31
Recommend
More recommend