Secure and Efgicient Parsing via Programming Language Theory Neel Krishnaswami & Jeremy Yallop
parsing & security types & algebras staging & speed speed & correctness Γ ⊢ e : τ ✓ ✓ < < e > >
Parsing and security parsing & security types & staging & speed & correctness parsing interpretation ⊢ algebras ≡ speed < < e > > ✓ ✓
Parser combinators: appeal simplicity: atom rparen) star sexp >> sexp = (lparen >> empty = return [] where empty star p = ps parsing & declarative: parsers are functions parsers resemble BNF correctness staging & security types & | ATOM speed & ⊢ star :: Parser a → Parser [a] algebras ≡ ⊕ ⊕ ps = do x ← p xs ← star p sexp ::= LPAREN sexp* RPAREN speed < < e > > return (x : xs) ✓ ✓
Parser combinators: pitfalls speed & not in practice declarative? exponential (or worse) complexity: correctness parsing & (demonstration) staging & types & security ⊢ p ⊕ q ̸≡ p ⊕ q algebras ≡ speed < < e > > ✓ ✓
Parser combinators: pitfalls speed & not in practice declarative? exponential (or worse) complexity: correctness parsing & (demonstration) staging & types & security ⊢ p ⊕ q ̸≡ p ⊕ q algebras ≡ speed < < e > > ✓ ✓
ASP and its aims correctness Competitive Determinism Guaranteed Semantics Unsurprising Interface Conventional asp: a combinator library with an unusual combination of features Performance parsing & speed & staging & types & security ⊢ algebras ≡ speed < < e > > ✓ ✓
ASP: interface correctness security types & Parsers from grammars (also star , plus , infix , &c.) staging & User-defined functions val eps: unit t parsing & speed & Abstract grammar interface (context-free expressions) type α t val chr: char → char t val seq: α t → β t → ( α * β ) t val bot: α t ⊢ val alt: α t → α t → α t algebras ≡ val fix: ( α t → α t) → α t val map: ( α → β ) → α t → β t alt (map (fun _ → None) eps) let option r = speed < < e > > (map (fun x → Some x) r) ✓ ✓ val parser: α t → (char Stream.t → α )
ASP: interface speed & Parsers from grammars (also star , plus , infix , &c.) User-defined functions val eps: unit t parsing & Real interface: arbitrary tokens correctness Abstract grammar interface (context-free expressions) security staging & types & type α t val chr: char → char t val seq: α t → β t → ( α * β ) t val bot: α t ⊢ val alt: α t → α t → α t algebras ≡ val fix: ( α t → α t) → α t val map: ( α → β ) → α t → β t alt (map (fun _ → None) eps) let option r = speed < < e > > (map (fun x → Some x) r) ✓ ✓ val parser: α t → (char Stream.t → α )
ASP: interface speed & Parsers from grammars (also star , plus , infix , &c.) User-defined functions val eps: unit t parsing & Real interface: arbitrary tokens correctness Abstract grammar interface (context-free expressions) security staging & types & type α t val chr: char → char t val seq: α t → β t → ( α * β ) t val bot: α t ⊢ val alt: α t → α t → α t algebras ≡ val fix: ( α t → α t) → α t val map: ( α → β ) → α t → β t alt (map (fun _ → None) eps) let option r = speed < < e > > (map (fun x → Some x) r) ✓ ✓ val parser: α t → (char Stream.t → α )
ASP: interface speed & Parsers from grammars (also star , plus , infix , &c.) User-defined functions val eps: unit t parsing & Real interface: arbitrary tokens correctness Abstract grammar interface (context-free expressions) security staging & types & type α t val chr: char → char t val seq: α t → β t → ( α * β ) t val bot: α t ⊢ val alt: α t → α t → α t algebras ≡ val fix: ( α t → α t) → α t val map: ( α → β ) → α t → β t alt (map (fun _ → None) eps) let option r = speed < < e > > (map (fun x → Some x) r) ✓ ✓ val parser: α t → (char Stream.t → α )
ASP: interface correctness Parsers from grammars (also star , plus , infix , &c.) User-defined functions val eps: unit t parsing & Imperative stream Real interface: arbitrary tokens Abstract grammar interface (context-free expressions) speed & security types & staging & type α t val chr: char → char t val seq: α t → β t → ( α * β ) t val bot: α t ⊢ val alt: α t → α t → α t algebras ≡ val fix: ( α t → α t) → α t val map: ( α → β ) → α t → β t alt (map (fun _ → None) eps) let option r = speed < < e > > (map (fun x → Some x) r) ✓ ✓ val parser: α t → (char Stream.t → α )
ASP: interface correctness Parsers from grammars (also star , plus , infix , &c.) User-defined functions val eps: unit t Abstract grammar interface (context-free expressions) parsing & Imperative stream Real interface: arbitrary tokens May fail! speed & security staging & types & type α t val chr: char → char t val seq: α t → β t → ( α * β ) t val bot: α t ⊢ val alt: α t → α t → α t algebras ≡ val fix: ( α t → α t) → α t val map: ( α → β ) → α t → β t alt (map (fun _ → None) eps) let option r = speed < < e > > (map (fun x → Some x) r) ✓ ✓ val parser: α t → (char Stream.t → α )
accepted or rejected? seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a'))
accepted or rejected? seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a')) ✓
accepted or rejected? seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a')) ✓
accepted or rejected? seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) disjunctive non-determinism X alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a')) ✓
accepted or rejected? seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) disjunctive non-determinism X alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a')) ✓
accepted or rejected? seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) disjunctive non-determinism X alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a')) ✓ ✓
accepted or rejected? seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) disjunctive non-determinism X alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a')) ✓ ✓
accepted or rejected? seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) (also reject: lefu recursion, non-lefu-factored) disjunctive non-determinism X sequential non-determinism X alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a')) ✓ ✓
accepted or rejected? Plan : use a type system to decide seq (chr 'a') (option (chr 'b')) seq (option (chr 'a')) (option (chr 'a')) (also reject: lefu recursion, non-lefu-factored) disjunctive non-determinism X sequential non-determinism X alt (map (fun _ → 1) (chr 'a')) alt (map (fun _ → 1) (chr 'a')) (map (fun _ → 2) (chr 'b')) (map (fun _ → 2) (chr 'a')) ✓ ✓
context-free expressions Semantics of CFEs c g Context-free expressions (CFEs) x speed & correctness parsing & types & staging & security ::= ⊥ | g ∨ g ′ | ϵ | | g · g ′ | | µ x . g ⊢ algebras ≡ � ⊥ � γ = ∅ � g ∨ g ′ � γ = � g � γ ∪ � g ′ � γ � ϵ � γ = { ε } � c � γ = { c } { w · w ′ | w ∈ � g � γ ∧ w ′ ∈ � g ′ � γ } speed < < e > > � g · g ′ � γ = � x � γ = γ ( x ) � µ x . g � γ = fix ( λ X . � g � ( γ, X / x )) ✓ ✓ = ∅ ∪ L i where L 0 fix ( f ) = = f ( L n ) L n +1 i ∈ N
ASP: equations speed & g g parsing & CFEs form an idempotent semiring correctness g security types & staging & g 1 ∨ ( g 2 ∨ g 3 ) = ( g 1 ∨ g 2 ) ∨ g 3 g ′ ∨ g g ∨ g ′ = g ∨ ⊥ = ⊢ g ∨ g = algebras ≡ g 1 · ( g 2 · g 3 ) = ( g 1 · g 2 ) · g 3 g · ϵ = ( g 1 ∨ g 2 ) · g = ( g 1 · g ) ∨ ( g 2 · g ) g · ( g 1 ∨ g 2 ) = ( g · g 1 ) ∨ ( g · g 2 ) g · ⊥ = ⊥ speed < < e > > ⊥ · g = ⊥ ✓ ✓ (along with some equations for µ )
Recommend
More recommend