the geometry of syntax and semantics for directed file
play

The geometry of syntax and semantics for directed file - PowerPoint PPT Presentation

IEEE S&P 2020 LangSec workshop The geometry of syntax and semantics for directed file transformations Steve Huntsman 1 Michael Robinson 2 1 FAST Labs / Cyber Technology 2 American University 21 May 2020 IEEE S&P 2020 LangSec workshop


  1. IEEE S&P 2020 LangSec workshop The geometry of syntax and semantics for directed file transformations Steve Huntsman 1 Michael Robinson 2 1 FAST Labs / Cyber Technology 2 American University 21 May 2020

  2. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 2 string.h must be used carefully to prevent buffer overflows • X = strings of ASCII NULL s and printable characters • G = cyclic shifts on individual characters • Goal: remove NULL s and punctuation; make lowercase • This example is discussed in the paper

  3. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 3 Transform files to achieve language-theoretical security • X = space of files in some fixed format (e.g., PDF) • G = various invertible transformations • Goal: eliminate nondeterministic syntax • Input ambiguity = vulnerability

  4. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 4 Patch binary code to secure critical legacy systems • X = space of disassembled binary code • G = “sugar-neutral” lifts , translations, etc • Goal: parsimoniously patch a known vulnerability • Compiler/build options, dependencies make this hard

  5. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 5 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations

  6. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 6 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G )

  7. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 7 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G ) • Locally looks like X × G • G acts on P nicely

  8. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 8 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G ) • Locally looks like X × G • G acts on P nicely • E.g., X = S 1 (time of day); G = Z (epoch); P = R (as a helix above X )

  9. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 9 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G ) • Locally looks like X × G • G acts on P nicely • E.g., X = S 1 ; G = (0 , 1) w/ x ⊞ y := f ( f − 1 ( x ) + f − 1 ( y )) for invertible f : R → (0 , 1)

  10. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 10 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G ) • Locally looks like X × G • G acts on P nicely • E.g., Hopf fibration S 1 → S 3 → S 2

  11. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 11 Connections model geometry directing transformations • Principal bundles are a natural arena for geometry realized through a connection

  12. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 12 Connections model geometry directing transformations • Principal bundles are a natural arena for geometry realized through a connection • I.e., a “vertical” and “horizontal” direct sum decomposition of tangent spaces . . .

  13. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 13 Connections model geometry directing transformations • Principal bundles are a natural arena for geometry realized through a connection • I.e., a “vertical” and “horizontal” direct sum decomposition of tangent spaces . . . • . . . that is equivariant under group action

  14. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 14 Connections model geometry directing transformations • Principal bundles are a natural arena for geometry realized through a connection • I.e., a “vertical” and “horizontal” direct sum decomposition of tangent spaces . . . • . . . that is equivariant under group action • Connects local product geometries via parallel transport

  15. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 15 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations

  16. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 16 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae

  17. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 17 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend

  18. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 18 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend ⇒ objend % objend -> endobj

  19. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 19 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend ⇒ objend % objend -> endobj ⇒ endobj % objend -> endobj

  20. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 20 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend ⇒ objend % objend -> endobj ⇒ endobj % objend -> endobj • Sugar-neutral : transformations should handle sugar, but not introduce or eliminate it

  21. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 21 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend ⇒ objend % objend -> endobj ⇒ endobj % objend -> endobj • Sugar-neutral : transformations should handle sugar, but not introduce or eliminate it • Suggests using normal forms

  22. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 22 Normal forms simplify and disambiguate START; S jmp @5 do while b @4: int i; S jmp @9 for (i=0; i<10; i++) do while b @8: { if b jne @19 z+=i; S jmp @10 } do while b @19: S jmp @14 int n=0; enddo @13:@14: while (n<10) { endif jg @13 x+=n; S @9:@10: n++; enddo jge @20 } S jmp @8 enddo; HALT @5:@20: jge @21 (From Lacomis et al. ) (From Zhang and D’Hollander) jmp @4 @21:

  23. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 23 Concrete syntax trees parameterize a principal bundle • G corresponds to semantics-preserving CST transformations • Equivalence class of CSTs corresponding to a given AST has group-theoretical and language security significance and indicates format redundancy • E.g., xref table in PDF (which nobody trusts)

  24. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 24 Dynamic concretization semantically enriches an AST [Files] can be considered as an abstraction of their semantics. For example the syntax of [files] records the existence of [objects] and maybe their type but not [the trace of a parser or renderer], as defined by the semantics. 1 • Annotating (with, e.g., types) and cross-linking an AST gives a semantically rich derived graph • To understand a file, parse it . . . 1 [Cousot and Cousot], replacing “program” and “variable” with “file” and “object,” respectively.

  25. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 25 Dynamic concretization semantically enriches an AST [Files] can be considered as an abstraction of their semantics. For example the syntax of [files] records the existence of [objects] and maybe their type but not [the trace of a parser or renderer], as defined by the semantics. 1 • Annotating (with, e.g., types) and cross-linking an AST gives a semantically rich derived graph • To understand a file, parse it . . . • . . . to understand it more, render/compile it 1 [Cousot and Cousot], replacing “program” and “variable” with “file” and “object,” respectively.

Recommend


More recommend