signature inference for functional property discovery
play

Signature Inference for Functional Property Discovery or: How never - PowerPoint PPT Presentation

Signature Inference for Functional Property Discovery or: How never to come up with tests manually anymore(*) Tom Sydney Kerckhove ETH Zurich https://cs-syd.eu/ https://github.com/NorfairKing 27 July 2017 Long term vision: A future in which


  1. Definitions: Scope Scope: Functions in scope Size of scope: Number of functions in scope

  2. Definitions: Scope Scope: Functions in scope Size of scope: Number of functions in scope Size of signature: Number of functions in signature

  3. Automated, but still slow log(runtime) (seconds) 100 10 1 5 10 15 scope−size (functions)

  4. Why is this slow? 1. Maximum size of the discovered properties

  5. Why is this slow? 1. Maximum size of the discovered properties 2. Size of the signature

  6. Idea

  7. Critical insight We are not interested in the entire codebase. We are interested in a relatively small amount of code.

  8. Reducing the size of the signature inferSignature :: [Function] -- Focus functions -> [Function] -- Functions in scope -> [Function] -- Chosen functions

  9. Full background and empty background inferFullBackground _ scope = scope inferEmptyBackground focus _ = focus

  10. Full background and empty background inferFullBackground _ scope = scope inferEmptyBackground focus _ = focus runtime ( time seconds ) 300 200 100 0 5 10 15 scope−size ( # functions ) strategy empty−background full−background

  11. Full background and empty background inferFullBackground _ scope = scope inferEmptyBackground focus _ = focus Boxplot for relevant−equations (More is better.) full−background ● ● empty−background ● 0 5 10 15 20 25 30 35 relevant−equations ( # equations )

  12. Pause slide with a joke safeCoerce :: a ~ b => a -> b safeCoerce x = x

  13. Syntactic similarity: Name inferSyntacticSimilarityName [focus] scope = take 5 $ sortOn (\sf -> hammingDistance (name focus) (name sf)) scope

  14. Syntactic similarity: Name inferSyntacticSimilarityName [focus] scope = take 5 $ sortOn (\sf -> hammingDistance (name focus) (name sf)) scope runtime ( time seconds ) 300 200 100 0 5 10 15 scope−size ( # functions ) strategy full−background syntactical−similarity−name−5

  15. Syntactic similarity: Name inferSyntacticSimilarityName [focus] scope = take 5 $ sortOn (\sf -> hammingDistance (name focus) (name sf)) scope Boxplot for relevant−equations (More is better.) syntactical−similarity−name−5 ● ● full−background ● ● 0 10 20 30 40 relevant−equations ( # equations )

  16. Syntactic similarity: Implementation inferSyntacticSimilaritySymbols i [focus] scope = take i $ sortOn (\sf -> hammingDistance (symbols focus) (symbols sf)) scope

  17. Syntactic similarity: Implementation inferSyntacticSimilaritySymbols i [focus] scope = take i $ sortOn (\sf -> hammingDistance (symbols focus) (symbols sf)) scope runtime ( time seconds ) 300 200 100 0 5 10 15 scope−size ( # functions ) strategy full−background syntactical−similarity−symbols−5

  18. Syntactic similarity: Implementation inferSyntacticSimilaritySymbols i [focus] scope = take i $ sortOn (\sf -> hammingDistance (symbols focus) (symbols sf)) scope Boxplot for relevant−equations (More is better.) syntactical−similarity−symbols−5 ● ● full−background ● ● 0 10 20 30 relevant−equations ( # equations )

  19. Syntactic similarity: Type inferSyntacticSimilarityType i [focus] scope = take i $ sortOn (\sf -> hammingDistance (getTypeParts focus) (getTypeParts sf)) scope

  20. Syntactic similarity: Type inferSyntacticSimilarityType i [focus] scope = take i $ sortOn (\sf -> hammingDistance (getTypeParts focus) (getTypeParts sf)) scope runtime ( time seconds ) 300 200 100 0 5 10 15 scope−size ( # functions ) strategy full−background syntactical−similarity−type−5

  21. Syntactic similarity: Type inferSyntacticSimilarityType i [focus] scope = take i $ sortOn (\sf -> hammingDistance (getTypeParts focus) (getTypeParts sf)) scope Boxplot for relevant−equations (More is better.) syntactical−similarity−type−5 ● ● full−background ● ● 0 10 20 30 40 relevant−equations ( # equations )

  22. Other things we tried 1. Similarity using a different metric: edit distance 2. Unions of the previous strategies

  23. Breakthrough Histogram of the number of different functions in an equation 0.4 relative # of cases 0.3 0.2 0.1 0.0 0 1 2 3 4 5 Different functions

  24. Idea

  25. We can run QuickSpec more than once!

  26. Inferred Signature type SignatureInferenceStrategy = [Function] -> [Function] -> InferredSignature

  27. Inferred Signature type SignatureInferenceStrategy = [Function] -> [Function] -> InferredSignature Combine the results of multiple runs: type InferredSignature = [Signature]

  28. Inferred Signature type SignatureInferenceStrategy = [Function] -> [Function] -> InferredSignature Combine the results of multiple runs: type InferredSignature = [Signature] User previous results as background properties: type InferredSignature = Forest Signature

  29. Inferred Signature type SignatureInferenceStrategy = [Function] -> [Function] -> InferredSignature Combine the results of multiple runs: type InferredSignature = [Signature] User previous results as background properties: type InferredSignature = Forest Signature Share previous runs: type InferredSignature = DAG Signature

  30. Chunks chunks :: SignatureInferenceStrategy > chunks > [sort :: Ord a => [a] -> [a]] > [reverse :: [a] -> [a], id :: a -> a] [sort, reverse] | v -> [sort] | | [sort, id]

  31. The runtime of chunks runtime ( time seconds ) 300 200 100 0 5 10 15 scope−size ( # functions ) strategy chunks full−background

  32. The outcome of chunks: Relevant equations Boxplot for relevant−equations (More is better.) full−background ● ● chunks 0 10 20 30 40 50 60 relevant−equations ( # equations )

  33. Why does chunks find more relevant equations? Boxplot for equations (More is better.) full−background chunks 20 40 60 80 equations ( # equations )

  34. Why does chunks find more relevant equations? Scope: i = (+ 1) j = (+ 2) k = (+ 3) l = (+ 4) m = (+ 5) n = (+ 6) o = (+ 7) p = (+ 8) q = (+ 9) r = (+ 10)

  35. Why does chunks find more relevant equations? Scope: i = (+ 1) j = (+ 2) k = (+ 3) l = (+ 4) m = (+ 5) n = (+ 6) o = (+ 7) p = (+ 8) q = (+ 9) r = (+ 10) Full background: i (i x) = j x i (j x) = k x i (k x) = l x i (l x) = m x i (m x) = n x i (n x) = o x i (o x) = p x i (p x) = q x i (q x) = r x Relevant to r: i (q x) = r x

  36. Why does chunks find more relevant equations? Scope: i = (+ 1) j = (+ 2) k = (+ 3) l = (+ 4) m = (+ 5) n = (+ 6) o = (+ 7) p = (+ 8) q = (+ 9) r = (+ 10) Full background: i (i x) = j x i (j x) = k x Chunks for r: i (k x) = l x q (i x) = r x i (l x) = m x q (q x) = p (r x) i (m x) = n x q (q (q x)) = o (r (r x)) i (n x) = o x q (q (q (q (q x)))) = m (r (r (r (r x)))) i (o x) = p x q (q (q (q (q (q x))))) = l (r (r (r (r (r x))))) i (p x) = q x i (q x) = r x All relevant Relevant to r: i (q x) = r x

  37. Inferred Signature type SignatureInferenceStrategy = [Function] -> [Function] -> InferredSignature type InferredSignature = DAG ([(Signature, [Equation])] -> Signature)

  38. Inferred Signature type SignatureInferenceStrategy = [Function] -> [Function] -> InferM () data InferM a where InferPure :: a -> InferM a InferFmap :: (a -> b) -> InferM a -> InferM b InferApp :: InferM (a -> b) -> InferM a -> InferM b InferBind :: InferM a -> (a -> InferM b) -> InferM b InferFrom :: [EasyNamedExp] -> [OptiToken] -> InferM (OptiToken, [EasyEq])

  39. Chunks Plus chunksPlus :: SignatureInferenceStrategy > chunksPlus > [sort :: Ord a => [a] -> [a]] > [reverse :: [a] -> [a], id :: a -> a] -> [sort, reverse] / | / v [sort, reverse, id] -> [sort] \ | \ | -> [sort, id]

  40. The runtime of chunks plus runtime ( time seconds ) 300 200 100 0 5 10 15 scope−size ( # functions ) strategy chunks−plus full−background

  41. The outcome of chunks plus: Relevant equations Boxplot for relevant−equations (More is better.) full−background ●● chunks−plus 0 20 40 60 80 100 120 relevant−equations ( # equations )

  42. All strategies Boxplot for relevant−equations (More is better.) syntactical−similarity−type−5 ● ● syntactical−similarity−symbols−5 ● ● syntactical−similarity−name−5 ● ● full−background ●● empty−background ● chunks−plus chunks 0 20 40 60 80 100 120 relevant−equations ( # equations )

Recommend


More recommend