Synthesizing Sub-Range Queries on Tree Structures Shubhang Kulkarni & Pranav Annapindi Sup
Context Don’t know? Pre- image f Lé user Input Output (ex : File structure manipulations) Chill. We got you for trees Output = f (Pre-image) (Us)
Motivating Example Input Grades A Sub-Range A Sub-Range 51 63 74 82 90 95 Output f 2 (x)=B f 1 (x)=A B B B A A A
Observations There are two sub-ranges: (0, 3) & (3, 6) Each sub-range has a function that is applied element-wise to it. The “user’s intent” corresponds to a applying a function to contiguous subsets of nodes at some level This could be really useful if the tree on which the example is performed, actually represented a larger system
We Saw a Gap… Data Structure Manipulations using PBE Cannot perform queries on sub-ranges of a level. Must be performed on entire tree/level PBE Range Manipulations Exist Good Ol’ Flashfill!! Cool Ideas. Works for linear structure like strings Combine ideas for queries on subranges of tree levels!
A Few Formalizations Tree T := (N, E, R) NodeTree n := node ∈ N | T (N’, E’, node) s.t. N’ ⊆ N, E’ ⊆ E, node ∈ N Basically a Subtree Two NodeTrees are defined to be at the same Level if they are at equal distances from the root node. Used interchangeably with Range A SubRange on a Subtree is a contiguous ordered subset of NodeTrees in the intersection of that subtree and a Level. Restrictive User Pruning! The aforementioned subtree has the the LCA of the desired range as root.
High level idea of Approach We use all the given examples at once We look at all possible ranges However, a range in one list might not be the same in another. Hence, we look at all equivalent ranges in the various Examples given. With intelligent pruning for feasibility We construct new input-output examples using all the nodes in a range over all examples an use a slightly modified version of the enumerative search algorithm in [1] This gives us a lot of examples and would make the synthesis un-ambiguous almost always. We return the functions synthesized, the ranges and the range transformation applied
Range Transformation Each index of a list can be viewed in 3 ways: Generalization of indices Number of Elements before it in the list Number of elements after it in the list Fraction of elements before it in the list. Hence, an index has 3 ‘equivalent’ indices in a new list A range has 2 indices and so generalizes to 9 equivalent ranges
The Input-Output Example Format Two parts T : The Tree on which the operation is performed. 𝜁 : A list of lists the following form: [Example 1, Example 2 … Example n] Each Example i is of the form: [SRangeF 1, SRangeF 2 … SRangeF n] Some sub-range in T is mapped to a SRangeF I via application of a function. We want to learn that function and which range.
Assumption User orders ranges for each input-output example in the same way Wordy explanation : Output 1 of example 1 and output 2 of example 2 correspond to equivalent ranges in their corresponding trees. A reasonable assumption All the sub-ranges exist on a particular level Not too many! We define them in this manner
The Procedure Level order traversal We assume that the first T – Set of all input examples in the form of List of example has the least number of nodes of the level proc(T, 𝜁 ) nodes = n s 𝜁 – List of all output examples SR = { length(e) | e ϵ 𝜁 0 } Pruning technique: Look at ranges whose size for sRs in SR do O(n s ) equals the size of an output example for all sub-ranges (p,q) of size sRs do All the various sizes of sub-ranges we need to consider O(n s ) r <- (p,q) All the sub-ranges of a given size progs = {} O( 𝜁 ) in total! for {e | e ϵ 𝜁 0 and e is of size sRs with index j} do for each output example of the same size. There might be many. 1: for t in rangeTransformations do O(1) There are the 9 possible range transformations. for i in range(len( 𝜁 )) For all the other output examples O( 𝜁 ) r’ = t(r, T i ) Transform our range to the one pertaining to that example’s input tree if len(r’) ≠ len( 𝜁 i,j ) then If the Sub-range in the other tree could not map to the corresponding output example, we prune! break ' 𝜁 𝑃 𝑜 & exams = exams U (T i (r), 𝜁 i,j ) All the new examples we feed to our sub-synthesizer prog <- synthesize(exams) Most Probably a small number! if prog ≠ { ⊥ } 𝑃 𝜁 If we found a valid program, we don’t need to progs = progs U {prog, r, t} check any more transformations! break 1
Summary: Key Contributions The Synthesis Infer user intended sub-ranges from input-output example Synthesize different programs that are applied to the inferred sub-ranges of a tree. Generalize the sub-range sizes, if the input-output examples were given on a smaller representative tree And of course synthesize corresponding program
Future Work Build the software Want to see if its practical, and gives our expected performance If Gulwani did it, then so can we! Generalize Better generalization of file system Multiple Levels, Subset Queries, DAG structures instead of trees Ranking Function Can only guess relative importance at the moment Need the software to check which features work better
References [1] FCD15: John K. Feser, Swarat Chaudhuri, Isil Dillig Synthesizing Data Structure Transformations from Input-Output Examples. PLDI 2015. [2] FlashFill (Microsoft Excel 2013 Feature) http://research.microsoft.com/users/sumitg/flashfill.html
Recommend
More recommend