15-150 Fall 2020 Lecture 8 Stephen Brookes
trees vs. lists • Representing a collection as a tree may enable a parallel speed-up • Using a sorted tree may enable faster code, e.g. for searching • With lists, even sorted lists, there’s less potential for parallelism • But badly balanced trees are no better than lists, and balance may be hard to achieve!
trees vs. lists • Representing a collection as a tree may enable a parallel speed-up • Using a sorted tree may enable faster code, e.g. for searching • With lists, even sorted lists, there’s less potential for parallelism • But badly balanced trees are no better than lists, and balance may be hard to achieve!
the plan • First, a quick review • We’ll discuss how to search in lists and trees - under various assumptions (sorted, balanced) • Then we’ll implement an algorithm for sorting a tree - and prove its correctness - and analyze its work and span
but first… • Someone asked about naming conventions • I prefer T for trees, t for types (and tea to drink) • I often use capitalized names for datatype constructors like Node, Empty, SOME, NONE • Not required by ML, but you must be consistent datatype ’a tree = Empty | Node of ’a tree * ’a * ’a tree; fun size empty = 0 | size (Node(A, _, B)) = 1 + size A + size B What happens?
balanced trees We can build a balanced tree from a list… … and (if we do it right) get the same list back by in-order traversal 1 list2tree [4,1,2] 4 2 inord
recall datatype ’a tree = Empty | Node of ’a tree * ’a * ’a tree fun size Empty = 0 | size (Node(T1, x, T2)) = 1 + (size T1) + (size T2) fun depth Empty = 0 | depth (Node(T1, x, T2)) = 1 + Int.max(depth T1, depth T2) • size T = number of nodes • depth T = length of longest path from root to leaf • A full binary tree of depth d has 2 d - 1 nodes • depth T is O(log (size T)) for a balanced tree, depth T is O(size T) otherwise(!)
recall fun inord Empty = [ ] | inord (Node(T1, x, T2)) = (inord T1) @ x :: (inord T2) fun list2tree [ ] = Empty | list2tree [x] = Node(Empty, x, Empty) | list2tree L = let val n = length L val (A, x::B) = takedrop (n div 2, L) in Node(list2tree A, x, list2tree B) end • inord T = inorder traversal list of T • length(inord T) = size T
question • Would it have been OK to omit the [x] clause? fun list2tree [ ] = Empty | list2tree L = let val n = length L val (A, x::B) = takedrop (n div 2, L) in Node(list2tree A, x, list2tree B) end list2tree [4] = ???
answer • Would it have been OK to omit the [x] clause? fun list2tree [ ] = Empty | list2tree L = let val n = length L val (A, x::B) = takedrop (n div 2, L) in Node(list2tree A, x, list2tree B) end YES Correctness proof still works!
precision (or lack thereof) • There may be MANY balanced trees with the same inorder traversal list • list2tree L builds a balanced tree with inorder traversal list L • We don’t need to (or care to) say which one! Go back and see if/where/why we used imprecise specs before!
balanced • Empty is balanced • Node(A, x, B) is balanced iff |size(A) - size(B)| ≤ 1 and A, B are balanced A structurally inductive definition
balanced • Empty is balanced • Node(A, x, B) is balanced iff |size(A) - size(B)| ≤ 1 and A, B are balanced A structurally inductive definition • If T is balanced, every node of T is balanced • If T is balanced, its children each have about half the data
balanced • Empty is balanced • Node(A, x, B) is balanced iff |size(A) - size(B)| ≤ 1 and A, B are balanced A structurally inductive definition • If T is balanced, every node of T is balanced (by definition + an easy structural induction) • If T is balanced, its children each have about half the data
balanced • Empty is balanced • Node(A, x, B) is balanced iff |size(A) - size(B)| ≤ 1 and A, B are balanced A structurally inductive definition • If T is balanced, every node of T is balanced (by definition + an easy structural induction) • If T is balanced, its children each have about half the data (how could you prove this?)
sorted lists nil is sorted x::R is sorted iff x is ≤ every integer in R and R is sorted also a structurally inductive definition
sorted trees Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted
sorted trees Empty is sorted Theorem Node(A, x, B) is sorted iff T is a sorted tree iff every integer in A is ≤ x, inord T is a sorted list every integer in B is ≥ x, and A and B are sorted
sorted trees Empty is sorted Theorem Node(A, x, B) is sorted iff T is a sorted tree iff every integer in A is ≤ x, inord T is a sorted list every integer in B is ≥ x, prove by structural induction and A and B are sorted
sorted trees Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, prove by structural induction and A and B are sorted
sorted trees Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted
sorted trees Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted . 42 . . 42 81 . . 57 . 99 . 3 14
sorted trees Empty is sorted Node(A, x, B) is sorted iff every integer in A is ≤ x, every integer in B is ≥ x, and A and B are sorted . . 42 42 . . . . 14 81 42 81 . . 57 . 99 . . . 57 . 99 . 3 42 3 14
all all : (int -> bool) * int tree -> bool fun all (p, Empty) = true | all (p, Node(A, x, B)) = (p x) andalso all (p, A) andalso all (p, B) REQUIRES p is total ENSURES all (p, T) = true iff every integer in T satisfies p
all all : (int -> bool) * int tree -> bool fun all (p, Empty) = true | all (p, Node(A, x, B)) = (p x) andalso all (p, A) andalso all (p, B) REQUIRES p is total ——————— p x terminates, for all x in T ENSURES all (p, T) = true iff every integer in T satisfies p
sorted fun sorted (T : int tree) : bool = case T of Empty => true | Node(A, x, B) => all ( fn y => y <= x, A) andalso all ( fn y => y >= x, B) andalso sorted A andalso sorted B sorted T = true iff T is a sorted tree
sorted fun sorted Empty = true | sorted (Node(A, x, B)) = all ( fn y => y <= x, A) andalso all ( fn y => y >= x, B) andalso sorted A andalso sorted B sorted T = true iff T is a sorted tree Useful in specs, never used in code!
motivation Sorted data may be easier to deal with… • That’s why dictionaries are in lexicographic order! Let’s look at functions for searching data contained in • lists (unsorted, sorted) • trees (unsorted, sorted) • We’ll contrast the work and span.
searching an unsorted list mem : int * int list -> bool fun mem (x, [ ]) = false fun mem (x, [ ]) = false | mem (x, y::L) = (x = y) orelse mem (x, L) | mem (x, y::L) = (x = y) orelse mem (x, L) REQUIRES true ENSURES mem (x, L) = true iff x is in L W mem (x, L) is O(length L) S mem (x, L) is also O(length L)
searching a sorted list mem : int * int list -> bool fun mem (x, [ ]) = false | mem (x, y::L) = case Int.compare(x, y) of LESS => false | EQUAL => true | GREATER => mem (x, L) REQUIRES L is a sorted list ENSURES mem (x, L) = true iff x is in L W mem (x, L) is O(length L) S mem (x, L) is also O(length L)
searching an unsorted tree mem : int * int tree -> bool fun mem (x, Empty) = false fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = | mem (x, Node(A, y, B)) = (x = y) orelse mem (x, A) orelse mem (x, B) (x = y) orelse mem (x, A) orelse mem (x, B) (* not designed for parallel evaluation *) REQUIRES T is a tree ENSURES mem (x, T) = true iff x is in T W mem (x, T) is O(size T) S mem (x, T) is also O(size T)
searching an unsorted tree mem : int * int tree -> bool fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in (* designed for parallel evaluation *) a orelse b end W mem (x, T) is O(size T) S mem (x, T) is O(depth T) … let’s see why
searching an unsorted tree fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end S(mem(x, Empty)) = 1 S(mem(x, Node(A, y, B))) = 1 + max(S(mem(x,A)), S( mem (x, B)))
searching an unsorted tree fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end S(mem(x, Empty)) = 1 S(mem(x, Node(A, y, B))) = 1 + max(S(mem(x,A)), S( mem (x, B))) S( mem (x, T)) is O(depth T)
searching an unsorted tree fun mem (x, Empty) = false | mem (x, Node(A, y, B)) = (x = y) orelse let val (a, b) = (mem (x, A), mem (x, B)) in a orelse b end S(mem(x, Empty)) = 1 S(mem(x, Node(A, y, B))) = 1 + max(S(mem(x,A)), S( mem (x, B))) S( mem (x, T)) is O(depth T) Let S mem (d) be span for mem(x,T) with T of depth d
Recommend
More recommend