Number Systems and Data Structures RALF HINZE Institut f¨ ur Informatik III, Universit¨ at Bonn R¨ omerstraße 164, 53117 Bonn, Germany Email: ralf@informatik.uni-bonn.de Homepage: http://www.informatik.uni-bonn.de/~ralf March, 2005 (Pick up the slides at .../~ralf/talks.html#T40 .) 1 ◭ ◭ ◭ ◮ ◮ ◮ ✷
An analogy Natural numbers (aka Peano numerals, unary numbers etc): = Zero | Succ Nat data Nat plus :: Nat → Nat → Nat plus Zero n 2 = n 2 plus ( Succ n 1 ) n 2 = Succ ( plus n 1 n 2 ) Lists (aka stacks, sequences etc): data List α = Nil | Cons α ( List α ) :: ∀ α . List α → List α → List α append = x 2 append Nil x 2 append ( Cons a x 1 ) x 2 = Cons a ( append x 1 x 2 ) ☞ There is a strong analogy between representations of numbers ( n ) and representations of container objects (of size n ). 2 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Numerical representations ☞ Data structures that are designed on the basis of this analogy are called numerical representations. Idea: the data structure inherits the properties of the number system. The operations on the data structure are modelled after their numerical counterparts. increment n + 1 insertion into a container decrement n − 1 deletion from a container addition n 1 + n 2 union or merge of two container objects ☞ This design technique is suitable for implementing arbitrary abstractions: sequences, priority queues, sets etc. ☞ The numerical representations we shall introduce are fully persistent: updates never destroy the original data structure. 3 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Numerical representations—continued arithmetic shift n ∗ b we shall see n / b multiplication n ∗ k division n / k split of a container object number conversion construction of a container object conversion between different container types 4 ◭ ◭ ◭ ◮ ◮ ◮ ✷
History ◮ Clancy, Knuth, A programming and problem-solving seminar, 1977. ◮ Guibas, McCreight, Plass, Roberts, A new representation for linear lists, 1977. ◮ Vuillemin, A data structure for manipulating priority queues, 1978. ◮ Okasaki, Purely Functional Data Structures, 1998. ☞ Some material is taken from Okasaki’s book, which I highly recommend. 5 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Outline of the talk ✖ Exploring the analogy (7–27) ✖ A toolbox of number systems (29–41) ✖ Analysis of data structures (43–51) ✖ A worked-out example: 2-3 finger trees (53–64) 6 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Random-access lists Lists are based on the unary number system; random-access lists are based on the binary number system. data Seq α = Nil | Zero ( Seq ( α, α )) | One ( α, Seq ( α, α )) ☞ The type of elements changes from position to position: the top-level possibly contains an element of type α , the next of type ( α, α ) , the next of type (( α, α ) , ( α, α )) and so on. In other words, n ∗ 2 corresponds to pairing. ☞ Seq is an example of a non-regular or nested data type. 7 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Random-access lists—examples Nil One (11 , Nil ) Zero ( One ((10 , 11) , Nil )) One (9 , One ((10 , 11) , Nil )) Zero ( Zero ( One (((8 , 9) , (10 , 11)) , Nil ))) One (7 , Zero ( One (((8 , 9) , (10 , 11)) , Nil ))) Zero ( One ((6 , 7) , One (((8 , 9) , (10 , 11)) , Nil ))) One (5 , One ((6 , 7) , One (((8 , 9) , (10 , 11)) , Nil ))) Zero ( Zero ( Zero ( One ((((4 , 5) , (6 , 7)) , ((8 , 9) , (10 , 11))) , Nil )))) One (3 , Zero ( Zero ( One ((((4 , 5) , (6 , 7)) , ((8 , 9) , (10 , 11))) , Nil )))) Zero ( One ((2 , 3) , Zero ( One ((((4 , 5) , (6 , 7)) , ((8 , 9) , (10 , 11))) , Nil )))) One (1 , One ((2 , 3) , Zero ( One ((((4 , 5) , (6 , 7)) , ((8 , 9) , (10 , 11))) , Nil )))) 8 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Random-access lists—insertion Insertion corresponds to binary increment, except that the carry is explicit—the carry is witnessed by a container object of the appropriate size. :: ∀ α . ( α, Seq α ) → Seq α cons cons ( a , Nil ) = One ( a , Nil ) cons ( a , Zero x ) = One ( a , x ) cons ( a 1 , One ( a 2 , x )) = Zero ( cons (( a 1 , a 2 ) , x )) ☞ cons requires a non-schematic form of recursion, called polymorphic recursion: the recursive call inserts a pair not an element. ☞ cons runs in Θ(log n ) worst-case time. 9 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Random-access lists—deletion Deletion corresponds to binary decrement, except that the borrow is explicit. :: ∀ α . Seq α → ( α, Seq α ) uncons uncons ( One ( a , Nil )) = ( a , Nil ) uncons ( One ( a , x )) = ( a , Zero x ) uncons ( Zero x ) = let (( a 1 , a 2 ) , x ) = uncons x in ( a 1 , One ( a 2 , x )) ☞ uncons is the mirror image of cons . :: ∀ α . ( α, Seq α ) → Seq α cons cons ( a , Nil ) = One ( a , Nil ) cons ( a , Zero x ) = One ( a , x ) cons ( a 1 , One ( a 2 , x )) = Zero ( cons (( a 1 , a 2 ) , x )) 10 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Random-access lists—indexing Indexing corresponds to . . . (well, it’s a bit like ‘ � ’). :: ∀ α . Integer → Seq α → α lookup lookup 0 ( One ( a , x )) = a lookup ( n + 1) ( One ( a , x )) = lookup n ( Zero x ) lookup (2 ∗ n + 0) ( Zero x ) = fst ( lookup n x ) lookup (2 ∗ n + 1) ( Zero x ) = snd ( lookup n x ) 11 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Random-access lists—construction Container objects can be constructed in at least two different ways: ◮ construct a container object containing n copies of a given element: replicate :: ∀ α . Integer → α → Seq α ◮ construct a container object from a given list of elements: toSeq :: ∀ α . [ α ] → Seq α Often, the former operation can be implemented more efficiently. ☞ In both cases, construction corresponds to conversion of number representations: here from the unary to the binary number system. 12 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Conversion of number representations There are at least two ways to convert a number in one system to the equivalent number in another system: ◮ use the arithmetic of the target number system; this is sometimes called the expansion method; functions of this type are typically folds. ◮ use the arithmetic of the source number system; this is sometimes called the multiplication or division method; functions of this type are typically unfolds. 13 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Construction— replicate Using the arithmetic of the target system (unary to binary): :: ∀ α . Integer → α → Seq α replicate replicate 0 a = Nil replicate ( n + 1) a = cons ( a , replicate n a ) ☞ replicate runs in Θ( n ) worst-case time; it is not polymorphically recursive. 14 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Construction— replicate —continued Using the arithmetic of the source system (unary to binary): :: ∀ α . Integer → α → Seq α replicate replicate n a = if n 0 then Nil else case modDiv n 2 of (0 , q ) → Zero ( replicate q ( a , a )) (1 , q ) → One ( a , replicate q ( a , a )) ☞ replicate runs in Θ(log n ) worst-case time; it is polymorphically recursive. 15 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Construction— toSeq Using the arithmetic of the target system (unary to binary): :: ∀ α . [ α ] → Seq α toSeq toSeq [ ] = Nil toSeq ( a : x ) = cons ( a , toSeq x ) ☞ toSeq runs in Θ( n ) worst-case time. ☞ [ α ] is the built-in list data type, which is isomorphic to List α (see page 2): [ α ] ∼ = List α , [ ] ∼ = Nil , and a : x ∼ = Cons a x . 16 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Random-access lists—conversion—continued Using the arithmetic of the source system (unary to binary): data Digit α = Zero ′ | One ′ α modDiv2 :: [ α ] → ( Digit α, [( α, α )]) = ( Zero ′ , [ ]) modDiv2 [ ] modDiv2 ( a : x ) = case modDiv2 x of q ) → ( One ′ a , q ) ( Zero ′ , ( One ′ a ′ , q ) → ( Zero ′ , ( a , a ′ ) : q ) :: ∀ α . [ α ] → Seq α toSeq toSeq x = if null x then Nil else case modDiv2 x of ( Zero ′ , q ) → Zero ( toSeq q ) ( One ′ a , q ) → One ( a , toSeq q ) ☞ toSeq runs in Θ( n ) worst-case time. 17 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Exercises Exercise 1. Implement two versions of :: ∀ α . Seq α → Integer size fromSeq :: ∀ α . Seq α → [ α ] and determine the worst-case running times (binary to unary). 18 ◭ ◭ ◭ ◮ ◮ ◮ ✷
1-2 random-access lists The container object that corresponds to ‘ 0 ’ contains no elements. This is wasteful! ☞ Interestingly, we can also use the digits 1 and 2 instead of 0 and 1 (the base is still 2 ). data Seq α = Nil | One ( α, Seq ( α, α )) | Two (( α, α ) , Seq ( α, α )) ☞ Each number has a unique representation in this system; this is a so-called zeroless number system. 19 ◭ ◭ ◭ ◮ ◮ ◮ ✷
1-2 random-access lists—examples Nil One (11 , Nil ) Two ((10 , 11) , Nil ) One (9 , One ((10 , 11) , Nil )) Two ((8 , 9) , One ((10 , 11) , Nil )) One (7 , Two (((8 , 9) , (10 , 11)) , Nil )) Two ((6 , 7) , Two (((8 , 9) , (10 , 11)) , Nil )) One (5 , One ((6 , 7) , One (((8 , 9) , (10 , 11)) , Nil ))) Two ((4 , 5) , One ((6 , 7) , One (((8 , 9) , (10 , 11)) , Nil ))) One (3 , Two (((4 , 5) , (6 , 7)) , One (((8 , 9) , (10 , 11)) , Nil ))) Two ((2 , 3) , Two (((4 , 5) , (6 , 7)) , One (((8 , 9) , (10 , 11)) , Nil ))) One (1 , One ((2 , 3) , Two ((((4 , 5) , (6 , 7)) , ((8 , 9) , (10 , 11))) , Nil ))) 20 ◭ ◭ ◭ ◮ ◮ ◮ ✷
1-2 random-access lists—insertion :: ∀ α . ( α, Seq α ) → Seq α cons cons ( a , Nil ) = One ( a , Nil ) cons ( a 1 , One ( a 2 , x )) = Two (( a 1 , a 2 ) , x ) cons ( a 1 , Two (( a 2 , a 3 ) , x )) = One ( a 1 , cons (( a 2 , a 3 ) , x )) 21 ◭ ◭ ◭ ◮ ◮ ◮ ✷
Recommend
More recommend