Parametric Type Inferencing for Helium Bastiaan Heeren and Jurriaan Hage Institute of Information and Computing Science Utrecht University e-mail: { bastiaan,jur } @cs.uu.nl September 17, 2002 • × ◭ ◭ ◭ ◮ ◮ ◮ ◭
The Helium Compiler ◮ Helium is Haskell 98 with the following restrictions: type classes are not allowed, all type variables have kind ⋆ . ◮ A major design criterion is the ability to give superb error messages. This is especially needful for novice functional programmers. ◮ The compiler is structured as follows. coreToAsm .hs UHA UHA .core Scanning and Static Desugaring asmToLvm Parsing Analysis lvmToBytes Parsec UU_AG The LVM assembler library • × 1 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Types and Type Constraints (1) ◮ The syntax of types and type schemes is given by: (type) := α | where arity ( T ) = n τ T τ 1 . . . τ n (type scheme) := ∀ � σ α.τ ◮ We introduce three forms of type constraint: τ 1 ≡ τ 2 (constraint) C := (equality) | τ 1 � M τ 2 (implicit instance) | τ � σ (explicit instance) • × 2 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Types and Type Constraints (2) ◮ Satisfaction of a constraint by a substitution S is defined as follows. S satisfies ( τ 1 ≡ τ 2 ) = def S τ 1 = S τ 2 S satisfies ( τ 1 � M τ 2 ) = def S τ 1 ≺ generalize ( S M, S τ 2 ) S satisfies ( τ � σ ) = def S τ ≺ S σ ◮ Examples: ( τ 3 � ∅ τ 1 → τ 2 ) is satisfied by S =[ τ 1 := τ 2 , τ 3 := Int → Int ] S τ 3 ≺ generalize ( S ( ∅ ) , S ( τ 1 → τ 2 )) Int → Int ≺ generalize ( ∅ , ( τ 2 → τ 2 )) Int → Int ≺ ∀ α.α → α ( τ 3 � { τ 4 } τ 1 → τ 2 ) isn’t satisfied by S =[ τ 1 := τ 4 , τ 2 := τ 4 , τ 3 := Int → Int ] S τ 3 �≺ generalize ( S{ τ 4 } , S ( τ 1 → τ 2 )) Int → Int �≺ generalize ( { τ 4 } , τ 4 → τ 4 ) Int → Int �≺ τ 4 → τ 4 • × 3 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Bottom-Up Typing Rules A judgement M, A , C ⊢ e BU e : τ for expression e consists of: ◮ a set of monomorphic type variables M , ◮ an assumption set A to record the type variables for the free variables, ◮ a set of type constraints C , ◮ a type τ for e . [ Var] e e BU M, { x : β } , ∅ ⊢ BU x : β e e e M, A 1 , C 1 ⊢ BU e 1 : τ 1 M, A 2 , C 2 ⊢ BU e 2 : τ 2 M, A 3 , C 3 ⊢ BU e 3 : τ 3 [ If] e e BU M, A 1 ∪ A 2 ∪ A 3 , C 1 ∪ C 2 ∪ C 3 ∪ { τ 1 ≡ Bool , τ 2 ≡ β, τ 3 ≡ β } ⊢ BU if e 1 then e 2 else e 3 : β p e B i , C i ⊢ BU p i : τ i for 1 ≤ i ≤ n B = � i B i M ∪ ran ( B ) , A , C ⊢ BU e : τ [ Abs] e e BU M, A\ dom ( B ) , � i C i ∪ C ∪ ( B ≡ A ) ∪ { β ≡ τ 1 → . . . → τ n → τ } ⊢ BU ( λp 1 . . . p n → e ) : β • × 4 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Collecting the Constraints (1) The attributes for an expression are given by: ATTR Expr [ mono : Types (inherited) | unique : Int (chained) | aset : Assumptions (synthesized) ctree : ConstraintTree beta : Type ] ◮ The type constraints are stored in a tree. Later, these can be ordered. ◮ The unique counter provides fresh type variables when required. • × 5 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Collecting the Constraints (2) lhs u m a c b u If Expr Expr Expr u m a c b u u m a c b u u m a c b u guard then else SEM Expr | If • × 6 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Collecting the Constraints (2) lhs u m a c b u If loc b Expr Expr Expr u m a c b u u m a c b u u m a c b u guard then else SEM Expr | If loc . beta = TVar @ lhs . unique • × 6 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Collecting the Constraints (2) lhs u m a c b u If loc b Expr Expr Expr u m a c b u u m a c b u u m a c b u guard then else SEM Expr | If loc . beta = TVar @ lhs . unique guard . unique = @ lhs . unique + 1 • × 6 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Collecting the Constraints (2) lhs u m a c b u If loc b Expr Expr Expr u m a c b u u m a c b u u m a c b u guard then else SEM Expr | If loc . beta = TVar @ lhs . unique guard . unique = @ lhs . unique + 1 lhs . aset = @guard . aset + + @then . aset + + @else . aset • × 6 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Collecting the Constraints (2) lhs u m a c b u If loc b Expr Expr Expr u m a c b u u m a c b u u m a c b u guard then else SEM Expr | If loc . beta = TVar @ lhs . unique guard . unique = @ lhs . unique + 1 lhs . aset = @guard . aset + + @then . aset + + @else . aset . ctree = Node [ [ @guard . beta ≡ boolType ] ‘add‘ @guard . ctree , [ @then . beta ≡ @ beta ] ‘add‘ @then . ctree , [ @else . beta ≡ @ beta ] ‘add‘ @else . ctree ] • × 6 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Collecting the Constraints (2) lhs u m a c b u If loc b Expr Expr Expr u m a c b u u m a c b u u m a c b u guard then else SEM Expr | If loc . beta = TVar @ lhs . unique guard . unique = @ lhs . unique + 1 lhs . aset = @guard . aset + + @then . aset + + @else . aset . ctree = Node [ [ @guard . beta ≡ boolType ] ‘add‘ @guard . ctree , [ @then . beta ≡ @ beta ] ‘add‘ @then . ctree , [ @else . beta ≡ @ beta ] ‘add‘ @else . ctree ] • × 6 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Collecting the Constraints (3) p e B i , C i ⊢ BU p i : τ i for 1 ≤ i ≤ n B = � i B i M ∪ ran ( B ) , A , C ⊢ BU e : τ [ Abs] e e BU M, A\ dom ( B ) , � i C i ∪ C ∪ ( B ≡ A ) ∪ { β ≡ τ 1 → . . . → τ n → τ } ⊢ BU ( λp 1 . . . p n → e ) : β SEM Expression | Lambda lhs . aset = removeKeys (map fst @pats.bset) @expr.aset = [ beta ≡ foldr ( → ) @expr.beta @pats.betas ] ‘add‘ . ctree Node [ @pats . ctree , @binds ‘spread‘ @expr . ctree ] pats . unique = @ lhs . unique + 1 expr . mono = map snd @pats.bset + + @ lhs . mono loc . beta = TVar @ lhs . unique . binds = [ τ 1 ≡ τ 2 | (x 1 , τ 1 ) ← @pats . bset , (x 2 , τ 2 ) ← @expr . aset , x 1 == x 2 ] • × 7 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Flattening the Constraint Tree ◮ The location where an inconsistency is detected strongly depends on the order in which types are unified ◮ By flattening the tree into a list of constraints to be considered in that order, we can imitate type inference algorithms such as W and M ◮ Type constraints corresponding to v7 = v0 -> v1 -> v6 the binding of a variable to a pat- v0 = v3 Abs v1 = v4 v0 = v2 tern variable can be “spread” to v7 the location of the bound variable v2 = v5 -> v6 App Var "f" Var "x" v6 v3 = v4 -> v5 Var "f" App v0 v1 W spread + postorder v2 v5 M spread + preorder Var "f" Var "x" Bottom-Up no spread + postorder v3 v4 • × 8 ◭ ◭ ◭ ◮ ◮ ◮ ◭
A Type Class to Solve Constraints (1) class Solver solver info where initialize :: State solver info () makeConsistent :: State solver info () unifyTypes :: info → Type → Type → State solver info () newVariables :: [Int] → State solver info () findSubstForVar :: Int → State solver info Type ◮ The State monad contains a uniqueness counter, a list of reported inconsistencies, and a substitution or something equivalent. ◮ By default, initialize , makeConsistent , and newVariables do nothing. solve :: Solver solver info ⇒ Int → Constraints info → State solver info () solve unique constraints = do setUnique unique initialize mapM solveOne constraints makeConsistent • × 9 ◭ ◭ ◭ ◮ ◮ ◮ ◭
A Type Class to Solve Constraints (2) ◮ The following code fragment shows how to solve a single constraint. solveOne :: Solver solver info ⇒ Constraint info → State solver info () solveOne constraint = case constraint of t1 ≡ t2 → do unifyTypes info t1 t2 tp � ts → do unique ← getUnique let (unique’, its) = instantiate unique ts setUnique unique’ newVariables [unique..unique’-1] solveOne (tp ≡ its) t1 � m t2 → do makeConsistent t2’ ← applySubst t2 m’ ← mapM applySubst m let scheme = generalize (ftv m’) t2’ solveOne (t1 � scheme) • × 10 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Greedy Constraint Solving The most obvious instance of Solver is a substitution, such that: ◮ unifyTypes incorporates the most general unifier of the two types into the substitution. ◮ findSubstForVar applies the current substitution to a type variable. ◮ If two types cannot be unified, we immediately deal with the inconsistency. An appropriate error message is constructed that explains why the types cannot be unified. Information that is stored with the constraint can be used. We can choose to ignore the erroneous constraint and continue solving the remaining constraints. This might lead to the detection of multiple errors. ◮ The default skip function is the obvious choice for makeConsistent . • × 11 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Constraint Solving with Type Graphs (1) ◮ A type graph considers a complete set of type constraints. Compared to the standard (greedy) algorithms, the advantages are the following. There is no left to right bias in the algorithm. Global analysis can be performed, since all deductive steps for a type variable are available. It is straightforward to plug-in your own heuristics. ◮ The relative order of the implicit instance constraints must be taken into account. Moreover, when converting an implicit instance constraint into an explicit instance constraint, the state must be consistent. ◮ Example: f 0 y = y f x y = if y then x else f ( x − 1 ) y • × 12 ◭ ◭ ◭ ◮ ◮ ◮ ◭
Recommend
More recommend