Codept, a whole-project dependency analyzer for OCaml Florian “octachron” Angeletti INRIA ICFP 2019, OCaml workshop, 23 August 2019
Discovering dependencies $ ls a.ml b.ml c.ml d.ml e.ml f.ml atlas.ml
Discovering dependencies $ ls a.ml b.ml ◮ Discovering project structure c.ml d.ml e.ml f.ml atlas.ml
Discovering dependencies $ ls a.ml b.ml ◮ Discovering project structure c.ml ◮ Building project d.ml e.ml f.ml atlas.ml
Let’s call ocamldep
Let’s call ocamldep B D C Atlas A F E
Let’s call ocamldep The End
Let’s call ocamldep wait . . . a flower?
A . . . flower ? B D C Atlas A F E
A . . . flower ? B (* Atlas.ml *) module X = A D C module Y = B module Z = C Atlas module W = D A F module R = E module S = F E
$ ocamldep -map atlas.ml a.ml b.ml c.ml d.ml e.ml f.ml
And submodule collisions? (* a.ml *) open B B open A A (* b.ml *) module A = struct end
Don’t forget first class modules (* a.ml *) module type S = sig module A : sig end end let f () = let module M = struct module A = struct end A end in ( module M : S ) let () = let module M = ( val f ()) in let open M in let open A in ()
Abstract module types? (* a.ml *) module type S = sig module type T module M : T end module F ( X : S )= struct module M = X . M end module X = struct A module type T = sig module A : sig end end module M = struct module A = struct end end end open F ( X ) open M open A
Modules and compilation units Compilation units A concrete file or pair of files mapped to a module ◮ All compilation units are mapped to a module ◮ All modules do not come from compilation units
Context matters ... open A
Context matters module A = struct ... end open A
Context matters (* start of the file *) open A
Context matters open B (* Does B define a submodule A *) open A
Dependency tracking in OCaml Recognizing compilation units from submodules.
Naive dependencies ◮ All actual direct dependencies are recorded ◮ So many false positives ◮ Aliases are not tracked
OCamldep’s way: local analysis Local analysis module Sub = struct ... end include Sub ◮ Every modules that is not of the current compilation unit submodules is a compilation unit ◮ Nearly an over-approximation ◮ False positive: post-processing phase ◮ Alias: manual map tracking with -map option
OCamldep’s way: local analysis Local analysis module Sub = struct ... end include Sub ◮ Every modules that is not of the current compilation unit submodules is a compilation unit ◮ Nearly an over-approximation ◮ False positive: post-processing phase ◮ Alias: manual map tracking with -map option
Going further: whole-project analysis How to get precise dependencies for a file A? What do you need for precise dependencies ◮ A signature for the universe
Going further: whole-project analysis How to get precise dependencies for a file A? What do you need for precise dependencies ◮ A signature for the universe ◮ How to deal with first-class modules?
Going further: whole-project analysis How to get precise dependencies for a file A? What do you need for precise dependencies ◮ A signature for the universe ◮ How to deal with first-class modules? ◮ A signature for all compilation units
Going further: whole-project analysis How to get precise dependencies for a file A? What do you need for precise dependencies ◮ A signature for the universe ◮ How to deal with first-class modules? ◮ A signature for all compilation units ◮ A signature for all dependencies Codept core idea ◮ A dependency and signature analyzer ◮ ...than can stop on a missing signature and resume later
Codept specification ◮ No warning, exact dependencies ◮ At worst, an over-approximation of dependencies ◮ All analysis results are serializable in machine readable formats Secondary goal ◮ Full compatibility with ocamldep
3 layers ◮ AST simplification ◮ Interruptible interpreter ◮ Dependency orchestration
Simplified M2l Ast ◮ expressions ◮ classes ◮ patterns ◮ modules ◮ types ◮ module types type expression = | Open of module_expr | Include of module_expr | SigInclude of module_type | Bind of module_expr bind | Bind_sig of module_type bind | Bind_rec of module_expr bind list | Minor of annotation | Extension_node of extension and ... Full OCaml Parsetree: 960 LOC Simplified (M2l) AST : 80 LOC
[ module M = module M = struct [ module X = [] (l2.2-l3.5)] module X = struct (l1.0-l4.3) end open [ M ](l5.0-6) end open [ X ](l5.7-13) open M open X open [ B ](l6.0-6) open B module C = [](l7.0-21) module C = struct end ]
Interruptible interpreter ◮ How to represent partial evaluation result partial result, � , partial AST
Interruptible interpreter ◮ How to represent partial evaluation result partial result, � , partial AST ◮ � : a still unknown module name
Interruptible interpreter ◮ How to represent partial evaluation result partial result, � , partial AST ◮ � : a still unknown module name Zipper ◮ Add holes to the AST data type ◮ Holes are to be filled by the environment
[ module M = [ module X = [] (l2.2-l3.5)] (l1.0-l4.3) Computation halted at: open [ M ](l5.0-6) ... open B ? open [ X ](l5.7-13) [ module C = [] (l7.0-21)] open [ B ](l6.0-6) module C = [](l7.0-21) ]
Zipper example type ' hole me = ... | Ident : and module_expr = path_in_context me | Ident of | Apply_left : Paths . Simple .t M2l .module_expr | Apply of -> M2l .module_expr me { f: module_expr | Apply_right : ; x:module_expr module_expr } -> M2l .module_expr me ... | ...
Evaluation ◮ Try to fill all holes ◮ Fail if there is a hole that the environment doesn’t know how to fill ◮ Return the signature and dependencies otherwise
Orchestration ◮ Different strategies to compute whole-project signature and dependencies ◮ What to do with cycles?
Cycles ◮ Report them? ◮ Try to remove them and go on with the rest of the computation? C D B A H E G F
Codept in the real world How well does codept fare against its specification?
Alias tracking E (* Atlas.ml *) D module X = A module Y = B F Atlas module Z = C module W = D C A module R = E B module S = F
And submodule collisions? (* a.ml *) open B A open A B (* b.ml *) module A = struct end
Don’t forget first class modules (* a.ml *) ... let () = A let module M = ( val f ()) in let open M in let open A in () [ Warning ]: a.ml:l7.6-12, first- class module M was opened while its signature was unknown. Local solution: (* a.ml *) ... let module M : S = ( val f ()) in ...
Abstract module types? Work-in-progress.
Performances Slower than ocamldep Not the right question:
Library core, ocamldep compatible executable ◮ Codept executable, fully compatible with ocamldep
Library core, ocamldep compatible executable ◮ Codept executable, fully compatible with ocamldep ◮ Core library, to be published on version 1.0
Library core, ocamldep compatible executable ◮ Codept executable, fully compatible with ocamldep ◮ Core library, to be published on version 1.0 ◮ Too many options, a lighter executable planned
Machine readable output ◮ JSON and sexp format available ◮ for signature and dependencies ◮ for the M2l AST
{ "version" : [0, 10, 3], "dependencies" : [{ "file" : "a.ml", "deps" : [["C"], ["B"], ["Atlas"]] }, { "file" : "atlas.ml" }, { "file" : "b.ml", "deps" : [["C" { "file" : "c.ml", "deps" : [["Atlas"]] }, { "file" : "d.ml", "deps" : [["Atlas"]] }, { "file" : "e.ml", "deps" : [["Atlas"]] }, { "file" : "f.ml", "deps" : [["Atlas"]] }], "local" : [{ "module" : ["A"], "ml" : "a.ml" }, { "module" : ["Atlas"], "ml" : "atlas.ml" }, { "module" : ["B"], "ml" : "b.ml" }, { "module" : ["C"], "ml" { "module" : ["D"], "ml" : "d.ml" }, { "module" : ["E"], "ml" { "module" : ["F"], "ml" : "f.ml" }] } Support incremental compilation
Perspectives ◮ Library publication ◮ Lightweight executable ◮ Multi-zipper ◮ Dune integration Dune integration ◮ Not that straightforward: a full new layer of dependency computation ◮ But more opportunities for caching Past features from the future ◮ Full support for decoupling module names from filenames ◮ Full support for nested namespaces with -nested
Thanks!
Nearly an over-approximation (* a.ml *) module Sub = struct module SubSub = struct end (* b.ml *) end module Sub = struct end open B open Sub open SubSub
Recommend
More recommend