GADT Formats OCaml - 2013 Benoˆ ıt Vaugon Introduction Format Types The Current Implementation The New Implementation Issues A New Implementation of Formats Performances based on GADTs Conclusion Benoˆ ıt Vaugon Ensta-ParisTech Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 1 / 13
Introduction GADT Formats Introduction Benoˆ ıt Vaugon Formats in OCaml Introduction ◮ Used for Printing and Scanning. Format Types ◮ Stdlib modules: Printf, Scanf and Format. The Current ◮ Advantage: separate structure from data. Implementation The New Implementation Basic Examples Issues ◮ Printf.printf "%d/%d/%d" m d y Performances ◮ Scanf.scanf "%d/%d/%d" (fun m d y -> (m, d, y)) Conclusion Advanced Examples ◮ Printf.sprintf "%#-0*.3X" 6 42 ( → "0x02A�" ) ◮ Printf.printf "today=%a%!" print_date (m, d, y) ◮ Printf.printf "version=%(%d%d%s%)" "%d.%d(%S)" 4 0 "alpha" ◮ Format.printf "@[<hov�2>%d@,%d@]" 42 43 ◮ Scanf.sscanf "OCaml|2013" "%s@|%[0-9]%!" callback ◮ Scanf.sscanf "today=09/24/2013" "today=%r" scan_date callback Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 2 / 13
Introduction GADT Formats Summary Benoˆ ıt Vaugon Introduction 1. Format Types Format Types The Current Implementation 2. The Current Implementation The New Implementation Issues 3. The New Implementation Performances Conclusion 4. Issues 5. Performances 6. Conclusion Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 3 / 13
Format Types GADT Formats Format Types Benoˆ ıt Vaugon Introduction The OCaml type-checker: Format Types match expression, expected_type with The Current | String_literal s, ty when equiv ty format6_ty -> [...] Implementation | [...] The New Implementation Issues Inferred type: Performances type (’a, ’b, ’c, ’d, ’e, ’f) format6 Conclusion ’a: the type of the parameters of the format ’b: the type of the first argument given to [%a] and [%t] printing functions ’c: the type of the result of the [%a] and [%t] functions ’d: the result type for the scanf-style functions, ’e: the type of the receiver function for the scanf-style functions ’f: the result type for the printf-style function Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 4 / 13
Format Types GADT Formats Format Types (Examples) Benoˆ ıt Vaugon Introduction Standard library functions: Format Types The Current Printf.printf : Implementation (’a, out_channel, unit, unit, unit, unit) format6 -> ’a The New Scanf.scanf : Implementation (’a, in_channel, ’c, ’d, ’a -> ’f, ’f) format6 -> ’d Issues Performances Inferred types of formats: Conclusion format_of_string "%d" : (int -> ’a, ’b, ’c, ’d, ’e, ’f) format6 format_of_string "%a" : ((’b -> ’x -> ’c) -> ’x -> ’f, ’b, ’c, ’e, ’e, ’f) format6 format_of_string "%r" : (’a -> ’f, ’b, ’c, (’b -> ’a) -> ’e, ’e, ’f) format6 Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 5 / 13
The Current Implementation GADT Formats The Current Implementation Benoˆ ıt Vaugon Type-checking: ◮ Parsing of the literal string Introduction ◮ Manual inference of the format6 type parameters Format Types The Current Implementation Memory representation: ◮ At runtime, formats are represented by strings The New Implementation Issues Printing function steps: Performances 1. Parse the format and count parameters Conclusion 2. Accumulate parameters 3. Extract and patch sub-formats 4. Call the C sprintf function on each sub-formats Scanning function steps: 1. Count the number of "%r" in the format 2. Accumulate the readers and the callback function 3. Scan the channel and accumulate parameters 4. Call the callback function all at once Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 6 / 13
The Current Implementation GADT Formats Problems Benoˆ ıt Vaugon Safety ◮ Multiple format parsers ( ⇒ risk of incompatibilities) Introduction ex: Printf.printf "%1.1s" "hello" Format Types The Current Implementation The New Implementation Issues Performances Conclusion Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 7 / 13
The Current Implementation GADT Formats Problems Benoˆ ıt Vaugon Safety ◮ Multiple format parsers ( ⇒ risk of incompatibilities) Introduction ex: Printf.printf "%1.1s" "hello" Format Types → Invalid_argument "Printf:�bad�conversion�%s..." The Current Implementation The New Implementation Issues Performances Conclusion Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 7 / 13
The Current Implementation GADT Formats Problems Benoˆ ıt Vaugon Safety ◮ Multiple format parsers ( ⇒ risk of incompatibilities) Introduction ex: Printf.printf "%1.1s" "hello" Format Types → Invalid_argument "Printf:�bad�conversion�%s..." The Current ◮ Weakness of the type-checker: Implementation ex: Printf.sprintf "%2.+f" 3.14 The New Implementation Issues Performances Conclusion Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 7 / 13
The Current Implementation GADT Formats Problems Benoˆ ıt Vaugon Safety ◮ Multiple format parsers ( ⇒ risk of incompatibilities) Introduction ex: Printf.printf "%1.1s" "hello" Format Types → Invalid_argument "Printf:�bad�conversion�%s..." The Current ◮ Weakness of the type-checker: Implementation ex: Printf.sprintf "%2.+f" 3.14 The New → "%2.+0f" Implementation Issues Performances Conclusion Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 7 / 13
The Current Implementation GADT Formats Problems Benoˆ ıt Vaugon Safety ◮ Multiple format parsers ( ⇒ risk of incompatibilities) Introduction ex: Printf.printf "%1.1s" "hello" Format Types → Invalid_argument "Printf:�bad�conversion�%s..." The Current ◮ Weakness of the type-checker: Implementation ex: Printf.sprintf "%2.+f" 3.14 The New → "%2.+0f" Implementation ◮ Use of Obj.magic in printing and scanning functions Issues ex: Format.printf "@%d%s" 42 "hello" Performances Conclusion Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 7 / 13
The Current Implementation GADT Formats Problems Benoˆ ıt Vaugon Safety ◮ Multiple format parsers ( ⇒ risk of incompatibilities) Introduction ex: Printf.printf "%1.1s" "hello" Format Types → Invalid_argument "Printf:�bad�conversion�%s..." The Current ◮ Weakness of the type-checker: Implementation ex: Printf.sprintf "%2.+f" 3.14 The New → "%2.+0f" Implementation ◮ Use of Obj.magic in printing and scanning functions Issues ex: Format.printf "@%d%s" 42 "hello" → Segmentation fault Performances Conclusion Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 7 / 13
The Current Implementation GADT Formats Problems Benoˆ ıt Vaugon Safety ◮ Multiple format parsers ( ⇒ risk of incompatibilities) Introduction ex: Printf.printf "%1.1s" "hello" Format Types → Invalid_argument "Printf:�bad�conversion�%s..." The Current ◮ Weakness of the type-checker: Implementation ex: Printf.sprintf "%2.+f" 3.14 The New → "%2.+0f" Implementation ◮ Use of Obj.magic in printing and scanning functions Issues ex: Format.printf "@%d%s" 42 "hello" → Segmentation fault Performances Conclusion Speed ◮ Parsing of the format at runtime ◮ Re-parsing by C (slow) printing functions ◮ Lots of memory allocations Memory allocations ◮ Sub-formats extractions (substrings) ◮ Lots of partial calls ⇒ closure allocations ◮ Ex: Printf.printf "Hello�world\n" � allocates 738 bytes Printf.printf "%s|%d\n" "OCaml" 2013 � allocates 1512 bytes Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 7 / 13
The New Implementation GADT Formats The New Implementation Benoˆ ıt Vaugon The Idea: Introduction Format Types ◮ Implement the format6 type by a GADT The Current ⇒ The format6 type is now concrete (not predefined) Implementation The New Implementation Examples Issues ◮ "Hello" � String_literal ("Hello", End_of_format) Performances ◮ "n�=�%02d\n%!" � Conclusion String_literal ("n�=�", Int (Conv_d, Lit_pad (Zero_pad, 2), No_prec, Char_literal (’\n’, Flush End_of_format))) Remark: ◮ Formats are statically allocated (not dynamically multiple times allocated) Benoˆ ıt Vaugon (ENSTA-ParisTech) GADT Formats September 24, 2013 8 / 13
Recommend
More recommend