Blaise Source Code Blaise Source Code Editing System Presenter: Danilo Gutierrez C Co-author: Sheila Deskins th Sh il D ki Health and Retirement Study (HRS) The 11th International Blaise Conference Annapolis, Maryland September 2007
Presentation Overview Presentation Overview How Big is Big? How Big is Big? What Does a Source Editor Do? Th The System and Updating a New Language S t d U d ti N L Current Use & Future Plans Questions Survey Research Center • Institute for Social Research • University of Michigan
How Big How Big is is Big? Big? Survey Research Center • Institute for Social Research • University of Michigan
HRS CAI Size HRS CAI Size • Datamodel Source Code ( bla Datamodel Source Code (.bla, .inc) inc) 175 624 175,624 Program lines P li 61 Include files 518 Procedures 344 344 Blocks Blocks 10 Tables Survey Research Center • Institute for Social Research • University of Michigan
HRS CAI Size HRS CAI Size • Fields Fields – 5,818 Fields – 1,773 Auxfields 1 773 Auxfields – 1,691 Locals – 5,754 Parameters 5 754 Parameters Survey Research Center • Institute for Social Research • University of Michigan
HRS CAI Size HRS CAI Size • Type Definitions yp – 8,962 USER-DEFINED – 1,390 ENUMERATED – 2,366 STRING – 481 RANGE – 366 366 OPEN OPEN – 309 INTEGER – 126 ARRAY – 141 SET – 11 REAL Survey Research Center • Institute for Social Research • University of Michigan
What does a “Source Editor” do? Survey Research Center • Institute for Social Research • University of Michigan
System Core System Core • Parses source code files (.bla and .inc) • Merges in update information • Writes updated source code files Writes updated source code files Survey Research Center • Institute for Social Research • University of Michigan
Parsing - example Parsing - example Statement: Q1 (One) "Are you ready to answer questions?“ : (y,n) Parses into tokens of: Parses into tokens of: Q1 (One) “Are you ready to answer questions?“ y y q : ( y , n ) Survey Research Center • Institute for Social Research • University of Michigan
Definition of Token Definition of Token A token is part of a program statement consisting of characters identified as consisting of characters identified as meaningful syntax. Survey Research Center • Institute for Social Research • University of Michigan
Merging Merging • There are two inputs in merging – The user update request information – The parsed tokens from original source code Survey Research Center • Institute for Social Research • University of Michigan
User Request User Request User update request information: User update request information: fi ld fieldname: Q1 Q1 token type: descriptor language: default edit instruction: edit instruction: add (or delete) add (or delete) update: “ This is the new label ” Survey Research Center • Institute for Social Research • University of Michigan
Merge results - example Merge results - example Source code in tokens: Q1 (One) “Are you ready to answer questions?” y y q “This is the new label” : ( y , n ) Problem: the descriptor looks like question text! Survey Research Center • Institute for Social Research • University of Michigan
How to fix the problem? How to fix the problem? • Blaise Data Object (BDO) contains all Blaise Data Object (BDO) contains all possible parts of a Blaise syntax statement statement. Blaise Syntax for Fields Bl i S t f Fi ld Q [ Q1 [ Q [ Q1, [ ... ] ] [ ( Tag ) ] [ [ Lid ] "Text" ] [ ... ] ] ] [ ( T ) ] [ [ Lid ] "T t" ] [ ] [ / [ Lid ] "Description" ] [ ... ] : T Survey Research Center • Institute for Social Research • University of Michigan
Using the BDO Using the BDO Syntax y BDO with update p Q Q1 [ Q1, [ ... ] ] <blank> [ ( T [ ( Tag ) ] ) ] (One) (O ) [ [ Lid ] "Text" ] [ [ Lid ] "Are you ready to answer questions?" ] [ ... ] [ ... ] [ / [ Lid ] [ / [ Lid ] "Description" ] "Description" ] " This is the new label" ] " This is the new label" ] [ ... ] : T [ ... ] : T Survey Research Center • Institute for Social Research • University of Michigan
Edited Source Code Edited Source Code Q1 (One) “Are you ready to answer questions?” / / “This is the new label” : (y , n) “This is the new label” : (y n) Survey Research Center • Institute for Social Research • University of Michigan
Writing Writing • Writing is simpler when the database is Writing is simpler when the database is already organized by the parsing and merging processes merging processes • Need to write out whitespace, comments, file names etc file names, etc. • Write Spanish language diacriticals • Write the same number of files as were parsed Survey Research Center • Institute for Social Research • University of Michigan
System Considerations System Considerations Survey Research Center • Institute for Social Research • University of Michigan
System Considerations System Considerations Editing .BMI (BCP) requires Parse Blaise source files Y Y (.bla, .inc) into tokens Keep Comments Y N Keep include file Keep include file Y Y N N structure Keep whitespace Keep whitespace Y Y N N Survey Research Center • Institute for Social Research • University of Michigan
Why would we want a Source Editor System? • HRS is longitudinal study. g y • It’s a ‘big’ application. • Most of the large scale (bulk) changes have to do with fields ‘Small’ changes usually involve a do with fields. Small changes usually involve a few hundred fields (10% = 581 changes). • There often are several “small” changes that Th f l “ ll” h h take place during the CAI preparation for a field period. p Survey Research Center • Institute for Social Research • University of Michigan
What We Learned What We Learned • For the 2004 Descriptor update task For the 2004 Descriptor update task – The few descriptors mentioned turned out to be a 2 700 descriptors change request be a 2,700 descriptors change request – The merge key that was provided with the – The merge key that was provided with the descriptor update request information was the DEP field name Survey Research Center • Institute for Social Research • University of Michigan
Translator Functions Translator Functions • Convert DEP field names to defined block Convert DEP field names to defined block and field name • Report duplicate requests for same • Report duplicate requests for same defined block and field name Survey Research Center • Institute for Social Research • University of Michigan
Translator Translator • Need to ‘translate’ DEP fieldname paths to Need to translate DEP fieldname paths to defined block name. Block Name (Def) Block Name (Def) # # ind DEP Path ind DEP Path BB 41 1 BB_Born 42 1 SecB.Born BB_ShowStateList 43 1 SecB.Born.B076_ B003_ BB_ShowStateList 43 2 SecB.LivedArea.B 078_B047_ Survey Research Center • Institute for Social Research • University of Michigan
Translator Translator • Several DEP fields update request with one Several DEP fields update request with one define field Block Name Existing DEP Field Name User Descriptor Request (Defined) (Defined) Descriptor BB Marry _ y B066 066_ MARR YEAR BEG G FIRST MARRIAGE YEAR BEGAN S G G SecB.Marry[1].B066 S y[ ] 066_ BB_Marry B066_ MARR YEAR BEG SECOND MARRIAGE YEAR BEGAN SecB.Marry[2].B066_ BB_Marry B066_ MARR YEAR BEG THIRD MARRIAGE YEAR BEGAN SecB.Marry[3].B066_ BB Marry _ y B066 _ MARR YEAR BEG MARRIAGE YEAR BEGAN -4 SecB.Marry[4].B066 y[ ] _ Survey Research Center • Institute for Social Research • University of Michigan
The System The System and and Updating a new language Survey Research Center • Institute for Social Research • University of Michigan
New Language Update New Language Update • The early system that handled updating The early system that handled updating descriptors needed to be expanded to handle the ‘update’ addition of a new handle the update addition of a new language. Survey Research Center • Institute for Social Research • University of Michigan
System Conceptualization System Conceptualization • Parser Application Parser Application • DEP Field Name Translator • BDO Creation BDO C ti • Merger Application • Writer Application • User Interface User Interface Survey Research Center • Institute for Social Research • University of Michigan
Survey Research Center • Institute for Social Research • University of Michigan
System Design Considerations • Encapsulated routines and procedures for Encapsulated routines and procedures for each function • Reusable code versus ad hoc routines • Reusable code versus ad hoc routines • Tokens described in more meaningful terms • Language order option • Parsing whitespace Survey Research Center • Institute for Social Research • University of Michigan
Recommend
More recommend