CS 6340 Software Analysis and Testing Mary Jean Harrold Aristotle Research Group SPARC/CERCS College of Computing Georgia Tech 1 Class 1 � Introductions; Student Information � Details (syllabus, etc.) � Shown on T-Square (https://t-square.gatech.edu) � Basic Analyses (1): intermediate representations, control-flow analysis, � Assign � Basic Analyses (1): Be familiar with concepts � Representation and Analysis of Software (Sections 1-5) (Schedule has link) � Problem Set 1 (Schedule has link): due 8/25/09 2
Course Overview, Syllabus � Motivation for studying program analysis and testing � Course objectives � Learn traditional, promising analyses � Learn traditional, new applications � Explore research areas in analysis, use of artifacts � Apply analyses and applications through homework, semester project � Means for approaching course objectives � Class lectures, readings, homework, class presentations � Semester project (proposal, oral, written report) � Exams 3 Course Overview, Syllabus � Your responsibilities � Arrive on time, attend all classes � Prepare (read papers before class), participate in class � Submit homework, projects, etc. at the beginning of class on the due date � Course evaluation � Homework: 30% � Semester project (proposal, written, oral): 30% � Exams: 30% � Class participation: 10% � Prerequisites � CS 4240, graduate-level standing, permission of instructor 4
Overview of Course � Static analyses (computed without execution) � Intraprocedural (within a single procedure) � AST, control-flow, control-dependence, data-flow, etc. � Complicating factors � Interprocedural (across procedure boundaries), recursion � Pointers, references, polymorphism, dynamic binding, etc. � Slicing, analysis by reachability, demand analysis � Applications � Dynamic analyses (computed by execution) � Instrumentation, profiling � Dynamic versions of control-flow, etc. � Applications such as testing, debugging, � Combinations of static and dynamic analyses 5 Overview of Course � Static analyses (computed without execution) � Intraprocedural (within a single procedure) � AST, control-flow, control-dependence, data-flow, etc. � Complicating factors � Interprocedural (across procedure boundaries), recursion � Pointers, references, polymorphism, dynamic binding, etc. � Slicing, analysis by reachability, demand analysis � Applications � Dynamic analyses (computed by execution) � Instrumentation, profiling � Dynamic versions of control-flow, etc. � Applications such as testing, debugging, � Combinations of static and dynamic analyses 6
Intermediate Representations 7 Intermediate Representations (traditional) Lexical Source program Tokens Parser analyzer (stream of char.) Intermediate representation Code generation, optimization Target code 9
Intermediate Representations (traditional) Lexical Source program Tokens Parser analyzer (stream of char.) • Syntax tree, other lower-level Intermediate Intermediate intermediate language representation representation • Little information on what the program does � Further analysis—e.g., Code generation, • Control-flow analysis: flow of optimization control within procedures • Data-flow analysis: global information on data manipulation Target • Use for optimization and software code engineering tasks 10 Intermediate Representations (traditional) Lexical Source program Tokens Parser analyzer (stream of char.) Intermediate representation Code generation, optimization Where does Java Bytecode Target fit in this process? code 11
Abstract Syntax Tree (AST) � Concrete versus abstract syntax � Concrete shows structure and is language-specific � Abstract shows structure � Representations � Parse tree represents concrete syntax � Abstract syntax tree represents abstract syntax 12 Example: Grammar Examples 1. a := b + c Grammar for 1 2. a = b + c; � stmtlist � stmt | stmt stmtlist � stmt � assign | if-then | … � assign � ident “:=“ ident binop ident � binop � “+” | “-” | … Grammar for 2 � stmtlist � stmt “;” | stmt”;” stmtlist � stmt � assign | if-then | … � assign � ident “=“ ident binop ident � binop � “+” | “-” | … 13
Example: Parse tree and AST • Example: a := b + c; • Grammar • stmtlist -> stmt “ ; ” | stmt “ ; ” stmtlist stmt -> assign | if-then | … assign -> ident “ := ” ident binop ident binop -> “ + ” | “ - ” … AST Parse tree stmtlist stmt “;” assign assign a add ident “:=“ ident binop ident b c a b “+” c 14 Three Address Code � General form: x := y op z � May include temporary variables (intermediate values) � Types (examples; rest in handout) � Assignment � Binary x := y op z � Unary x := op y � Copy x := y � Jumps � Unconditional goto L � Conditional if x relop y goto L � … 15
Example: Three Address Code Source code Corresponding 3-address code � if a > 10 goto 4 if a > 10 then � x = y – z x = y + z � goto 5 else � x = y + z x = y – z � … … 16 Analysis Levels � Local: within a single basic block or statement � Global, Intraprocedural: within a single procedure, function, or method (sometimes, intramethod) � Interprocedural: across procedure boundaries, procedure call, shared globals, etc. � Intraclass: within a single class � Interclass: across class boundaries � Intramodule: within a single module � … 17
Control-flow Analysis 18 Computing Control Flow Procedure AVG S1 count = 0 S2 fread(fptr, n) S3 while (not EOF) do S4 if (n < 0) S5 return (error) else S6 nums[count] = n S7 count ++ endif S8 fread(fptr, n) endwhile S9 avg = mean(nums,count) S10 return(avg) 19
Computing Control Flow Control flow is a relation (i.e., set of ordered pairs) Procedure AVG � that represents the possible S1 count = 0 flow of execution in a program S2 fread(fptr, n) � (a, b) in the relation means that S3 while (not EOF) do control can flow from node a to S4 if (n < 0) node b during execution. S5 return (error) else S6 nums[count] = n S7 count ++ endif S8 fread(fptr, n) endwhile S9 avg = mean(nums,count) S10 return(avg) 20 Computing Control Flow Control flow is a relation (i.e., set of ordered pairs) Procedure AVG � that represents the possible S1 count = 0 flow of execution in a program S2 fread(fptr, n) � (a, b) in the relation means that S3 while (not EOF) do control can flow from node a to S4 if (n < 0) node b during execution. S5 return (error) else S6 nums[count] = n S7 count ++ endif S8 fread(fptr, n) endwhile S9 avg = mean(nums,count) S10 return(avg) What is the control-flow relation for Procedure AVG? 21
Computing Control Flow Control flow is a relation (i.e., set of ordered pairs) Procedure AVG � that represents the possible S1 count = 0 flow of execution in a program S2 fread(fptr, n) � (a, b) in the relation means that S3 while (not EOF) do control can flow from node a to S4 if (n < 0) node b during execution. S5 return (error) else S6 nums[count] = n {(entry,S1),(S1,S2), (S2,S3), S7 count ++ (S3,S4), (S3,S9), (S4,S5), endif (S5,exit), (S4,S6), (S6,S7), S8 fread(fptr, n) (S7,S8), (S8,S3), (S9,S10), endwhile (S10,exit)} S9 avg = mean(nums,count) S10 return(avg) What is the control-flow relation for Procedure AVG? 22 Computing Control Flow Control-flow Graph (CFG) is a way to represent the control- Procedure AVG flow relation: S1 count = 0 � nodes represent elements in S2 fread(fptr, n) pairs (A,B) S3 while (not EOF) do � edges represent the relation S4 if (n < 0) S5 return (error) between A and B else � labels represent the conditions S6 nums[count] = n that cause that branch to be S7 count ++ executed endif � entry and exit nodes added S8 fread(fptr, n) endwhile S9 avg = mean(nums,count) S10 return(avg) 23
Computing Control Flow Control-flow Graph (CFG) is a way to represent the control- Procedure AVG flow relation: S1 count = 0 � nodes represent elements in S2 fread(fptr, n) pairs (A,B) S3 while (not EOF) do � edges represent the relation S4 if (n < 0) S5 return (error) between A and B else � labels represent the conditions S6 nums[count] = n that cause that branch to be S7 count ++ executed endif � entry and exit nodes added S8 fread(fptr, n) endwhile S9 avg = mean(nums,count) S10 return(avg) What is the control-flow graph for Procedure AVG? 24 Computing Control Flow entry S1 Procedure AVG S1 count = 0 S2 S2 fread(fptr, n) F S3 while (not EOF) do S3 S4 if (n < 0) T S5 return (error) S4 else T F S6 nums[count] = n S7 count ++ S5 S6 endif S8 fread(fptr, n) S7 endwhile S9 avg = mean(nums,count) S8 S10 return(avg) S9 S10 exit 25
Recommend
More recommend