Implementing Legacy Implementing Legacy Statistical Algorithms in a Statistical Algorithms in a Spreadsheet Environment Spreadsheet Environment Stephen W. Liddle Liddle John S. Lawson Stephen W. John S. Lawson Information Systems Faculty Department of Statistics Information Systems Faculty Department of Statistics Rollins eBusiness Center Rollins eBusiness Center Brigham Young University Brigham Young University Provo, UT 84602 Provo, UT 84602
Overview Overview � Introduction Introduction � � Fundamentals of VBA in Excel Fundamentals of VBA in Excel � � Retargeting traditional algorithms to a Retargeting traditional algorithms to a � spreadsheet environment spreadsheet environment � Converting FORTRAN to VBA Converting FORTRAN to VBA � � Conclusions Conclusions �
Why Convert FORTRAN Programs to Run Why Convert FORTRAN Programs to Run in a Spreadsheet Environment? in a Spreadsheet Environment? � Useful code available that is not Useful code available that is not � implemented in standard statistical implemented in standard statistical packages packages � FORTRAN compilers not usually available FORTRAN compilers not usually available � on normal Windows workstation on normal Windows workstation � Many textbooks refer to published Many textbooks refer to published � FORTRAN algorithms FORTRAN algorithms
Sources for Published FORTRAN Sources for Published FORTRAN Algorithms Algorithms � STATLIB ( STATLIB (http:// http://lib.stat.cmu.edu lib.stat.cmu.edu/ /) ) � � General Archive General Archive � � Applied Statistics Archive Applied Statistics Archive � � Journal of Quality Technology Archive Journal of Quality Technology Archive � � JASA Software Archive JASA Software Archive � � JCGS Archive JCGS Archive �
Advantages of Running Legacy Advantages of Running Legacy FORTRAN Code in Excel FORTRAN Code in Excel � Comfortable environment for practitioners Comfortable environment for practitioners � � More user friendly input from spreadsheet More user friendly input from spreadsheet � � Output to spreadsheet allows further Output to spreadsheet allows further � graphical and computational analysis of graphical and computational analysis of results with Excel functions results with Excel functions
VDG Inputs: Design X1 X2 X3 X4 X5 X6 Nickname: 1-hybrid 0 0 0 0 0 2.3094 Runs: 30 -1 -1 -1 -1 -1 0.57735 Factors: 6 1 1 -1 -1 -1 0.57735 Model Order(1/2): 2 1 -1 1 -1 -1 0.57735 Design Region(S/C): s -1 1 1 -1 -1 0.57735 Weight by N (Y/N): y 1 -1 -1 1 -1 0.57735 Number of Radii (20-99): 20 -1 1 -1 1 -1 0.57735 Scaling Unit (suggest 1): 1 -1 -1 1 1 -1 0.57735 Design Radius/Region Radius: 1 1 1 1 1 -1 0.57735 1 -1 -1 -1 1 0.57735 -1 1 -1 -1 1 0.57735 Run VDG -1 -1 1 -1 1 0.57735 1 1 1 -1 1 0.57735 -1 -1 -1 1 1 0.57735 1 1 -1 1 1 0.57735 1 -1 1 1 1 0.57735 -1 1 1 1 1 0.57735 2 0 0 0 0 -1.1547 -2 0 0 0 0 -1.1547 0 2 0 0 0 -1.1547
Proposed Methodology Proposed Methodology � Understand original FORTRAN program Understand original FORTRAN program � � Choose suitable I/O methods Choose suitable I/O methods � � Convert original FORTRAN code to VBA Convert original FORTRAN code to VBA � � Test and use resulting Excel code Test and use resulting Excel code �
Visual Basic For Applications Visual Basic For Applications � Built on ANSI BASIC Built on ANSI BASIC � � Language engine of Microsoft Office Language engine of Microsoft Office � � Modern structured programming language Modern structured programming language � � Has vast array of types, functions, Has vast array of types, functions, � programming helps programming helps � Powerful support environment (Office platform) Powerful support environment (Office platform) � � Popular in business contexts Popular in business contexts �
Excel Object Model Excel Object Model � Objects in Excel are Objects in Excel are � Application addressable in VBA addressable in VBA � Each object has: Each object has: Workbooks ( Workbook ) � � Properties Properties � Worksheets ( Worksheet ) � Methods Methods � Range Chart
Input/Output Methods Input/Output Methods � Non Non- -interactive interactive Output Region � � Files, databases Files, databases � Worksheet cells Worksheet cells Input Region � Interactive Interactive Clicking these buttons � runs the ORPS1 and ORPS2 algorithms. � Message boxes Message boxes � Input boxes Input boxes � Custom GUI forms Custom GUI forms
FORTRAN vs. VBA FORTRAN vs. VBA 2 − 2 − − ± − ± b b 4 ac b b 4 ac 2 a 2 a � VBA: VBA: ( (- -b+Sqr b+Sqr (b^ 2 (b^ 2- -4*a*c))/(2*a) 4*a*c))/(2*a) � � FORTRAN: FORTRAN: ( (- -b+SQRT(b b+SQRT(b**2 **2- -4*a*c))/(2*a) 4*a*c))/(2*a) �
More Operators More Operators � .EQ. .EQ. = � .AND. .AND. And And = � � � .NE. .NE. <> � .OR. .OR. Or <> Or � � � .LT. .LT. < � .NOT. .NOT. Not Not < � � � .LE. .LE. <= <= � � .GT. .GT. > � // // & > & � � � .GE. .GE. >= >= �
Data Types Data Types � INTEGER INTEGER Byte, Integer, Long Byte, Integer, Long � � REAL REAL Single Single � � DOUBLE PRECISION DOUBLE PRECISION Double Double � � COMPLEX COMPLEX Non- -primitive in VBA primitive in VBA Non � � LOGICAL LOGICAL Boolean Boolean � � CHARACTER CHARACTER String String � � CHARACTER* CHARACTER*length length String* String*length length � � Other notable VBA types: Other notable VBA types: � � Currency, Decimal, Date, Variant Currency, Decimal, Date, Variant
Worksheet Functions Worksheet Functions ChiDist(x,deg_freedom) � � Returns one-tailed probability of the ?2 distribution. Correl(array1,array2) � � Returns the correlation coefficient of two cell ranges. Fisher(x) � � Returns the Fisher transformation at a given x. Pearson(array1,array2) � � Returns the Pearson product moment correlation coefficient for two sets. Quartile(array,quart) � � Returns the requested quartile of a data set. StDev(array) � � Returns the standard deviation of a data set. ZTest(array,x,sigma) � � Returns the two-tailed P-value of a z-test.
Flow- -Control Statements Control Statements Flow FORTRAN VBA IF ( expr ) stmt If expr Then stmt Logical if IF ( expr 1 ) THEN If expr 1 Then Block if stmt 1 stmt 1 ELSE IF ( expr 2 ) THEN ElseIf expr 2 Then stmt 2 stmt 2 … … ELSE Else stmt n stmt n END IF EndIf
Subtle Differences (“Gotchas Gotchas”) ”) Subtle Differences (“ � Implicit conversion of real to integer values Implicit conversion of real to integer values � � FORTRAN: truncate FORTRAN: truncate � � VBA: round VBA: round � � Solution: use Solution: use VBA’s VBA’s Fix(), which truncates Fix(), which truncates � � Both languages allow implicit typing Both languages allow implicit typing � � This introduces ambiguity This introduces ambiguity � � Solution: supply explicit types everywhere Solution: supply explicit types everywhere �
Eliminating Goto Goto Statements Statements Eliminating � Computer science accepts the axiom that Computer science accepts the axiom that � goto is generally “considered harmful” goto is generally “considered harmful” � We advocate rewriting We advocate rewriting alogrithms alogrithms to use to use � structured programming techniques where structured programming techniques where feasible feasible � Sine qua non Sine qua non is “make it work” is “make it work” � � It’s a good idea for maintainability, It’s a good idea for maintainability, � understandability to move to structured form understandability to move to structured form
Eliminating Goto Goto Statements Statements Eliminating DO 8 J=1,3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 CONTINUE
Eliminating Goto Goto Statements Statements Eliminating For j=1 To 3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 Next j
Eliminating Goto Goto Statements Statements Eliminating For j=1 To 3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 If j <> 3 Then xk = bestk - step End If Next j
Eliminating Goto Goto Statements Statements Eliminating For j=1 To 3 ... Do Until objfn > bestfn ... Loop If j <> 3 Then xk = bestk - step End If Next j
Our Reasoning Our Reasoning � Digital assets are fragile Digital assets are fragile � � FORTRAN is not universally available FORTRAN is not universally available � � Excel is a ubiquitous, powerful platform Excel is a ubiquitous, powerful platform � � VBA is a full VBA is a full- -featured language capable of featured language capable of � handling sophisticated statistical handling sophisticated statistical computations computations
Recommend
More recommend