scala for business automation
play

Scala For Business Automation Solving Real Business Problems with - PowerPoint PPT Presentation

Scala For Business Automation Solving Real Business Problems with Streams, POI and a Dash of BDD Conor Svensson conor10@gmail.com The Spreadsheet The PCs first killer app The London Whale the model operated through a series of Excel


  1. Scala For Business Automation Solving Real Business Problems with Streams, POI and a Dash of BDD Conor Svensson conor10@gmail.com

  2. The Spreadsheet The PC’s first killer app

  3. The London Whale …the model operated through a series of Excel spreadsheets, which had to be completed manually, by a process of copying and pasting data from one spreadsheet to another… Report of JPMorgan Chase & Co. Management Task Force Regarding 2012 CIO Losses

  4. A brief history… • 1979 VisiCalc - The first killer app for PCs on Apple II • 1983 Lotus 1-2-3 • 1985 Excel 1.0 on Apple • 1987 Excel 2.0 on MS-DOS • 1992 Excel 3.0 on Windows • 1994 StarCalc (Star Office) • 2001 Calc (Open Office) • 2006 Google Spreadsheets • 2007 iWork Numbers

  5. Today

  6. Use cases • Traditional • Accounting/inventory • Project management • Simple data analysis • Reporting • Exotic • Business process workflows • Risk management • Transaction capture • Reconciliations

  7. Why? And then… Then… Initially… “I can solve this problem It becomes business Management/audit/ with a spreadsheet” critical regulators arrive… Avoid more • “W hy do we have • expensive build via these decentralised a proper solution business critical (bureaucratic spreadsheets?” change processes don’t help here) “We need a proper • solution” Lack of • development knowledge

  8. Complex spreadsheets • Multiple lookup tables with reference data • Data from multiple data sources • Series of complex calculations • Output reports • Centralised via a file share (if lucky), multiple copies tend to exist due to e-mail

  9. The result

  10. Automation

  11. 1. Apache POI • Poor Obfuscation Implementation • The Java API for Microsoft Documents • Work with Office files without a license (Excel, Word, PowerPoint, Outlook, Visio, Publisher) • Text extraction applications - web spiders, index builders, content management systems • Business automation…

  12. 2. Scala • Expressive and concise • Streams, for comprehensions, and case classes • Many organisations have not yet made the switch to Java 8 • Slots easily into existing Java dominant environments (e.g. financial institutions)

  13. Streams

  14. For comprehension Instead of: We can use:

  15. Case classes Scala provides: • field accessors • constructors • equals • hashCode • toString • copy

  16. 3. BDD in Scala Examples use ScalaTest’s FlatSpec with Matchers

  17. An example Q: Is there a correlation between Bitcoin and commodity returns?

  18. Source data • Bitcoin prices • ASX SPI 200 (Australian commodities index)

  19. Our model • Reorder data • Create named ranges • Merge price series by date • Calculate: 1. Returns of our price series 2. Mean 3. Standard deviation 4. Covariance 5. Correlation

  20. The Code ExcelReader • Use named ranges or sheet/row/col indices Stats • Statistical library Correlation • Incremental extraction of business logic with Streams/for-comprehensions

  21. ExcelReader

  22. Stats

  23. Correlation

  24. End to end validation Tests that all can understand!!!

  25. Clear test output

  26. Building on this • Rapid prototyping/MVP • Spreadsheet used as initial datasource • Micro-spreadsheet-services, etc… (sorry)

  27. Limitations • VBA evaluations are not supported, only formula evaluations are • Some Excel functions are not yet implemented (but you can contribute easily) • It’s not a silver bullet for avoiding spreadsheets, merely a tool to ease the pain of moving off them • Additional overhead of more granular stream processing

  28. In summary • Useful tools for rapid prototyping/MVP without risking data quality • Reduced implementation error • Spreadsheet can be modelled incrementally • BDD test cases verified against original source • Use FP techniques • Very well suited to streams • Promoting immutability of our data structures

  29. Further resources • https://github.com/conor10/spreadsheets - source code and example code supporting this talk • https://www.quandl.com - Quandl for data • https://poi.apache.org/ - Apache POI Project • http://scalatest.org/ - ScalaTest • Power, D. J., "A Brief History of Spreadsheets", DSSResources.COM, World Wide Web, http://dssresources.com/history/sshistory.html, version 3.6, 08/30/2004 • http://blogs.office.com/2012/09/13/introducing-spreadsheet-controls-in-the- new-office/ - Controls for critical spreadsheets • http://fsprojects.github.io/ExcelProvider/ - F# Excel Type Provider

  30. Questions?

Recommend


More recommend