the modern analytical landscape
play

The Modern Analytical Landscape 2 Where We Are Today 3 Data - PowerPoint PPT Presentation

The Modern Analytical Landscape 2 Where We Are Today 3 Data Explosive growth Data 2014 566PB/day 2017 est 1.5EB/day 2020 est 2.5YB/day 4 Data Types Telemetry IoT Formats JSON/BSON


  1. The Modern Analytical Landscape

  2. 2 Where We Are Today

  3. 3 Data ▸ Explosive growth ▹ Data ▹ 2014 – 566PB/day ▹ 2017 – est 1.5EB/day ▹ 2020 – est 2.5YB/day

  4. 4 Data Types ▸ Telemetry – IoT ▸ Formats ▹ JSON/BSON ▹ Audio/Video ▹ CDISC ▹ F004 ▹ ASDF ▹ DMS-MTX

  5. 5 Usage ▸ Analytics recognition Finally. ”I keep saying the sexy job in the next ten years will be statisticians.” Hal Varian – Google Chief Economist 2009

  6. 6

  7. 7

  8. 8 Used to be Easier ▸ Minimal Sources ▸ Minimal Data types ▸ Smallish Volume ▸ Couple of Analytic Packages

  9. 9

  10. 10

  11. 11 Used to be Harder ▸ Minimal Sources ▸ Minimal Data types ▸ Smallish Volume ▸ Couple of Analytic Packages ▸ Finding data ▸ Gathering data ▸ Limited data types ▸ Limited Analytic Techniques ▸ Limited Compute Power

  12. 12 Current “Trends” Disruptive Technologies

  13. 13 Hadoop ▸ “Released” in 2006 building on work done in 2003/4 ▸ Developed for page rank ▸ Distributed File Storage ▸ Map-Reduce on top ▸ Apache Hadoop 1.0 released in 2012

  14. 14 Hadoop ▸ Ideal for particular use cases ▸ Not always performant for Analytics

  15. 15 Data Science ▸ Gen 1 Data Scientist ▹ Programmer first ▸ Gen 2 Data Scientist ▹ Taught at University ▹ Has stats knowledge ▸ Gen 3 Data Scientist ▹ Workplace experience

  16. 16 Data Science ▸ Becoming the new “I.T department” style roadblock ▸ Rapidly becoming a 4 letter word

  17. 17 Machine Learning ▸ Trendy term ▸ Highly sought after skills

  18. 18 Congra gratul ulation ions y you c can now put ut m mach achin ine le learnin arning o on y n your ur resumes .

  19. 19 Deep Learning ▸ Simply a neural network with >1 hidden layer ▸ Frameworks ▹ Tensorflow ▹ Theano ▹ Keras ▹ Caffe ▹ DSSTNE t

  20. 20 Deep Learning ▸ May produce superior results ▸ Only way (currently) for some problems ▸ Programming skills ▸ Over complicating things ▸ Loss of explainabilty

  21. 21 In Database ▸ Move the work to the data ▸ Database to manipulate the data ▸ SAS and Teradata 10 year partnership ▸ Phenomenal reductions in processing

  22. 22 In Memory ▸ SAS leading the way ▸ Visual Analytics/Statistics ▸ HPA procs ▸ Teradata 750 appliance ▸ Viya

  23. 23 THANKS! Any questions? You can find me at @pwsegal-ca

Recommend


More recommend