data engineer our data science with significant
play

Data Engineer Our Data Science with Significant Statistics, to - PowerPoint PPT Presentation

Data Engineer Our Data Science with Significant Statistics, to Enrich Success by Enhancing Trust and Value of Society Arnold Goodman, Cofounder: Interface Symposia plus Statistical Analysis and Data Mining: The ASA Data Science Journal Did Not


  1. Data Engineer Our Data Science with Significant Statistics, to Enrich Success by Enhancing Trust and Value of Society Arnold Goodman, Cofounder: Interface Symposia plus Statistical Analysis and Data Mining: The ASA Data Science Journal Did Not “Jack” Good Use His DATA ENGINEER TRUST AND VALUE Statistics to Enrich Success Why Not Enhance Processes by Our of Alan Turing A. I. to Break Products, Models by Our Meanings, the German’s Enigma Code, Algorithms by Our Actions, Data by without High Commanders Our Descriptions and Possibilities by of Germany Learning Code Our Probabilities of Consequences? Was Broken and Fixing It? 1 Communcations of the ACM 58:6, 46,June 2015 InformationWeek, Before 62, February 8, 1993 Data Engineering with Statistics to Iteratively Associate Leo Breiman’s INFORMATION WEEK OF FEBRUARY 8, 1993 DATA ENGINEER OUR DATA SCIENCE Algorithmic Model Prediction with a Likely-Good Data Model Relationship Why Not Balance Our Data Science Deduction by Inference, Efficiency by Effectiveness, Findings by Likely STEIN SHRINKAGE, 5 OTHERS ARE CHALLENGED TO SUPERVISED LEARNING, Conclusions, Precision by Accuracy, EMPLOY MY ASSOCIATION, OR AN APPROPRIATE STEIN SHRINKAGE, SUPERVISED OTHERS ARE CHALLENGED TO EVALUATE ITS RESULTS AS TECHNIQUE MAY BE USED LEARNING, OR AN APPROPRIATE EMPLOY MY ASSOCIATION, WELL AS DOCUMENT THEIR TECHNIQUE MAY BE USED TO EVALUATE ITS RESULTS AS TO MOVE ALGORITHMIC and Speed by Far Improved Quality? MOVE ALGORITHIC MODEL WELL AS DOCUMENT THEIR SITUATIONS, PROBLEMS, MODEL PREDICTION AND PREDICTION AND DATA SITUATIONS, PROBLEMS, IMPROVEMENTS, FINDINGS DATA MODEL PREDICTION MODEL PREDICTION EVER IMPROVEMENTS, FINDINGS CLOSER TO EACH OTHER . AND CONCLUSIONS. AND CONCLUSIONS. CLOSER TO EACH OTHER .

  2. Data Science Evidence for Data Engineering with Statistics • Why Not “Ask Watson or Siri: Artificial Intelligence Is as Elusive as Ever?” 2 • To Improve “ The State of A. I … We Need … More Predictive Models … That Can ‘Routinely Make Predictions’.” 3 • A. I. has “Thrill of Discovery” and “Science” -- Needs “Data Analysis of Statistics” and “Power of Engineering”. 4 • The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age – Evaluate A. I. Impacts. 5 • “Discussed the Problem of Overfitting … in Deep Neural Networks (and) … Techniques to Prevent Overfitting”. 6 • “95% of Tasks Do Not Require Deep Learning. … In 90% of Cases Generalized Linear Regression Will Do . ” 7 • Does Computer Research Associates’ “Incentivizing Quality and Impact (Value)” Not Make Key Significant Point? • Does “The Code Issue” Not Explain the Essential Pros and Cons of Both Codes and Coders in Plain Language? 8 • “Controlled Experimentation (Is the) … Only Way to Establish Cause and Effect (, Causation and Causality).” 9 • “Turning Uncertainty into Breakthrough Opportunities” Is the Subtitle of This Insightful and Inspiring Book . 10 • The “Great Thing … about Data Mining/Data Science Community (Is) … Machine Learners, Statisticians, … .” 11 • “Statisticians David J. Hand (2nd) and John Elder (5th) within ‘Videolectures’, in Their Number of Viewings.” 12 • “Hiring Data Scientists”: "Statistics, Machine Learning &/or Data Mining” 13 • Drucker: “Innovation in Any … Area Tends to Originate Outside the Area.” (Drucker Institute MONDAY* , 7/6/15) • Big Data Is a “Major Engineering and Mathematical Challenge”, and So Is a Balanced Analysis in Data Science. 14 • Stanford University Has Institute for Computational and Mathematical Engineering to Do Its Data Engineering. • “Do Not Know: How Algorithms Perform Quantitatively, Good Parameter Settings, Processes Underlying Data, Algorithm (Re)Evaluation/Comparison”, “Matching Patterns to Reality”, “Data Problem”, and “Data Science”. 15 • PhD Topics: “Pattern(s) in Evolving Data”, “Version Control under Uncertainty” and “Ranking with ER Graphs” 16

  3. Data Engineering with Statistics to Enhance Trust and Value • Is Computing Not: Deduction, Efficiency, Learning Some Potential Findings, Precision and Speed? • Is Statistics Not: Inference, Effectiveness, Likely Conclusions from Findings, Accuracy and Quality? • Why Not Data Engineer Our Data Science to Balance Its Deduction by Inference, Efficiency by Effectiveness, Findings by Likely Conclusions, Precision by Accuracy, and Speed by Quality? • Did Symposia on the Interface of Computing and Statistics Not Define Data Science to be the Interface with Big Data and Data Science Conceived, Born and Even Matured in Them? 17 + Refs • Does Data Science Not Function with: Processes, Models, Algorithms, Data and Possibilities? • Does Data Engineering Not Enable: Products, Meanings, Actions, Descriptions and Probabilities? • Why Not Data Engineer Trust and Value by Enhancing Processes with Products, Models with Meanings, Algorithms with Actions, Data with Descriptions, and Possibilities with Probabilities? • Founding Fathers Engineered Improbable Improvisation of United States (7/3/15 “Charlie Rose”). • If Mathematics Yields No Solutions, Data Engineering Iterates and Simulates to Far Smarter Data. • My 1960s Data Engineering: Iterative Weighted Least Squares (1 st Regression Mixed Model and 1 st Hierarchical Model), (1st Text) Relationship Analytics, Experimental and Simulation Analysis, and Model to Guide Negotiators of $1B Incentive Prime Contract to Put Man on Moon. 18 + Refs • My Data Engineering: 2012 INFORMS Conference on Business Analytics and Operations Research.

  4. 2001 Is a Data Engineering and Statistics “We Must Sit Loosely in the Saddle of the Data.” * Year, yet Computing Now Rules Market “Vaguely Correct Is Better Than Precisely Wrong.” * With Harry Press, “Power Spectral • Trevor Hastie, Rob Tibshirani and Jerry Friedman, The Methods”, NATO Flight Test , 1956 Elements of Statistical Learning: Data Mining, Inference, and Prediction 19 “The Inevitable Collision Between Statistics and • Leo Breiman, “Two Cultures of Statistical Modeling” 20 Computation”, 1963 IBM • William Cleveland, “Data Science: An Action Plan for Scientific Computing Sym. Expanding the Technical Areas of the Field of Statistics”. 2 1 John • Padhraic Smyth and I Enriched Computing, Statistics, Tukey and Bioinformatics (with Own Day) by Interface ’01. 22 Was the Pioneer • Computer Scientists Who Data Engineer with Statistics of Data Include Tom Dietterich, Pedro Domingos, Usama Fayyad, ↓ Exploratory Chandrika Kamath, Vipin Kumar, Zoran Obradovic, Peter Mining, * Heard Data Norvig, Padhraic Smyth and Mohammed Zaki. Data by Me Analysis, Science, • Most Statisticians Who Data Engineer with Statistics During 1977 and Now Include John Chambers, Dick De Veaux, Bill DuMouchel, 1958/59 Data Brad Efron, John Elder, Jim Goodnight, David Hand, Alan Stanford Engineer- Izenman, Michael Jordan, Jon Kettenring, David Madigan, Statistics ing. Richard Olshen, Art Owen, Daryl Pregibon, Stuart Russell, Seminar John Sall, Dan Steinberg, Hal Stern, Joe Verducci, Ed Wegman and Lee Wilkinson – I May Have Missed Some. Gene Wiley Created 1976 Cartoon for Me.

  5. Associate Algorithmic Prediction with Likely-Good Data Relationship • Leo Breiman Defined Both the Algorithmic Modeling (Computing) Culture and Data Modeling (Statistics) Culture, for Predicting a Process (System) Output Vector Y from Its Input Vector X. • An Algorithmic Model Predicts Y, while a Data Model Predicts Y Based on Linear Relationship. • Charles Stein Observed to Me in 1960: “When There Are Two Things That You Know Are True and They Contradict Each Other, You Are About to Learn Something.” – Challenge Is to Learn! • If We Desire to Engineer the Value of Y by Varying the Value of X in Addition to Predicting Y, May Associating an Algorithmic Prediction with a Relationship of Y to X Not Be Significant? • Might Algorithmic and Data Predictions Becoming Sufficiently Close to Essentially Associate an Algorithmic Prediction with Likely-Good Data Relationshipit Not Be Significant Progress? • My Complete System Analysis Iteratively Integrates Experimental and Simulation Analysis. 18 • Why Not Data Engineer Predictions, Performing Specified Iterations to Associate Algorithmic Prediction with a Sufficiently-Close Data Relationship, Which Is Likely Good, by Modifying It? • May It Not Likely Yield Integration of Deduction with Inference, Efficiency with Effectiveness, Findings with Likely Conclusions, Precision with Accuracy, and Speed with Quality? • Others Are Challenged to Employ My Proposed Association and to Evaluate Their Results as Well as to Document Their Situations, Problems, Improvements, Findings and Conclusions.

Recommend


More recommend