Kaycee Lai, CEO & Founder Presto Summit NYC 2019
WHO WE ARE EXEC TEAM $400M+ from successful startup exits Pedigree from GOOG, VMW, MSFT, ORCL Dr. Shuo Yang Azary Smotrich Kaycee Lai VP, Engineering Principal Architect CEO & Founder -Office of CTO @Oracle -GM $120M P&L @EMC -Ph.D. CS from Purdue Univ. -Founding Eng @ModleN -President @Waterline Data -Key member of ”Borg” @Google -Prescriptive Analytics @NASA -VP Sales -Built cloud native analytics @EA -Founding Eng . @Waterline @Virsto (VMware) @Avamar (EMC) DOMAIN EXPERTISE @Delphix Data Ops / Big Data / Analytics / Cloud / Data Management / Cluster Management/ Data Governance TOP INVESTORS Successful track record nurturing startups to success Jocelyn Arnold Graham Jeff Goldfein Silverman Brooks Parks Board Member Board Member Board Member Investor Zetta Ventures Discovery Ventures .406 Ventures Riverwood 2
GETTING ANSWERS FOR BI IS EVEN HARDER Months Weeks Days Months Days Hours 1 2 3 4 5 6 7 Discover Move Prep SQL Query Visualize Govern Data Data Data Statement Data Data Data Not sure if the data is right until step 6! >4 Months to answer 1 question Resources Required: Resources Required: Business Analysts / Data Governance / IT / Data Scientists / Office of CDO / DBAs /BI Developers / SIs Compliance 3
BI/ANALYTICS SHOULD BE ABOUT ANSWERING QUESTIONS…RIGHT? 4
OUR VISION TO SIMPLIFY BI & ANALYTICS 1 3 4 2 Connect Reveal Rationalize Execute Reduce a 4 month process to minutes Any Data Source Location Intent of question Federated Query Relationships Assembly logic BI Integration 5 Instructions SQL Statement
DATA AS A SERVICE WITH PROMETHIUM CONNECT REVEAL RATIONALIZE EXECUTE VISUALIZE Data Data Data Data Federated Query Self-Service Sources Catalogs Discovery “Prep” Analytics NLP Search Location (Question Builder) (Data Explorer) Relationships Logical Guidance (Data Map) (Reasoner) Instructions Auto SQL Statement (Directions) (SQL AI) 6 FAST, SCALABLE, SAAS PLAFORM ON CLOUD (AWS)
ARCHITECTURE
SCALABLE ARCHITECTURE FRONT END QUERY EXECUTION AI/NLP DATA CONTEXT ENGINE 8
KEY COMPONENTS 3 RD PARTY DATA CATALOG QUERY EXECUTION DATA SOURCES 9
HOW IT WORKS - CONNECT SMART BOTS Cloud 3 rd PARTY DATA CATALOG JDBC HDFS Data Catalogs INFO FROM SMARTBOTS: 1. API-Based (e.g. JDBC) 2. Name / Location / Schema 3. No heavy processing & data movement DATA SOURCES 4. Alt. Names: Tags / Synonyms 5. Data Quality 6. Lineage 10
HOW IT WORKS - REVEAL DATA CONTEXT ENGINE DATA EXPLORER (FIND DATA) DIRECTIONS (ASSEMBLE) 1. Table/File/Column Name 1. Tables / Files 2. Vendor Name / Data Type 2. From what Vendor DATA MAP (VISUALIZE) 3. Location (IP address / URL) 3. Select / Join 1. Topology 2. Relationships 11 3. Alternate Versions
HOW IT WORKS - RATIONALIZE DATA CONTEXT ENGINE DIRECTIONS (ASSEMBLE) 1. Change Join Types 2. Change Join Operators 3. Auto-Create SQL Statement for Presto DATA MAP (VISUALIZE) 1. Delete / Change Tables 12 2. Find Missing Tables via Catalog
HOW IT WORKS - EXECUTE VIRTUAL VIEW Query Initiated Direct Access to Data – No ETL 13
THE NO-ETL APPROACH HOW IT WAS DONE HOW IT CAN BE TODAY ~ 4 MONTHS ~ 4 MINUTES 1 1 Build complex data pipelines Select data discovered Run queries directly from 2 2 Schedule long running ETL jobs source Copy/Move data to a data 3 warehouse / lake Manually subset, join, write SQL 4 statements Query against data warehouse/ 5 lake 14
SUPPORT & INTEGRATION DATA SOURCES DATA LINEAGE DATA VIRTUALIZATION RDBMS: HDFS: * S3 based: * Cloud: PLATFORM DATA CATALOG DATA VISUALIZATION (PUBLIC CLOUD OR ON-PREM VPC) SUPERSET 15 * On Roadmap
BUSINESS IMPACT
PAIN POINTS ADRESSED PROBLEM PROBLEM PROBLEM FINDING DATA IS COMPLEX & ANSWERING QUESTIONS STILL QUERIES ACROSS DIFFERENT TIME CONSUMING. TAKES HUGE MANUAL EFFORT SOURCES / VENDORS = HARD TO POST DATA DISCOVERY. EXECUTE Data is fractured across multiple Need to know data relationships to Insights are limited as SQL statements can only reflect data from one system systems, multiple vendors and multiple know what / how to join locations Highly manual two-step process of SQL statements can take up to 8 hours No single tool can search for data moving data from each separate to create vendor then manually joining the data across the entire data estate Few people in the organization who Loading all of the data into a single can write a valid SQL statement repository is expensive & time consuming SOLUTION SOLUTION SOLUTION PROMETHIUM’S DATA EXPLORER ™ PROMETHIUM’S QUESTION BUILDER ™ PROMETHIUM’S KALEIDOSCOPE ™ NLP driven method to transform questions into 1 SINGLE solution to find data without the 1-STEP Federated query execution across data need to move data various sources with integration for BI tools such as Tableau PROMETHIUM’S DATA MAP & DIRECTIONS Reveals context of data across all Instantly generates a STEP BY STEP assembly vendors and systems directions + DATA MAP PROMETHIUM’S SQL AI Instantly generates a valid SQL statement 17
TODAY: TIME & EFFORT TODAY Data Analyst Cost $125,000 Time to find Data 4 weeks Time to Move Data 5 days Time to Subset/Model/Join 2 months Data Engineer Cost $250,000 Time to Write 1 SQL Statement 8 hours Time to Aggregate & Query Data 3 days # Data Analysts 4 Business Analyst Cost $90,000 # Data Engineers 2 # Business Analysts 2 TODAY # of People Involved 8 Amount of Time (months) 3 month+ # of Questions answered in 1 year < 4 Cost of Asking 4 Questions $549,973 18
PROMETHIUM EFFICIENCY TODAY PROMETHIUM Data Analyst Cost $125,000 Time to find Data 4 weeks 1 min Time to Move Data 5 days 0 Time to Subset/Model/Join 2 months 2 sec Data Engineer Cost $250,000 Time to Write 1 SQL Statement 8 hours 1 min Time to Aggregate & Query Data 3 days 1 min # Data Analysts 4 1 Business Analyst Cost $90,000 # Data Engineers 2 0 # Business Analysts 2 0 PROMETHIUM TODAY People Efficiency 7X less # of People Involved # of People Involved 8 1 Amount of Time (min) Amount of Time (months) 3 month+ ~3 min ~ 10,000X less Time Efficiency # of Questions answered in 1 year # of Questions answered in 1 year < 4 40,000 Cost Efficiency ~ 75,000X less Cost of Asking 4 Questions Cost of Asking 4 Questions $549,973 $7.30 For 7x less resources & 75,000X less cost , Promethium can answer up to 10,000X more questions. What can a business do if it has a 10,000X increase in efficiency to answer questions & gain insights? 19
AI-DRIVEN APPROACH WITH PROMETHIUM Discover Ask a Question (NLP) Prep Execute 20
Thank you!
Recommend
More recommend