modernizing data estates with presto
play

Modernizing Data Estates with Presto Ken Seier, Chief Architect | - PowerPoint PPT Presentation

Modernizing Data Estates with Presto Ken Seier, Chief Architect | Data & AI ken.seier@insight.com Insight fast facts DEEP PORTFOLIO & RELATIONSHIPS ENGAGED WORKFORCE GLOBAL REACH 19 3,500 + 11,000 + countries Hardware, software


  1. Modernizing Data Estates with Presto Ken Seier, Chief Architect | Data & AI ken.seier@insight.com

  2. Insight fast facts DEEP PORTFOLIO & RELATIONSHIPS ENGAGED WORKFORCE GLOBAL REACH 19 3,500 + 11,000 + countries Hardware, software serving clients around the globe Insight teammates worldwide and cloud partners FOUNDED IN FINANCIAL STABILITY BROAD EXPERTISE $9B+ 1988 7,500 + Sales and service in revenue in 2018 Fortune 500 company with delivery professionals long legacy and knowledge

  3. Presto today • Targeted query federation for line-of-business applications or reporting • Ad hoc analytics enablement • Tech and Retail verticals, with some FinServ https://db-engines.com/en/ranking_trend/system/Presto

  4. Federated queries and data aggregation Presto doing what we know its good at, and a little more.

  5. Challenge • Global technical services company • 500,000+ customers • 300,000+ events/second • End-user investigation tool with cumbersome Java query tier

  6. Federated query solution Custom insights UX Simplified Java/SQL services Detail events in Starburst Presto Event aggregates Amazon S3 query fabric in Elasticsearch

  7. Pre-aggregation ETL solution Amazon Elasticsearch

  8. Outcomes • Rationalized Java query tier to single Presto SQL source • Implemented pre-aggregation ETL in same AWS/Java/Presto toolset • Elasticsearch queries through Presto over 1 million documents return in <2 seconds

  9. Big Data 2.0 Presto is a lighter replacement for aging big SQL tools.

  10. Challenge • Global software-as-a-service company • 15,000,000+ customers • Ad-hoc queries over 100 terabytes of cleansed data • Aging on-prem big-data-SQL implementation challenged to scale

  11. Hive to Presto Hive QL queries ANSI SQL queries Starburst Presto Data lake Data lake

  12. Outcomes • Data-in-place replacement for Hive • Migrate from HiveQL to ANSI SQL • Many-X concurrency improvement over Hive • 10X performance over Spark benchmarks

  13. Unified Query Plane Using Presto to simplify and de-risk legacy data management

  14. Challenge • Global manufacturer/retailer • $20,000,000,000+ globally • Rich operational ecosystem • Aggressively working toward comprehensive stack rationalization

  15. Legacy data estate

  16. Presto data fabric

  17. Outcomes • Many, many Presto sources and consumers • Supporting data science and line-of-business on isolated clusters • Presto abstraction over legacy systems enables table-by-table migrations • Row and column level RBAC enabled in Ranger • End-to-end automation for registering and managing data definitions: metadata, stats and security • Query-grain costbacks enabled with log listener

  18. Trends Where Presto may be headed

  19. Presto today • Targeted query federation for applications or reporting • Ad hoc analytics enablement • Tech and Retail verticals, with some FinServ https://db-engines.com/en/ranking_trend/system/Presto

  20. Presto going forward • Adoption driven by data science value • Drafting with Kubernetes adoption • Awareness in new industries • Data estate rationalization • Blue/green migration abstraction • New data tier and estate patterns

  21. Presto going forward Core data value cases Historical Reporting Operational Data Store Analytics Discovery Line of business reporting Line of business reporting Line of business reporting for defined historical period for making real-time course for making real-time course using defined metrics and corrections in day to day corrections in day to day performance indicators operations operations Data warehouse Small, disk-bound or Data lake Data mart in-memory store Lab environment

  22. Presto going forward Conceptual data architecture ANSI SQL data & insight Three fully- decoupled, Data lake horizontally of choice scalable, Insight single-tool enrichment tiers Direct event and Data staged from Data directly from transactional data source systems source systems

  23. Questions? Ken Seier, Chief Architect | Data & AI ken.seier@insight.com

Recommend


More recommend