big data on tap
play

Big Data on Tap Jonathan Gray Founder & CEO November 7, 2016 - PowerPoint PPT Presentation

Unified Integration for Data-Driven Applications Big Data on Tap Jonathan Gray Founder & CEO November 7, 2016 cask.co Hadoop Enables New Applications and Architectures ENTERPRISE DATA LAKES BIG DATA ANALYTICS PRODUCTION DATA APPS


  1. Unified Integration for Data-Driven Applications Big Data on Tap Jonathan Gray Founder & CEO November 7, 2016 cask.co

  2. Hadoop Enables New Applications and Architectures ENTERPRISE DATA LAKES BIG DATA ANALYTICS PRODUCTION DATA APPS Batch and Realtime 360 o Customer View Recommendation Engines Data Ingestion Integrate data from any source Build models in batch using and expose through queries historical data and serve them Any type of data from any and APIs in realtime type of source in any volume Realtime Dashboards Anomaly Detection Systems Batch and Streaming ETL Perform realtime OLAP Process streaming events and Code-free self-service creation aggregations and serve them predictably compare them in and management of pipelines through REST APIs realtime to historical data SQL Exploration and NRT Event Monitoring Time Series Analysis Data Science Reliably monitor large streams of Store, process and serve massive All data is automatically data and perform defined actions volumes of time-series data accessible via SQL and client SDKs within a specified time Data as a Service Realtime Log Analytics Internet of Things Easily expose generic or Ingestion and processing of Ingestion, storage and processing custom REST APIs on any data high-throughput streaming of events that is highly-available, log events scalable and consistent Data Applications Drive Meaningful Business Value cask.co 2

  3. But Getting Value from Big Data is Hard Complexity of technologies Proliferation of projects, and new user learning curve services and APIs Divergence of distributions Integration silos created by and technologies narrow point solutions Too much focus on infrastructure and integration, rather than applications and analytics cask.co 3

  4. And There Are Many Faces of Hadoop Developer Data Scientist LOB / Product Ops Configuring & Monitoring Architecture & Programming Scripting & Machine Learning Driving Revenue & Decision Making Focused on Infrastructure & SLA’s Focused on Apps & Solutions Focused on Data & Algorithms Focused on Products & Insights Without a consistent set of tools, IT will not be an effective data enabler for the business cask.co 4

  5. Enter Cask Founded in 2011 By early Hadoop engineers from Facebook and Yahoo! Raised $37+ Million Andreessen Horowitz, Safeguard, Battery Venture and Ignition Partners Strategic Investors AT&T, Cloudera and Ericsson Latest Release 3.6 Cask Data Application Platform, Cask Hydrator and Cask Tracker Key Customers & Partners AT&T, Ericsson, Lotame, Salesforce, Cloudera, Hortonworks, MapR, Microsoft, IBM, Tableau… NEW: CDAP 4 Preview Featuring Cask Market, 
 the “big data app store” Why “Cask” ? A Container Architecture that puts Big Data on Tap cask.co 5

  6. The Evolution of the Cask Platform Convergence of Big Data Apps and Data Integration CDAP CDAP CDAP v2 v3 v4 Big Data App Server Unified Integration for Big Data Big Data Apps + Data Integration • Abstractions & integrations • Security & governance • Data ingest • Metrics & logs • Self-service environment • Data pipelines • Debugging environment • Enterprise integrations • Workflows and metadata “WebLogic for Hadoop” “WebLogic Meets Informatica” “Unified Big Data Integration” cask.co 6

  7. 
 
 Introducing Cask Data Application Platform (CDAP) Fraud Customer Recommendation Sensor Data Data Lake Detection 360 Engine Analytics First Unified Integration Platform for Big Data 
 Platform for distributed apps, bringing together 
 Enterprise-grade application management with data integration Self-Service Security & User Experience Governance 100% open source and built for extensibility Distributed • Modern Data Application Integration Framework Supports all major Hadoop distributions and clouds • Integrates the latest open source big data technologies • cask.co 7

  8. Modern Data Integration INGEST EXPLORE PROCESS SERVE any data from for analytics and for ETL and any data to any any source data science machine learning destination • Real-time and Batch • Reliable and Scalable • Simple and Self-Service cask.co 8

  9. Distributed Application Framework DEVELOP TEST DEPLOY SCALE rapidly build powerful test and run any apps in horizontally scale applications CI framework any environment apps and data • Real-time and Batch • Memory, Local, Distributed • Analytics and Applications cask.co 9

  10. Security and Governance CAPTURE DISCOVER TRACK ANALYZE store all metadata easily locate any every audit plus understand usage about your data of your data lineage graphs patterns of data ENCRYPT AUTHENTICATE AUTHORIZE cask.co 10

  11. Self-Service User Experience A code-free framework to build and run data pipelines A data discovery tool to explore metadata and usage Create, Separation Native to Drag & drop Rich app- Track Analyze MDM debug, of logic and Hadoop & graphical level lineage and usage of integration deploy and execution Spark — interface metadata audits datasets framework manage environment scales out cask.co 11

  12. The CDAP Architecture Applications Datasets Programs • Application Container Architecture Table Avro Parquet MapReduce Spark Timeseries OLAP Cube Tigon Work fm ow • Reusable Programming Abstractions Geospatial ObjectStore Service Worker • Global User and Machine Metadata Metadata Metadata • Highly Extensible Plugin Architecture Metadata cask.co 12

  13. CDAP Enables the Full Big Data Application Lifecycle • Standardization, deep • Simplified packaging, deployment integrations, tools and docs and monitoring of apps on Hadoop Production Rapid • Separation of app logic from • Enhanced security and governance Operations & with centralized metrics and logs data logic and integration logic Development Governance • Conceptual integrity within • Tracking and exploration of applications and consistency metadata, data provenance, audit across environments trails and usage analytics Single framework for building and running data apps and data lakes on Hadoop and Spark reduces time to develop and deploy big data apps by 80% reduces time to insights and accelerates business value removes barriers to innovation and future-proofs your apps cask.co 13

  14. Customer Success Stories Leading SaaS Platform 
 Large Telco Enterprise 
 Health Insurance Provider 
 taking new real-time, massive building a centralized, secured, 
 offloading clinical / immunization scale products to market multi-tenant Data Lake reporting from Netezza Customer 
 Situation Lack of existing Hadoop expertise Small team and significant Multiple teams and technologies and frustration with hand-coding technical challenges limit pace of with widely varied skillsets and and scripting tools development and solution scale incompatible design choices Cask Hydrator for rapid creation of CDAP for real-time ingestion and CDAP for data lake management data pipelines and Cask Tracker for consistent processing with and orchestration, tightly data discovery production operations support integrated into existing systems Cask 
 Solution POC in 2 days 
 Development in 1 month 
 Hundreds of Users 
 Production in 2 months Production in 3 months Thousands of Pipelines cask.co 14

  15. Awards and Accolades “ … “ CDAP is a big win for us … the amount of code we needed to write was minimal with CDAP , and it was much easier and faster than we ever expected …” (Jia-Long Wu, Data Architect, Lotame) Cask was Named a Gartner Cool Vendor 2016 “ … “ Cask has tilted the playing field , earning a massive unfair advantage over proprietary point products for data integration and ingest …” (Nik Rouda, Senior Analyst, Enterprise Strategy Group) “ … for the rest of us who lack the technological chips or patience to make it all work, there’s good news: it will soon get easier, thanks to the Cask was Certified a work done by the big data pioneers, as well as vendors like Cask …” Great Place to Work 2016 (Alex Woodie, Managing Editor, Datanami) cask.co 15

  16. NEW: CDAP 4 — Big Data Apps on Tap! Release of CDAP 4 Preview Available for download now! Cask Market Cask Wrangler “Big Data App Store” Interactive Data Preparation Reimagined CDAP UI Resource Center Rewrite based on React Interactive Wizards for Common Tasks cask.co 16

  17. NEW: CDAP 4 — Big Data Apps on Tap! Cask Market The “App Store for Big Data” • Goal: Time to value in minutes w/ no existing experience • Application and Library Ecosystem with pre-built Hadoop solutions, reusable templates, and third-party plugins • Available from anywhere inside the CDAP UI with a click • Initially, everything in the Cask Market has been bootstrapped by Cask based on ongoing work across our customers, is 100% open source and available on GitHub • Eventually, developers and ISVs will be able to showcase and market their own applications and libraries (ex: Graylog) Cask Market includes Interactive, Guided Wizards for Configuring Pre-Built Templates cask.co 17

Recommend


More recommend