insights big data analytics
play

Insights Big Data Analytics Processing on streaming, hot and - PowerPoint PPT Presentation

Simpler, Smarter and Faster Insights Big Data Analytics Processing on streaming, hot and historical data Rajiv Shah Director of Solution Architect and Professional Services 2019 About GigaSpaces 300+ Direct customers We deliver the


  1. Simpler, Smarter and Faster Insights Big Data Analytics Processing on streaming, hot and historical data Rajiv Shah Director of Solution Architect and Professional Services 2019

  2. About GigaSpaces 300+ Direct customers We deliver the fastest big data 50+ / 500+ analytics processing platform to Fortune / Organizations run your analytics & machine learning in production, at scale 5,000+ Large installations in production (OEM) 25+ ISVs

  3. GigaSpaces Select Customers OEMs / ISVs / Partners

  4. AnalyticsXtreme: Accelerating Your Data Lake by 100X for Real-time Analytics Your Yo r data is is im immediately se searc rchable le, quer eryable, , and ava vaila lable fo for r analytics • Single logical view for hot, warm and cold data • Hot data resides on in-memory data grid and historical data on HDFS/Object Store • Hot data is muta table le and historical data is immutable le (parquet) Fast Access • Fast access to frequently used historical data STREAMING HOT & WARM DATA DATA Access any data through a unified layer INGEST • Analytics (Spark ML) COLD & ARCHIVED DATA • Query (Spark SQL) Automatic lifecycle management • Automatically handles the underlying data movement, optimization and deletion

  5. GigaSpaces Coverage

  6. GigaSpaces Competitive Edge SPEED Any Data Live, Transactional & Historical Data Deploy Anywhere ANALYTICS SCALE

  7. Data Analytics: Undeniable Value to your Business Dynamic Pricing Predictive Maintenance Helps grow sales by 30% annually Reduces maintenance costs by up to 75% per mile (transportation example) Optimized Operations Saves $100sK in annual savings Personalized Recommendation (banking example) Increases conversions by up to 20X for brick & mortar stores via location-based promotions Risk Analysis Reduces loan losses by 10-30% Fraud Analytics Reduces losses by 3 to 5% in mature environments and by over 30% in evolving contexts Call Center Automation Increases efficiency by over 90%

  8. The Velocity of Business “A typical e -commerce “ To prevent fraud, “A call center receives website will experience anomaly detection 450,0 ,000 ca calls lls/day, each 40% bounce if it loads in needs to happen call needs to be routed in more than 3 s seconds, against 500,000 less than 60 millis illiseconds ” including txn/sec in less than personalization offers” 200 millis illiseconds ” ECOMMERCE TELCO FINANCIAL SERVICES

  9. Use Cases Spanning Industries Benefit from Near Real-time AI Decision Support Systems Built on GigaSpaces • • • Fraud Usage based Personal • insurance Credit risk scoring recommendations • • • Customer 360 Customer 360 Intelligent inventory mgmt. • • • Customer churn Customer churn Customer 360 • • FINANCIAL FI Claims management RETAIL RET IL Locations-based INSUR URANCE SER ERVIC ICES ECO ECOMMERC RCE promotions • • • Predictive maintenance Inventory planning Customer 360 (incl. churn) • • • Intelligent call center routing Customer 360 Fleet management • • Data Center Infrastructure • Predictive maintenance Customer 360 Monitoring (DCIM) • Predictive maintenance INDUS USTRI RIAL MED EDIA IA/ TRANSPORTATIO TRA ION IOT OT TEL TELCO

  10. InsightEdge: Unifying Real-Time Analytics, AI and Transactional Processing in One Platform • Rich ML & DL support • Extreme performance • Fully Transactional Machine Learning • ACID Compliance & Deep Learning • Enterprise-grade (Security, High Availability) In-Memory KEY-VALUE GEO SPATIAL DOCUMENT • Co-located Apps and Services Multi Model Store TABLE COLUMNAR STREAMING • Seamless integration with Big Data Intelligent Multi-tier Storage Management ecosystem • Data sources (Kafka/Nifi/Talend/etc.) STORAGE • Data lakes (S3/Hadoop/etc.) ORCHESTRATION • BI tools (Tableau/Looker/etc.) CLOUD/HYBRID/ ON-PREMISE

  11. Traditional vs. Unified “Translytical” Processing TRANSACTIONAL/ANALYTICAL TRANSACTIONAL/ANALYTICAL TRADITIONAL UNIFIED PROCESSING PROCESSING TRANSACTIONAL PROCESSING TRANSACTIONAL PROCESSING SLOW FAST IN-MEMORY FEEDBACK IMPACTS DATA REPLICATION FEEDBACK DATA GRID LOOP Real-time analytics LOOP Greater situation awareness Simplified architecture ANALYTICS ANALYTICS

  12. UNIFYING Analytics and Transactional Processing at SCALE & SPEED BI TOOLS DATA LAKE DATABASE & DATA WAREHOUSE APPLICATIONS MOBILE WEB IOT ANALYTICS, MACHINE & DEEP LEARNING APPS & MICROSERVICES BI & VISUALIZATION SECURITY AND AUDITING MANAGEMENT AND MONITORING REST ORCHESTRATION EVENT MICROSERVICES DEEP MICROSERVICES EVENT MACHINE SPARK JOBS SQL/JDBC NOTEBOOK STREAMING (REST) PROCESSING PROCESSING LEARNING LEARNING (REST) RPC & MAP/REDUCE CDC Engine CORE CR8 RPC & MAP/REDUCE MemoryXtend - MULTI-TIERED STORAGE MULTI MODEL STORE DATA OBJECTS, JSON, KEY-VALUE, TABLES, TEXT, SSD IN-MEMORY RAM LAKE GEO SPATIAL, GRAPH DATA GRID EVENT PERSISTENT WAN GW - MULTI SITE REPLICATION PROCESSING MEMORY WAN GATEWAY CLUSTER MANAGEMENT & SERVICE DISCOVERY ON-PREMISE HYBRID CLOUD

  13. Ultra-low latency and high throughput transactional processing IMDG Partitioned In-Memory Grid Shared-nothing, linear scalability, MOBILE WEB IOT elastic capacity Co-Location of Data and Business Logic Co-located ops, event-driven, ANALYTICS & BIG DATA APPS & MICROSERVICES SEARCH, BI & QUERY fast indexing SECURITY AND AUDITING MANAGENENT AND MONITORING MANAGENENT AND MONITORING EVENT MACHINE MICROSERVICES MICROSERVICES EVENT SPARKL SQL .NET SQL/JDBC SEARCH STREAMING JAVA LEARNING (REST) PROCESSING PROCESSING Event-Driven Processing and (REST) Map/Reduce No Downtime Auto-healing, multi-data center RPC & RPC & DATA MODELS WEB CONTAINERS EVENT replication, fault tolerance MAP/REDUCE MAP/REDUCE (SPATIAL, POJO, JSON) PROCESSING IN-MEMORY Fast Indexing Multi-Data Model DATA GRID POJO, .NET, Document/JSON, RAM SSD SPERSISTENT DATA REPLICATION Geospatial, Time-series STORAGE MEMORY & PERSISTENCE Seamless Integration with CLUSTER MANAGEMENT & SERVICE DISCOVERY Java/Scala ecosystem Cloud, Kubernetes, Docker Native ON-PREMISE HYBRID CLOUD

  14. Co-located Analytics and AI with Transactional Processing MOBILE WEB IOT ANALYTICS & BIG DATA SEARCH, BI & QUERY APPS & MICROSERVICES SEARCH, BI & QUERY SECURITY AND AUDITING MANAGENENT AND MONITORING MANAGENENT AND MONITORING MACHINE MACHINE EVENT MICROSERVICES MICROSERVICES EVENT SPARK SQL STREAMING .NET SQL/JDBC SQL/JDBC SEARCH SEARCH JAVA LEARNING LEARNING (REST) PROCESSING PROCESSING (REST) Distributed SQL-99 Spark for ML and leading DL frameworks RPC & RPC & DATA MODELS WEB CONTAINERS EVENT MAP/REDUCE Real-time Push-down predicate for ultra-low MAP/REDUCE (SPATIAL, POJO, JSON) PROCESSING latency filter (30x faster) integration with IN-MEMORY Tableau and DATA GRID Business Shared RDDs/DataFrames RAM SSD STORAGE-CLASS DATA REPLICATION STORAGE MEMORY & PERSISTENCE Intelligence tools Streaming with 99.999% availability JDBC driver CLUSTER MANAGEMENT & SERVICE DISCOVERY Deep Learning with Intel BigDL Graph processing, text mining, geospatial ON-PREMISE HYBRID CLOUD

  15. Benchmark (in IOPS) • Persistent Memory +249% than SSD • Persistent Memory +159% than SSD • RAM (off-heap) +180% than SSD • RAM (off-heap) +350% than SSD

  16. Costs Analysis for 5GB usable data • CAPEX reduction of up to 50% with RAM off-heap vs. on-heap • CAPEX reduction of up to 75% with AEP vs. RAM on-heap • OPEX reduction by X10

  17. Tiered Storage Architecture Higher Performance – Optimized TCO 10X less expensive than only RAM maintaining Define which in-memory performance data resides on which layer per class and per field `

  18. Kubernetes and Docker

  19. LAMBDA ARCHITECTURE IS COMPLICATED BATCH LAYER DATA SOURCES DATA CAPTURE/ LAYER BATCH ANALYTICS STORAGE APPLICATIONS Files FILES Public Cloud (GCP) EMR Public Cloud MESSAGE BUS (AWS) Capture Public Cloud (Azure) SPEED LAYER DATABASES MANAGEMENT LAYER STORAGE & CACHE EVENT-DRIVEN ANALYTICS Private Cloud Serverless, e.g. Events AWS Lambda EVENTS Kafka consumers CDC, Kinesis Enabled App Event Hubs Google Pub/Sub Message Azure Bus Cosmos DB SENSOR DATA SOCIAL CONTROL LAYER (Management, Orchestration, and Security)

  20. LAMBDA ARCHITECTURE MADE SIMPLE BATCH LAYER DATA SOURCES DATA CAPTURE/ LAYER BATCH ANALYTICS STORAGE APPLICATIONS FILES Public Cloud (GCP) EMR Public Cloud MESSAGE BUS (AWS) Capture Files Smart access to historical context Public Cloud (Azure) SPEED LAYER • No ETL, reduced complexity DATABASES MANAGEMENT LAYER Built-in integration with external • STORAGE & CACHE EVENT-DRIVEN ANALYTICS Private Hadoop/Data Lakes S3-like Cloud Serverless, e.g. Events AWS Lambda • Fast access to historical data EVENTS Kafka consumers Automated life-cycle management • CDC, Kinesis Enabled App Event Hubs Google Pub/Sub Message Azure Bus Cosmos DB SENSOR DATA SOCIAL CONTROL LAYER (Management, Orchestration, and Security)

  21. Leverage leading BI Platforms Tableau Looker Qlik Power BI

Recommend


More recommend