Simple Steps for Building a Scalable BI Solution on MySQL 5.7, its - PowerPoint PPT Presentation

Simple Steps for Building a Scalable BI Solution on MySQL 5.7, its JSON functionality and ElasticSearch

/ Intro Who is Pavel? Pepper.com ?

/ For what is this talk about How we get data from around the world in realtime? How we maintain and scale that system? How much it costs us? What are our problems?

/ From where do we need to get data?

/ How we started Аround 2013 – no reports; 10-15 people running 3 sites with around 2.5 million visitors a day 1. No dedicated people for BI 2. Very small reports only for historical analysis 3. Everything was added to the product database and was presented in the different products (NOQS slave) Around 2015 – some limited visibility; 30-35 people running 7 sites with around 4 million visitors a day 1. using external tools – RJMetricks and BIME 2. weekly snapshots 3. crazy expensive Today – big visibility and many dashboards withs lots of data; ~200 people running 12 sites with 5.5 million visitors a day 1. Everything is in-house, cost us almost nothing, realtime

/ What was important before we start working 1. We need to be able to add or alter the events itself on a later stage 2. We need to know: How to maintain ? how to monitor ? how to secure ? How to scale ? how to backup ? how to operate daily with it ? 3. Fault tolerant from point of view of networking 4. Easy to add more events without increase the load of the product 5. Has to be realtime or Near real time – 60s delay when there is internet 6. Cheap – I love cheap

/ Solution – Ocular master slave UK MHA DE Internet MX KR BR

/ Ocular To minimize the network traffic for responses we are using HTTP status codes for acknowledge 201 For error 400 – with status codes of the error { status : bad_request, errors : [], data: { … } } For no authentication or wrong authentication – 401 For pause / maintenance – 503 with Retry-After Header XX minutes (Retry-After: 15)

/ Ocular Currently only the backend emit events but we are planning to add it to the frontend interaction. It does use JSON Web Tokens (JWT) authentication -> header.payload.signiture eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ 4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM (diagram on the next slide) Events looks like this: { time: unix timestamp consumer_code: varchar user_id: integer // internal user id uuid: varchar // fingerprint is_api: bool attr: JSON }

/ Ocular Currently only the backend emit events but we are planning to add it to the frontend interaction. It does use JWT authentication

/ Ocular Each event has a configured database connection thus we are able to write one event into 1 database server and another event into another. We are manually partitioning the database. Each data server has per 1 slave for HighAvailability and for the possibility to blow up the master at any given moment. Also for the possibility to read from the slave with our ETL tool - we use pentaho. Each event has assigned strategy for storing the event: minimum 1 step - mysql maximum 3 steps – mysql → elastic search → powerbi

/ Ocular client Client has 4 purposes. 1.Easy distribution. We build the client as a php composer package and it is easy to implement in any php project. 2. Report expected data. Client has rough validation – important when you have different codebases/team 3. Acknowledgment. The client expects to receive an acknowledge response from the server. If it doesn't it would repeat the event 15 minutes later. It does that for 5 times and if all 6 times it fail to deliver to ocular server, it puts it in the country’s bucket for failed events. We designed it this way, so that we are able to handle 90 minutes of lost of internet between 2 continents but still if the internet lost is longer we are not loosing data 4.It can accept a pause command from the server. When the server is paused, the client will re-queue every event it wants to send for a time after the time the server announced it is not going to be available.

/ What works pretty nice 1. The ability to have all data even before a report is being requested. - data is being used for different purposes – back in the product, from sales, from marketing, for feeding data models for ML 2. Adding data to existing events – noSQL nature allows us to start building more rich reports 3. Mixture of the Relational SQL database benefits and NoSQL benefits. We can define indexes on columns as we would do in a normal MYSQL InnoDB table and we know how inserting or reading from table would be affected based on the queries we are running. 4. The ability to add indexes on top of the JSON columns - huge help when we want to perform operations on all events of previous month when a new report is requested or a new correlation between the data is found 5. It does sort the keys. It validates the JSON data on insert. 6. It is cheap to operate

/ Stuff that could be improved 1. Speed of storing data – mysql is optimized for reads. Updates (although we are not doing) of partial documents require write of the whole document (being fixed in mysql 8) 2. Not nice to work with array data. With 5.7 you cannot do much. (most likely going to be fixed in 8) 3. You need to build Facts and Dimensions again, although you almost have them in the product. Sometimes that could be an issue if you are really in a hurry and you need some number quite fast

/ Current load We run the web application and the queues only on 2 nodes. We are getting on avg around 15 000 to 20 0000 request per minute during the day in Europe and a bit less during the nights here. It is 8 th May 2017 GMT

/ What we achieved

/ Using mySQL 5.7 for the product Storing configurations in database or for example sessions data - especially combined with exposing data from the json as a column. Also you can add key on that. To replace a typical anti-pattern in relational database - scaling the rows when you have to store unknown amount of columns. - example our storage for changes.

/ Join us 1. Backend Developer https://pepper.bamboohr.co.uk/jobs/ 2. Frontend Developers 3. DWH/BI Developer

Simple Steps for Building a Scalable BI Solution on MySQL 5.7, its - PowerPoint PPT Presentation

Simple Steps for Building a Scalable BI Solution on MySQL 5.7, its JSON functionality and ElasticSearch / Intro Who is Pavel? Pepper.com ? / For what is this talk about How we get data from around the world in realtime? How we maintain and

Building Wealth Simple Steps to Financial Security Introduction/ Background 1) First of two

Building Scalable Big Data Pipelines NOSQL SEARCH ROADSHOW ZURICH Christian Ggi, Solution

MySQL Replication Update MySQL 5.5 (GA) & MySQL 5.6.2 (Dev. Milestone) Lars Thalmann

Presentation Skills in 7 simple steps Presentation Skills in 7 simple steps Filesize: 5.14 MB

MySQL Proxy Making MySQL more flexible Jan Kneschke jan@mysql.com MySQL Proxy proxy-servers

MySQL X Protocol Talking to MySQL Directly over the Wire Simon J Mudd

PHP and MySQL Dr. E. Benoist Winter Term 2006-2007 PHP and MySQL 1 PHP and MySQL Introduction

Building on Strengths Learning From Differences Baron Schwartz O'Reilly MySQL Conference &

MySQL Proxy meets: binlogs Jan Kneschke MySQL Enterprise Tools mailto: jan@mysql.com What is

Akka Building Distributed Systems for Concurrent, Fault-tolerant and Scalable Java Applications

Q&A 1 19/12/2013 9 Simple Steps To an Effective and (Profitable) Website TWITTER Use tag:

LAMP CC ICT-SUD Setting up and integrate Apache, MySQL and PHP on a Linux system Installation

MySQL Group Replication & MySQL InnoDB Cluster Production Ready? Kenny Gryp MySQL Practice

V6.0 Connecting, Building and Growing Your Seismic Processing Capability A Scalable Seismic

Presentation Skills in 7 simple steps Presentation Skills in 7 simple steps Book Review Book

BUILDING A GPU-FOCUSED CI SOLUTION Mike Wendt @mike_wendt github.com/nvidia

Forecasting MySQL Scalability Baron Schwartz O'Reilly MySQL Conference & Expo 2011

Troubleshooting Java Connection to MySQL About a week ago David Busby posted some Java code in

Secrets of Effective Communication Or . . . How You Can Change the World in 10 Simple Steps

MySQL 8 Tips and Tricks Dave Stokes @stoker david.stokes@oracle.com

PHP + MySQL MySQL on the command line is great and all well not its not really that great

S9334: Building And Managing Scalable AI Infrastructure With NVIDIA DGX Pod And DGX Pod

SQL & MySQL Jeff Siarto - TC 361 Whats the Difference? MySQL is a database SQL is

Alphaminds Tech Transforming your Ideas into Scalable Solution Registered Office: 29A / 59A,

Simple Steps for Building a Scalable BI Solution on MySQL 5.7, its - PowerPoint PPT Presentation

Simple Steps for Building a Scalable BI Solution on MySQL 5.7, its JSON functionality and ElasticSearch / Intro Who is Pavel? Pepper.com ? / For what is this talk about How we get data from around the world in realtime? How we maintain and

Building Wealth Simple Steps to Financial Security Introduction/ Background 1) First of two

Building Scalable Big Data Pipelines NOSQL SEARCH ROADSHOW ZURICH Christian Ggi, Solution

MySQL Replication Update MySQL 5.5 (GA) &amp; MySQL 5.6.2 (Dev. Milestone) Lars Thalmann

Presentation Skills in 7 simple steps Presentation Skills in 7 simple steps Filesize: 5.14 MB

MySQL Proxy Making MySQL more flexible Jan Kneschke jan@mysql.com MySQL Proxy proxy-servers

MySQL X Protocol Talking to MySQL Directly over the Wire Simon J Mudd

PHP and MySQL Dr. E. Benoist Winter Term 2006-2007 PHP and MySQL 1 PHP and MySQL Introduction

Building on Strengths Learning From Differences Baron Schwartz O'Reilly MySQL Conference &amp;

MySQL Proxy meets: binlogs Jan Kneschke MySQL Enterprise Tools mailto: jan@mysql.com What is

Akka Building Distributed Systems for Concurrent, Fault-tolerant and Scalable Java Applications

Q&amp;A 1 19/12/2013 9 Simple Steps To an Effective and (Profitable) Website TWITTER Use tag:

LAMP CC ICT-SUD Setting up and integrate Apache, MySQL and PHP on a Linux system Installation

MySQL Group Replication &amp; MySQL InnoDB Cluster Production Ready? Kenny Gryp MySQL Practice

V6.0 Connecting, Building and Growing Your Seismic Processing Capability A Scalable Seismic

Presentation Skills in 7 simple steps Presentation Skills in 7 simple steps Book Review Book

BUILDING A GPU-FOCUSED CI SOLUTION Mike Wendt @mike_wendt github.com/nvidia

Forecasting MySQL Scalability Baron Schwartz O'Reilly MySQL Conference &amp; Expo 2011

Troubleshooting Java Connection to MySQL About a week ago David Busby posted some Java code in

Secrets of Effective Communication Or . . . How You Can Change the World in 10 Simple Steps

MySQL 8 Tips and Tricks Dave Stokes @stoker david.stokes@oracle.com

PHP + MySQL MySQL on the command line is great and all well not its not really that great

S9334: Building And Managing Scalable AI Infrastructure With NVIDIA DGX Pod And DGX Pod

SQL &amp; MySQL Jeff Siarto - TC 361 Whats the Difference? MySQL is a database SQL is

Alphaminds Tech Transforming your Ideas into Scalable Solution Registered Office: 29A / 59A,

MySQL Replication Update MySQL 5.5 (GA) & MySQL 5.6.2 (Dev. Milestone) Lars Thalmann

Building on Strengths Learning From Differences Baron Schwartz O'Reilly MySQL Conference &

Q&A 1 19/12/2013 9 Simple Steps To an Effective and (Profitable) Website TWITTER Use tag:

MySQL Group Replication & MySQL InnoDB Cluster Production Ready? Kenny Gryp MySQL Practice

Forecasting MySQL Scalability Baron Schwartz O'Reilly MySQL Conference & Expo 2011

SQL & MySQL Jeff Siarto - TC 361 Whats the Difference? MySQL is a database SQL is