Migrating from Oracle to Espresso David Max Senior Software - PowerPoint PPT Presentation

Migrating from Oracle to Espresso David Max Senior Software Engineer LinkedIn

About LinkedIn New York Engineering • Located in Empire State Building • Approximately 100 engineers and 1000 employees total New York • Multiple teams, front end, back Engineering end, and data science

About Me • Software Engineer at LinkedIn NYC since 2015 • Content Ingestion team • Office Hours – Thursday 11:30-12:00 David Max Senior Software Engineer LinkedIn www.linkedin.com/in/davidpmax/

What is Content Ingestion? Content Ingestion Babylonia

Babylonia Content Ingestion

url: https://www.youtube.com/watch?v=MS3c9hz0bRg title: "SATURN 2017 Keynote: Software is Details” image: Babylonia https://i.ytimg.com/vi/MS3c9hz0bRg/hqdefault.jpg?sq poaymwEYCKgBEF5IVfKriqkDCwgBFQAAiEIYAXAB\\u00 Content Ingestion 26rs=AOn4CLClwjQlBmMeoRCePtHaThN-qXRHqg

Babylonia Content Ingestion

What is Content Ingestion? • Extracts metadata from web pages • Source of Truth for 3 rd party content • Also contains metadata for some public 1 st party content Babylonia • Used by LinkedIn services for sharing, decorating, and embedding content Content Ingestion • Data also feeds into content understanding and relevance models

Babylonia Datasets HDFS Database ETL Babylonia Data Change Events Content Ingestion

Downstream and Upstream Datasets HDFS Offline Database ETL Babylonia Data Change Events Content Ingestion Near Line

Babylonia use of Oracle (before migration) • Schema – Metadata extracted from • RDBMS – Relational Database each URL stored in individual rows Management System • Client –Babylonia the main (but not • Databus – Platform for streaming only) client to directly execute data change events to near line queries on Oracle DB consumers • Rest.li – Most online interaction with • Offline – ETL to HDFS for offline dataset in Oracle via Babylonia’s consumers Rest.li API

Espresso is LinkedIn’s strategic distributed, fault-tolerant NoSQL database that powers many of What is LinkedIn’s services Espresso? • ~100 clusters in use* • ~420TB of SoT data* • ~2 million qps at peak load* * as of August 1, 2017

What is Espresso? • NoSQL – Non relational • Document – A table is a container for documents of the same schema • Distributed – A single database can (defined in Avro) be distributed over a cluster of machines • Keys – Documents index by key fields, which are defined in the table • Scalable – Able to scale clusters schema horizontally by adding more nodes

Why Migrate? • Maintenance – Babylonia’s Oracle • Integration – Support for Espresso tables required periodic jobs to be integrated with other tools and run that involved downtime for each systems at LinkedIn server • Rest.li – Espresso’s API is based on • Cost – Oracle more expensive to run Rest.li, which makes it easier to treat Espresso endpoints like other • Strategy – Espresso is the preferred LinkedIn Rest.li endpoints platform at LinkedIn for data of this type • Schema Evolution – Supported with zero downtime and no coordination • Support – Espresso team part of with DBA teams LinkedIn

Data Formats (Oracle) Pegasus Oracle Row Oracle Row Rest.li Object Endpoints Oracle Row Oracle HDFS Offline Database ETL Pegasus Data Babylonia Oracle Databus Events Content Ingestion Near • Complex transformation Line Oracle Row between Oracle format and Pegasus format

Pegasus and Avro • Pegasus and Avro schema Pegasus Avro definitions are very similar Schema Schema • Both can be used to generate Java objects with very similar interfaces • Pegasus schema can be used to auto-generate the Java Java Objects Objects Avro schema

Data Formats (Espresso) Pegasus Espresso Avro Espresso Avro Rest.li Object Endpoints Espresso Avro Espresso HDFS Offline Database ETL Pegasus Data Babylonia Espresso Brooklin Events Content Ingestion Near • Simple transformation Line Espresso Avro between Avro format and Pegasus format

Why Migrate? Schema Evolution Espresso • ALTER TABLE • Document schema auto-registration • Not tied to code deployment – need to • Schema changes are registered coordinate with DBAs automatically as part of the Babylonia deployment process • Schema change involves server downtime • Backwards compatibility is enforced – existing data does not need to be • In practice, developers go to great transformed lengths to avoid the hassle • Avro schema more natural fit with • Schema accumulates tech debt Rest.li Pegasus schema

• Zero down time • Transparent to Rest.li clients Goals for • Give offline and nearline Migration Process consumers time to migrate • Validate each step • Mirroring in real time

Pre-Migration State of Babylonia Oracle HDFS Offline Database ETL Babylonia Oracle Databus Events Content Ingestion Near Line

Pre-Migration State of Babylonia Rest.li Endpoints Oracle Database Rest.li Calls Oracle Databus Events Other Services

Pre-Migration Cleanup Rest.li • Identify code that is Endpoints tightly-coupled to the Oracle database Database • Decide which code should Rest.li be reimplemented for Calls Oracle Databus Espresso, and which code Events should be decoupled or eliminated. • Reduce number of code Other Services paths to migrate The easiest lines of code to migrate are the lines of code that don’t exist

Bootstrap Espresso Database Offline Oracle HDFS Convert Database ETL Job Espresso Avro Data Espresso Bulk File Database Loader

Bootstrap Espresso Database Oracle HDFS Database ETL Espresso Database

Databus Listener, Shadow Read Validation Oracle Database Shadow Read Validation Oracle Databus Events Espresso Databus Database Listener

Direct Writes to Espresso Oracle Database Shadow Read Validation Oracle Databus Events Direct Espresso Databus Database Listener Write

Resolving Write Conflicts • Dual Write Conflict – Databus Listener • Migration Control – optional and Babylonia updating same record field added to scheme indicating which process wrote the record: Bulk Loader, Oracle Databus Events Databus listener, or Babylonia Direct Espresso Databus Database Listener Write

Espresso New SoT Dual Writes Oracle Deprecated Database Oracle Databus Events Direct Espresso Espresso Read/Write Database Brooklin Events

Oracle Turnoff Direct Espresso Espresso Read/Write Database Brooklin Events

Thank you

Migrating from Oracle to Espresso David Max Senior Software - PowerPoint PPT Presentation

Migrating from Oracle to Espresso David Max Senior Software Engineer LinkedIn About LinkedIn New York Engineering Located in Empire State Building Approximately 100 engineers and 1000 employees total New York Multiple teams, front

Migrating from Grid to Cloud: Migrating from Grid to Cloud: Migrating from Grid to Cloud:

Migrating to Java 9 Modules @Sander_Mak By Sander Mak Migrating to Java 9 Java 8 java -cp ..

Migrating Legacy.com Migrating a top 50 most visited site in the U.S. onto Drupal - Legacy.com

Oracle Buys AmberPoint Strengthens Oracle Fusion Middleware SOA Suite and Oracle Enterprise

Oracle eBusiness Suite 11i Integration Ulrich Janke Oracle Consulting Deutschland Page 1

Lecture 13: Oracle Turing Machines Arijit Bishnu 13.04.2010 Oracle Turing Machines

Oracle SOA Suite Enterprise Service Bus Oracle Integration Product Management Multi Tiered

Oracle SOA Suite Enterprise Service Bus Oracle Integration Product Management Oracle ESB Header

Oracle Buys Ksplice Oracle Linux Enhanced with Zero Downtime Software Updates July 21, 2011

Oracle Database 11g Highly Available Grid made easy with Oracle Enterprise Manager Venkat

Oracle Partner Network (OPN) Specialisms Andy Butchart - Prject (EU) Ltd Frank Lauer - Oracle

Migrating GNOME to Git Migrating GNOME to Git (a human & technical perspective) Frdric

Migrating to PostgreSQL Boriss Mejas Consultant - 2ndQuadrant Air Guitar Player https://www.

Kobalto Highlights Patented Z3000 Necta espresso brewer producing 15 bar pressure for the

ESPResSo under the hood Axel Arnold Institute for Computational Physics Universit at

Mr.Coffee Espresso Machine Jefferson Delgado Allan Li Thanh Tran Antonio Whitehead

Case report of cheilitis granulomatosa and joint complaints as presentation of Crohns disease

Aladdin Oil & Gas Company ASA Emerging Market Opportunity with Significant Growth Potential

MORE stands for M aintenance O ptimization R epair E ngineering W e offer high-quality kiln

R-QUEST Centre for Research Quality and Policy Impact Studies Why study research quality? The

Instan(t)a-neous Monitoring Instan(t)a-neous Monitoring Have You Ever Had The Feeling You Wanted

Espresso Somdeep Dey Rohit Gurunath Jianfeng Qian Oliver Willens Overview Introduction

KARISMA Machine Presentation INTRODUCING KARISMA KARISMA IS NOT LIKE ITALIAN COFFEE, IT IS

Make Your UI Tests Resilient with the Next Generation of

Sambuz

Useful Links

Newsletter

Mail Us

Migrating from Oracle to Espresso David Max Senior Software - PowerPoint PPT Presentation

Migrating from Oracle to Espresso David Max Senior Software Engineer LinkedIn About LinkedIn New York Engineering Located in Empire State Building Approximately 100 engineers and 1000 employees total New York Multiple teams, front

Migrating from Grid to Cloud: Migrating from Grid to Cloud: Migrating from Grid to Cloud:

Migrating to Java 9 Modules @Sander_Mak By Sander Mak Migrating to Java 9 Java 8 java -cp ..

Migrating Legacy.com Migrating a top 50 most visited site in the U.S. onto Drupal - Legacy.com

Oracle Buys AmberPoint Strengthens Oracle Fusion Middleware SOA Suite and Oracle Enterprise

Oracle eBusiness Suite 11i Integration Ulrich Janke Oracle Consulting Deutschland Page 1

Lecture 13: Oracle Turing Machines Arijit Bishnu 13.04.2010 Oracle Turing Machines

Oracle SOA Suite Enterprise Service Bus Oracle Integration Product Management Multi Tiered

Oracle SOA Suite Enterprise Service Bus Oracle Integration Product Management Oracle ESB Header

Oracle Buys Ksplice Oracle Linux Enhanced with Zero Downtime Software Updates July 21, 2011

Oracle Database 11g Highly Available Grid made easy with Oracle Enterprise Manager Venkat

Oracle Partner Network (OPN) Specialisms Andy Butchart - Prject (EU) Ltd Frank Lauer - Oracle

Migrating GNOME to Git Migrating GNOME to Git (a human &amp; technical perspective) Frdric

Migrating to PostgreSQL Boriss Mejas Consultant - 2ndQuadrant Air Guitar Player https://www.

Kobalto Highlights Patented Z3000 Necta espresso brewer producing 15 bar pressure for the

ESPResSo under the hood Axel Arnold Institute for Computational Physics Universit at

Mr.Coffee Espresso Machine Jefferson Delgado Allan Li Thanh Tran Antonio Whitehead

Case report of cheilitis granulomatosa and joint complaints as presentation of Crohns disease

Aladdin Oil &amp; Gas Company ASA Emerging Market Opportunity with Significant Growth Potential

MORE stands for M aintenance O ptimization R epair E ngineering W e offer high-quality kiln

R-QUEST Centre for Research Quality and Policy Impact Studies Why study research quality? The

Instan(t)a-neous Monitoring Instan(t)a-neous Monitoring Have You Ever Had The Feeling You Wanted

Espresso Somdeep Dey Rohit Gurunath Jianfeng Qian Oliver Willens Overview Introduction

KARISMA Machine Presentation INTRODUCING KARISMA KARISMA IS NOT LIKE ITALIAN COFFEE, IT IS

Make Your UI Tests Resilient with the Next Generation of

Sambuz

Useful Links

Newsletter

Mail Us

Migrating GNOME to Git Migrating GNOME to Git (a human & technical perspective) Frdric

Aladdin Oil & Gas Company ASA Emerging Market Opportunity with Significant Growth Potential