Distributed Logging Architecture in Container Era LinuxCon Japan - PowerPoint PPT Presentation

Distributed Logging Architecture in Container Era LinuxCon Japan 2016 at Jun 13 2016 Satoshi "Moris" Tagomori (@tagomoris)

Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.

Topics • Microservices and logging in various industries • Difficulties of logging with containers • Distributed logging architecture • Patterns of distributed logging architecture • Case Study: Docket and Fluentd • Why OSS are important for logging

Logging

Logging in Various Industries • Web access logs • Views/visitors on media • Views/clicks on Ads • Commercial transactions (EC, Game, ...) • Data from devices • Operation logs on Apps of phones • Various sensor data

Microservices and Logging Users Users Service (Application) Logs Logs • Microservices • many services produce data • Monolithic service about an users access • a service produces all data • it's needed to collect logs about an users access from many services to know what is happening

Logging and Containers

Containers: "a must" for microservices • Dividing a service into services • a service requires less computing resources   (VM -> containers) • Making services independent from each other • but it is very difficult :( • some dependency must be solved even in development environment   (containers on desktop)

Redesign Logging: Why? • No permanent storages • No fixed physical/network address • No fixed mapping between servers and roles

Containers: immutable & disposable • No permanent storages • Where to write logs? • file in container   → be gone w/ container instance 😟 • dir shared from host   → hosts are shared by many services ☹ • TODO: ship logs from container to anywhere ASAP

Containers: unfixed addresses • No fixed physical / network address • Where should we go to fetch logs? • Service discovery (e.g., consul)   → one more component 😟 • rsync? ssh+tail? or ..? Is it installed in container?   → one more tool to depend on ☹ • TODO: push logs to anywhere from containers

Containers: instances per roles • No fixed mapping between servers and roles • How can we parse / store these logs? • Central repository about log syntax   → very hard to maintain 😟 • Label logs by source address   → many containers/roles in a host ☹ • TODO: label & parse logs at source of logs

Distributed Logging Architecture

Core Architecture Collector nodes (Docker containers + agent) • Collector nodes • Aggregator nodes Aggregator nodes • Destination Destination   (Storage, Database, ...)

Collecting and Storing Data • Parse (collector) • Raw logs are not good for processing • Convert logs to structured data (key-value pairs) • Sort/Shuffle (aggregator) • Mixed logs are not good for scanning • Split whole data stream into streams • Store (destination) • Format logs(records) as destination expects

Scaling Logging • Network traffic • CPU load to parse / format • Parse logs on each collector (distributed) • Format logs on aggregator (to be distributed) • Capability • Make aggregators redundant • Controlling delay

Patterns

Aggregation Patterns source aggregation source aggregation NO YES destination aggregation NO destination aggregation YES

Source Side Aggregation Patterns w/ source aggregation w/o source aggregation collector aggregate container aggregator

Without Source Aggregation collector • Pros: • Simple configuration aggregator • Cons: • fixed aggregator (endpoint) address • many network connections • high load in aggregator

With Source Aggregation • Pros: • less connections aggregate • lower load in aggregator container • less configuration in containers   (by specifying localhost) • highly flexible configuration   (by deployment only for aggregate containers) • Cons: • a bit much resource (+1 container per host)

Destination Side Aggregation Patterns w/o destination aggregation w/ destination aggregation collector aggregator destination

Without Destination Aggregation • Pros: • Less nodes • Simpler configuration • Cons: • Storage side change affects collector side • Worse performance: many small write requests on storage

With Destination Aggregation • Pros: • Collector side configuration is   free from storage side changes aggregator • Better performance with fine tune   on destination side aggregator • Cons: • More nodes • A bit complex configuration

Scaling Patterns Scaling Up Endpoints Scaling Out Endpoints HTTP/TCP load balancer Round-robin clients Huge queue + workers Collector nodes Load balancer Aggregator nodes Backend nodes

Scaling Up Endpoints • Pros: • Simple configuration   Load balancer in collector nodes • Cons: • Scaling up limit Backend nodes

Scaling Out Endpoints • Pros: • Unlimited scaling   by adding aggregator nodes • Cons: • Complex configuration • Client features for round-robin

Without   With   Destination Aggregation Destination Aggregation Collecting logs over Internet Scaling Up Systems in early stages Endpoints or Using queues Impossible :( Scaling Out Collecting logs Collector nodes must know Endpoints in datacenter all endpoints ↓ Uncontrollable

Case Studies

Case Study: Docker+Fluentd • Destination aggregation + scaling up • Fluent logger + Fluentd • Source aggregation + scaling up • Docker json logger + Fluentd + Elasticsearch • Docker fluentd logger + Fluentd + Kafka • Source/Destination aggregation + scaling out • Docker fluentd logger + Fluentd

Why Fluentd? • Docker Fluentd logging driver • Docker container can send logs into Fluentd directly - less overhead • Pluggable architecture • Various destination systems • Small memory footprint • Source aggregation requires +1 container per host • Less additional resource usage ( < 100MB )

Destination aggregation + scaling up • Sending logs directly over TCP by Fluentd logger   in application code • Same with patterns of New Relic Application code

Source aggregation + scaling up • Kubernetes: Json logger + Fluentd + Elasticsearch • Applications write logs to STDOUT Application code • Docker writes logs as JSON in files • Fluentd   reads logs from file   Files (JSON) parse JSON objects   writes logs to Elasticsearch Elasticsearch http://kubernetes.io/docs/getting-started-guides/logging-elasticsearch/

Source aggregation + scaling up • Docker fluentd logging driver + Fluentd + Kafka • Applications write logs to STDOUT Application code • Docker sends logs   to localhost Fluentd • Fluentd   gets logs over TCP   pushes logs into Kafka Kafka

Source/Destination aggregation + scaling out • Docker fluentd logging driver + Fluentd • Applications write logs to STDOUT Application code • Docker sends logs   to localhost Fluentd • Fluentd   gets logs over TCP   sends logs into Aggregator Fluentd   w/ round-robin load balance

What's the Best? • Writing logs from containers: Some way to do it • Docker logging driver • Write logs on files + read/parse it • Send logs from apps directly • Keep it scalable! • Source aggregation: Fluentd on localhost • Scalable storage: (Kafka, external services, ...) • No destination aggregation + Scaling up • Non-scalable storage: (Filesystems, RDBMSs, ...) • Destination aggregation + Scaling out

Why OSS Are Important For Logging?

Why OSS? • Logging layer is interface • transparency • interoperability • Keep it scalable • number of nodes • number of types of source/destination

Use OSS, Make Logging Scalable

Distributed Logging Architecture in Container Era LinuxCon Japan - PowerPoint PPT Presentation

Distributed Logging Architecture in Container Era LinuxCon Japan 2016 at Jun 13 2016 Satoshi "Moris" Tagomori (@tagomoris) Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.

DISASTER RELIEF CENTER 2x Accommodation Container 2x Sanitary Container 1x

Container Library and FUSE Container File System Softwarepraktikum f ur Fortgeschrittene

Postcapitalism Jamie Dobson, GOTO Berlin, 2016 www.container-solutions.com |

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #5: LOGGING

ALMA Common Software Basic Track Logging and Error Systems Logging system conceptual overview

Debugging & Logging Java Logging Java has built-in support for logging Logs contain

ERA 1 ERA I I ( i) Deakin and Faculty of Bus. & Law Response to ERA I ( ii)

Kubernetes Crossing the Chasm 05.03.2018 Ian Crosby @IanDCrosby info@container-solutions.com

Mini-Bulk/IBC Pesticide Container Collection Program EPA Sponsored California San Joaquin Valley

Container Live Migration Adrian Reber FOSDEM 2020, February 01 Red Hat Blog: Container

Samson Logging Tires Logging Tire Size Definition 24.5-32/16 24.5 = section width in inches -

Logging and Recovery Module 6, Lectures 3 and 4 If you are going to be in the logging business,

Logging with ASP.NET Core Damien Bowden Microsoft MVP https://damienbod.com @damien_bod Why

LHC LOGGING Timeline of t he proj ect , resources Cont ext : where does logging f it in? Basic

RPC / failure 1 last time redo logging (fjnish) (weird?) choice not to use redo logging for

E RA- MIN 2 Sta rting De c 1 st 2016 2 About ERA MIN 2 ERA MIN 2 is an ERA NET

A Computational Mo del for Represen tation of Image V elo citi es Eero Simoncelli

Semantic entropy measures and the semantic transparency of noun noun compounds Melanie J. Bell,

Disabilities during Periods of Remote or Blended Learning Technical Assistance Session 2:

DTKI A Protocol for Public-Key Validation Jiangshan Yu, Vincent Cheval, and Mark Ryan School of

R Programming Basics Thomas J. Leeper May 20, 2015 1 Functions Built-in functions x <-

Emulab Anton Burtsev, Prashanth Radhakrishnan, Mike Hibler, and Jay Lepreau University of Utah,

Swiss E-Voting Workshop September 6, 2010 TRANSPARENCY SECURITY 2 VERIFIABILITY PRIVACY 3

REDBOOK 101 Accounting Procedures for Kentucky School Activity Funds 2 What is the

Distributed Logging Architecture in Container Era LinuxCon Japan - PowerPoint PPT Presentation

Distributed Logging Architecture in Container Era LinuxCon Japan 2016 at Jun 13 2016 Satoshi "Moris" Tagomori (@tagomoris) Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.

DISASTER RELIEF CENTER 2x Accommodation Container 2x Sanitary Container 1x

Container Library and FUSE Container File System Softwarepraktikum f ur Fortgeschrittene

Postcapitalism Jamie Dobson, GOTO Berlin, 2016 www.container-solutions.com |

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #5: LOGGING

ALMA Common Software Basic Track Logging and Error Systems Logging system conceptual overview

Debugging &amp; Logging Java Logging Java has built-in support for logging Logs contain

ERA 1 ERA I I ( i) Deakin and Faculty of Bus. &amp; Law Response to ERA I ( ii)

Kubernetes Crossing the Chasm 05.03.2018 Ian Crosby @IanDCrosby info@container-solutions.com

Mini-Bulk/IBC Pesticide Container Collection Program EPA Sponsored California San Joaquin Valley

Container Live Migration Adrian Reber FOSDEM 2020, February 01 Red Hat Blog: Container

Samson Logging Tires Logging Tire Size Definition 24.5-32/16 24.5 = section width in inches -

Logging and Recovery Module 6, Lectures 3 and 4 If you are going to be in the logging business,

Logging with ASP.NET Core Damien Bowden Microsoft MVP https://damienbod.com @damien_bod Why

LHC LOGGING Timeline of t he proj ect , resources Cont ext : where does logging f it in? Basic

RPC / failure 1 last time redo logging (fjnish) (weird?) choice not to use redo logging for

E RA- MIN 2 Sta rting De c 1 st 2016 2 About ERA MIN 2 ERA MIN 2 is an ERA NET

A Computational Mo del for Represen tation of Image V elo citi es Eero Simoncelli

Semantic entropy measures and the semantic transparency of noun noun compounds Melanie J. Bell,

Disabilities during Periods of Remote or Blended Learning Technical Assistance Session 2:

DTKI A Protocol for Public-Key Validation Jiangshan Yu, Vincent Cheval, and Mark Ryan School of

R Programming Basics Thomas J. Leeper May 20, 2015 1 Functions Built-in functions x &lt;-

Emulab Anton Burtsev, Prashanth Radhakrishnan, Mike Hibler, and Jay Lepreau University of Utah,

Swiss E-Voting Workshop September 6, 2010 TRANSPARENCY SECURITY 2 VERIFIABILITY PRIVACY 3

REDBOOK 101 Accounting Procedures for Kentucky School Activity Funds 2 What is the

Debugging & Logging Java Logging Java has built-in support for logging Logs contain

ERA 1 ERA I I ( i) Deakin and Faculty of Bus. & Law Response to ERA I ( ii)

R Programming Basics Thomas J. Leeper May 20, 2015 1 Functions Built-in functions x <-