Flying Faster with Heron KARTHIK RAMASAMY @KARTHIKZ #TwitterHeron
TALK OUTLINE BEGIN b I II ( � III MOTIVATION HERON OVERVIEW K V Z IV END HERON OPERATIONAL PERFORMANCE EXPERIENCES
[ � OVERVIEW
TWITTER IS REAL TIME real time trends real time conversations real time recommendations real time search Ü � G s Real time sports Emerging break out Real time product Real time search of conversations related trends in Twitter (in the recommendations based tweets with a topic (recent goal form #hashtags) on your behavior & or touchdown) profile ANALYZING BILLIONS OF EVENTS IN REAL TIME IS A CHALLENGE!
TWITTER STORM Streaming platform for analyzing realtime data as they arrive, so you can react to data as it happens . / Ñ \ b GUARANTEED HORIZONTAL ROBUST CONCISE MESSAGE SCALABILITY FAULT CODE- FOCUS PROCESSING TOLERANCE ON LOGIC
STORM TERMINOLOGY TOPOLOGY , Directed acyclic graph Vertices=computation, and edges=streams of data tuples SPOUTS Sources of data tuples for the topology Examples - Event Bus/Kafka/Kestrel/MySQL/Postgres BOLTS % Process incoming tuples and emit outgoing tuples Examples - filtering/aggregation/join/arbitrary function
STORM TOPOLOGY BOLT 1 % SPOUT 1 BOLT 4 % BOLT 2 % % % SPOUT 2 BOLT 5 BOLT 3
WORD COUNT TOPOLOGY Live stream of Tweets % % TWEET SPOUT PARSE TWEET BOLT WORD COUNT BOLT LOGICAL PLAN
WORD COUNT TOPOLOGY % % % % % % % % % % TWEET SPOUT PARSE TWEET BOLT WORD COUNT BOLT TASKS TASKS TASKS When a parse tweet bolt task emits a tuple which word count bolt task should it send to?
STREAM GROUPINGS SHUFFLE GROUPING FIELDS GROUPING ALL GROUPING GLOBAL GROUPING / - . , Random distribution Group tuples by a Replicates tuples to Sends the entire of tuples field or multiple all tasks stream to one task fields
WORD COUNT TOPOLOGY SHUFFLE GROUPING FIELDS GROUPING % % % % % % % % % % TWEET SPOUT PARSE TWEET BOLT WORD COUNT BOLT TASKS TASKS TASKS
( MOTIVATION
STORM ARCHITECTURE MASTER NODE TOPOLOGY Nimbus ASSIGNMENT SUBMISSION MAPS Multiple Functionality Single point of failure Scheduling/Monitoring ZK No resource reservation CLUSTER and isolation Storage Contention SUPERVISOR SUPERVISOR W2 W3 W4 W2 W3 W4 W1 W1 SLAVE NODE SLAVE NODE
STORM WORKER EXECUTOR2 EXECUTOR1 Complex hierarchy TASK1 JVM PROCESS Hard to debug TASK4 TASK2 Difficult to tune TASK5 TASK3
DATA FLOW IN STORM WORKERS User Logic User Logic User Logic User Logic User Logic User Logic In Queue User Logic User Logic In Queue In Queue User Logic In Queue In Queue In Queue In Queue Thread Thread In Queue Out Queue Thread Thread Send Thread In Queue Thread Thread Thread Thread Thread Queue Contention Global Receive Outgoing Thread Message Buffer Global Send TCP Receive Buffer Multiple Languages Thread TCP Send Buffer Kernel
OVERLOADED ZOOKEEPER Scaled up STORM S1 W zk W S2 zk W S3 Handled unto to 1200 workers per cluster
OVERLOADED ZOOKEEPER Analyzing zookeeper traffic KAFKA SPOUT 67% Offset/partition is written every 2 secs STORM RUNTIME 33% Workers write heart beats every 3 secs
OVERLOADED ZOOKEEPER Heart beat daemons STORM S1 W zk H HH W S2 zk W S3 KV KV KV 5000 workers per cluster
STORM - DEPLOYMENT shared pool storm cluster
STORM - DEPLOYMENT shared pool isolated pools joe’s topology storm cluster
STORM - DEPLOYMENT shared pool isolated pools joe’s topology storm jane’s topology cluster
STORM - DEPLOYMENT shared pool isolated pools joe’s topology storm jane’s topology cluster dave’s topology
STORM ISSUES LACK OF BACK PRESSURE g Drops tuples unpredictably EFFICIENCY G Serialization program consumes 75 cores at 30% CPU Topology consumes 600 cores at 20-30% CPU NO BATCHING � Tuple oriented system - implicit batching by 0MQ
EVOLUTION OR REVOLUTION? fix storm or develop a new system? FUNDAMENTAL ISSUES- REQUIRE EXTENSIVE REWRITING , Several queues for moving data Inflexible and requires longer development cycle USE EXISTING OPEN SOURCE SOLUTIONS Issues working at scale/lacks required performance Incompatible API and long migration process
b HERON
HERON DESIGN GOALS FULLY API COMPATIBLE WITH STORM � Directed acyclic graph Topologies, spouts and bolts TASK ISOLATION � Ease of debug ability/resource isolation/profiling USE OF MAIN STREAM LANGUAGES d C++/JAVA/Python
HERON ARCHITECTURE Topology 1 Scheduler Topology 2 TOPOLOGY SUBMISSION Topology 3 Topology N
TOPOLOGY ARCHITECTURE Logical Plan, Physical Plan and Topology Execution State Master ZK CLUSTER Sync Physical Plan Metrics Metrics Stream Stream Manager Manager Manager Manager I1 I2 I3 I4 I1 I2 I3 I4 CONTAINER CONTAINER
TOPOLOGY MASTER Solely responsible for the entire topology Ñ \ b ASSIGNS ROLE MONITORING METRICS
TOPOLOGY MASTER Logical Plan, Physical Plan and Topology Execution State Master ZK CLUSTER � PREVENT MULTIPLE TM BECOMING MASTERS � ALLOWS OTHER PROCESS TO DISCOVER TM
STREAM MANAGER Routing Engine Ñ / , ROUTES TUPLES BACK PRESSURE ACK MGMT
STREAM MANAGER S1 B2 B3 B4 % % %
STREAM MANAGER S1 B2 S1 B2 Stream Stream Manager Manager B3 B4 B3 B4 O(n 2 ) O(k 2 ) S1 B2 S1 B2 Stream Stream Manager Manager B3 B4 B3
STREAM MANAGER tcp back pressure S1 B2 S1 B2 Stream Stream Manager Manager B3 B4 B3 B4 S1 B2 S1 B2 Stream Stream Manager Manager B3 B4 B3 SLOWS UPSTREAM AND DOWNSTREAM INSTANCES
STREAM MANAGER spout back pressure S1 S1 B2 S1 S1 B2 Stream Stream Manager Manager B3 B4 B3 B4 S1 S1 B2 S1 S1 B2 Stream Stream Manager Manager B3 B4 B3
STREAM MANAGER stage by stage back pressure S1 S1 B2 B2 S1 S1 B2 B2 Stream Stream Manager Manager B3 B4 B3 B4 S1 S1 B2 B2 S1 S1 B2 B2 Stream Stream Manager Manager B3 B4 B3
STREAM MANAGER back pressure advantages PREDICTABILITY � Tuple failures are more deterministic SELF ADJUSTS � Topology goes as fast as the slowest component
HERON INSTANCE Does the real work! > > | > p RUNS ONE TASK EXPOSES API COLLECTS METRICS
HERON INSTANCE Stream Manager data-in queue Gateway Task Execution Thread Thread data-out queue Metrics metrics-out queue Manager
K OPERATIONAL EXPERIENCES �
HERON DEPLOYMENT Topology 1 ZK Aurora Scheduler CLUSTER Aurora Services Topology 2 Heron Web Topology 3 Heron Tracker Heron Topology N VIZ Observability
HERON SAMPLE TOPOLOGIES
SAMPLE TOPOLOGY DASHBOARD
HERON @TWITTER STORM is decommissioned Large amount of data Large cluster Several topologies Several billion produced every day deployed messages every day 1 stage 10 stages 3x reduction in cores and memory
x HERON PERFORMANCE 9
HERON PERFORMANCE Settings COMPONENTS EXPT #1 EXPT #2 EXPT #3 EXPT #4 Spout 25 100 200 300 Bolt 25 100 200 300 # Heron containers 25 100 200 300 # Storm workers 25 100 200 300
HERON PERFORMANCE Word count topology - Acknowledgements enabled Throughput Latency Storm Heron Storm Heron 1400 2500 1050 1875 million tuples/min latency (ms) 700 1250 350 625 0 0 25 100 200 500 25 100 200 500 Spout Parallelism Spout Parallelism 10-14x 5-15x
HERON PERFORMANCE Word count topology - CPU usage Storm Heron 2500 1875 # cores used 1250 625 0 25 100 200 500 Spout Parallelism 2-3x
HERON PERFORMANCE Throughput and CPU usage with no acknowledgements - Word count topology Storm Heron Storm Heron 5000 2500 3750 1875 million tuples/min # cores used 2500 1250 1250 625 0 0 25 100 200 500 25 100 200 500 Spout Parallelism Spout Parallelism
HERON EXPERIMENT RTAC topology SHUFFLE FIELDS FIELDS % % % GROUPING GROUPING GROUPING CLIENT EVENT DISTRIBUTOR USER COUNT AGGREGATOR SPOUT BOLT BOLT BOLT
HERON PERFORMANCE CPU usage - RTAC Topology Storm Heron Storm Heron No acknowledgements Acknowledgements enabled 400 400 300 300 # cores used # cores used 200 200 100 100 0 0
HERON PERFORMANCE Latency with acknowledgements enabled - RTAC Topology Storm Heron 70 52.5 latency (ms) 35 17.5 0
CURIOUS TO LEARN MORE… Twitter Heron: Stream Processing at Scale Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel *,1 , Karthik Ramasamy, Siddarth Taneja @sanjeevrk, @challenger_nik, @Louis_Fumaosong, @vikkyrk, @cckellogg, @saileshmittal, @pateljm, @karthikz, @staneja Twitter, Inc., *University of Wisconsin – Madison Storm @Twitter Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel*, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, Dmitriy Ryaboy @ankitoshniwal, @staneja, @amits, @karthikz, @pateljm, @sanjeevrk, @jason_j, @krishnagade, @Louis_Fumaosong, @jakedonham, @challenger_nik, @saileshmittal, @squarecog Twitter, Inc., *University of Wisconsin – Madison
Recommend
More recommend