real time big data
play

Real-Time Big Data Streaming Framework Junguk Cho, Hyunseok Chang, - PowerPoint PPT Presentation

Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework Junguk Cho, Hyunseok Chang, Sarit Mukherjee, T.V. Lakshman, and Jacobus Van der Merwe 1 Big Data Era Big data analysis is increasingly common Recommendation systems


  1. Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework Junguk Cho, Hyunseok Chang, Sarit Mukherjee, T.V. Lakshman, and Jacobus Van der Merwe 1

  2. Big Data Era ● Big data analysis is increasingly common ○ Recommendation systems (e.g., Netflix, Amazon) ○ Targeted advertising (e.g., Google, Facebook) ○ Mobile network management (e.g., AT&T, Verizon) ● Real time stream framework ○ Continuously process unlimited streams of data ○ High throughput and low latency using computation parallelism ○ Fault tolerance ○ A lot of open-source stream frameworks (e.g., Storm, Heron, Flink, Spark etc.) 2

  3. Limitations of Current Real-Time Stream Frameworks ● Lack of runtime flexibility ○ Scale-in/out, computation logic, and routing policy 3

  4. Limitations of Current Real-Time Stream Frameworks ● Lack of runtime flexibility ○ Scale-in/out, computation logic, and routing policy ○ Solutions ■ Shutdown -> modification -> restart ● Lose data 4

  5. Limitations of Current Real-Time Stream Frameworks ● Lack of runtime flexibility ○ Scale-in/out, computation logic, and routing policy ○ Solutions ■ Shutdown -> modification -> restart ● Lose data ■ Instant swapping ● Require stable front-end queue systems (e.g., kafka) ● Still resource inefficient 5

  6. Limitations of Current Real-Time Stream Frameworks ● Lack of runtime flexibility ○ Scale-in/out, computation logic, and routing policy ○ Solutions ■ Shutdown -> modification -> restart ● Lose data ■ Instant swapping ● Require stable front-end queue systems (e.g., kafka) ● Still resource inefficient ● Not optimal for one-to-many communication ○ No broadcast concept ○ Multiple same computations in application-layer 6

  7. Limitations of Current Real-Time Stream Frameworks ● Lack of runtime flexibility ○ Scale-in/out, computation logic, and routing policy ○ Solutions ■ Shutdown -> modification -> restart ● Lose data ■ Instant swapping ● Require stable front-end queue systems (e.g., kafka) ● Still resource inefficient ● Not optimal for one-to-many communication ○ No broadcast concept ○ Multiple same computations in application-layer Problem on inflexibility and performance is mainly from application-layer data routing 7

  8. Limitations of Current Real-Time Stream Frameworks ● Lack of runtime flexibility ○ Scale-in/out, computation logic, and routing policy ○ Solutions ■ Shutdown -> modification -> restart ● Lose data ■ Instant swapping ● Require stable front-end queue systems (e.g., kafka) ● Still resource inefficient ● Not optimal for one-to-many communication ○ No broadcast concept ○ Multiple same computations in application-layer ● Lack of management for deployed stream apps ○ Limited to application monitoring 8

  9. Typhoon : An SDN Enhanced Real-Time Stream Framework ● Offer flexible stream processing pipelines for dynamic reconfigurations ○ Scale-in/out, computation logic, routing policy ● Optimize one-to-many communication ○ Broadcast concept at network layer ● Enhance the management capabilities in stream frameworks ○ SDN control plane applications ● Cross-layer design ○ Partially offload application data routing layer into network layer ○ Manage both layers from an SDN controller 9

  10. Why SDN in Stream Framework? ● SDN benefits ○ Centralized views and control ○ Programmability ● One-to-many communication ○ More efficient packet-level replications at the network layer ● Well-defined protocol and interface ○ OpenFlow ● Evolvable framework ○ Extensible via control plane applications 10

  11. Can We Apply SDN to Stream Framework? ● Conceptually similar to computer networks Graph-based communication pattern 11

  12. Can We Apply SDN to Stream Framework? ● Conceptually similar to computer networks Graph-based communication pattern Computing & Routing Round-Robin Key-based components 12

  13. Can We Apply SDN to Stream Framework? ● All topology information pre-defined in submitted application App Developer Logical Topology Aggreg Input Split Count New Application Or (2) ator (1) Reconfiguration Request (1) (2) Stream frameworks Count Split Aggreg Input ator Split Count Physical Topology 13

  14. Can We Apply SDN to Stream Framework? ● All topology information pre-defined in submitted application Well fit in proactive SDN model App Developer Logical Topology Aggreg Input Split Count New Application Or (2) ator (1) Reconfiguration Request (1) (2) Stream frameworks Count Split Aggreg Input ator Split Count Physical Topology 14

  15. Current Stream Frameworks Compute Cluster Worker Agent Streaming Manager Central ….. Topology Builder & Worker Worker Coordinator Scheduler 15

  16. Typhoon Architecture Central Coordinator Compute Cluster Streaming Manager Worker Agent ….. Topology Builder & Worker Worker SDN Controller Scheduler SDN Control Plane Dynamic Topology Software SDN Switch Application Manager 16

  17. Typhoon Design Challenges ● How can we integrate SDN into stream frameworks? ● How can we support dynamic reconfigurations? 17

  18. Typhoon Design Challenges ● How can we integrate SDN into stream framework? ● How can we support dynamic reconfigurations? 18

  19. Integration of SDN into Stream Framework Worker Agent ….. Worker Worker Software SDN Switch 19

  20. Integration of SDN into Stream Framework Worker Agent Worker Agent Worker Worker ….. Worker Worker Data Data Computation Computation Data Routing Data Routing Software SDN Switch I/O Layer I/O Layer Software SDN Switch 20

  21. Integration of SDN into Stream Framework Worker Agent Worker Agent Worker Worker ….. Worker Worker Data Data Computation Computation Data Routing Data Routing Software SDN Switch I/O Layer I/O Layer Port Port Software SDN Switch 21

  22. Integration of SDN into Stream Framework Worker Agent Worker Agent Worker Worker ….. Worker Worker Data Data Computation Computation Data Routing Data Routing Software SDN Switch I/O Layer I/O Layer Port Port Software SDN Switch To use SDN as data forwarding functions, Typhoon needs match fields in flow rules and packet format to be matched 22

  23. Design Flow Rules and Packet Format in Typhoon Data Tuple: stream communication data model for worker-to-worker communications in existing stream frameworks Metadata Tuple Length of data tuple Destination worker ID Source worker ID StreamID Output from Data Computation 23

  24. Design Flow Rules and Packet Format in Typhoon Data Tuple: stream communication data model for worker-to-worker communications in existing stream frameworks Metadata Tuple Length of data tuple Destination Unique IDs in a stream application worker ID Source worker ID StreamID Output from Data Computation 24

  25. Design Flow Rules and Packet Format in Typhoon Data Tuple: stream communication data model for worker-to-worker communications in existing stream frameworks Metadata Tuple Length of data tuple Destination Unique IDs in a stream application worker ID Source worker ID StreamID Custom transport protocol in Typhoon Output from Ethernet Destination Source Ether Data Computation header worker ID worker ID type Metadata Tuple Length Payload of data tuple (Data tuple) StreamID In Typhoon Output from 25 Data Computation

  26. Offloading Data Forwarding from Application Layer Packet format created by I/O Layer Ethernet Destination Source Ether header worker ID worker ID type Set of data tuples Payload Destination worker Source worker Data Computation Data Computation Data Routing Data Routing I/O Layer I/O Layer Port Port Pre-installed flow rule Match in_port=[src worker port], dl_dst=[dest worker ID], dl_src=[src worker ID] Action output=[dest worker port] Software SDN Switch 26

  27. Typhoon Design Challenges ● How can we integrate SDN into stream framework? ● How can we support dynamic reconfigurations? What are the requirements for dynamic reconfigurations? ○ 27

  28. Application Layer Data Routing in Typhoon Policy-specific routing states Worker Routing Type States Data Computation Round-robin Counter Data Routing Key-based Index for hash I/O Layer One-to-many Policy-independent routing states ● Worker data routing ○ Policy-specific routing states Destination Worker IDs ○ Policy-independent routing states Destination 1 ○ Routing computations ... Destination N Routing computations /* Round robin */ Index = (counter++) % numDestWorkers; 28 dstWorker = nextWorkers[index];

  29. Application Layer Data Routing in Typhoon Policy-specific routing states Worker Routing Type States Data Computation Round-robin Counter Data Routing Key-based Index for hash I/O Layer One-to-many Policy-independent routing states ● Update routing states if needed Policy-independent routing states Destination Worker IDs ○ Scale-in/out ■ Destination 1 ■ Computation logic update ... ○ Policy-specific routing states Destination N Change routing type and states ■ 29

  30. Typhoon Design Challenges ● How can we integrate SDN into stream framework? ● How can we support dynamic reconfigurations? What are the requirements for dynamic reconfigurations? ○ Updating per-worker routing states for reconfigurations ■ 30

Recommend


More recommend