university of minnesota
play

University of Minnesota Scaling Up The Performance of Distributed - PowerPoint PPT Presentation

University of Minnesota Scaling Up The Performance of Distributed Key-Value Stores Using Emerging Technologies for Big Data Applications Hebatalla Eldakiky Advisor: Prof. David H. C. Du Department of Computer Science and Engineering University


  1. University of Minnesota Scaling Up The Performance of Distributed Key-Value Stores Using Emerging Technologies for Big Data Applications Hebatalla Eldakiky Advisor: Prof. David H. C. Du Department of Computer Science and Engineering University of Minnesota , USA January 22 nd , 2020

  2. Talk Outline • Introduction • Background & Motivation • Completed Work ❑ TurboKV: Scaling Up the Performance of Distributed Key-value Stores with In-Switch Coordination ❑ Key-value Pairs Allocation Strategy for Kinetic Drives • Proposed Work ❑ TransKV: A Networking Support for Transaction Processing in Distributed Key-value Stores (Proposed Project) • Conclusion • Future Plan 2

  3. The Big Data Era (1/2) We live in the digital era, where data is generated from everywhere Bridge Monitoring. Environment Controls. Elder Care Monitoring. Forest Management. Soil Monitoring. Internet of Things. Social Media. Smart Phones. New 4 PB/day and more….. 6000 tweets/sec C 2017, Effective Business Intelligence with Quick Sight 3

  4. The Big Data Era (2/2) NoSQL Databases become a competitive alternate to the relational DB to store and process the data. NoSQL DB Document Key-Value Graph DB Column DB DB Store RAMCloud 4

  5. Big Data & Storage Challenges (1/2) • • Storage infrastructure is vital for solving big Software-defined Networks (SDN) provide data problems. efficient resource allocation and flexibility for • maximum network performance. Enormous amount of data is distributed • between several storage nodes which are Network switches also become more intelligent connected with network switches. to perform some computational tasks in- • network. Network latency plays a critical role in the efficient access of data in this distributed How to use SDN environment. to manage the distributed storage nodes intelligently Storage Infrastructure 5

  6. Big Data & Storage Challenges (2/2) In-Storage Computing Architecture Conventional Architecture Host Host Execute CPU DRAM DRAM CPU Query Host Interface Host Interface Send Return Return Read Query Query Data Data Results Storage Device Execute Storage Device CPU Device Query ( ARM Processor ) DRAM Data movement problem Reduce the amount of data shipped between With data intensive application, amount of data storage and compute shipped from storage drives to be processed by the host is very large.  Lower Latency  Less energy for data transfer 6

  7. Programmable Networks  In-Network Computing Programmable Networks P4 is a high-level language for Switch OS programming protocol independent Network Demands packet processors designed to Feedback achieve 3 goals. Run-time API Driver • Protocol independence. P4 • Target independence. • “This is how I want the Re-configurability in the field. network to behave and how to switch Think programming rather than packets…” protocols… (the user / controller makes the rules) P4 Programmable Device 7

  8. What is PISA ? Programmer defines the Programmer declares the Programmer declares tables and the exact headers that should be how the output packet processing algorithm recognized and their order will look on the wire in the packet Programmable Match-Action Pipeline Programmable Programmable Parser Deparser • Packet is parsed into individual headers. • Headers and intermediate results are used for matching and actions. • Headers can be modified, added or removed in match-action processing. • Packet is deparsed. 8

  9. Match-Action Processing • Tables are the fundamental unit in the match-action pipeline SUME • Each table contains one or more entries Bandwidth 6.5 Tbps Bandwidth 4x10 Gbps Processing delay < 1 µs  An entry contains: specific key to match on, single action, Action data. Systems use programmable switches • NetCache [ SOSP’ 17 ]  On-switch cache for Load Balancing (LB). • NetChain [ NSDI’ 18]  on-switch KV store for small data. • DistCache [ FAST’ 19]  multiple racks on-switch cache for LB • iSwitch [ ISCA ’19]  on-switch aggregation for distributed RL 9

  10. Kinetic Drive  In-Storage Computing Kinetic Stack • Active KV storage device developed by Seagate. • Accessible by an Ethernet Model No. ST4000NK0001 connection. Transfer rate 60 Mbps • Has CPU and RAM with built-in Capacity 4 TB LevelDB. Key size Up to 4 KB • Handle device to device data Value size Up to 1 MB migration through P2P copy commands. Kinetic Drives Research • Applications communicate with the • Kinetic Action [ICPADS’ 17] drive using the Kinetic Protocol  Performance evaluation of KD characteristics. • over the TCP network. Data Allocation [BigDataService’ 17] • Simple API (get, put, delete).  4 data allocation approaches for KD. 10

  11. Our Mission • Improve data access performance for distributed KV Stores when applications access storage through network. Apps • Reduce the amount of data shipped from storage devices to be processed by the host in data intensive applications. KV Stores • Completed Work  TurboKV: Scaling Up The performance of Distributed Key-value stores with In-Switch Coordination  Key-value pair allocation strategy for Kinetic Storage drives. Infrastructure • Proposed Work  TransKV: Networking Support for Transaction Processing in Distributed Key-value Stores. 11

  12. Completed Work (1/2) TurboKV: Scaling Up the Performance of Distributed Key-value Stores with In-Switch Coordination [𝟐] [1] Hebatalla Eldakiky, David H.C. Du, and Eman Ramadan, “TurboKV: Scaling Up the performance of Distributed Key -value Stores with In-Switch Coordination”, under submission to ACM Transaction on Storage (ToS) 12

  13. Problem Definition • In distributed Key-value store, data is partitioned between several nodes. • Partitions management and query routing are managed in three different ways: Server-driven coordination, Client-driven coordination, and Master-node coordination Server-driven Coordination Master-node Coordination Client-driven Coordination Reply sent to Reply sent Request directed to the client Request sent to the client the right instance to master node 2 2 3 1 1 1 Request sent to 2 Request sent to 3 target storage node random instance Re-direct to right Reply sent to the client storage node Increase query response time. Periodic pulling of updated directory info. × × Increase query response time. × Single point of failure. client needs to link code related to the × ×  Client doesn’t need to link any code  Client doesn’t need to link any code used KV store. to the KV store.  to the kV store. Decrease query response time. 13

  14. Why Switch-driven Coordination? 4 hops • Requests pass by network switches to arrive at their target. • Switch-driven Coordination can carry out  Partitions management  Query routing In network switches. 2  Higher Throughput hops  Lower R/W Latency 14

  15. Objectives • Design in-switch indexing scheme to manage the directory information records. • Adapt the scheme to the match-action pipeline in the programable switches. • Utilize switches as a monitoring system for data popularity and storage nodes load. • Scale up the scheme to multiple racks inside the data center network. Design Issues  Data Partitioning  Key-value Operations Processing  Data Replication  Load Balancing  Index Table Design  Failure Handling  Network Protocol  Scaling up to the data center networks. 15

  16. TurboKV Overview Programmable Switches • Match-action table stores directory information. • Manages key-based Routing. • Provide Query statistics reports to controller. System Controller • Load balancing between the storage nodes. • Updating match-action tables with new location of data. • Handle failures. Storage Nodes • Server library to translate TurboKV packet to the used key-value store. System Clients • Client library to construct TurboKV request packets. 16

  17. TurboKV Data plane Design (1/3) Logical View of TurboKV Data Plane Pipeline Hash partitioning Range Partitioning Chain Replication 17

  18. TurboKV Data plane Design (2/3) On-Switch Index Table Network Protocol Sub-range Storage Nodes Sub-range1 𝐽𝑄 1 , 𝐽𝑄 2 , 𝐽𝑄 3 Sub-range2 𝐽𝑄 2 , 𝐽𝑄 3 , 𝐽𝑄 4 Sub-range3 𝐽𝑄 3 , 𝐽𝑄 4 , 𝐽𝑄 1 Sub-range4 𝐽𝑄 4 , 𝐽𝑄 1 , 𝐽𝑄 2 18

  19. TurboKV Data plane Design (3/3) Key-value Operations Processing RANGE ( 𝑳 𝟐𝟏 , 𝑳 𝟐𝟏𝟏 ) PUT (K,value) GET (K) At egress pipeline [𝐿 1 − 𝐿 30 ] Packet out 𝐿 100 ≤ 𝐿 30 Recirculate [𝐿 31 − 𝐿 80 ] Packet out 𝐿 100 ≤ 𝐿 80 [𝐿 80 − 𝐿 120 ] Recirculate Packet out 𝐿 100 ≤ 𝐿 120 19

Recommend


More recommend