pire exogeni envri
play

PIRE ExoGENI ENVRI preparation for Big Data science Stavros - PowerPoint PPT Presentation

System and Network Engineering MSc Research project PIRE ExoGENI ENVRI preparation for Big Data science Stavros Konstantaras, Ioannis Grafis February 5, 2014 Background Big Data science Software Defined Networking (SDN) Huge amount


  1. System and Network Engineering MSc Research project PIRE ExoGENI – ENVRI preparation for Big Data science Stavros Konstantaras, Ioannis Grafis February 5, 2014

  2. Background Big Data science Software Defined Networking (SDN) • Huge amount of data • Separate control plane • Many sources from data plane • Data Movement (DM) • Single entity controls is very important the network • Described by “5V”s • Forwarding intelligence (Volume, Velocity, relies on programmers Variety, Variability and Value) 2

  3. Research questions The main research question is the following: • To what degree can the performance of the data movement protocols be optimized by using Software Defined Networking technology? The main research question includes the following sub- questions: • What network level problems exist which limit the performance of the data movement protocols? • How can SDN eliminate these problems? 3

  4. Outline  Theory part  Problem analysis  Solution profiles  Experimental part  Prototyping HIDE (Hybrid Intelligent Data Enhancer)  Scenarios and Results  Conclusion 4

  5. Data Movement Application problems Application Positives Negatives Network limits -Open source -Difficult to deploy -High scalability -Network speed limit: -High reliability (13 Gbps for TCP GridFTP -Option to resume transfers version) (Globus) -Decrease window size that are stopped because of for every loss packet and failures resend the packet -Open source -Transfer only files, not -Application is not aware -High scalability directories for the topology and the -High reliability -Little industry adoption bbFTP path that data flows -Multi-stream TCP -Little documentation (NASA) -Easy to deploy -Most of times the speed -Resume file transfer session of transferring data is -Open source -Little industry adoption limited due to network -Runs on all major platforms -Little documentation traffic FDT (Java application) -Network speed limit (CERN) -Multi-stream TCP (4.5 Gbps) -Resume file transfer session 5

  6. Performance problem? Does the application perform well? YES NO Is it a network Do nothing problem? YES NO Is it a corrupted/ Examine / Improve broken link? the Application YES NO Fix the link Is it a busy link? YES NO Proceed to the Examine the other Decision tree to entities of the 6 select a QoS solution network

  7. Available technologies • Traffic monitoring • Deep Packet Inspection (DPI) • Inspect client/server interfaces • Inspect flow counters • Flow management • Port level • Socket level (IP address and TCP port) • Network Controllability • Commands to the controller (API) • Commands to the switches 7

  8. Decision tree How can we improve the Application’s performance? Excluded options Can we use the Can we use the Data Parallel options Application for that? network for that? NO NO YES YES Can we control Can not provide Can not provide Do we have access to network to boost network level Application level solution the Application? performance? solution. NO Full Access level Some Access level YES Do we have Do nothing Can we modify the Use of an API to Do nothing We need to grand access to source code? communicate with some access rights Controller? NO YES YES NO Ports Can we How can we Build a separate Can not use Replace the monitor the component to make use of the SDN to solve Application Flow solve Controller? the problem traffic? Sockets management NO YES Full Access level Some Access level Extend the Traffic Traffic Extend the Extend the Use an API to monitoring at source code monito source code source code communicate Interfaces without ring network input Traffic Traffic Flow Network Network Inspect monitoring monitoring management control control flow counters Inspect Flow Commands Commands to 8 client/server DPI sockets ports to switches Controller counters interfaces

  9. Decision tree How can we improve the Application’s performance? Excluded options Can we use the Can we use the Data Parallel options Application for that? network for that? NO YES Can we control Can not provide network to boost network level performance? solution. NO YES Do we have Do nothing We need to grand access to some access rights Controller? YES NO Ports How can we Can not use make use of the SDN to solve Flow Controller? the problem Sockets management Full Access level Some Access level Traffic Extend the Use an API to monito source code communicate ring Traffic Flow Network Network Inspect monitoring management control control flow counters Flow Commands Commands to 9 DPI sockets ports to switches Controller counters

  10. Decision tree How can we improve the Application’s performance? Excluded options Can we use the Can we use the Data Parallel options Application for that? network for that? NO YES Can not provide Do we have access to Application level solution the Application? Full Access level Some Access level Do nothing Can we modify the Use of an API to source code? communicate with NO YES Can we Build a separate Replace the monitor the component to Application solve traffic? NO YES Extend the Traffic Extend the monitoring at source code source code Interfaces without network input Traffic monitoring Inspect 10 client/server interfaces

  11. Solution development profiles Application level Requirements Programmer Network Programmer (API) Network Programmer (full) Hybrid Programming Develop at Application level YES NO NO YES Develop at Network level NO YES YES NO Make use of SDN Technology NO YES YES YES Access to the Application YES NO NO SOME Access to the Controller NO SOME YES SOME Network topology knowledge NO YES YES YES Network status knowledge SOME YES YES YES Traffic monitor using DPI NO NO YES NO Traffic monitor on flow level NO YES YES YES Traffic monitor at Interfaces YES NO NO NO Flow management NO YES YES YES Network controllability NO SOME YES YES 11

  12. Solution tracks How can we improve the Application’s performance? Application approach Network approach (Full) Can we use the Can we use the Data Application for that? network for that? Our approach Network approach (API) NO NO YES YES Can we control Can not provide Can not provide Do we have access to network to boost network level Application level solution the Application? performance? solution. Some Access level NO Full Access level YES Do we have Do nothing Can we modify the Use of an API to Do nothing We need to grand access to source code? communicate with some access rights Controller? NO YES YES NO Ports Can we How can we Build a separate Can not use Replace the monitor the component to make use of the SDN to solve Application Flow solve Controller? the problem traffic? Sockets management NO YES Full Access level Some Access level Extend the Traffic Traffic Extend the Extend the Use an API to monitoring at source code monito source code source code communicate Interfaces without ring network input Network Traffic Traffic Flow Network Inspect control monitoring monitoring management control flow counters Inspect Flow Commands Commands to 12 client/server DPI sockets ports to switches Controller counters interfaces

  13. Controller-Application relationship Controller Dependent Our approach Network level Network level (API) Application level Application Application Independent Dependent Controller Independent 13

  14. HIDE component OpenFlow Floodlight Controller HIDE server2 server1 FDT Iperf SW2 SW1 SW4 Iperf FDT SW3 1 Gbps links 100Mbps links 10Mbps links Path1 Path2 client1 client2

  15. 15

  16. HIDE overhead 6s 5s 5s 5s t0 t1 t2 t3 t4 FDT time 1 st FDT 2 nd FDT 3 rd FDT 4 th FDT new connection output output output output 5s 6s 8s 2s COMPONENT t0 t2 t3 t4 t1 t2’ time new connection confirm that Ignored output discover problem solved QoS problem send commands Δ t = FDT + HIDE = 16s to change path 16

  17. Scenarios • Scenario 1 • Transferring files via Path1 with and without interfering traffic for getting reference points • Scenario 2 • Transferring files via Path1 with interfering traffic and component enabled • Scenario 3 • Interfering traffic change path every 30s in order to stress HIDE for longer period 17

  18. Scenario results FDT performance on transfering different files 1800 1600 FDT ideal Scenario 1 HIDE disable 1400 Scenario 2 HIDE enable Scenario 2 HIDE disabled 1200 Scenario 2 HIDE enabled Time in seconds 1000 800 600 400 200 0 1 12,5 25 37,5 50 62,5 75 87,5 100 125 250 375 500 625 750 875 1000 1250 2500 3750 5000 6250 7500 8750 9000 File sizes in Megabytes 18

  19. Total transfer time Total time for transfering three different files 1800 1600 1400 1200 Time in seconds 1000 800 600 400 200 0 Scenario 1 Scenario 2 HIDE disabled Scenario 2 HIDE enabled Scenario 3 HIDE disabled Scenario 3 HIDE enabled 125 MB 17 29 23 43 23 1.25 GB 117 225 123 222 138 8.75 GB 773 1569 790 1387 892 19

Recommend


More recommend