Balancing TCP Buffer Size vs Parallel Streams in Application-Level - PowerPoint PPT Presentation

Balancing TCP Buffer Size vs Parallel Streams in Application-Level Throughput Optimization Esma Yildirim, Dengpan Yin, Tevfik Kosar* Center for Computation & Technology Louisiana State University June 9, 2009 DADC’09 AT LOUISIANA STATE UNIVERSITY

Motivation  End-to-end data transfer performance is a major bottleneck for large-scale distributed applications  TCP based solutions ◦ Fast TCP, Scalable TCP etc  UDP based solutions ◦ RBUDP, UDT etc  Most of these solutions require kernel level changes  Not preferred by most domain scientists

Application-Level Solution  Take an application-level transfer protocol (i.e. GridFTP) and tune-up for optimal performance: ◦ Using Multiple (Parallel) streams ◦ Tuning Buffer size

Roadmap  Introduction  Parallel Stream Optimization  Buffer Size Optimization  Combined Optimization of Buffer Size and Parallel Stream Number  Conclusions

Parallel Stream Optimization For a single stream , theoretical calculation of throughput based on MSS, RTT and packet loss rate: For n streams ?

Previous Models Hacker et al (2002) Dinda et al (2005) A relation is established An application opening n between RTT , p and the streams gains as much number of streams n: throughput as the total of n individual streams can get: ) s p b M ( t u p h g u o r h T number of parallel streams

Kosar et al Models Break Function Modeling Logarithmic Modeling Modeling Based on Newton’s Method Modeling Based on Full Second Order 2 RTT n 2 + b ' n + c ' p ' n = p n 2 = a ' n 2 MSS c

It is not a perfect World!  The selection of point should be made intelligently otherwise it could result in mispredictions a) Dinda et. al Model b) Newthon’s Method Model 35 35 GridFtp GridFtp 30 30 Dinda et al_1_2 Newton’s Method_4_14_16 Throughput(Mbps) Throughput(Mbps) 25 25 20 20 15 15 10 10 5 5 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Number of parallel streams Number of parallel streams c) Full second order Model d) Model comparison 35 35 GridFtp GridFtp 30 30 Full Second Order_4_9_10 Dinda et al_1_2 Throughput(Mbps) Throughput(Mbps) Newton’s Method_4_14_16 25 25 Full Second Order_4_9_10 20 20 15 15 10 10 5 5 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Number of parallel streams Number of parallel streams

Delimitation of Coefficients  Pre-calculations of the coefficients of a’, b’ and c’ and checking their ranges could save us for elimination of error rate  Ex: Full second order ◦ a’ > 0 2 RTT n 2 + b ' n + c ' ◦ b’ < 0 p ' n = p n 2 = a ' n 2 MSS c ◦ c’ > 0 ◦ 2c’ + b’ > 1

Selection Algorithm selected set of stream number and through the minimum err is selected and returned. ExpSelection( T ) BestCmb( O, n, model ) � Input: T � Input: O, n � Output: O[i][j] � Output: a, b, c, optnum 1 Begin 1 Begin err m ← init 2 accuracy ← α 2 for i ← 1 to ( n − 2) do 3 i ← 1 3 for j ← ( i + 1) to ( n − 1) do 4 streamno 1 ← 1 4 for k ← ( j + 1) to n do 5 a � , b � , c � ← CalCoe( O, i, j, k, model ) throughput 1 ← T streamno 1 5 6 if a � , b � , c � are effective then O [ i ][1] ← streamno 1 7 6 P n err ← 1 t =1 | O [ t ][2] − T h pre ( O [ t ][1]) | 8 O [ i ][2] ← throughput 1 7 n if err m = init || err < err m then 9 do 8 err m ← err 10 streamno 2 ← 2 ∗ streamno 1 9 a ← a � 11 b ← b � throughput 2 ← T streamno 2 10 12 c ← c � slop ← throughput 2 − throughput 1 13 11 streamno 2 − streamno 1 end if 14 i ← i + 1 12 end if 15 O [ i ][1] ← streamno 2 13 end for 16 O [ i ][2] ← throughput 2 end for 14 17 end for 18 streamno 1 ← streamno 2 15 optnum ← CalOptStreamNo( a, b, c, model ) 19 throughput 1 ← throughput 2 16 return optnum 20 while slop > accuracy 17 21 End 18 End

Points Chosen by the Algorithm

Buffer Size Optimization  Buffer size affects the # of packets on the fly before an ack is received  If undersized ◦ The network can not be fully utilized  If oversized ◦ Throughput degradation due to packet losses which causes window reductions  A common method is to set it to Bandwidth Delay Product = Bandwidth x RTT  However there are differences in understanding the bandwidth and delay

Bandwidth Delay Product  BDP Types:  BDP1= C x RTT max  BDP2= C x RTT min  C -> Capacity  BDP3= A x RTT max  BDP4= A x RTT min  A -> Available bandwidth  BDP5= BTC x RTT ave  BTC -> Average throughput of a congestion limited transfer  BDP6= B inf  B inf -> a large value that is always greater than window size

Existing Models  Disadvantages of existing optimization techniques ◦ Requires modification to the kernel ◦ Rely on tools to take measurements of bandwidth and RTT ◦ Do not consider the effect of cross traffic or congestion created by large buffer sizes  Instead, can perform sampling and fit a curve to the buffer size graph

Buffer Size Optimization  Throughput becomes stable around 1M buffer size

Combined Optimization

Balancing: Simulations  Simulator: NS-2  Range of different buffer sizes and parallel streams used  Test flows are from Sr1 to Ds1 where cross traffic is from Sr0 to Ds0

1 - No Cross Traffic ‣ Increasing the buffer size pulls back the parallel stream number to smaller values for peak throughput ‣ Further increasing the buffer size causes a drop in the peak throughput value

2 - Non-congesting Cross Traffic ‣ 5 streams of 64KB buffer size as traffic ‣ Similar behavior as no traffic case until the capacity is reached ‣ After the congestion starts the fight is won by the parallel flows of which stream number keeps increasing

3 - Congesting Cross Traffic ‣ 12 streams of 64KB buffer size traffic ‣ No significant effect of buffer size ‣ As the number of parallel streams increases the throughput increases and cross traffic throughput decreases

Experiments on 10Gbps Network  Approach 1: Tune # of streams first, then buffer size ◦ Optimal stream number is 14 and an average peak of 1.7 Gbps is gained ◦ Optimal buffer size = 256

Experiments on 10Gbps Network  Approach 2: Tune buffer size first, then # of streams ◦ Tuned buffer size for single stream is 1M and a throughput of around 900 Mbps is gained ◦ Applying the parallel stream model, the optimal stream number is 4 and an average of around 2Gbps throughput is gained

Conclusions and Future Work  Tuning buffer size and using parallel streams allow improvement of TCP throughput at the application level  Two mathematical models (Newtons & Full Second Order) give promising results in predicting optimal number of parallel streams  Early results in combined optimization show that using parallel streams on tuned buffers result in significant increase in throughput

Hmm.. This work has been sponsored by: NSF and LA BoR For more information Stork: http://www.storkproject.org PetaShare :http://www.petashare.org

Balancing TCP Buffer Size vs Parallel Streams in Application-Level - PowerPoint PPT Presentation

Balancing TCP Buffer Size vs Parallel Streams in Application-Level Throughput Optimization Esma Yildirim, Dengpan Yin, Tevfik Kosar* Center for Computation & Technology Louisiana State University June 9, 2009 DADC09 AT LOUISIANA STATE

Attacks on TCP 1 Outline What is TCP protocol? How the TCP Protocol Works SYN

TCP Pacing in Data Center Networks Monia Ghobadi, Yashar Ganjali Department of Computer

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -> 2

Internal Load Balancing in 5 mins Deliver scalable and resilient internal-only services on GCP

TCP on Wireless Ad Hoc Networks CS 218 Oct 22, 2003 TCP overview Ad hoc TCP : mobility,

TCP TCP Congestion Control Congestion Control Essential strategy :: The TCP host sends

Hacking the MPTCP socket API draft-hesmans-mptcp-socket-00 MultiPath TCP WiFi 4G LTE MultiPath

Load Balancing Load Balancing Load balancing: distributing data and/or computations across

TCP-CCC: single-path TCP congestion control coupling draft-welzl-tcp-ccc-00 Michael Welzl,

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

TCP/IP Over Lossy Links - TCP SACK without Congestion Control Organization 1. The History of

TCP/IP: TCP Network Security Lecture 7 Eike Ritter Network Security - Lecture 7 1 TCP

TCP/IP Networks Dr. Miled M. Tezeghdanti December 17, 2010 Dr. Miled M. Tezeghdanti () TCP/IP

TCP/IP Networks Dr. Miled M. Tezeghdanti October 7, 2011 Dr. Miled M. Tezeghdanti () TCP/IP

TCP/IP Networks Dr. Miled M. Tezeghdanti October 7, 2011 Dr. Miled M. Tezeghdanti () TCP/IP

Performance Andre Ryll, B.Eng. Content TCP Basics revisited Facts TCP Header TCP

OpenStack Workload Reference Architecture: Web Applications Web applications are the most

9. Architecture Venkat Subramaniam Arch-1 Whats Architecture? Description of sub-system

1 Programming IoT Applications with Ravel Laurynas Riliskis & Philip Levis, with others 2

Opening a Pipeline to Patient-Centered Research Proposals December 3, 2014

The M 3 (Measure-Measure-Model) Tool-Chain for Performance Prediction of Multi-tier Applications

Static Performance Analysis with LLVM Clment Courbet G. Chatelet, B. De Backer, O. Sykora,

MCMC Diagnostics Review In the practical you used Metropolis-Hastings with a Gaussian proposal

Clustering & Unsupervised Learning Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter

Balancing TCP Buffer Size vs Parallel Streams in Application-Level - PowerPoint PPT Presentation

Balancing TCP Buffer Size vs Parallel Streams in Application-Level Throughput Optimization Esma Yildirim, Dengpan Yin, Tevfik Kosar* Center for Computation & Technology Louisiana State University June 9, 2009 DADC09 AT LOUISIANA STATE

Attacks on TCP 1 Outline What is TCP protocol? How the TCP Protocol Works SYN

TCP Pacing in Data Center Networks Monia Ghobadi, Yashar Ganjali Department of Computer

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -&gt; 2

Internal Load Balancing in 5 mins Deliver scalable and resilient internal-only services on GCP

TCP on Wireless Ad Hoc Networks CS 218 Oct 22, 2003 TCP overview Ad hoc TCP : mobility,

TCP TCP Congestion Control Congestion Control Essential strategy :: The TCP host sends

Hacking the MPTCP socket API draft-hesmans-mptcp-socket-00 MultiPath TCP WiFi 4G LTE MultiPath

Load Balancing Load Balancing Load balancing: distributing data and/or computations across

TCP-CCC: single-path TCP congestion control coupling draft-welzl-tcp-ccc-00 Michael Welzl,

WITH C++ Prof. Amr Goneid AUC Part 9. Streams &amp; Files Prof. amr Goneid, AUC 1 Streams

TCP/IP Over Lossy Links - TCP SACK without Congestion Control Organization 1. The History of

TCP/IP: TCP Network Security Lecture 7 Eike Ritter Network Security - Lecture 7 1 TCP

TCP/IP Networks Dr. Miled M. Tezeghdanti December 17, 2010 Dr. Miled M. Tezeghdanti () TCP/IP

TCP/IP Networks Dr. Miled M. Tezeghdanti October 7, 2011 Dr. Miled M. Tezeghdanti () TCP/IP

TCP/IP Networks Dr. Miled M. Tezeghdanti October 7, 2011 Dr. Miled M. Tezeghdanti () TCP/IP

Performance Andre Ryll, B.Eng. Content TCP Basics revisited Facts TCP Header TCP

OpenStack Workload Reference Architecture: Web Applications Web applications are the most

9. Architecture Venkat Subramaniam Arch-1 Whats Architecture? Description of sub-system

1 Programming IoT Applications with Ravel Laurynas Riliskis &amp; Philip Levis, with others 2

Opening a Pipeline to Patient-Centered Research Proposals December 3, 2014

The M 3 (Measure-Measure-Model) Tool-Chain for Performance Prediction of Multi-tier Applications

Static Performance Analysis with LLVM Clment Courbet G. Chatelet, B. De Backer, O. Sykora,

MCMC Diagnostics Review In the practical you used Metropolis-Hastings with a Gaussian proposal

Clustering &amp; Unsupervised Learning Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter

Balancing Gas system information provision 12 June 2018 GRTgaz balancing in a nutshell -> 2

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

1 Programming IoT Applications with Ravel Laurynas Riliskis & Philip Levis, with others 2

Clustering & Unsupervised Learning Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter