learning cloud dynamics to optimize spot instance bidding
play

Learning Cloud Dynamics to Optimize Spot Instance Bidding - PowerPoint PPT Presentation

Learning Cloud Dynamics to Optimize Spot Instance Bidding Strategies Misha Khodak Joint with Liang Zheng, Andrew Lan, Carlee Joe-Wong, and Mung Chiang Overview The popularity of cloud computing services has led to the rise of dual-market


  1. Learning Cloud Dynamics to Optimize Spot Instance Bidding Strategies Misha Khodak Joint with Liang Zheng, Andrew Lan, Carlee Joe-Wong, and Mung Chiang

  2. Overview • The popularity of cloud computing services has led to the rise of dual-market pricing schemes: • Providers sell some instances at a fixed “on-demand” price. • Excess capacity is sold at a variable “spot-price” determined via an auction. • We propose a nonlinear dynamical system model to understand spot-price behavior in this environment. • We verify our model using five months of Amazon EC2 data and demonstrate its potential to inform strategic bidding between heterogeneous cloud resources.

  3. The Amazon EC2 Spot Market 1. Select the type of instance 2. Configure and set a bid price 3. Amazon sets a spot price 4. User receives instance if bid was above spot price

  4. Motivation • Spot price dynamics are poorly understood, with past economic modeling focusing mainly on global behavior: • Zheng et al. (SIGCOMM 2015) study the spot price distribution at equilibrium. • Hoy et al. (WINE 2016) explain the optimality of a two- market design as stemming from variable user risk aversion. • Understanding temporal dynamics can better inform strategic bidding.

  5. Spot Price Observations Sometimes goes above the on-demand price π . Occurs when on-demand users take up too much capacity. The spot price π t tends to hover above a constant lower bound price π . m3.medium spot price over 5 months in 2017

  6. Provider Profit Maximization on-demand spot-market instances instances ( π − π ) N ( d ) ( π t − π ) N ( s ) + t t on-demand profit spot-market profit Need a constraint on the number of instances ( N ): • Profit-Maximizing: N ( d ) + N ( s ) ≤ N t t • Usage-Maximizing: N ( d ) + N ( s ) = N t t

  7. Maximize Profit or Usage? Proposition [KZLJC’18]: If B t bids are drawn independently from a distri- bution that weakly stochastically dominates the uniform distribution over [ π , π ], then if the provider uses the profit-maximizing constraint we have ! ◆ 2 ✓ 1 ⇣ π t ≤ π ( ρ ) ⌘ ≤ exp − 2 2 − 2 ρ B t P for ρ ∈ [0 , 1 / 4] and π ( ρ ) = ρπ + (1 − ρ ) π . So a profit-maximizing provider will not set π t close to π very often, which contradicts the data and motivates the choice of a usage-maximizing constraint: 0.10 0.08 0.06 0.04 0.02 0.00 Mar. Apr. May Jun. Jul. Aug.

  8. Observed Spot Price Model At time t cloud provider sees B t bids, which we model as being i.i.d. draws from U [ π , π ] (Zheng et al., 2015). Then in the limit B t → ∞ this the spot price is distributed as 8 n t + b t ≤ 1 (not enough users) π < π − ( π − π ) 1 − n t + ε t 0 < 1 − n t < b t π t = b t n t ≥ 1 (too many on-demand users) : π where we define: N ( d ) n t = /N on-demand usage t b t = B t /N spot usage 0 , σ 2 ✓ ◆ N observation noise ∼ ε t b t

  9. Job Arrival and Departure We model two hidden variables: 1. n t , the number of running on-demand jobs at time t , normalized by N 2. b t , the number of active spot bids at time t , normalized by N At each time step, Λ ( d ) on-demand jobs arrive, Λ ( s ) Λ ( d ) spot jobs arrive, ˜ on- t t t Λ ( s ) demand jobs complete, and ˜ spot jobs complete. This yields the dynamical t system n t +1 = n t + λ ( d ) λ ( d ) − ˜ t t b t +1 = b t + λ ( s ) λ ( s ) − ˜ t t for all λ t = Λ t /N modeled as i.i.d. draws from exponential distributions.

  10. Combined Model Our spot price model is a hidden Markov model (HMM) with hidden state transition governed by the job arrival/departure model: n t +1 = n t + λ ( d ) λ ( d ) − ˜ t t hidden state b t +1 = b t + λ ( s ) λ ( s ) − ˜ t t and the spot price distribution:  n t + b t ≤ 1 π  π − ( π − π ) 1 − n t + ε t 0 < 1 − n t < b t observation π t = b t n t ≥ 1 π  Five model parameters: a scale parameter for each exponentially-distributed λ t for job arrival/departure and variance σ 2 of the Gaussian observation noise ε t .

  11. Parameter Estimation • Model parameters and 1.4 no. of request (prop. of N) no. of request (prop. of N) 2.5 hidden states are jointly 1.2 2.0 estimated using 1.0 1.5 0.8 Expectation-Maximization 0.6 1.0 (EM). 0.4 0.5 0.2 0.0 0.0 • The E-step is conducted Feb. 19 Mar. 19 Apr. 16 Feb. 19 Mar. 19 Apr. 16 date date using a sequential Monte 0.10 1.4 Carlo (“particle filter”) prediction prediction (95% conf.) 1.2 (95% conf.) 0.08 approach: 1.0 dollars dollars 0.06 0.8 • Better suited better 0.6 0.04 0.4 than Kalman-type 0.02 0.2 filters for non-smooth, 0.00 0.0 Feb. 19 Mar. 19 Apr. 16 Feb. 19 Mar. 19 Apr. 16 date date singular models. m3.medium, g2.2xlarge, • Can handle hidden state constraints. spring 2017 spring 2017

  12. Strategic Bidding • We consider the setting where we want to start a job immediately. • In the single-instance setting, the optimal strategy is to bid the on- demand price, if one can afford it. • Instead we can bid within a class of instances by assuming jobs can be easily parallelized. family of compute- optimized instances price scales linearly with resources

  13. Choosing Between Instance Families Given a price π ( i ) at time τ for each instance type i in a family I of instances, τ we wish to minimize the instance cost: t + T i π ( i ) X P ( i ) = t τ t = τ +1 for T i the amount of time it takes to finish a job on type i . τ P ( i ) • Solved by using a spot-price model to find i τ = arg min E π ( i ) τ . • In experiments we assume jobs that can be run completely in parallel. • Can be extended to bidding on non perfectly-parallel jobs and strategic bidding across geographic regions.

  14. Expected Instance Cost: Leveraging Our Model • Simulation: use learned parameters to compute the expected instance cost by simulating multiple trajectories. • Requires a lot of computation for high accuracy. • Empirically useful on shorter job lengths. • Approximation: approximate the expected instance cost using a second-order Taylor expansion. • Cheap to compute. • Empirically useful on longer timescales. • Assumes the job arrival/departure rates are about the same in both the on-demand and spot market (empirically true).

  15. Expected Instance Cost: Linear Auto-Regression • Baseline AR(p) 600 600 model - the price at 500 500 400 each time step is 400 300 some noisy linear 300 200 200 combination of the 100 100 price at p previous 0 0 -0.005 0.000 0.005 -0.005 0.000 0.005 0.010 time steps. 0.04 0.04 0.02 0.02 • Data does not satisfy 0.00 0.00 standard Gaussian -0.02 -0.02 error assumptions -0.04 -0.04 and uncorrelated -0.04 -0.02 0.00 0.02 0.04 -0.04 -0.02 0.00 0.02 0.04 residuals. AR(1) AR(17)

  16. Evaluating Bidding Strategies • The performance of each bidding strategy is evaluated using regret: the difference between the cost of the chosen action and that of the best action in hindsight. • Model-based methods succeed especially well on shorter- term (e.g. 16-hour) jobs. 16 Hours 64 Hours 4 10 Monte Carlo Approximation Monte Carlo Regret (US Cents) Regret (US Cents) Linear Auto-Regression Approximation 3 7.5 Linear Auto-Regression 2 5 1 2.5 0 0 m3 c3 r3 i3 g2( /20) m3 c3 r3 i3 g2( /100) Instance Type Instance Type

  17. Performance and Volatility • Model is consistently better than AR on more volatile instances. • More monetary gain to be had from strategic bidding. • More overlap between the realized cost distributions of different instances. Payment Distribution for Different Instances Non-Volatile Instances (m3, c3, r3) Volatile Instances (g2, i3) Probability Density Probability Density Probability Density 0.3 0.4 0.5 0.6 0.5 1.0 1.5 2.0 0 10 20 30 40 50 Probability Density Probability Density 0.6 0.8 1.0 1.2 1.4 0.5 1.0 1.5 2.0

  18. Summary • We model spot-pricing in cloud computing as a nonlinear dynamical system. • Amazon EC2 data was used to analyze the problem and learn model parameters. • We describe strategies for strategic bidding between instances that can make use of the model.

  19. Open Questions • How do we examine the problem in the setting where dynamics can be influenced across instance families or different regions? • Provide a model for job departure that explicitly depends on the recent job arrival random variables. • Devise more sophisticated bidding strategies requiring lighter assumptions concerning job parallelism.

  20. Thank you! Questions? Contact e-mail: mkhodak@princeton.edu

Recommend


More recommend