Topics in Computational Sustainability CS 325 Spring 2016 Making Choices: Stochastic Optimization
Introduction • Stochastic programming is a modeling framework to deal with optimization problems that involve uncertainty. • Real world problems almost invariably include some unknown and uncertain parameters (e.g., how much energy the wind farm will actually produce, the actual costs of a project, ..) • Probability distributions over uncertain parameters are known or can be estimated from data (machine learning).
Introduction Example • Farmer can plant his land with either corn , soy , or beans . • For simplicity, assume that the season will either be wet or dry • If it is wet , corn is the most profitable • If it is dry , soy is the most profitable.
Profit All Corn All Soy All Beans Wet 100 70 80 Dry -10 40 35 • If it is wet , corn is the most profitable plant all corn • If it is dry , soy is the most profitable plant all soy
Profit All Corn All Soy All Beans Wet 100 70 80 Dry -10 40 35 Assume the probability of a wet season is p , the expected profit of planting the different crops: Corn: 100 p + (-10) (1-p) = -10+ 110p Soy: 40+ 30p Beans: 35+ 45p
What is the answer ? Suppose p = 0.5, can anyone suggest a planting plan? – If it is wet , corn is the most profitable plant all corn – If it is dry , soy is the most profitable plant all soy – Plant 1/2 corn, 1/2 soy ? Expected Profit: 0.5 (-10 + 110(0.5)) + 0.5 (40 + 30(0.5))= 50 Is this optimal?
Optimal strategy Suppose p = 0.5, can anyone suggest a planting plan? Plant all beans! Expected Profit: 35 + 45(0.5) = 57.5!
What Did We Learn ? • Averaging Solutions Doesn’t Work! • The best decision for today, when faced with a number of different outcomes for the future, is not equal to the “average” of the decisions that would have been best for each specific future outcome.
Discrete random variables Discrete random variable Z is described by mass probabilities of all elementary events: , ,..., z z z 1 2 K , ,..., , p p p 1 2 K Such that ... 1 p p p 1 2 K
Discrete random variables If probability measure is discrete, the expected value of Z is the sum : K [ ] E Z z i p i 1 i Example: If Z represents the outcome of a die [ ] 1 / 6 2 / 6 3 / 6 4 / 6 5 / 6 6 / 6 3 . 5 E Z Similarly, given a function f K [ ( )] ( ) E f Z f z p i i 1 i
Continuous random variables ( , ) f x The expected value of random function is integral: ( ) [ ( , )] ( , ) ( ) F x E f x f x z p z dz Where p(z) is the density function of a continuous random variable 1 ( ) p z dz
Stochastic Programming • Unconstrained stochastic programming problem: min ( ) [ , ] ( , ) ( ) F x E f x f x z p z dz x X Here the set X specifies which solutions are feasible (e.g., through constraints)
Example Crop yield optimization: Z is a binary random variable: wet (Z=1, with probability p) or dry season ( Z=0, prob. 1-p ) ( , , , ) ( 100 10 ( 1 )) f x x x Z x Z Z 1 2 3 1 ( 70 40 ( 1 )) x Z Z 2 ( 80 35 ( 1 )) x Z Z 3 The expected value is ( , , ) [ ( , , , )] ( 100 70 80 ) F x x x E f x x x Z p x x x 1 2 3 1 2 3 1 2 3 ( 1 )( ( 10 ) 40 35 ) p x x x 1 2 3
Stochastic programming example Crop yield optimization problem. Given p , ( , , ) ( 100 10 ( 1 )) F x x x x p p maximize 1 2 3 1 ( 70 40 ( 1 )) x p p 2 ( 80 35 ( 1 )) x p p 3 subject to 0 , 0 , 0 , 1 . x x x x x x 1 2 3 1 2 3
Stochastic Programming • Unconstrained stochastic programming problem: min ( ) [ , ] ( , ) ( ) F x E f x f x z p z dz x X How to solve? – Might not be possible to evaluate the integral in closed form – Computationally hard to evaluate
Sample average approximation • Instead of this min ( ) [ , ] ( , ) ( ) f x E g x g x z p z dz x S , ,..., • Sample according to p(z) and solve 1 2 N • is the sample average function – The expected value is f(x)
Statistics break
SAA gives a lower bound
Proof
Estimating the lower bound
Application • Food security is a global issue – 7.4 Billion people to feed; 21 Million newborns in the past year • Ways to improve production of crops – Increase arable land – Improve yield with technology • 2016 Syngenta Yield Prediction challenge: select best (soy) varieties to improve yield: – Use knowledge of soil/regional data – Understand the uncertainty due to weather/climate – Around 34000 data points, 80 varieties
Our hierarchical model Li, Zhong, Lobell, Ermon: first prize out of 130 teams
Dealing with uncertainty • Two sources of uncertainty – Weather – Errors in variety yield prediction • It is hard to fit a parametric model • Solution: sample from historical data – historical weather distribution at the site of interest – Errors from our yield prediction (non-Gaussian)
Hedging risk • Which one would you pick? – 100 dollars with probability 0.5, nothing with probability 0.5 – 50 dollars with probability 1 • Different criteria. Choose a mix of varieties to – Maximize expected yield minus variance – Maximize expected yield, subject to small variance – Maximize the yield that you can achieve with probability at least 95%
Results
Results
Results
Maximizing the Spread of Cascades Using Network Design with application to spatial conservation planning Daniel Sheldon, Bistra Dilkina, Adam Elmachtoub, Ryan Finseth, Ashish Sabharwal, Jon Conrad, Carla P. Gomes, David Shmoys Institute for Computational Sustainability Cornell University and Oregon State University Will Allen, Ole Amundsen, Buck Vaughan The Conservation Fund
Spatial Conservation Planning • What is the best land acquisition and management strategy to support the recovery of the Red-Cockaded Woodpecker (RCW)? " ! "# $%# Federally listed rare and endangered species
RCW 101 • Habitat requirements in • Cooperative breeders conflict with modern land-use – small family groups – 30-year timber rotation – well-defined territories or – Development patches – centered around cluster of • cavity trees Management – Habitat restoration and preservation • Cavities! – Artificial cavities – One for each bird – Live, old-growth pine (80+ years old) – 2-10 years to excavate – Extensively reused
Problem Setup Available Conserved parcels parcels " ! "# $%# Current territories Potential territories Given limited budget, what parcels should I conserve to maximize the expected number of occupied territories in 50 years?
Metapopulation Model • Model for population dynamics in fragmented landscape – Territories are occupied or unoccupied in each time step – Two types of stochastic events: • Local extinction: occupied -> unoccupied • Colonization: unoccupied -> occupied (from neighbor) Time Time 1 2
Network Cascades • Models for diffusion in (social) networks – Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½
Network Cascades • Models for diffusion in (social) networks – Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½
Network Cascades • Models for diffusion in (social) networks – Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½
Network Cascades • Models for diffusion in (social) networks – Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½
Network Cascades • Models for diffusion in (social) networks – Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½ Note: “activated” nodes are those reachable by red edges
Metapopulation = Cascade • Metapopulation model can be viewed as a cascade in the layered graph representing territories over time i i i i i Initiall y j j j j j occupie k k k k k Patches d territori l l l l l es m m m m m 1 2 3 4 5 Time:
Recommend
More recommend