The 18 th summer course of Behavior Modeling Final presentation マルコフ決定過程に基づく経路選択 行動のパラメータ推定 —自動車・自転車交通施策の検討— Evaluation of car/bicycle traffic measures with a link choice model University of Tokyo team A Takuya Iizuka (M1) Kenta Ishii (M1) Shoma Dehara (M1) Miho Yonezawa (M1)
2 1. Background ◆ Area: 松山市 Matsuyama city Population: 512479 (2018.1.1.) Area: 429.06 m 2 • Many people use private car. • City projects are underway to increase activity in the central city. http://udcm.jp/project/
3 2. Basic Analysis Representative Mode Choice in ◆ Mode Choice Matsuyama (n=7107) • Data: Matsuyama PP (2007 Feb.19 – Mar.23) Walk Other 1% 15% Bicycle • High rate of Car & Bicycle use 17% Car • Car & Bicycle paths are overlapping. 53% Taxi → By providing bicycle lanes, 0% traffic accidents can be suppressed !! Bus 1% Car Motorcycle Train Train Bus 2% Taxi Motorcycle Bicycle 11% ※ 経路情報が得られたトリップを抽出
4 2. Basic Analysis ◆ Traffic Volume in the center of Matsuyama Dogo Onsen Dogo Onsen City Hall City Hall JR Station JR Station Center Station Center Station Car Trip Bicycle Trip • Most part of the center of Matsuyama, the car & bicycle trips are separated. • At some roads, car & bicycle trips are overlapping!!
5 2. Basic Analysis Car & Bicycle traffic of each link On links with heavy car traffic, The smaller the traffic of the car, sidewalks are maintained, the more traffic of the bicycle. increasing bicycle traffic.
6 3. Target ◆ For Simulation • Characteristics of each link (length, width, etc.) affect travelers’ behavior. → We adopt Link Base Route Choice Model for analysis. ◆ Our Goal • To clarify what is important element in the route choice behavior of car & bicycle • To simulate transport policy and to verify the sensitivity of each parameter
7 4. Model ◆ Estimation Link based route choice model Different Estimation method Behavior model; RL model compare Inverse Reinforcement Learning (IRL) link length parameter lanes right turn dummy
8 4. Model ◆ Sequential Route Choice Model: Recursive Logit model (RL) (Fosgerau et al., 2013) Graph: 𝐻 = 𝐵, 𝜉 𝐵 : set of links absorbing 𝜉 : set of nodes state • Utility Maximization problem The value function is defined by the 𝑒 (𝑏) 𝑤 𝑜 𝑏 𝑙 + 𝜈𝜁 𝑜 𝑏 + 𝛾𝑊 𝑜 Bellman equation (Bellman, 1957); 𝑒 𝑙 = 𝐹 An instantaneous utility 𝑒 (𝑏) 𝑊 𝑏∈𝐵(𝑙) 𝑤 𝑜 𝑏 𝑙 + 𝜈𝜁 𝑜 𝑏 + 𝛾𝑊 max 𝑜 𝑜 At each current state 𝑙 , a traveler ∀𝑙 ∈ 𝐵 chooses an action 𝑏 (next link). 𝜁 𝑜 𝑏 : error term (i.i.d. Gumbel distribution) Link choice probability 𝜈 : scale parameter 𝛾 : discount rate 1 𝜈(𝑤 𝑜 𝑏 𝑙 +𝛾𝑊 𝑒 (𝑏)) 𝑜 An expected downstream utility 𝑓 𝑒 𝑏 𝑙 = 𝑄 𝑜 𝜈(𝑤 𝑜 𝑏 ′ 𝑙 +𝛾𝑊 :value function 1 𝑒 (𝑏 ′ )) 𝑜 σ 𝑏′∈𝐵(𝑙) 𝑓 from the selected state 𝑏 to the destination link 𝑒
9 4. Compared IRL with RL ◆ Bellman equation ∞ 𝑊 𝜌 𝑡 = 𝐹 𝜌 𝛿 𝑙 𝑠 𝑢+𝑙+1 |𝑡 𝑢 = 𝑡 𝑙=0 ∞ 𝛿 𝑙 𝑠 𝑢+𝑙+2 |𝑡 𝑢 = 𝑡 = 𝐹 𝜌 𝑠 𝑢+1 + 𝛿 𝑙=0 ∞ 𝑏 + 𝛿𝐹 𝜌 𝑏 𝛿 𝑙 𝑠 𝑢+𝑙+2 |𝑡 𝑢+1 = 𝑡′ = 𝜌 𝑡, 𝑏 𝒬 ℛ 𝑡𝑡′ 𝑡𝑡′ 𝑏 𝑡′ 𝑙=0 = 𝑅(𝑡, 𝑏) 𝑏 ℛ 𝑡𝑡′ 𝑏 + 𝛿𝑊 𝜌 𝑡′ = 𝜌 𝑡, 𝑏 𝒬 𝑡𝑡′ 𝑏 𝑡′ 𝑠 𝑢+1 𝛿 : discount rate ( 0 < 𝛿 ≤ 1 ) Transition 𝑡 𝑢 𝑡 𝑢+1 𝑏 : expected reward state ℛ 𝑡𝑡′ 𝑏 ( = 𝐹{𝑠 𝑢+1 |𝑡 𝑢 = s, 𝑏 𝑢 = 𝑏, 𝑡 𝑢+1 = s ′ } ) 𝑏~𝜌(𝑡, 𝑏) 𝒬 𝑡𝑡′ = Pr{𝑡 𝑢+1 = s ′ |𝑡 𝑢 = s, 𝑏 𝑢 = 𝑏}
10 4. Compared IRL with RL ◆ The estimation method : Recursive Logit model (RL) -NPL 𝑢 = 𝜾 𝑼 𝒀 Reward (Instantaneous utility): 𝑠 Value Choice Parameter Convergence Likelihood function probability 𝜾 test No Yes The algorithm for calculating fixed point of value function 𝑊 estimated Con onvergence tes est Parameter 𝜾 ∗ 𝑢 𝜾 ∗ − 𝑊 𝜾 𝑼 − 𝜾 < 𝜀 𝑊 𝑢 (𝜾) + 𝑢 𝑢
11 4. Compared IRL with RL ◆ The estimation method : Max entropy - Inversed Reinforced Learning (IRL) 𝑢 = 𝜾 𝑼 𝒀 Reward: 𝑠 Reinforced Learning Policy Parameter Reward Likelihood Convergence ( 𝑅 value) 𝜾 𝑠 𝑢 𝑀𝑀 test No Yes Proble lem σ 𝑗 log 𝑄 (𝜼 𝑗 |𝜾) max st. 𝑅 𝑢 = 𝑠 𝑢 + 𝛿𝑅 𝑢+1 estimated 𝜾 Parameter 𝜾 ∗ where 𝜼 𝑗 is the path of expert 𝒀 is the feature relating to link
12 5. Estimation Result ◆ RL estimation (car) ◆ IRL estimation (car) 𝛾 = 0.47 (given) 𝛾 = 0.47 (given) Variables Parameters t-Value Variables Parameters t-Value Link Length -0.03 -1.33 Link Length -0.07 -9.72** Right-Turn -0.80 -6.49** Right-Turn -1.02 -8.53** Lanes 0.37 2.76** Lanes -0.37 -5.64** L(0) -1179.29 L(0) -2080.67 LL -1147.00 LL -1117.10 Rho-Square 0.03 Rho-Square 0.46 Adjusted Rho-Square 0.02 Adjusted Rho-Square 0.46
13 5. Estimation Result ◆ Recursive Logit estimation (bicycle) Variables Parameters t-Value Link Length -0.00 -6.21** Right-Turn -0.19 -3.67** Car Traffic -14.37 -0.14 β 0.00 15.15** L(0) -4093.90 LL -3861.56 Rho-Square 0.06 Adjusted Rho-Square 0.06
14 5. Simulation and Evaluation Network Policy 𝐻 = 𝑚𝑗𝑜𝑙, 𝑜𝑝𝑒𝑓, 𝑚𝑏𝑜𝑓 Car Assignment 𝑤 𝑑𝑏𝑠 = 𝜄 1 ∙ 𝑀𝑓𝑜𝑢ℎ + 𝜄 2 ∙ 𝑆𝑗ℎ𝑢𝑢𝑣𝑠𝑜 + 𝜄 3 ∙ 𝑀𝑏𝑜𝑓𝑡 Car traffic Bicycle Assignment 𝑤 𝑐𝑗𝑑𝑧𝑑𝑚𝑓 = 𝜄 4 ∙ 𝑀𝑓𝑜𝑢ℎ + 𝜄 5 ∙ 𝑆𝑗ℎ𝑢𝑢𝑣𝑠𝑜 + 𝜄 6 ∙ 𝐷𝑏𝑠𝑈𝑠𝑏𝑔𝑔𝑗𝑑
15 5. simulation ← Bicycle traffic Policy Reduce the lanes of large bicycle traffic links Private car/bicycle user’s logsum value with/without policy With policy Without policy (rode lanes are reduced) Private car user -2639 -2638 Bicycle user -9297 -1147
16 6. Future works ◆ Policies decided by Two-stage optimization To decide the policy by calculating the fixed point of demand of cars and bicycles Variables is changed Policy Demand change change Consumer surplus
20 4. Frame & Model ◆ Estimation ◆ Policy Simulation Link based route choice model Upper Problem : traffic network • reduction of vehicle lanes Different Estimation method (pedestrian/bicycle only) Behavior model; RL model compare traffic volume network Inverse Reinforcement of each link Learning (IRL) Lower Problem : route choice behavior link length parameter lanes Car Bicycle right turn dummy Assign each OD volume
Recommend
More recommend