estimation of airline itinerary choice models using
play

Estimation of Airline Itinerary Choice Models Using Disaggregate - PowerPoint PPT Presentation

Estimation of Airline Itinerary Choice Models Using Disaggregate Ticket Data Laurie Garrow with Matthew Higgins, GA Tech Virginie Lurkin* Michael Schyns, University of Liege Northwestern University Evanston, IL October, 2015 My Research


  1. Estimation of Airline Itinerary Choice Models Using Disaggregate Ticket Data Laurie Garrow with Matthew Higgins, GA Tech Virginie Lurkin* Michael Schyns, University of Liege Northwestern University Evanston, IL October, 2015

  2. My Research Discrete choice / Aviation demand modeling Urban travel Big data analytics 2

  3. My Background Leadership: President, AGIFORS Former President, INFORMS Transportation Science and Logistics Former Board Member, INFORMS Revenue Management and Pricing Former Chair, INFORMS Aviation Applications Section, 2011-12 Former Co-Chair, Emerging Methods, TRB Travel Demand, 2007-12 Teaching: Discrete choice analysis, demand modeling (CEE graduate) Advanced statistical programing (CEE graduate) Revenue management and pricing (MBA) Civil engineering systems, probability (CEE undergraduate) Ongoing Industry and Government Collaborations: Boeing, American, Sabre, Airline Reporting Company, … Parsons Brinkerhoff, AirSage, Epsilon, Georgia DOT 3

  4. Research Philosophy 4

  5. Research Portfolio http://garrowlab.ce.gatech.edu/ 5

  6. Outline 1 Review of network planning models and problem motivation Research objectives 2 Data 3 4 Methodology 5 Results Future research 6 6

  7. Network Planning Models Are used to forecast schedule profitability 1 Support many decisions such as where to fly, when to fly, what equipment to use/purchase, which airlines/flights 2 to codeshare with, etc. Contain multiple sub-modules 3 7

  8. Network Planning Sub-Models Forecasts Sub-Models Our focus Reference: Garrow 2010, Figure 7.1. . 8

  9. Quality of Service Index (QSI) QSI models developed in 1957 and can be thought of in terms of ratios           , or QSI X X X X   i 1 1 2 2 3 3 4 4 QSI i S i      QSI      QSI X X X X . j i 1 1 2 2 3 3 4 4  j J where 𝛾 are preference weights X are quality measures ( e.g., # stops, fare, carrier, equipment type) i,j are indices for itineraries Limitations2004): Management Science • 𝛾 are usually not estimated QSI models don’t incorporate competitive factors • 9

  10. Itinerary Choice Model Outbound itineraries from ATL-ORD AA 101 AA 946 DL 457 UA 147/UA 229    U V i i i        cost time ... V i i 1 2 10

  11. Factors Influencing Itinerary Choice 11

  12. The Fundamental Problem 100 pax 120 pax 40 pax $500 $700 $120 demand = 𝜸 × price + … + 𝜻 𝜸 = +𝟏. 𝟐𝟓 Demand Supply 12

  13. Outline 1 Review of network planning models and problem motivation Research objectives 2 Data 3 4 Methodology 5 Results Future research 6 13

  14. Research Objectives Use ticketing data from Airlines Reporting Corporation (ARC) to 1 generate itineraries and estimate choice models 2 Estimate models that account for price endogeneity 14

  15. Outline 1 Review of network planning models and problem motivation Research objectives 2 Data 3 4 Methodology 5 Results Future research 6 15

  16. Data 1 ARC ticketing data for May 2013 departures Restrict analysis to Continental U.S. markets 2 Include simple one-way and round-trip tickets with at most 2 3 connections Eliminated tickets with fares < $50 (employee and frequent flyers) 4 or in top 0.1% (charter flights) More than 9.6 million tickets meet these criteria 5 16

  17. Explanatory Variables Carrier characteristics • Carrier preferences • Marketing relationships • Airport share Itinerary characteristics • Price • Departure time of day preferences • Elapsed time • Number of connections • Short connection (<60 minutes) indicator • Direct flight indicator 17

  18. Marketing Relationships Marketing carrier Operating carrier US 102 US 5992 SEA PHX DFW AA 1840 US 102 • Online = Same marketing and operating carrier all legs • Codeshare = Same marketing carrier, different operating carrier • Interline = Different marketing carriers, different operating carrier 18

  19. Airport Share Outbound DFW SEA Inbound 𝑃𝑆𝐻 # 𝑥𝑓𝑓𝑙𝑚𝑧 𝑔𝑚𝑗𝑕ℎ𝑢𝑡 𝑙 𝑃𝐶 = 𝑇ℎ𝑏𝑠𝑓 𝑙 , 𝑙 = 𝑝𝑞𝑓𝑠𝑏𝑢𝑗𝑜𝑕 𝑑𝑏𝑠𝑠𝑗𝑓𝑠 𝐿 𝑃𝑆𝐻 # 𝑥𝑓𝑓𝑙𝑚𝑧 𝑔𝑚𝑗𝑕ℎ𝑢𝑡 𝑙 𝑙=1 𝐸𝑇𝑈 # 𝑥𝑓𝑓𝑙𝑚𝑧 𝑔𝑚𝑗𝑕ℎ𝑢𝑡 𝑙 𝐽𝐶 = 𝑇ℎ𝑏𝑠𝑓 𝑙 𝐿 𝐸𝑇𝑈 # 𝑥𝑓𝑓𝑙𝑚𝑧 𝑔𝑚𝑗𝑕ℎ𝑢𝑡 𝑙 𝑙=1 19

  20. Price “Business” Prices “Leisure” Prices Average price for First, Average price for Restricted Business, and Unrestricted Coach and Other fares Coach fares • Average is taken by origin , destination, carrier and level of service (NS, 1 CNX, 2 CNX) • Assume outbound (or inbound) price = total price/2 • Exclude taxes 20

  21. Departure Time of Day Departure time preferences vary by 1  Length of haul  Direction of travel  Number of time zones  Day of week  Itinerary type (OW, OB, IB) Continuous time of day preference formulation is preferred 2 over discrete formulation to avoid counter-intuitive forecasts 21

  22. 10 Time of Day Classifications Same time zone, ≥ 600 miles Same time zone, < 600 miles 1 time zone westbound, ≥ 600 miles 1 time zone westbound, < 600 miles For each classification, estimate separate time of day preferences for outbound , inbound and one-way itineraries and day of week 22

  23. Descriptive Statistics Distance Choice Sets Segment Min Avg Max Min Mean Max # OD # Pax Alts Alts Alts Same TZ ≤ 600 67 419 600 3923 2 19 81 1,995,096 Same TZ > 600 601 855 1534 3034 2 25 107 1,599,528 1 TZ EB ≤ 600 118 463 600 766 2 18 69 284,983 1 TZ EB > 600 601 995 1925 3223 2 25 123 1,283,187 1 TZ WB ≤ 600 118 463 600 755 2 18 66 286,818 1 TZ WB > 600 601 994 1925 3251 2 24 132 1,296,951 2 TZ EB 643 1596 2451 1573 2 30 115 641,831 2 TZ WB 643 1597 2451 1541 2 28 109 642,802 3 TZ EB 1578 2229 2774 1074 2 43 172 653,091 3 TZ WB 1575 2227 2774 1059 2 41 164 650,062 23

  24. Continuous Time of Day 𝐷𝑝𝑜𝑢𝑗𝑜𝑣𝑝𝑣𝑡 𝑢𝑗𝑛𝑓 𝑑𝑛𝑒 = 2𝜌𝑢 2𝜌𝑢 4𝜌𝑢 4𝜌𝑢 𝛾 1𝑑𝑛𝑒 𝑡𝑗𝑜 1440 + 𝛾 2𝑑𝑛𝑒 𝑑𝑝𝑡 1440 + 𝛾 3𝑑𝑛𝑒 𝑡𝑗𝑜 1440 + 𝛾 4𝑑𝑛𝑒 𝑑𝑝𝑡 1440 + 6𝜌𝑢 6𝜌𝑢 𝛾 5𝑑𝑛𝑒 𝑡𝑗𝑜 1440 + 𝛾 6𝑑𝑛𝑒 𝑑𝑝𝑡 1440 where 𝑑 = 𝑢𝑗𝑛𝑓 𝑝𝑔 𝑒𝑏𝑧 𝑑𝑚𝑏𝑡𝑡ification 1,…10 𝑛 = 𝑝𝑣𝑢𝑐𝑝𝑣𝑜𝑒, 𝑗𝑜𝑐𝑝𝑣𝑜𝑒, 𝑝𝑜𝑓𝑥𝑏𝑧 𝑒 = 𝑒𝑏𝑧 𝑝𝑔 𝑥𝑓𝑓𝑙 1, … 7 𝑢 = 𝑒𝑓𝑞𝑏𝑠𝑢𝑣𝑠𝑓 𝑢𝑗𝑛𝑓 𝑗𝑜 𝑛𝑗𝑜𝑣𝑢𝑓𝑡 𝑞𝑏𝑡𝑢 𝑛𝑗𝑒𝑜𝑗𝑕ℎ𝑢 1440 = 𝑜𝑣𝑛𝑐𝑓𝑠 𝑝𝑔 𝑛𝑗𝑜𝑣𝑢𝑓𝑡 𝑗𝑜 𝑏 𝑒𝑏𝑧 24 Reference: Koppelman, Coldren, and Parker (2008).

  25. Data Representativeness Carrier ARC Data DB1B Market Data DL 29.5% 23.4% UA 22.9% 17.1% US 18.4% 10.0% AA 17.5% 19.0% AS 3.3% 4.2% B6 3.2% 3.0% F9 2.2% 1.7% FL 1.4% 2.8% VX 1.3% 0.9% SY 0.3% 0.2% WN 0.0% 17.7% Total 100% 100% 25

  26. Outline 1 Review of network planning models and problem motivation Research objectives 2 Data 3 4 Methodology 5 Results Future research 6 26

  27. Network Planning Sub-Models Forecasts Sub-Models Our focus Reference: Garrow 2010, Figure 7.1. . 27

  28. Define Choice Sets Construct choice sets for each OD city pair that 1 departs on day of week d Create a representative weekly schedule as the 2 Monday after the 9 th of the month [May 13 – May 19, 2013] Define a unique itinerary by org l , dst l , op carr l , op flt 3 num l, dept dow l for legs l=1,2,3 4 Map all demand to representative schedule/unique itinerary 5 Eliminate choice sets with demand < 30 pax/month Mapping process is 98% accurate for all variables and screening rule changes MNL parameter estimates by 4.4% 28

  29. Itinerary Choice Model Outbound itineraries from ATL-ORD AA 101 AA 946 DL 457 UA 147/UA 229    U V i i i        cost time ... V i i 1 2 29

  30. The Fundamental Problem 100 pax 120 pax 40 pax $500 $700 $120 demand = 𝜸 × price + … + 𝜻 𝜸 = +𝟏. 𝟐𝟓 Demand Supply 30

  31. The Fundamental Solution Multiple approaches for correcting price endogeneity 1 We will focus on two-stage control function method that uses 2 instruments 31

  32. The Basic Idea of Control Function “True” impact of price on demand xxx Instrument Validity tests (“are instruments valid?”) Instruments should be correlated with price 1 Instruments should not be correlated with choice 2 32

  33. Two-Stage Control Function Method Stag tage 1: : Linear Regressio ion 𝑞𝑠𝑗𝑑𝑓 = α 0 + α 1 sin2pi_MO_OW_S1 + …. + α 1260 cos6pi_SU_IB_S10 + … + α 1276 interline + α 1277 IV1 + α 1278 IV2 + µ Exogenous Variables Endogenous variable Instruments 𝛿 = 𝑞𝑠𝑗𝑑𝑓 − 𝑞𝑠𝑗𝑑𝑓 Save residuals Stag age 2: : Discrete e Choic ice Model V = α 1 sin2pi_MO_OW_S1 + …. + α 1269 price + … + α 1278 interline + α 1279 𝛿 + 𝜁 33

Recommend


More recommend