Constructing High Frequency Price Indexes Using Scanner Constructing High Frequency Price Indexes Data Daniel Melser Using Scanner Data Daniel Melser School of Finance & Economics University of Technology, Sydney Haymarket, NSW 2007 Australia Email: daniel.melser@uts.edu.au May 4, 2011 Prepared for the 12th Meeting of the Ottawa Group, Wellington, NZ
Introduction Constructing High The use of scanner data has been disappointing. Why? Frequency Price Indexes Various ‘reasonable’ approaches to index construction can Using Scanner Data lead to wildly different results. Daniel Melser There is concern about bias but also the high variance of the resulting price indexes. Part of the problem stems from the large changes in prices and quantities caused by the sales cycle and stockpiling. We think about some new ideas, within a hybrid economic-stochastic framework, that reflect the sales cycle and builds indexes which encompass the possibility that purchase and consumption may not take place simultaneously. We use a large publicly available US scanner data set from IRI to illustrate.
The Wacky World of Chained Indexes Constructing High Frequency Figure: Weekly Chained Indexes for Laundry Detergent from 2001–6 Price Indexes Using Scanner Data 20 Daniel Melser Chained Geometric Laspeyres Chained Geometric Paasche Chained Tornqvist 15 10 5 Log Index 0 −5 −10 −15 −20 0 50 100 150 200 250 300 350 Week
Various Approaches to Constructing Price Indexes with Scanner Data Constructing High Frequency Price Indexes Using Scanner Data Daniel Melser The rolling year GEKS approach of Diewert, Fox and Ivancic (2011) looks good.
Various Approaches to Constructing Price Indexes with Scanner Data (cont.) Constructing High Average basket methods may also be worth some Frequency Price Indexes investigation. Why? Using Scanner Data It is consistent with standard agency practice which Daniel Melser compares prices to a reference period. It can be configured to respect the fact that purchase and consumption decisions are made over a time-span (‘budget horizon’) rather than contained within a single period. In some circumstances it may provide a sounder measure of price change. Our approach to measuring prices in period c is to ask a question like... What would have been the cost of purchasing the reference bundle across a ‘budgeting horizon’ A r facing period c’s price distribution?
An Example of an Average Basket-Type Approach Constructing Let’s look at a Laspeyres/Lowe-type index. High Frequency Notation: p ist =price, q ist =quantity, i =product, s =store, Price Indexes Using Scanner t =time (weeks). Data Take the prices and quantities from a reference period A r Daniel Melser as the base ( A r is a span of periods such as a year, e.g. A r = { 1 , 2 , . . . , 52 } ). We want to compare the prices in the span of periods A r with the prices in an individual period c . How? Estimate the distribution of prices in period c and use this information in some sense to create a pseudo-sample p c . spread over a span of dimension | A r | . Call this ˜ Could simply insert the actual prices p isc for each of the time periods or sample from a distribution. p c � t ∈ A r , s ∈ S t , i ∈ I st ˜ ist q ist ˜ P EL r , c | A r = � t ∈ A r , s ∈ S t , i ∈ I st p ist q ist
A Mixture Model for Prices Constructing In order to model the distribution of prices in a given High Frequency period we propose a mixture model for log-prices: Price Indexes Using Scanner Data α ist − β i , σ 2 α ist , σ 2 � � � � log p ist ∼ z ist N + (1 − z ist ) N , β i ≫ 0 i i Daniel Melser z ist ∼ B ( ω it )
Estimating the Parameters of the Mixture Model Constructing High Frequency Price Indexes Using Scanner To estimate the parameters of the model a Data likelihood-based approach, the EM (Expectation Daniel Melser Maximization) algorithm, is used. This iterates between estimating the sales labels (the z ’s) conditional on the parameters (the α ’s, β ’s, ω ’s and σ ’s), and estimating the parameters conditional on the sales labels. In theory the likelihood improves at every step. In practice convergence can be slow, though we did not have this problem.
Some Results from Estimating the Model Constructing High Frequency Table: Summary Statistics for IRI Data — Boston Price Indexes Using Scanner Data Daniel Melser Product Number of: Expenditures, Probability Avg. Sales Items Stores Obs. $ (% on Sale) of Sale, % Discount, % Beer 558 33 175,658 10,773,410 (26.29) 8.46 12.87 Soft Drinks 863 9 355,071 14,401,327 (38.93) 21.79 25.51 Coffee 460 14 304,416 9,782,830 (34.59) 21.99 27.65 Deodorant 551 14 313,495 1,873,691 (17.61) 15.70 36.55 Diapers 260 37 248,830 8,624,339 (24.70) 20.04 24.32 Laundry Detergent 325 16 210,504 8,748,413 (36.82) 17.24 31.80 Milk 282 17 321,594 48,326,059 (17.16) 19.52 19.56 Mustard and Ketch. 225 23 245,953 6,051,455 (21.29) 17.71 25.97 Paper Towels 219 38 234,434 21,202,529 (28.65) 12.25 28.98 Peanut Butter 116 35 270,964 9,242,605 (26.52) 17.45 25.23 Salty Snacks 963 9 307,746 8,788,534 (27.01) 18.35 27.55 Shampoo 551 14 299,967 2,326,703 (19.74) 15.70 30.27 Soup 552 9 264,729 4,612,889 (36.60) 22.41 34.43 Spaghetti Sauce 422 12 288,320 5,465,383 (36.57) 23.67 28.22 Sugar Substitute 52 51 191,022 4,435,557 (14.81) 13.31 22.85 Toilet Tissue 163 32 241,574 22,487,510 (28.11) 13.42 27.97 Toothpaste 403 14 246,966 2,952,563 (19.95) 15.70 31.27 Yoghurt 550 6 272,707 10,179,578 (22.92) 17.28 28.59 NOTE: Results are for IRI data discussed in detail in Bronnenberg, Kruger and Mela (2008). The data covers 313 weeks (6 years) and records weekly average prices at the store-level by product barcode.
Some Examples for Deodorants in Chicago... Constructing High Frequency 4.4 Price Indexes 4.2 Using Scanner 4 Data 4 3.8 3.5 Daniel Melser 3.6 Price Price 3 3.4 3.2 2.5 3 2.8 2 2.6 20 40 60 80 100 120 140 20 40 60 80 100 120 140 Weeks Weeks 5 4 4.5 4 3.5 3.5 Price Price 3 3 2.5 2.5 2 2 1.5 1.5 20 40 60 80 100 120 140 120 130 140 150 160 170 180 190 200 210 Weeks Weeks
Checking for Stockpiling Constructing High If stockpiling does not occur then a classical model of Frequency Price Indexes demand, such as the CES cost function, should explain Using Scanner Data consumers’ expenditures. Daniel Melser We look at expenditure shares ( v ist ) in the period immediately before and immediately after a sale. Our expectation is that, controlling for price, expenditure shares will be lower after the sale than before as consumers will have built up inventories during the sale. � v isr /λ r � p isr � � log = γ 0 + (1 − σ ) log / P ur v isu /λ u p isu + γ 1 display isru + γ 2 feature isru + e isru
The Results of the Stockpiling Regression Constructing High Frequency Price Indexes Table: Results of Stockpiling Regression Using Scanner Data Daniel Melser R 2 Product Obs. Coefficients: Intercept Elasticity ( σ ) Display Feature Beer 250 0.0167 –0.1362 ∗∗ 1.2652 ∗∗∗ 0.4737 –0.8335 Soft Drinks 10,019 0.0067 –0.0732 ∗∗∗ 1.4242 ∗∗∗ –0.0398 –0.0044 Coffee 7,587 0.0056 –0.0502 ∗∗∗ 0.5494 ∗∗∗ 0.0458 –0.0201 Deodorant 8,718 0.0339 –0.0350 ∗∗∗ 0.4911 ∗∗∗ –0.0055 0.0541 Diapers 9,471 0.0475 –0.0196 ∗∗ 0.6477 ∗∗∗ 0.3432 ∗∗ 0.0039 –0.1061 ∗∗∗ 0.9916 ∗∗∗ Laundry Detergent 4,770 0.0005 0.1792 –0.0534 1.4885 ∗∗∗ Milk 8,005 0.0016 0.0016 0.1837 0.0015 Mustard and Ketch. 2,257 0.0116 0.0086 0.1245 0.2216 0.0469 –0.0356 ∗∗∗ 1.4993 ∗∗∗ 0.2553 ∗∗ Paper Towels 3,789 0.0617 –0.2069 –0.0910 ∗∗∗ 0.3122 ∗∗∗ Peanut Butter 5,644 0.0155 –0.0528 0.0708 –0.0485 ∗∗∗ 1.7570 ∗∗∗ Salty Snacks 5,765 0.0128 –0.0508 –0.0014 –0.0536 ∗∗∗ 0.6271 ∗∗∗ Shampoo 9,377 0.0261 –0.0744 0.0126 –0.0837 ∗∗∗ 0.5901 ∗∗∗ Soup 6,416 0.0087 –0.0942 0.0061 –0.0599 ∗∗∗ 1.1128 ∗∗∗ Spaghetti Sauce 11,493 0.0019 0.1101 0.1117 1.5789 ∗∗∗ Sugar Substitute 935 0.0793 –0.0532 0.9133 0.23466 –0.0943 ∗∗∗ 1.4801 ∗∗∗ Toilet Tissue 5,559 0.0625 0.0423 0.0655 –0.0400 ∗∗∗ 0.5713 ∗∗∗ Toothpaste 9,864 0.0072 0.2121 0.0163 –0.0653 ∗∗∗ 1.6283 ∗∗∗ Yoghurt 14,983 0.0092 0.0994 –0.0289 Note: ∗ denotes significance at the 10% confidence level, ∗∗ =5%, and ∗∗∗ =1%.
Sales, Stockpiling and Index Bias Constructing Using our model of prices we can decompose a link in the High Frequency chained Laspeyres index as, Price Indexes Using Scanner Data � p isc � � log P GL b , c = v isb log Daniel Melser p isb s ∈ S , i ∈ I s � = v isb ( α isc − α isb ) s ∈ S , i ∈ I s � − β i v isb ( z isc − z isb ) s ∈ S , i ∈ I s � + v isb ( e isc − e isb ) s ∈ S , i ∈ I s Where, p isb q isb v isb = � s ∈ S , i ∈ I s p isb q isb
Recommend
More recommend