results on the pascal promo challenge
play

Results on the PASCAL PROMO challenge Ivan Markovsky University of - PowerPoint PPT Presentation

Results on the PASCAL PROMO challenge Ivan Markovsky University of Southampton The challenge Data: consists of two (simulated) time series u d ( 1 ) ,..., u d ( 1095 ) { 0 , 1 } 1000 promotions y d ( 1 ) ,..., y d ( 1095 ) R 100


  1. Results on the PASCAL PROMO challenge Ivan Markovsky University of Southampton

  2. The challenge Data: consists of two (simulated) time series u d ( 1 ) ,..., u d ( 1095 ) ∈ { 0 , 1 } 1000 promotions y d ( 1 ) ,..., y d ( 1095 ) ∈ R 100 product sales Challenge: find ≤ 50 promotions that affect each product sales 7th promotion 3rd product sales 1 2000 0.8 0.6 1500 0.4 1000 0.2 0 200 400 600 800 1000 200 400 600 800 1000 time time

  3. Comments • time series nature of the data = ⇒ dynamic phenomenon (the current output may depend on past inputs and outputs) • it is natural to think of the promotions as inputs (causes) and the sales as outputs (effects) • multivariable data: m = 1000 inputs, p = 100 outputs • T = 1095 data points—very few, relative to m and p • even static linear model y = Au is unidentifiable ( A can not be recovered uniquely from ( u d , y d ) ) for T < T min : = 10 5 • prior knowledge that a few ( ≤ 50) inputs affect each output helps ( T min = 5000) but doesn’t recover identifiability • this prior knowledge makes the problem combinatorial

  4. Proposed model Main assumptions: 1. static input-output relation y j ( t ) = a j u ( t ) (this implies that one output can not affect other outputs) 2. there is offset and seasonal component, which is sine, i.e. , y bl , j ( t ) : = b j + c j sin ( ω j t + φ j ) Base line: The model is y j ( t ) = y bl , j ( t )+ Au ( t ) � � � � or, with Y : = y ( 1 ) y ( T ) , U : = u ( 1 ) u ( T ) , etc ., ··· ··· Y = Y bl ( b , c , ω , φ )+ AU

  5. Identification problem Parameters: A ∈ R p × m — input/output (feedthrough) matrix b : = ( b 1 ,..., b p ) ∈ R p — vector of offsets c : = ( c 1 ,..., c p ) ∈ R p — vector of amplitudes ω : = ( ω 1 ,..., ω p ) ∈ R p — vector of frequencies φ : = ( φ 1 ,..., φ p ) ∈ [ − π , π ] p — vector of phases Identification problem: � Y d − Y bl ( b , c , ω , φ ) − AU d � minimize over the parameters each row of A has at most 50 nonzero elements. subject to combinatorial, constrained, nonlinear, least squares problem

  6. Solution approach y j ( t ) = b j + c j sin ( ω j t + φ j )+ Au ( t ) � Model: Linear in A , b , c . Nonlinear in ω , φ . Combinatorial in A . Our approach: Split the problem into two stages: 1. Baseline estim.: minimize over b , c , ω , φ , assuming A = 0. Nonlinear LS problem. We use local optimization. 2. I/O function etim.: minimize over A , b , c , with ω , φ fixed. This is a combinatorial problem. We use the ℓ 1 heuristic. This approach simplifies the solution but leads to suboptimality.

  7. Identification of the autonomous term The problem decouples into p independent problems: over b j , c j , ω j ∈ R , φ j ∈ [ − π , π ] � y d , j − y bl , j ( b j , c j , ω j , φ j ) � 2 minimize (1) ( y d , j — j th row of Y d , y bl , j — j th row of Y bl ) A special case of the line spectral estimation problem, for which solution subspace and maximum likelihood (ML) methods exist. We use the ML approach, i.e. , local optimization, assuming ω j = 12 π / T (one year period) or 6 π / T (half year period). Furthermore, we eliminate the “linear” parameters b j , c j by projection VARPRO method �

  8. Baseline 2400 2200 2000 1800 bl , 3 y d , 3 and y ∗ 1600 1400 1200 1000 800 600 200 400 600 800 1000 t

  9. Baseline 1800 1700 1600 1500 bl , 4 y d , 4 and y ∗ 1400 1300 1200 1100 1000 900 800 200 400 600 800 1000 t

  10. Identification of the term involving the inputs Problem: over b j , c j , a j � y d , j − y bl , j ( b j , c j , φ ∗ j ) − a ⊤ j U d � 2 j , ω ∗ minimize a j has at most 50 nonzero elements subject to (2) Proposed heuristic: over b j , c j , a j � y d , j − y bl , j ( b j , c j , φ ∗ j ) − a ⊤ j U d � 2 j , ω ∗ minimize � a j � 1 ≤ γ j subject to (3) γ j > 0 is parameter controlling the sparsity vs accuracy trade-off

  11. Choice of the regularization parameter γ j If we fix the nonzero elements to be the first 10 elements, the optimal solution (with this choice of the nonzero elements) is � ( y d j − y bl , j ) U d ( 1 : 10 , : ) + � a j : = 0 1 × ( m − 10 ) Let a ∗ be the optimal solution over all choices of the nonzero elements. Since � a ∗ j � 1 = γ j , a heuristic choice for γ j is γ j : = � a j � 1 .

  12. Complete model (baseline and 24 inputs) 2400 2200 2000 1800 y d , 3 and y ∗ 3 1600 1400 1200 1000 800 600 200 400 600 800 1000 t

  13. Complete model (baseline and 25 inputs) 1800 1700 1600 1500 y d , 4 and y ∗ 4 1400 1300 1200 1100 1000 900 800 200 400 600 800 1000 t

  14. Nonuniqueness of the solution For uniqueness of A , we need U d to be full row rank. Special cases that lead to rank deficiency of U : • Zero inputs can’t affect the output. Removing them leads to an equivalent reduced model. For maximum sparsity, assign zero weights in A to those inputs. • Inputs that are multiples of other inputs lead to essential nonuniqueness that can not be recovered by the sparsity. Preprocessing step: remove redundant inputs.

  15. Algorithm 1. Input: U d ∈ R m × T and Y d ∈ R p × T . 2. Preprocessing: detect and remove redundant inputs. 3. For j = 1 to p j , c ∗ j , a ∗ 3.1 Identify the baseline � ( ω ∗ j , φ ∗ j ) 3.2 Identify the I/O relation � ( b ∗ j , c ∗ j , a ∗ j ) , sparsity pattern of a ∗ j 3.3 Solve (2) with fixed sparsity pattern, φ j = φ ∗ j and ω j = ω ∗ j � ( b ∗ j , c ∗ j , a ∗ j ) 4. Postprocessing: add zero rows in A ∗ corresponding to the removed inputs 5. Output: Y bl ( b ∗ , c ∗ , ω ∗ , φ ∗ ) and A ∗

  16. Identification of the baseline: 1. Let f ′ , φ ′ j be minimum value/point of (1) with ω j = 6 π / T . 2. Let f ′′ , φ ′′ j be minimum value/point of (1) with ω j = 12 π / T . 3. If f ′ < f ′′ , ω ∗ j : = 6 π / T , φ ∗ j : = φ ′ j , else ω ∗ j : = 12 π / T , φ ∗ j : = φ ′′ j . Identification of the baseline: 1. Let γ j : = � ( y d , j − y bl , j ) U d ( 1 : 10 , : ) + � 1 . 2. Let a ′ j be solution to (3) with φ j = φ ∗ j , ω j = ω ∗ j . 3. Determine the sparsity pattern of a ′ j .

  17. Results on the PROMO challenge % correctly identified inputs 100 90 80 70 60 50 40 30 20 10 0 20 40 60 80 100 output Total: 2321 true inputs, 1796 identified inputs, of which 507 correct. Code: http://www.ecs.soton.ac.uk/~im/challenge.tar

Recommend


More recommend