efficient online portfolio with
play

Efficient Online Portfolio with Logarithmic Regret Haipeng Luo - PowerPoint PPT Presentation

Efficient Online Portfolio with Logarithmic Regret Haipeng Luo (USC) Chen-Yu Wei (USC) Kai Zheng (Peking University) Online Portfolio Wealth Online Portfolio 0.5Wealth 0.3Wealth Wealth 0.2Wealth Online Portfolio


  1. Efficient Online Portfolio with Logarithmic Regret Haipeng Luo (USC) Chen-Yu Wei (USC) Kai Zheng (Peking University)

  2. Online Portfolio Wealth 𝑒

  3. Online Portfolio 0.5Wealth 𝑒 0.3Wealth 𝑒 Wealth 𝑒 0.2Wealth 𝑒

  4. Online Portfolio Γ— 1.4 0.5Wealth 𝑒 Γ— 1.0 0.3Wealth 𝑒 Wealth 𝑒 0.2Wealth 𝑒 Γ— 0.5

  5. Online Portfolio Γ— 1.4 0.5Wealth 𝑒 0.7Wealth 𝑒 Γ— 1.0 0.3Wealth 𝑒 0.3Wealth 𝑒 Wealth 𝑒 0.2Wealth 𝑒 0.1Wealth 𝑒 Γ— 0.5

  6. Online Portfolio Γ— 1.4 0.5Wealth 𝑒 0.7Wealth 𝑒 Γ— 1.0 0.3Wealth 𝑒 0.3Wealth 𝑒 Wealth 𝑒 Wealth 𝑒+1 = 0.7 + 0.3 + 0.1 Wealth 𝑒 0.2Wealth 𝑒 0.1Wealth 𝑒 Γ— 0.5 = 1.1Wealth 𝑒

  7. Online Portfolio Γ— 1.4 0.5Wealth 𝑒 0.7Wealth 𝑒 Γ— 1.0 0.3Wealth 𝑒 0.3Wealth 𝑒 Wealth 𝑒 Wealth 𝑒+1 = 0.7 + 0.3 + 0.1 Wealth 𝑒 0.2Wealth 𝑒 0.1Wealth 𝑒 Γ— 0.5 = 1.1Wealth 𝑒 𝑦 𝑒 (decision)

  8. Online Portfolio Γ— 1.4 0.5Wealth 𝑒 0.7Wealth 𝑒 Γ— 1.0 0.3Wealth 𝑒 0.3Wealth 𝑒 Wealth 𝑒 Wealth 𝑒+1 = 0.7 + 0.3 + 0.1 Wealth 𝑒 0.2Wealth 𝑒 0.1Wealth 𝑒 Γ— 0.5 = 1.1Wealth 𝑒 𝑦 𝑒 (decision) 𝑠 𝑒 (price relative)

  9. Online Portfolio Γ— 1.4 0.5Wealth 𝑒 0.7Wealth 𝑒 Γ— 1.0 0.3Wealth 𝑒 0.3Wealth 𝑒 Wealth 𝑒 Wealth 𝑒+1 = 0.7 + 0.3 + 0.1 Wealth 𝑒 0.2Wealth 𝑒 0.1Wealth 𝑒 Γ— 0.5 = 1.1Wealth 𝑒 𝑦 𝑒 (decision) = 𝑦 𝑒 , 𝑠 𝑒 Wealth 𝑒 𝑠 𝑒 (price relative)

  10. Online Portfolio π‘ˆ periods 𝑋 1

  11. Online Portfolio π‘ˆ periods 𝑋 𝑋 1 2 = 𝑦 1 , 𝑠 1 𝑋 1

  12. Online Portfolio π‘ˆ periods 𝑋 𝑋 𝑋 1 3 2 = 𝑦 2 , 𝑠 2 𝑋 = 𝑦 1 , 𝑠 1 𝑋 2 1

  13. Online Portfolio π‘ˆ periods 𝑋 𝑋 𝑋 1 3 2 = 𝑦 2 , 𝑠 2 𝑋 = 𝑦 1 , 𝑠 1 𝑋 2 1 π‘ˆ 𝑋 π‘ˆ+1 Final wealth = 𝑦 𝑒 , 𝑠 𝑒 𝑋 Initial wealth 1 𝑒=1

  14. Online Portfolio Gain:

  15. Online Portfolio Gain: Benchmark:

  16. Online Portfolio Gain: Benchmark: Minimize (Regret)

  17. Online Portfolio Gain: Online Convex Optimization [Zinkevich’03] Benchmark: Minimize (Regret)

  18. Online Portfolio Gain: Online Convex Optimization [Zinkevich’03] Benchmark: But with possibly unbounded gradient 𝑠 𝑒,𝑗 𝛼ℓ 𝑒 𝑦 ∞ β‰Ύ 𝐻 β‰œ max 𝑠 𝑗,π‘˜ 𝑒,π‘˜ Maximum Relative Ratio Minimize (Regret)

  19. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds

  20. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds β€’ Upper bounds: 𝐻: maximum relative ratio Algorithm Regret Time (/round) Universal Portfolio π‘ˆ 14 𝑂 4 𝑂 log π‘ˆ (Cover 1991, Kalai et al. 2002)

  21. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds β€’ Upper bounds: 𝐻: maximum relative ratio Algorithm Regret Time (/round) Universal Portfolio π‘ˆ 14 𝑂 4 𝑂 log π‘ˆ (Cover 1991, Kalai et al. 2002)

  22. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds β€’ Upper bounds: 𝐻: maximum relative ratio Algorithm Regret Time (/round) Universal Portfolio π‘ˆ 14 𝑂 4 𝑂 log π‘ˆ (Cover 1991, Kalai et al. 2002) 𝑂 3.5 𝐻𝑂 log π‘ˆ ONS (Hazan et al. 2007)

  23. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds β€’ Upper bounds: 𝐻: maximum relative ratio Algorithm Regret Time (/round) Universal Portfolio π‘ˆ 14 𝑂 4 𝑂 log π‘ˆ (Cover 1991, Kalai et al. 2002) 𝑂 3.5 𝐻𝑂 log π‘ˆ ONS (Hazan et al. 2007)

  24. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds β€’ Upper bounds: 𝐻: maximum relative ratio Algorithm Regret Time (/round) Universal Portfolio π‘ˆ 14 𝑂 4 𝑂 log π‘ˆ (Cover 1991, Kalai et al. 2002) 𝑂 3.5 𝐻𝑂 log π‘ˆ ONS (Hazan et al. 2007) Soft-Bayes (Orseau et al. 2017) 𝑂 π‘ˆπ‘‚

  25. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds β€’ Upper bounds: 𝐻: maximum relative ratio Algorithm Regret Time (/round) Universal Portfolio π‘ˆ 14 𝑂 4 𝑂 log π‘ˆ (Cover 1991, Kalai et al. 2002) 𝑂 3.5 𝐻𝑂 log π‘ˆ ONS (Hazan et al. 2007) Soft-Bayes (Orseau et al. 2017) 𝑂 π‘ˆπ‘‚

  26. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds β€’ Upper bounds: 𝐻: maximum relative ratio Algorithm Regret Time (/round) Universal Portfolio π‘ˆ 14 𝑂 4 𝑂 log π‘ˆ (Cover 1991, Kalai et al. 2002) 𝑂 3.5 𝐻𝑂 log π‘ˆ ONS (Hazan et al. 2007) Soft-Bayes (Orseau et al. 2017) 𝑂 π‘ˆπ‘‚ ? β‰ˆ 𝑂 log π‘ˆ β‰ˆ 𝑂

  27. Previous Results and Our Results 𝑂: number of stocks β€’ Lower bound: Ξ© 𝑂 log π‘ˆ π‘ˆ: number of rounds β€’ Upper bounds: 𝐻: maximum relative ratio Algorithm Regret Time (/round) Universal Portfolio π‘ˆ 14 𝑂 4 𝑂 log π‘ˆ (Cover 1991, Kalai et al. 2002) 𝑂 3.5 𝐻𝑂 log π‘ˆ ONS (Hazan et al. 2007) Soft-Bayes (Orseau et al. 2017) 𝑂 π‘ˆπ‘‚ ? β‰ˆ 𝑂 log π‘ˆ β‰ˆ 𝑂 𝑂 2 log π‘ˆ 4 π‘ˆπ‘‚ 2.5 BarrONS (this work)

  28. Key Components of Our Algorithm Main Challenge: bad bad suddenly good But player puts little weight on it

  29. Key Components of Our Algorithm Main Challenge: bad bad suddenly good But player puts little weight on it Barrons ( Bar rier- R egularized- ONS ) compared to ONS:

  30. Key Components of Our Algorithm Main Challenge: bad bad suddenly good But player puts little weight on it Barrons ( Bar rier- R egularized- ONS ) compared to ONS: 1. Additional regularizer (to avoid too extreme distribution over stocks)

  31. Key Components of Our Algorithm Main Challenge: bad bad suddenly good But player puts little weight on it Barrons ( Bar rier- R egularized- ONS ) compared to ONS: 1. Additional regularizer (to avoid too extreme distribution over stocks) 2. Increase the learning rate for worse stocks (faster recovery)

  32. Key Components of Our Algorithm Main Challenge: bad bad suddenly good But player puts little weight on it Barrons ( Bar rier- R egularized- ONS ) compared to ONS: 1. Additional regularizer (to avoid too extreme distribution over stocks) 2. Increase the learning rate for worse stocks (faster recovery) 3. Restarting (adapting to maximum relative ratio)

  33. Key Components of Our Algorithm Main Challenge: bad bad suddenly good But player puts little weight on it Barrons ( Bar rier- R egularized- ONS ) compared to ONS: 1. Additional regularizer (to avoid too extreme distribution over stocks) 2. Increase the learning rate for worse stocks (faster recovery) 3. Restarting (adapting to maximum relative ratio) Poster #157

Recommend


More recommend