streaming submodular maximization under noise subject to
play

Streaming -submodular Maximization under Noise subject to Size - PowerPoint PPT Presentation

Streaming -submodular Maximization under Noise subject to Size Constraint Lan N. Nguyen, My T. Thai University of Florida -submodular maximization s.t. size constraint -submodular function is a generalization of submodular


  1. Streaming 𝒍 -submodular Maximization under Noise subject to Size Constraint Lan N. Nguyen, My T. Thai University of Florida

  2. 𝒍 -submodular maximization s.t. size constraint ➒ 𝑙 -submodular function is a generalization of submodular function ❑ Submodular set function: input is a single subset π‘Š 𝑔 π‘Œ + 𝑔 𝑍 β‰₯ 𝑔 π‘Œ βˆͺ 𝑍 + 𝑔(π‘Œ ∩ 𝑍) ❑ 𝑙 -submodular function: input is 𝑙 disjoint subsets of π‘Š 𝑔 𝐲 + 𝑔 𝐳 β‰₯ 𝑔 𝐲 βŠ” 𝐳 + 𝑔(𝐲 βŠ“ 𝐳) β–ͺ 𝐲 = (π‘Œ 1 , … , π‘Œ 𝑙 ) and 𝐳 = (𝑍 1 , … , 𝑍 𝑙 ) β–ͺ 𝐲 βŠ” 𝐳 = (π‘Ž 1 , … , π‘Ž 𝑙 ) where π‘Ž 𝑗 = π‘Œ 𝑗 βˆͺ 𝑍 𝑗 βˆ– (Ϊ‚ π‘˜β‰ π‘— π‘Œ π‘˜ βˆͺ 𝑍 π‘˜ ) β–ͺ 𝐲 βŠ“ 𝐳 = (π‘Œ 1 ∩ 𝑍 1 , … , π‘Œ 𝑙 ∩ 𝑍 𝑙 ) ➒ 𝑙 -submodular maximization s.t. size constraint ( M 𝒍 SC ) ❑ π‘Š – a finite set of elements, 𝐢 – a positive integer. Find 𝐭 = (𝑇 1 , … , 𝑇 𝑙 ) s.t. 𝑙 + 1 π‘Š - a family of 𝑙 disjoint subsets of π‘Š ❑ 𝐭 =Ϊ‚ 𝑗≀𝑙 𝑇 𝑗 ≀ 𝐢 that ❑ 𝑔: 𝑙 + 1 π‘Š β†’ ℝ + - a 𝑙 -submodular function. maximizes 𝑔(𝐭)

  3. 𝒍 -submodular maximization s.t. size constraint ➒ Applications: ❑ Influence maximization with 𝑙 topics/products ❑ Sensor placement with 𝑙 kinds of sensors ❑ Coupled Feature Selection. ➒ Existing solutions (*) ❑ Greedy: 2 approximation ratio, 𝑃(π‘™π‘œπΆ) query complexity 𝐢 ❑ β€œLazy” Greedy: 2 approximation ratio, 𝑃(𝑙 π‘œ βˆ’ 𝐢 log 𝐢 log πœ€ ) query complexity with probability at least 1 βˆ’ πœ€ (*) Ohsaka, Naoto, and Yuichi Yoshida. "Monotone k-submodular function maximization with size constraints." Advances in Neural Information Processing Systems . 2015.

  4. Practical Challenges ➒ Noisy evaluation. ❑ In many applications (e.g. Influence Maximization), obtaining exact value for 𝑔(𝐭) is impractical. ❑ 𝑔 can only be queried through a noisy version 𝐺 1 βˆ’ πœ— 𝑔 𝐭 ≀ 𝐺 𝐭 ≀ 1 + πœ— 𝑔(𝐭) for all 𝐭 ∈ 𝑙 + 1 π‘Š ➒ Streaming. ❑ Algorithms are required to take only one single pass over π‘Š β–ͺ Produce solutions in a timely manner. β–ͺ Avoid excessive storage in memory.

  5. Our contribution ➒ Two streaming algorithms for MkSC – DStream & RStream ❑ Take only 1 single scan over π‘Š ❑ Access 𝐺 instead of 𝑔 ❑ Performance guarantee: β–ͺ Approximation ratio 𝑔 𝐭 /𝑔(𝐩) : 𝐩 - optimal solution. β–ͺ Query and memory complexity ➒ Experimental Evaluation ❑ Influence maximization with 𝑙 topics. ❑ Sensor placement with 𝑙 kinds of sensor.

  6. DStream ➒ Obtain 𝑝 such that 𝑔 𝑝 β‰₯ 𝑝 Γ— 𝐢 β‰₯ 𝑔 𝑝 /(1 + 𝛿) ❑ Using lazy estimation (*) ➒ For a new element 𝑓 , if 𝒕 < 𝐢 𝑇 1 Find max 𝑗≀𝑙 𝐺(𝐭 βŠ” (𝑓, 𝑗)) 𝑇 2 𝑓 Disjoint subsets obtained by putting 𝑓 into 𝑇 𝑗 𝑇 3 (*) Badanidiyuru, Ashwinkumar, et al. "Streaming submodular maximization: Massive data summarization on the fly." Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining . 2014.

  7. DStream ➒ Obtain 𝑝 such that 𝑔 𝑝 β‰₯ 𝑝 Γ— 𝐢 β‰₯ 𝑔 𝑝 /(1 + 𝛿) ❑ Using lazy estimation (*) ➒ For a new element 𝑓 , if 𝒕 < 𝐢 𝑇 1 𝐺 π­βŠ” 𝑓,𝑗 𝑝 𝑇 2 Putting 𝑓 to 𝑇 𝑗 if β‰₯ 𝐭 + 1 1βˆ’πœ— 𝑁 𝑓 𝑇 3 (*) Badanidiyuru, Ashwinkumar, et al. "Streaming submodular maximization: Massive data summarization on the fly." Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining . 2014.

  8. DStream ➒ Obtain 𝑝 such that 𝑔 𝑝 β‰₯ 𝑝 Γ— 𝐢 β‰₯ 𝑔 𝑝 /(1 + 𝛿) ❑ Using lazy estimation (*) ➒ For a new element 𝑓 , if 𝒕 < 𝐢 𝑇 1 𝐺 π­βŠ” 𝑓,𝑗 𝑝 𝑇 2 Putting 𝑓 to 𝑇 𝑗 if β‰₯ 𝐭 + 1 1βˆ’πœ— 𝑁 𝑓 Largest possible value of 𝑔 𝐭 βŠ” 𝑓, 𝑗 𝑇 3 (*) Badanidiyuru, Ashwinkumar, et al. "Streaming submodular maximization: Massive data summarization on the fly." Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining . 2014.

  9. DStream’s performance guarantee ➒ 𝐲 = (π‘Œ 1 , … , π‘Œ 𝑙 ) can also be understood as a vector 𝐲: π‘Š β†’ [𝑙] 𝑓 1 𝑓 2 𝑓 3 … … 𝑓 … … … … π‘˜ x 1 0 4 … … 𝑗 … … … … 𝐲 𝑓 = α‰Š 𝑗 if 𝑓 ∈ π‘Œ 𝑗 0 if 𝑓 βˆ‰Ϊ‚ 𝑗 π‘Œ 𝑗

  10. DStream’s performance guarantee ➒ 𝐭 0 , 𝐭 1 , … , 𝐭 𝑒 - sequence of obtained solutions ❑ 𝐭 𝑗 - obtained solution after adding 𝑗 elements ( 𝒕 𝑗 = 𝑗 ) ➒ Construct a sequence 𝐩 0 , 𝐩 1 , … , 𝐩 𝑒 𝐩 𝑗 = (𝐩 βŠ” 𝐭 𝑗 ) βŠ” 𝐭 𝑗 𝐭 𝑗 1 0 2 3 0 0 0 0 𝐩 2 1 2 0 0 3 0 1 𝐩 𝑗 1 1 2 3 0 3 0 1

  11. DStream’s performance guarantee 𝐩 0 𝐭 0 ➒ If in the end 𝐭 = 𝐢 𝐩 2 𝐭 1 𝑔 𝐭 β‰₯ 1 βˆ’ πœ— 𝑔(𝐩) 𝐩 2 𝐭 2 1 + πœ— 1 + 𝛿 𝑁 𝐩 3 𝐭 3 𝐩 4 𝐭 4

  12. DStream’s performance guarantee ➒ If in the end 𝐭 = 𝑒 < 𝐢 , with 𝑔 is monotone . ❑ Establish recursive relationship between 𝑝 π‘˜ , 𝑑 π‘˜ 𝑔 𝐩 π‘˜βˆ’1 + 𝑔 𝐭 π‘˜βˆ’1 ≀ 𝑔 𝐩 π‘˜ + 1 + πœ— 𝐩 0 𝐭 0 1 βˆ’ πœ— 𝑔(𝐭 π‘˜ ) 𝐩 2 ❑ Bound 𝑔 𝐩 βˆ’ 𝑔(𝐩 𝑒 ) (βˆ—) 𝐭 1 𝑔 𝐩 βˆ’ 𝑔 𝐩 𝑒 ≀ 1 + πœ— + 2πΆπœ— 𝑔(𝐭) 𝐩 2 𝐭 2 1 βˆ’ πœ— ❑ Bound 𝑔 𝐩 𝑒 βˆ’ 𝑔(𝐭) (βˆ—βˆ—) 𝐩 3 𝐭 3 𝑔 𝐩 𝑒 βˆ’ 𝑔 𝐭 ≀ 1 𝑁 𝑔 𝐩 + 2πΆπœ— 1 βˆ’ πœ— 𝑔(𝐭) ❑ Discard 𝑔(𝐩 𝑒 ) by combining βˆ— and (βˆ—βˆ—) 𝑁 2 + 4πΆπœ— 𝑔 𝐩 ≀ 𝑔(𝐭) 𝑁 βˆ’ 1 1 βˆ’ πœ—

  13. DStream’s performance guarantee ➒ If in the end 𝐭 = 𝑒 < 𝐢 , with 𝑔 is non-monotone . 𝐩 0 𝐭 0 ❑ 𝑔 is pairwise monotone 𝐩 2 Ξ” 𝑓,𝑗 𝑔 𝐲 + Ξ” 𝑓,π‘˜ 𝑔 𝐲 β‰₯ 0 𝐭 1 ❑ Using the same framework as the 𝐩 2 𝐭 2 monotone case but with different β€œmath” 𝐩 3 𝐭 3 𝑁 (1 + πœ—)(3 + 3πœ— + 6πΆπœ—) 𝑔 𝐩 β‰₯ 𝑔(𝐭) 1 βˆ’ πœ— 2 𝑁 βˆ’ 1

  14. DStream Lazy estimation to obtain 𝑝 β€’ 𝑔 𝐩 ∈ [Ξ” π‘š , 𝐢 Γ— Ξ” 𝑣 ] β€’ 𝑝 can be obtained by a value of 1 + 𝛿 π‘˜ ∈ Ξ” π‘š [ 𝐢 , 𝑁(1 + πœ—)Ξ” 𝑣 ] Query complexity 𝑃(π‘œπ‘™ 𝛿 log( 1 + πœ— (1 + 𝛿) 𝐢𝑁)) 1 βˆ’ πœ— Memory complexity 𝑃(𝐢 𝛿 log( 1 + πœ— (1 + 𝛿) 𝐢𝑁)) 1 βˆ’ πœ—

  15. DStream Approximation ratio 1 + πœ— 1 βˆ’ πœ— min π‘¦βˆˆ(1,𝑁] max(𝑏 𝑦 , 𝑐(𝑦)) If 𝑔 is monotone (1+𝛿)(1+πœ—) β€’ 𝑏 𝑦 = 𝑦 1βˆ’πœ— 2+4πΆπœ— 𝑦 β€’ 𝑐 𝑦 = 1βˆ’πœ— π‘¦βˆ’1 If 𝑔 is non-monotone (1+𝛿)(1+πœ—) β€’ 𝑏 𝑦 = 𝑦 1βˆ’πœ— (1+πœ—)(3+3πœ—+6πΆπœ—) 𝑦 β€’ 𝑐 𝑦 = 1βˆ’πœ— 2 π‘¦βˆ’1

  16. DStream’s weakness 𝐺 π­βŠ” 𝑓,𝑗 𝑝 Putting 𝑓 to 𝑇 𝑗 if β‰₯ 𝐭 + 1 𝑇 2 1βˆ’πœ— 𝑁 𝑓 𝑝 What if 𝑔 𝐭 β‰₯ 𝐭 + 1 𝑁 ? 𝑇 3 β€’ 𝑓 may have no contribution to 𝐭 β€’ Better consider marginal gain

  17. RStream ➒ For a new element 𝑓 , if 𝒕 < 𝐢 𝑒 𝑗 = 𝐺(𝐭 βŠ” (𝑓, 𝑗)) βˆ’ 𝐺(𝐭) 𝑇 1 1 βˆ’ πœ— 1 + πœ— β€’ 𝑒 𝑗 is an upper bound on Ξ” 𝑓,𝑗 𝑔(𝐭) 𝑇 2 𝑓 𝑇 3

  18. RStream ➒ For a new element 𝑓 , if 𝒕 < 𝐢 𝑒 𝑗 = 𝐺(𝐭 βŠ” (𝑓, 𝑗)) βˆ’ 𝐺(𝐭) 𝑇 1 1 βˆ’ πœ— 1 + πœ— β€’ 𝑒 𝑗 is an upper bound on Ξ” 𝑓,𝑗 𝑔(𝐭) 𝑇 2 𝑓 𝑝 β€’ Filter out 𝑇 𝑗 that 𝑒 𝑗 ≀ 𝑁 𝑝 β€’ 𝑒 𝑗 = 0 if 𝑒 𝑗 ≀ 𝑁 β€’ Otherwise 𝑒 𝑗 keeps its value 𝑇 3 β€’ Randomly put 𝑓 into 𝑇 𝑗 with probability π‘ˆβˆ’1 / ෍ π‘ˆβˆ’1 𝑒 𝑗 𝑒 π‘˜ π‘˜ 𝑝 β€’ π‘ˆ = | π‘˜ ∢ 𝑒 π‘˜ β‰₯ | 𝑁

  19. RStream 𝑒 𝑗 = 𝐺(𝐭 βŠ” (𝑓, 𝑗)) βˆ’ 𝐺(𝐭) 𝑇 1 1 βˆ’ πœ— 1 + πœ— β€’ 𝑒 𝑗 is an upper bound on Ξ” 𝑓,𝑗 𝑔(𝐭) 𝑇 2 𝑓 What if 𝐺 𝐭 β‰ˆ 𝑔 𝐭 = 𝑔(𝐭 βŠ” (𝑓, 𝑗)) β‰ˆ 𝐺(𝐭 βŠ” (𝑓, 𝑗)) β€’ 𝑓 has no contribution 𝑇 3 2πœ— 𝑝 β€’ But 𝑒 𝑗 β‰ˆ 1βˆ’πœ— 2 𝑔 𝐭 β‰₯ 𝑁

  20. RStream 𝑒 𝑗 = 𝐺(𝐭 βŠ” (𝑓, 𝑗)) βˆ’ 𝐺(𝐭) 𝑇 1 1 βˆ’ πœ— 1 + πœ— β€’ 𝑒 𝑗 is an upper bound on Ξ” 𝑓,𝑗 𝑔(𝐭) 𝑇 2 𝑓 (Denoise) Run multiple instances, each instance assumes 𝐺 is less noisy than it is. 𝒆 𝒋,𝝑 β€² = 𝐺(𝐭 βŠ” (𝑓, 𝑗)) βˆ’ 𝐺(𝐭) 𝑇 3 1 βˆ’ 𝝑 β€² 1 + 𝝑 β€² where πœ— β€² = 0, πœ— 2πœ— πœƒβˆ’1 , πœƒβˆ’1 , … , πœ— πœƒ – adjustable parameter, controlling number of instances

  21. (Denoise) Run multiple instances, each instance assumes 𝐺 is less noisy than it actually is.

  22. Lazy estimation: Ξ” 𝑣 is much larger than the one in DStream in order to bound 𝑒 𝑗 s’ value. Query complexity 𝛿 log(( 1 + πœ— 2 + 4πΆπœ—)(1 + 𝛿) 𝑃(π‘œπ‘™πœƒ 𝐢𝑁)) 1 βˆ’ πœ— 2 Memory complexity 𝛿 log(( 1 + πœ— 2 + 4πΆπœ—)(1 + 𝛿) 𝑃(πœƒπΆ 𝐢𝑁)) 1 βˆ’ πœ— 2

Recommend


More recommend