the lifecyle of a youtube video phases content and
play

The Lifecyle of a Youtube Video: Phases, Content and Popularity - PowerPoint PPT Presentation

The Lifecyle of a Youtube Video: Phases, Content and Popularity Honglin Yu, Lexing Xie, Scott Sanner Australian National University, NICTA May 22, 2015 Overview The scarce, and therefore valuable, resource is now attention B. A.


  1. The Lifecyle of a Youtube Video: Phases, Content and Popularity Honglin Yu, Lexing Xie, Scott Sanner Australian National University, NICTA May 22, 2015

  2. Overview “The scarce, and therefore valuable, resource is now attention” — B. A. Huberman ◮ Previous: Crane and Sornette’s model (PNAS 2008) ◮ But, in reality ... 3500 16000 3000 14000 12000 2500 daily viewcount daily viewcount 10000 2000 8000 1500 6000 1000 4000 500 2000 0 1 1 1 2 2 2 2 2 2 3 3 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 g c t c b p r n g c t c b p r 2 2 2 2 2 2 2 2 2 2 2 A u O D e F e A J u A u O D e F e A g p c t v c n b a r p r a y n A u S e O N o D e J a e F M A M J u

  3. Generalized Power-law Phases 35 50 30 40 25 30 20 15 20 10 10 5 0 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 . 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 . 0 x [ t ] = at b + c [Ours]: [CS08]: x [ t ] ∼ t b ◮ sufficiently expressive for ◮ result of epidemic monotonic curves branching processes ◮ model multiple phases ◮ account for different background processes Both are efficient to fit

  4. The Phase-finding Algorithm fitting error Regularizer � �� � � �� � n � E i { x [ t s i : t e min. ] , a i , b i , c i } + η ( n − 1) i i =1 � �� � � �� � paremeter boundary ◮ Try all the possible segmentation ◮ Dynamic programming with fitting in loop

  5. The “Tweeted Video” Dataset Category #videos Category #videos Music 64096 Howto 4357 Entertainment 26602 Travel 3379 Comedy 14616 Games 3299 People 12759 Nonprofit 2672 News 10422 Autos 2398 Film 8356 Animals 2375 Sports 7872 Shows 407 Tech 4626 Movies 15 Education 4577 Trailers 13 Total number : 172841 ◮ Unique longitudinal popularity history for a large+diverse set of videos ◮ From 20-30% sample of tweets 2009.06-07

  6. Examples of Segmentation Result (a) ID: 3o3hfNmtxYg (b) ID: IoNcZRkwbCA 100 1200 1000 80 daily viewcount 800 60 600 40 400 20 200 0 0 9 0 9 9 0 0 0 0 1 1 9 9 0 0 0 0 1 0 0 1 0 1 1 1 1 1 0 1 1 1 - - - - - - - - - - - - - - - l t n r l t n r g t c b r n g u c p u c p c e e p a a u u u J O A J O J A O D F A J A J A (c) ID: Hi0cQ5ELdt4 (d) ID: LRDihKbdrwc 800 800 700 700 600 600 daily viewcount 500 500 400 400 300 300 200 200 100 100 0 0 7 7 8 8 8 8 9 9 9 0 0 1 1 2 2 3 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 - - - - - - - - - - - - - - - - g v b y g v b y v y v y v y v y u o e a u o e a o a o a o a o a A N F M A N F M N M N M N M N M dates (mmm-yy)

  7. Four Types of Phases #phase #views duration 172K video 233/phase 3.3/video 22K+/phase convex.inc convex.dec concave.inc concave.dec 0% 15%30%45%60%

  8. #Phase v.s. Video Popularity 1 2 3 4 5 6 ≥ 7 1 . 0 %videos with various #phases 0 . 8 0 . 6 0 . 4 0 . 2 0 . 0 5 15 25 35 45 55 65 75 85 95 popularity percentile ◮ Popular videos have more phases.

  9. Dominant Convex Decreasing Phases 2500000 ◮ Novelty is the (only) most daily viewcount 2000000 T phase ≥ 0 . 9 T 1500000 important factor 1000000 ◮ Do not revive 500000 0 0 100 200 300 400 500 600 700 #days after uploading 0 . 7 %videos with domVexDec phase 0 . 6 0 . 5 0 . 4 0 . 3 0 . 2 0 . 1 0 . 0 m c a d a o e s r t l h s r s p e r p i m e c t v o c e w l s w o o i u a t t e m n F u m n p e i u e T o n d o r M T A E S P a N o H N A E C G

  10. How do popularity change? percentile at 6 months 5 20 35 50 65 80 95 5 15 25 35 45 55 65 75 85 95 popularity percentile (%) at 1 year ◮ Many videos go through a jump in popularity. ◮ They have been in a continuously increasing phase, or have at least one new phase. rank change → time range →

  11. Phase-aware Viewcount Prediction pivot date t p 800 feature feature target target 700 Daily viewcount 600 ◮ Target: χ ∗ = � ∆ t τ =1 x [ t p + τ ] 500 400 χ = � t p ◮ Prediction: ˆ τ =1 α τ x [ τ ] 300 200 ◮ Measure: normalized MSE, 100 0 1 v ∈V ( χ ∗ − ˆ χ ) 2 0 20 40 60 80 100 ǫ = � Days after uploading ∆ t |V| vex.inc cav.dec 0 . 32 0 . 28 prediction error prediction error prediction error 0 . 30 0 . 26 0 . 24 0 . 28 0 . 22 0 . 26 0 . 20 0 . 24 0 . 18 0 . 22 0 . 16 0 . 20 0 . 14 0 . 18 0 . 12 30 45 60 75 90 105 120 30 45 60 75 90 105 120 pivot date pivot date

  12. Summary ◮ Main contribution ◮ New representation: popularity phases. ◮ New method: phase extraction algorithm. ◮ A large-scale measurement study. ◮ Better viewcount prediction. ◮ Links ◮ Segmentation Algorithm: https://github.com/yuhonglin/segfit ◮ Dataset: https://github.com/yuhonglin/ytphasedata ◮ Data crawler: https://github.com/yuhonglin/YTCrawl ◮ Our on-going work: generative model of popularity Thank you!

  13. Riley Crane and Didier Sornette. Robust dynamic classes revealed by measuring the response function of a social system. Proceedings of the National Academy of Sciences , 105(41):15649–15653, 2008.

Recommend


More recommend