popularity over time
play

Popularity over time Analysis of Videos on YouTube Tizian Sarre - PowerPoint PPT Presentation

Fakultt fr Informatik Technische Universitt Mnchen Popularity over time Analysis of Videos on YouTube Tizian Sarre Advisor(s): Dr. Heiko Niedermayer Supervisor: Prof. Dr.-Ing. Georg Carle Chair of Network Architectures and Services


  1. Fakultät für Informatik Technische Universität München Popularity over time – Analysis of Videos on YouTube Tizian Sarre Advisor(s): Dr. Heiko Niedermayer Supervisor: Prof. Dr.-Ing. Georg Carle Chair of Network Architectures and Services Department of Informatics Technical University of Munich (TUM)

  2. Outline  YouTube Platform  Motivation  Dataset  Modeling  Conclusion  Future work  References 2

  3. The Platform • 2 nd biggest website worldwide • Over 1 billion users • More than 4 billion daily views • More than 300 hours of newly uploaded videos every minute 3

  4. Motivation Huge amounts of data • Storage costs • Networking costs Video popularity analysis • Better understanding user behavior • Modeling video views • (Network performance improvement) 4

  5. Dataset Overview Measured for 6 months (October 16th 2015 - April 14th 2016) YouTube metrics • 59328 static video information • 8909 unavailable videos • 58594 measured videos Facebook metrics • About YouTube videos from dataset 5

  6. Dataset (YouTube) YouTube metrics Static video information • Title • Duration • Description • Published • … Unavailable videos • Time Dynamic video measurements • Views • Likes • Dislikes • Comments 6

  7. YouTube Analysis -> Metrics Correlation  YouTube views strongly positively correlate with video likes 7

  8. YouTube Analysis -> Search Ranking Benefits  Keywords, description, etc. barely correlate positively with views 8

  9. YouTube Analysis -> Unavailable Videos  Most video taken down early  Still, life time median surprisingly high (23) 9

  10. Dataset (Facebook) Facebook metrics Dynamic YouTube video measurements • Likes • Comments • Shares • Totals 10

  11. Facebook Analysis: Influence on YouTube  Facebook shares correlates positively with YouTube video views  Linear regression line “curved” due to logarithmic scale 11

  12. Modeling -> Top Days Certain videos more popular than others  When and how often do the highest views gains (top days) occur? 12

  13. Modeling -> Event Days When and how often do distinctly popular days (event days) occur? How exactly are event days defined? Desired properties: • Independence of future data • Outstanding popularity • Independence of video age • Event day occurrence independence • Popularity only dependence 13

  14. Modeling -> Event Days Different event day definition attempts have been made • Absolute median • Popularity categories • Power law-based • Varying event models 14

  15. Modeling -> Event Days -> Absolute Median Idea: Calculate daily views gains medians of all videos in the dataset for each day in their lifetimes  Use daily medians as event day decider 15

  16. Modeling -> Event Days -> Absolute Median Why not use the average? Too heavily influenced by higher values  Represents reality less appropriately Daily views gains medians of the dataset: 16

  17. Modeling -> Event Days -> Absolute Median How do we use the daily views gains medians to derive event days? We could use the exact medians as decider  Better: Add arbitrary positive modifier to accomplish outstanding popularity 17

  18. Modeling -> Event Days -> Absolute Median Observation Lots of event days 18

  19. Modeling -> Event Days -> Absolute Median Problem Daily views gains medians no good decider… Values too small •  Need more fine-grained decider 19

  20. Modeling -> Event Days -> Popularity categories Idea Classify videos dynamically according to popularity categories  Smaller intervals for lower views gains (due to geometric views gains distribution)  More realistic event day determination via interval median as decider 20

  21. Modeling -> Event Days -> Popularity categories Problem Event day determination depends on interval choice  Seemingly random event days  Popularity only dependence violated 21

  22. Modeling -> Event Days -> Power Law Idea Calculate a power law based model on the dataset’s views gains  Strong positive deviations are event days 22

  23. Modeling -> Event Days -> Power Law Results  Slightly fluctuating but overall decreasing event day occurrence  Event days less likely than with absolute medians 23

  24. Modeling -> Event Days -> Power Law Positives: Relatively reasonable model • Negatives: Model changes are not considered Multiple models are not supported 24

  25. Modeling -> Event Days -> Varying Event Models Idea Adjust current power law model when strong deviations 25

  26. Modeling -> Event Days -> Varying Event Models Deviation weight using least squares?  No good, high deviation between views gains too severely weighted 26

  27. Modeling -> Event Days -> Varying Event Models Better with relative measure:  Deviations weighted linearly 27

  28. Modeling -> Event Days -> Varying Event Models Decider for model adaption? More/Less than 50% of expected value  Further model adaptions less likely (consecutive event days less likely) 28

  29. Modeling -> Event Days -> Varying Event Models  Similar results compared to power law approach  Still no multiple event models supported due to uncertainty 29

  30. Conclusion We discussed various better and worse possible event day • definitions External popularity influence not considered due to uncertainty • Vague models for predictions • 30

  31. Future Work Consider video recommendations for popularity analysis • YouTube internal and external search engine rank/hits • Consider YouTube channel subscriber base for video popularity • analysis Twitter/Snapchat/Instagram social media influence analysis • Alternate event day definitions and multiple models detection • support (e.g. with CUSUM) 31

  32. Thank you 32

  33. References http://www.alexa.com/topsites https://www.youtube.com/yt/press/en-GB/statistics.html https://cnet3.cbsistatic.com/hub/i/r/2013/11/11/c3fb1098-6de7- 11e3-913e- 14feb5ca9861/resize/570xauto/4ddbc82dd9df232db62c49b29192f 268/sandvine-2H13-NA-top10-peak_.png http://www.cisco.com/c/en/us/solutions/collateral/service- provider/visual-networking-index-vni/complete-white-paper-c11- 481360.html 33

Recommend


More recommend