¡ Learned about: LSH/Similarity search & recommender systems ¡ Search: “jaguar” ¡ Uncertainty about the user’s information need § Don’t put all eggs in one basket! ¡ Relevance isn’t everything – need diversity ! 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 2
¡ Recommendation: ¡ Summarization: “Robert Downey Jr.” ¡ News Media: 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 3
[ Althoff et al., KDD 2015 ] Robert ¡Downey ¡Jr. ¡(1965—) Deborah The ¡Party's Ben ¡Stiller Fiona ¡Apple Susan ¡Downey Iron ¡Man ¡2 Iron ¡Man ¡3 Falconer Over Robert Paramount Chaplin Ally ¡McBeal Gothika Iron ¡Man The ¡Avengers Downey, ¡Sr. Pictures 1985 1990 1995 2000 2005 2010 2015 Timeline Person ¡ Goal: Timeline should express his relationships to other people through events (personal, collaboration, mentorship, etc.) ¡ Why timelines? § Easier: Wikipedia article is 18 pages long § Context: Through relationships & event descriptions § Exploration: Can “jump” to other people 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 4
¡ Given: § Relevant relationships § Events that each cover some relationships ¡ Goal: Given a large set of events , pick a small subset that explains most known relationships (“the timeline”) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 5
Demo available at: http://cs.stanford.edu/~althoff/timemachine/demo.html Robert ¡Downey ¡Jr. ¡(1965—) Deborah The ¡Party's Ben ¡Stiller Fiona ¡Apple Susan ¡Downey Iron ¡Man ¡2 Iron ¡Man ¡3 Falconer Over Robert Paramount Chaplin Ally ¡McBeal Gothika Iron ¡Man The ¡Avengers Downey, ¡Sr. Pictures 1985 1990 1995 2000 2005 2010 2015 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 6
¡ User studies: People hate redundancy! Chaplin Iron Man Iron Man Iron Man Academy Award US Release US Release vs Award N. Ceremony Rented Lips Iron Man US Release EU Release ¡ Want to see more diverse set of relationships 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 7
¡ Idea: Encode diversity as coverage problem ¡ Example: Selecting events for timeline § Try to cover all important relationships
¡ Q: What is being covered? ¡ A: Relationships Captain America Anthony Hopkins Gwyneth Paltrow Susan Downey Downey Jr. starred in Chaplin together with Anthony Hopkins ¡ Q: Who is doing the covering? ¡ A: Events 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 10
¡ Suppose we are given a set of events E § Each event e covers a set of X e ⊆ U e relationships ¡ For a set of events we define: S ⊆ E � � � � [ F ( S ) = X e � � � � � � e ∈ S ¡ Goal: We want to Cardinality | S | ≤ k F ( S ) max Constraint ¡ Note: F(S) is a set function: F ( S ) : 2 E → N 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 11
¡ Given universe of elements U = { u 1 , . . . , u n } and sets { X 1 , . . . , X m } ⊆ U U: all relationships X 3 X i : relationships covered by event i X 1 U X 2 X 4 ¡ Goal: Find set of k events X 1 …X k covering most of U § More precisely: Find set of k events X 1 …X k whose size of the union is the largest 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 12
Simple Heuristic: Greedy Algorithm: ¡ Start with S 0 = {} ¡ For i = 1…k § Take event e that max F ( S i − 1 ∪ e ) � � § Let � � S i = S i − 1 ∪ { e } [ F ( S ) = X e � � � � � � e ∈ S ¡ Example: § Eval. F({e 1 }), …, F({e m }) , pick best (say e 1 ) § Eval. F({e 1 } u {e 2 }), …, F({e 1 } u {e m }) , pick best (say e 2 ) § Eval. F({e 1 , e 2 } u {e 3 }), …, F({e 1 , e 2 } u {e m }) , pick best § And so on… 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 13
¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 14
¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 15
¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 16
¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 17
¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 18
A C B ¡ Goal: Maximize the size of the covered area with two sets ¡ Greedy first picks A and then C ¡ But the optimal way would be to pick B and C 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 19
¡ Bad news: Maximum Coverage is NP-hard ¡ Good news: Good approximations exist § Problem has certain structure to it that even simple greedy algorithms perform reasonably well § Details in 2 nd half of lecture ¡ Now: Generalize our objective for timeline generation 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 20
¡ Objective values all relationships equally � � � � [ X [ F ( S ) = � = 1 where R = X e X e � � � � � e ∈ S r ∈ R e ∈ S ¡ Unrealistic: Some relationships are more important than others § use different weights (“weighted coverage function”) X w : R → R + F ( S ) = w ( r ) r ∈ R 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 21
§ Use global importance weights § How much interest is there? § Could be measured as § w(X) = # search queries for person X § w(X) = # Wikipedia article views for X § w(X) = # news article mentions for X Captain America Anthony Hopkins Gwyneth Paltrow Susan Downey Captain America Anthony Hopkins Gwyneth Paltrow Susan Downey 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 22
Captain America Susan Downey Justin Bieber Tim Althoff Applying global importance weights Captain America Justin Bieber Susan Downey Tim Althoff ¡ Some relationships are not (very) globally important but (not) highly relevant to timeline ¡ Need relevant to timeline instead of globally relevant w(Susan Downey | RDJr) > w(Justin Bieber | RDJr) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 23
¡ Can use co-occurrence statistics w(X | RDJr) = #(X and RDJr) / (#(RDJr) * #(X)) § Similar: Pointwise mutual information (PMI) § How often do X and Y occur together compared to what you would expect if they were independent § Accounts for popular entities (e.g., Justin Bieber) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 24
¡ How to differentiate between two events that cover the same relationships ? ¡ Example: Robert and Susan Downey § Event 1: Wedding, August 27, 2005 § Event 2: Minor charity event, Nov 11, 2006 ¡ We need to be able to distinguish these! 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 25
¡ Further improvement when we not only score relationships but also score the event timestamp X X F ( S ) = w R ( r ) + w T ( t e ) where r ∈ R e ∈ S [ R = X e e ∈ S Relationship (as before) Timestamps ¡ Again, use co-occurrences for weights w T 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 26
marvel.com • “Robert Downey Jr” and “May 4, 2012” occurs 173 times on 71 different webpages • US Release date of The Avengers • Use MapReduce on 10B web pages (10k+ machines) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 27
¡ Generalized earlier coverage function to linear combination of weighted coverage functions where X X F ( S ) = w R ( r ) + w T ( t e ) [ R = X e r ∈ R e ∈ S e ∈ S ¡ Goal: | S | ≤ k F ( S ) max ¡ Still NP-hard (because generalization of NP-hard problem) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 28
¡ How can we actually optimize this function? ¡ What structure is there that will help us do this efficiently? ¡ Any questions so far? 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 29
¡ For this optimization problem, Greedy produces a solution S s.t. F(S) ³ (1-1/e)*OPT ( F(S) ³ 0.63*OPT ) [Nemhauser, Fisher, Wolsey ’78] ¡ Claim holds for functions F (·) which are: § Submodular, Monotone, Normal, Non-negative (discussed next) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 30
Recommend
More recommend