http://cs224w.stanford.edu We are more influenced by our friends - PowerPoint PPT Presentation

CS224W: Machine Learning with Graphs Jure Leskovec, Stanford University http://cs224w.stanford.edu

¡ We are more influenced by our friends than strangers ¨ 68% of consumers consult friends and family before purchasing home electronics ¨ 50% do research online before purchasing electronics 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 2

Identify influential customers Convince them to These customers adopt the product – endorse the product Offer discount or among their friends free samples 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 3

“ Kate Middleton effect The trend effect that Kate, Duchess of Cambridge has on others, from cosmetic surgery for brides, to sales of coral-colored jeans.” 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 4

n According to Newsweek, "The Kate Effect may be worth £1 billion to the UK fashion industry." n Tony DiMasso, L. K. Bennett’s US president, stated in 2012, "...when she does wear something, it always seems to go on a waiting list." 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 5

¡ Influential persons often have many friends ¡ Kate is one of the persons that have many friends in this social network ¡ For more Kates, it’s not as easy as you might think! 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 6

¡ Given a directed graph and k>0, ¡ Find k seeds (Kates) to maximize the number of influenced people (possibly in many steps) 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 7

¡ Linear Threshold Model ¡ Independent Cascade Model 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 8

¡ A node v has random threshold 𝜄 " ~ U[0,1] ¡ A node v is influenced by each neighbor w according to a weight 𝑐 ",% such that å £ b 1 v w , w neighbor of v ¡ A node v becomes active when at least (weighted) 𝜄 " fraction of its neighbors are active å ³ q b v w , v w active neighbor of v 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 9

Inactive Node 0.6 Active Node 0.2 Threshold 0.2 0.3 Active neighbors X 0.1 0.4 U 0.3 0.5 Stop! 0.2 0.5 w v 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 10

¡ Independent Cascade Model § Directed finite 𝑯 = (𝑾, 𝑭) § Set 𝑻 starts out with new behavior § Say nodes with this behavior are “ active ” § Each edge (𝒘, 𝒙) has a probability 𝒒 𝒘𝒙 § If node 𝒘 is active, it gets one chance to make 𝒙 active, with probability 𝒒 𝒘𝒙 § Each edge fires at most once ¡ Does scheduling matter? No § If 𝒗, 𝒘 are both active at the same time, it doesn’t matter which tries to activate 𝒙 first § But the time moves in discrete steps 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 11

¡ Initially some nodes S are active ¡ Each edge (𝒘, 𝒙) has probability (weight) 𝒒 𝒘𝒙 0.4 a d 0.4 0.2 0.3 0.3 0.2 b 0.3 f f e e 0.2 h 0.4 0.4 0.3 0.2 0.3 0.3 g g i c 0.4 ¡ When node v becomes active: § It activates each out-neighbor 𝒙 with prob. 𝒒 𝒘𝒙 ¡ Activations spread through the network 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 12

0.4 a d Problem: ( k is a user-specified parameter) 0.4 0.2 0.3 0.3 0.2 ¡ Most influential set of 0.3 b f 0.2 e size k : set S of k nodes h 0.4 0.4 0.3 0.2 producing largest 0.3 0.3 g i expected cascade size f(S) 0.4 c if activated [Domingos- Influence Influence set X a of a set X d of d Richardson ‘01] f ( S ) max ¡ Optimization problem: S of size k 𝑔 𝑇 = 1 Why “expected cascade size”? X a is a result of a random process. So in |𝐽| 6 𝑔 7 (𝑇) practice we would want to compute X a for many random realizations and then maximize the “average” value f(S ). For now let’s ignore this nuisance and Random simply assume that each node u influences a set of nodes X u realizations i 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 13

¡ S : is initial active set ¡ f(S) : The expected size of final active set § f(S) is the size of the union of X u : 𝒈(𝑻) = ∪ 𝒗∈𝑻 𝒀 𝒗 a b d … influence set X u of node u c graph G ¡ Set S is more influential if f(S) is larger 𝒈( 𝒃, 𝒄 ) < 𝒈({𝒃, 𝒅}) < 𝒈({𝒃, 𝒆}) 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 14

¡ Problem: Most influential set of k nodes: set S on k nodes producing largest expected cascade size f(S) if activated ¡ The optimization problem: f ( S ) max S of size k ¡ How hard is this problem? § NP-COMPLETE! § Show that finding most influential set is at least as hard as a set cover problem 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 16

¡ Extremely bad news: § Influence maximization is NP-complete ¡ Next, good news: § There exists an approximation algorithm! § For some inputs the algorithm won’t find globally optimal solution/set OPT § But we will also prove that the algorithm will never do too badly either. More precisely, the algorithm will find a set S that where 𝒈 𝑻 ≥ 𝟏. 𝟕𝟒 ∗ 𝒈(𝑷𝑸𝑼) , where OPT is the globally optimal set. 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 20

¡ Consider a Greedy Hill Climbing algorithm to find S : § Input: Influence set 𝒀 𝒗 of each node 𝒗: 𝒀 𝒗 = {𝒘 𝟐 , 𝒘 𝟑 , … } § That is, if we activate 𝒗 , nodes {𝒘 𝟐 , 𝒘 𝟑 , … } will eventually get active § Algorithm: At each iteration 𝒋 activate the node 𝒗 that gives largest marginal gain: 𝐧𝐛𝐲 𝒈(𝑻 𝒋T𝟐 ∪ {𝒗}) 𝒗 𝑇 𝑗 … Initially active set 𝑔(𝑇 7 ) … Size of the union of 𝑌 W , 𝑣 ∈ 𝑇 7 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 21

Algorithm: a ¡ Start with 𝑻 𝟏 = { } b d ¡ For 𝒋 = 𝟐 … 𝒍 e c § Activate node 𝒗 that max 𝒈(𝑻 𝒋T𝟐 ∪ {𝒗}) f(S i-1 È {u}) § Let 𝑻 𝒋 = 𝑻 𝒋T𝟐 ∪ {𝒗} a ¡ Example: b § Eval. 𝑔 𝑏 , … , 𝑔({𝑓}) , pick argmax of them c § Eval. 𝑔 𝒆, 𝑏 , … , 𝑔({𝒆, 𝑓}) , pick argmax d § Eval. 𝑔(𝒆, 𝒄, 𝑏}), … , 𝑔({𝒆, 𝒄, 𝑓}) , pick argmax e 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 22

¡ Claim: Hill climbing produces a solution S where: f(S) ³ (1-1/e)*f(OPT) ( f(S)>0.63*f(OPT) ) [Nemhauser, Fisher, Wolsey ’78, Kempe, Kleinberg, Tardos ‘03] ¡ Claim holds for functions f(·) with 2 properties: § f is monotone: (activating more nodes doesn’t hurt) if S Í T then f (S) £ f (T) and f({})= 0 § f is submodular: (activating each additional node helps less) adding an element to a set gives less improvement than adding it to one of its subsets: " S Í T f(S È {u}) – f(S) ≥ f(T È {u}) – f(T) Gain of adding a node to a small set Gain of adding a node to a large set 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 23

¡ Diminishing returns: f(·) " S Í T f(T È {u}) f(T) f(S È {u}) f(S) Adding u to T helps less than adding it to S ! Set size |T|, |S| f(S È {u}) – f(S) ≥ f(T È {u}) – f(T) Gain of adding a node to a small set Gain of adding a node to a large set 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 24

Also see the handout posted on the course website.

¡ We must show our f(·) is submodular: ¡ " S Í T f(S È {u}) – f(S) ≥ f(T È {u}) – f(T) Gain of adding a node to a small set Gain of adding a node to a large set ¡ Basic fact 1: § If 𝒈 𝟐 (𝒚), … , 𝒈 𝒍 (𝒚) are submodular , and 𝒅 𝟐 , … , 𝒅 𝒍 ≥ 𝟏 then 𝑮 𝒚 = ∑ 𝒋 𝒅 𝒋 b 𝒈 𝒋 𝒚 is also submodular (Non-negative combination of submodular functions is a submodular function) 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 26

f(S È {u}) – f(S) ≥ f(T È {u}) – f(T) ¡ " S Í T : Gain of adding u to a small set Gain of adding u to a large set ¡ Basic fact 2: A simple submodular function § Sets 𝒀 𝟐 , … , 𝒀 𝒏 § 𝒈 𝑻 = ⋃ 𝒍∈𝑻 𝒀 𝒍 (size of the union of sets 𝒀 𝒍 , 𝒍 ∈ 𝑻 ) § Claim: 𝒈(𝑻) is submodular! T S The more sets you already u have the less new area a given set u will S Í T cover 11/7/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 27

http://cs224w.stanford.edu We are more influenced by our friends - PowerPoint PPT Presentation

CS224W: Machine Learning with Graphs Jure Leskovec, Stanford University http://cs224w.stanford.edu We are more influenced by our friends than strangers 68% of consumers consult friends and family before purchasing home electronics 50%

http://cs224w.stanford.edu We are more influenced by our friends than strangers 68% of

http://cs224w.stanford.edu Course website: Course website: http://cs224w.stanford.edu

http://cs224w.stanford.edu Complex systems are around us:

http://cs224w.stanford.edu Intuition: Map nodes to -dimensional embeddings such that

http://cs224w.stanford.edu Spreading through Examples: networks: Biological:

http://cs224w.stanford.edu The idea of the reaction papers is: Familiarize yourselves more

http://cs224w.stanford.edu Stanford Social Web (ca. 1999) network

http://cs224w.stanford.edu Three topics for today: 1. GNN recommendation (PinSage) 2.

http://cs224w.stanford.edu Spreading through Examples: networks: Biological: Cascading

http://cs224w.stanford.edu (1) New problem: Outbreak detection (2) Develop an approximation

http://cs224w.stanford.edu Three basic stages: 1) Pre-processing Construct a matrix

http://cs224w.stanford.edu (1) New problem: Outbreak detection (2) Develop an approximation

http://cs224w.stanford.edu (1) New problem: Outbreak detection (2) Develop an approximation

http://cs224w.stanford.edu Degree distribution: P(k) Path length: h Clustering coefficient: C

http://cs224w.stanford.edu Output: Node embeddings. We can also embed larger network

http://cs224w.stanford.edu 10/31/2012 Jure Leskovec, Stanford CS224W: Social and Information

http://cs224w.stanford.edu How to organize/navigate it? First try: Human curated Web

http://cs224w.stanford.edu Better and better clusters (k), (score) Clusters get worse and

http://cs224w.stanford.edu Evolving Networks are networks that change as a function of time

http://cs224w.stanford.edu October August 12/3/2013 Jure Leskovec, Stanford CS224W: Social and

http://cs224w.stanford.edu In decision-based models nodes make decisions based on pay-off

http://cs224w.stanford.edu [Morris 2000] Based on 2 player coordination game 2 players

http://cs224w.stanford.edu ? ? ? ? Machine Learning ? Node classification 12/4/17 Jure

http://cs224w.stanford.edu Subnetworks , or subgraphs, are the building blocks of networks: