gp regression on random graphs covariance functions and
play

GP regression on random graphs: Covariance functions and Bayes - PowerPoint PPT Presentation

Motivation Covariance functions Bayes errors Summary GP regression on random graphs: Covariance functions and Bayes errors P Sollich 1 and Camille Coti 1 , 2 1 Kings College London 2 Laboratoire de Recherche en Informatique, Universit e


  1. Motivation Covariance functions Bayes errors Summary GP regression on random graphs: Covariance functions and Bayes errors P Sollich 1 and Camille Coti 1 , 2 1 King’s College London 2 Laboratoire de Recherche en Informatique, Universit´ e Paris-Sud Peter Sollich & Camille Coti GP regression on random graphs

  2. Motivation Covariance functions Bayes errors Summary Outline Motivation 1 Covariance functions on graphs 2 Definition from graph Laplacian Analysis on regular graphs: tree approximation Effect of loops Bayes errors and learning curves 3 Approximations Effect of loops Effect of kernel parameters Summary and outlook 4 Peter Sollich & Camille Coti GP regression on random graphs

  3. Motivation Covariance functions Bayes errors Summary Motivation GP regression over continuous spaces relatively well understood [e.g. Opper & Malzahn] Discrete spaces occur in many applications: sequences, strings etc What can we say about GP learning on these? Focus on random graphs with finite connectivity as a paradigmatic case Peter Sollich & Camille Coti GP regression on random graphs

  4. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Outline Motivation 1 Covariance functions on graphs 2 Definition from graph Laplacian Analysis on regular graphs: tree approximation Effect of loops Bayes errors and learning curves 3 Approximations Effect of loops Effect of kernel parameters Summary and outlook 4 Peter Sollich & Camille Coti GP regression on random graphs

  5. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Graph Laplacian Easiest to define from graph Laplacian [Smola & Kondor 2003] Adjacency matrix A ij = 0 or 1 depending on whether nodes i and j are connected For a graph with V nodes, A is a V × V matrix Consider undirected links ( A ij = A ji ), and no self-loops ( A ii = 0 ) Degree of node i : d i = � V j =1 A ij Set D = diag ( d 1 , . . . , d V ) ; then graph Laplacian is def’d as L = 1 − D − 1 / 2 AD − 1 / 2 Spectral graph theory: L has eigenvalues in 0 . . . 2 Peter Sollich & Camille Coti GP regression on random graphs

  6. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Graph covariance functions Definition From graph Laplacian, can define covariance “functions” (really V × V matrices) Random walk kernel, a ≥ 2 : ( a − 1) 1 + D − 1 / 2 AD − 1 / 2 � P C ∝ ( a − L ) p ∝ � Diffusion kernel: − σ 2 � σ 2 � � � 2 D − 1 / 2 AD − 1 / 2 C ∝ exp 2 L ∝ exp Useful to normalize so that (1 /V ) � i C ii = 1 Peter Sollich & Camille Coti GP regression on random graphs

  7. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Graph covariance functions Interpretation Random walk on graph has transition probability matrix A ij d − 1 for transition j → i j After s steps, get ( AD − 1 ) s = D 1 / 2 ( D − 1 / 2 AD − 1 / 2 ) s D − 1 / 2 Compare this with p ( p � s )(1 /a ) s (1 − 1 /a ) p − s ( D − 1 / 2 AD − 1 / 2 ) s C ∝ s =0 So D 1 / 2 CD − 1 / 2 is a random walk transition matrix, averaged over distribution of number of steps: s ∼ Poisson ( σ 2 / 2) s ∼ Binomial(p,1/a) or Diffusion kernel is limit p, a → ∞ at constant p/a = σ 2 / 2 Peter Sollich & Camille Coti GP regression on random graphs

  8. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Outline Motivation 1 Covariance functions on graphs 2 Definition from graph Laplacian Analysis on regular graphs: tree approximation Effect of loops Bayes errors and learning curves 3 Approximations Effect of loops Effect of kernel parameters Summary and outlook 4 Peter Sollich & Camille Coti GP regression on random graphs

  9. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Random regular graphs Regular graphs: Every node has same degree d Random graph ensemble: all graphs with given V and d are assigned the same probability Typical loops are then long ( ∝ ln V ) if V is large So locally these graphs are tree-like How do graph covariance functions then behave? Expect that after many random walk steps ( p → ∞ ), kernel becomes uniform: C ij = 1 , all nodes fully correlated Peter Sollich & Camille Coti GP regression on random graphs

  10. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Covariance functions on regular trees On regular trees, all nodes are equivalent (except for boundary effects) So kernel C ij is a function only of distance ℓ measured along the graph (number of links between i and j ) Can calculate recursively over p : C ℓ,p =0 = δ ℓ, 0 and � 1 − 1 � C 0 ,p + d C 0 ,p +1 = ad C 1 ,p a 1 � 1 − 1 � C ℓ,p + d − 1 C ℓ,p +1 = ad C ℓ − 1 ,p + C ℓ +1 ,p a ad Normalize afterwards for each p so that C 0 ,p = 1 Let’s see what happens for d = 3 , a = 2 and increasing p Peter Sollich & Camille Coti GP regression on random graphs

  11. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Effect of increasing p a=2, d=3 1 K l p =1 p =2 0.8 p =3 p =4 p =5 p =10 0.6 p =20 p =50 p =100 0.4 p =200 p =500 p =infty 0.2 0 0 10 5 15 l Kernel does not become uniform even for p → ∞ Peter Sollich & Camille Coti GP regression on random graphs

  12. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops What is going on? Mapping to biased random walk Gather all the (equal) random walk probabilities over the shell of nodes at distance ℓ : S ℓ,p = d ( d − 1) ℓ − 1 C ℓ,p S 0 ,p = C 0 ,p , Then recursion S ℓ,p → S ℓ,p +1 represents a biased random walk in one dimension, with reflecting barrier at origin: 1 − 1 1 1 − 1 d − 1 1 − 1 d − 1 1 − 1 a a a ad a ad a − → − → − → � � � � 0 1 2 3 ← − ← − ← − 1 1 1 ad ad ad Peter Sollich & Camille Coti GP regression on random graphs

  13. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Random walk propagation Plots of ln S ℓ,p versus ℓ for d = 3 , a = 2 0 p =5000 S l -50 p =2000 1000 -100 500 -150 0 500 1000 1500 l ℓ → ℓ + 1 with prob. ( d − 1) / ( ad ) , ℓ → ℓ − 1 with prob. 1 / ( ad ) , so S ℓ,p has peak at ℓ = ( p/a )[( d − 2) /d ] Peter Sollich & Camille Coti GP regression on random graphs

  14. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Converting back to C ℓ,p ∝ S ℓ,p / ( d − 1) ℓ − 1 S l K l 0 0 0 -1 -2 -50 -50 -3 2000 0 10 p =5000 -100 -100 2000 500 100 -150 p =5000 -150 0 100 200 300 400 0 100 200 300 400 l l Covariance function determined by tail of S ℓ,p near origin Can be used to calculate C ℓ,p →∞ = [1 + ℓ ( d − 1) /d ]( d − 1) − ℓ/ 2 Peter Sollich & Camille Coti GP regression on random graphs

  15. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Outline Motivation 1 Covariance functions on graphs 2 Definition from graph Laplacian Analysis on regular graphs: tree approximation Effect of loops Bayes errors and learning curves 3 Approximations Effect of loops Effect of kernel parameters Summary and outlook 4 Peter Sollich & Camille Coti GP regression on random graphs

  16. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Effect of loops Eventually, approximation of ignoring loops must fail Estimate when this happens: tree of depth ℓ has V = 1 + d ( d − 1) ℓ − 1 nodes So a regular graph can be tree-like at most out to ℓ ≈ ln( V ) / ln( d − 1) Random walk on graph typically takes p/a steps, so expect loop effects to appear in covariance function around p ln( V ) a ≈ ln( d − 1) � Check by measuring average of K 1 = C ij / C ii C jj ( i, j nearest neighbours) on randomly generated graphs Peter Sollich & Camille Coti GP regression on random graphs

  17. Motivation Covariance functions Bayes errors Summary Definition Analysis Effect of loops Covariance function for neighbouring nodes K 1 1 0.9 d =3 0.8 0.7 0.6 a =2, V =infty 0.5 a =2, V =500 0.4 a =4, V =infty a =4, V =500 0.3 0.2 ln V / ln( d -1) 0.1 0 1 10 100 1000 p/a K 1 starts to get larger than for tree approximation ( V → ∞ ) Results depend only on p/a for large p as expected Peter Sollich & Camille Coti GP regression on random graphs

  18. Motivation Covariance functions Bayes errors Summary Approximations Effect of loops Kernel parameters Outline Motivation 1 Covariance functions on graphs 2 Definition from graph Laplacian Analysis on regular graphs: tree approximation Effect of loops Bayes errors and learning curves 3 Approximations Effect of loops Effect of kernel parameters Summary and outlook 4 Peter Sollich & Camille Coti GP regression on random graphs

Recommend


More recommend