2 model selection scores 3 new stuff fnml score 2 30
play

+ 2. Model Selection Scores 3. New Stuff: fNML Score 2/30 + Bayesian - PowerPoint PPT Presentation

+ fNML Criterion Tomi Silander Teemu Roos Petri Kontkanen Petri Myllymaki for Learning Bayesian PGM08 Network Hirtshals Structures September 1719 2008 Helsinki Institute for Information Technology HIIT FINLAND 1. Bayesian


  1. + fNML Criterion Tomi Silander Teemu Roos Petri Kontkanen Petri Myllymaki for Learning Bayesian PGM‐08 Network Hirtshals Structures September 17‐19 2008 Helsinki Institute for Information Technology HIIT FINLAND

  2. 1. Bayesian Networks + 2. Model Selection Scores 3. New Stuff: fNML Score 2/30

  3. + Bayesian Networks 3/30 Conditional independence assumptions Factorization of a joint probability distribution:

  4. + Data 4/30 NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 Clark male reporter 0 Margrethe female queen 2 : : : :

  5. + Data 5/30 NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 Clark male reporter 0 Margrethe female queen 2 : : : :

  6. + 6/30 Data NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 D Clark male reporter 0 Margrethe female queen 2 : : : :

  7. + 7/30 Data NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 D i Clark male reporter 0 Margrethe female queen 2 : : : :

  8. + • Bayes (BDe) • BIC & AIC • MDL 8/30

  9. + Bayesian Score 9/30 The state-of-the-art model selection criterion: Bayesian Dirichlet equivalent (BDe) score Assumes Dirichlet prior on model parameters θ . Evaluate marginal likelihood of data given model Depends on hyper-parameter α .

  10. + BIC & AIC 10/30 BIC: Asymptotic approximation of marginal likelihood: AIC: Asymptotic approximation of estimated prediction error:

  11. + MDL 11/30 Minimum Description Length (MDL) Principle: Choose the model that yields the shortest description of the data together with the model. Too simple model data long, model short "Just right" data short, model short Too complex model data short, model long

  12. + Flavours of MDL 12/30 1. "Pedestrian" Asymptotic two-part code-length same as BIC.

  13. + Flavours of MDL 13/30 1. "Pedestrian" Asymptotic two-part code-length same as BIC. 2. "Sophisticated" Bayesian marginal likelihood.

  14. + Flavours of MDL 14/30 1. "Pedestrian" Asymptotic two-part code-length same as BIC. 2. "Sophisticated" Bayesian marginal likelihood. 3. "Champions League" Modern (minimax regret optimal) code normalized maximum likelihood (NML) Problem: NML computationally very hard.

  15. + Bayes vs. MDL (minimax regret) 15/30 The Bayesian decision principle is minimization of expected loss: min A E X [loss(A,X)] MDL (especially NML) is based on minimization of worst-case regret: min A max X [loss(A,X) – min A' loss(A',X)] "regret"

  16. + • fNML = "factorized NML" • computation • consistency 16/30

  17. + fNML Score 17/30 We propose a new MDL score, factorized NML, which is 1. easy to compute, 2. decomposable (allowing fast search), 3. robust (experimentally).

  18. + 18/30 fNML vs. NML: what's new? NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 Clark male reporter 0 Margrethe female queen 2 : : : :

  19. + 19/30 fNML vs. NML: what's new? NML: Minimax code applied to whole data as one block NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 D Clark male reporter 0 Margrethe female queen 2 : : : :

  20. + 20/30 fNML vs. NML: what's new? fNML: minimax code applied column by column NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 D 2 Clark male reporter 0 Margrethe female queen 2 : : : :

  21. + 21/30 fNML vs. NML: what's new? fNML: Conditional minimax code when parent(s) exist. NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 D 1 Clark male reporter 0 Margrethe female queen 2 : : : :

  22. + 22/30 fNML vs. NML: what's new? fNML: Conditional minimax code when parent(s) exist. NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 D 3 Clark male reporter 0 Margrethe female queen 2 : : : :

  23. + 23/30 fNML vs. NML: what's new? fNML: Conditional minimax code when parent(s) exist. NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 D 4 Clark male reporter 0 Margrethe female queen 2 : : : :

  24. + 24/30 fNML vs. NML: what's new? fNML: Conditional minimax code when parent(s) exist. NAME GENDER PROFESSION CHILDREN Teemu male researcher 2 D 4 Clark male reporter 0 Margrethe female queen 2 : : : : Each column is encoded using the minimax code for multinomials. Using fast NML algorithms, this takes O(n log n) per column.

  25. + fNML: Consistency 25/30 (Haughton, 1988): Any penalized likelihood score of the form where a n satisfies and , is consistent. Theorem: fNML behaves asymptotically like BIC, i.e., a n = log n . Hence, fNML is consistent.

  26. + Robustness 26/30 BIC BDe, fNML

  27. + Robustness 27/30 BIC BDe optimal when prior "correct". fNML almost as good. BDe, fNML

  28. + Robustness 28/30 f N M L

  29. + Robustness 29/30 BDe much worse when prior "incorrect". fNML more robust. f N M L

  30. + Questions?

  31. + Decomposable Scores Problem: Super-exponential search space. Solution: Decomposable scores m SCORE(G,D) = Σ S(D i ,D Gi ) i=1 For decomposable scores, exact search (global optimum) can be done for about m ≤ 30 nodes (Koivisto & Sood, 2004; Silander and Myllymäki, 2006) .

Recommend


More recommend