Learning Polytrees with Constant Number of Roots from Data Jan - PowerPoint PPT Presentation

Learning Polytrees with Constant Number of Roots from Data Jan Manuch 1,2 , Javad Safaei 1 , Ladislav Stacho 2 1. University of British Columbia, Department of Computer Science 2. Simon Fraser University, Department of Mathematics

Introduction • The goal is to learn a probabilistic Graphical Model (Direct Acyclic Graph or DAG) given a dataset that optimizes an objective function. • Types of objective functions: ▫ Bayesian Score ▫ Maximum Likelihood ( ML ) Score • Chickering (1996) [1] has shown that learning optimal Bayesian DAGs is NP-complete. Similarly, learning minimal ML DAGs is NP-complete. • Minimal ML DAG is a ML DAG with minimum number of edges.

Data Set • Data is a set of m vectors ( D j, 1 ≤ j ≤ m ) . Vector X 1 X 2 X 3 X 4 1 0 1 1 0 2 0 1 1 0 • Each vector X has fixed number n of features ( X i , 1 ≤ i ≤ n ). 3 1 1 0 0 4 1 1 0 0 5 1 1 1 1 • Each feature can take different 6 0 0 1 1 values val( X i )={v 1 , v 2 , … , v mi } . 7 1 1 1 0 8 1 0 1 1 • The value of the � - �� vector and 9 0 1 0 1 � - �� feature will be denoted as 10 0 1 1 1 � �, � . 11 1 1 1 0 � • ML Score of D and DAG � 12 1 0 0 1 � 13 0 0 0 0 � = � � � � � � � � � 14 0 1 1 0 15 0 0 0 0 �� 16 0 1 0 1 • Example: m=16, n=4, m i =2

Learning Tree Structures • Chow and Liu (1968) [2] has shown that learning ML trees is polynomial and can be computed in � � � (� + log �) . 1. Compute mutual information of every two vertices 2. Find the MST (maximum spanning tree) using MI as weights 3. Pick any vertex as the root and orient the edges ( it can be shown that the choice does not affect ML score ) • Definition. Polytrees are directed graphs with no undirected loops. • Dasgupta (1999) [3] showed that learning ML polytrees from data is NP-complete. • We study finding ML polytrees with constant number of roots.

Factorization � • Definition . The probability for every input vector � � given a DAG � is defined as: � � = � �(� � = � �,� |! " = # �,� ) � � � � , �� , and # �,� all of their where ! " is a set of all parent nodes of � � in � values in vector � � . � is also called factorized form of distribution � with respect • � � � � � . to � • � itself is called empirical distribution and is computed from data: + ∑ 〈� �,� = ' 〉 �� = ' = �

Merging nodes and edges • Definition . Vertices having more than one parent in a DAG are called merging nodes , and merging edges are all incoming edges to merging nodes. • Proposition 1 (Verma and Pearl 1990 [4]) . Two DAGs with the same skeleton and merging edges, factorize a distributions similarly, i.e., . /01 23 � If �̅ = �′ � = 23(�′) then � � � � = � � �′ , � is a collection of all merging edges. where 23 � • Proposition 1 helps us to avoid enumerating all orientations of edges.

Learning Polytrees Algorithm • Proposition 2 . In a polytree with 4 > 1 number of roots the following properties hold: : 2 ≤ |8 ℓ | ≤ 4, ∑ |8 ℓ | = 4 + ; − 1 , ℓ�� where ; is the total number of merging nodes, and |8 ℓ | = = ℓ + 2 is the number of parents of the ℓ - �� merging node. • Algorithm for > -root polytree: 1. Generate a set of merging edges respecting Proposition 2. 2. For each selection of merging edges, run MST algorithm but we do not allow components contain more than one merging node. 3. Pick any orientation of undirected edges that does not introduce any new merging node (ok by Proposition 1).

Example ◮ Pick a selection of merging edges ( n = 7, k = 3)

Example ◮ Run modified MST algorithm

Example ◮ Orient edges in components containing merging nodes (merging nodes is the root)

Example ◮ Orient edges in other components (roots in each component can be picked arbitrarily)

Counting selections of merging edges • Let ?(�, 4) be total number of selections of merging edges in polytrees with � nodes and 4 roots: ? �, 4 = � �@� �@� �@� G@� ∑ ∑ … :�� A B CA D C⋯CA F �G@:@� : A B C� A D C� A H C� G@� I : 4 − ; − 1, ; � �:CG@� ≤ ∑ (by Proposition 2) :�� G@� ≤ � GC� J(� � ) :@� 4 − 2 � GC� 1 +� � G@� ∈ �(� LG@L ) = ; − 1 :��

Algorithm’s Complexity • Enumerates ?(�, 4) ∈ �(� LG@L ) merging edge selections, and for each spends � � + � + �� time for edge completion, orientation assignment, and likelihood computation. Hence, the total complexity of our algorithm is M N0 O>@P . • Gasper and et al. [5] introduced 4 -branchings as polytrees that by removing 4 arcs transform to directed forests, and provided an algorithm for learning k -branching working in time M(N0 O>CQ ) Proposition . 4 -branching is equivalent with learning • polytrees with up to 4 + 1 roots. Our algorithm is by �(� L ) factor faster than the algorithm of Gasper and et al. [5].

Experiment: Identification of phosphorylation sites • Peptides are shorts sequences of amino acids. We consider peptides of length 9 centered at a phosphorylation site (Serine, Threonine, Tyrosine) which is phosphorylated by protein kinases. • Two different peptides datasets is used ▫ 803 peptides that are phosphorylated by protein kinase PKC ▫ 1000 randomly selected peptides that are phosphorylated by some kinase • We learn the maximum likelihood polytrees of two and three roots.

Results Peptides of PKC 1000 Random Peptides Algorithm # Trees # Trees Score Time Score Time tested tested MST(1 root) = tree -19.15 0.15 1 -21.47 0.04 1 Heuristic: MST+(2 roots) -18.14 1.07 9 -20.40 1.13 8 MST+(3 roots) -17.26 2.86 23 -19.35 2.77 18 Exact: 2 roots -18.02 27.47 252 -20.37 35.98 252 3 roots -16.97 2551.50 23184 -19.30 3235.38 23184 PKC peptides have higher avg. likelihood score than random peptides as it is • expected and they are more convergent. The higher number of roots, the better likelihood score as expected. •

Application?: Predicting peptides structure • If we assume that connected nodes in the learned polytree are close positions in the peptides in 3D structure, then we could get some information about 3D structure of the peptides: Tree- Structure Tree-Structure learned by MST learned by 3- root polytree

Conclusions • We presented � �� LG@� algorithm for learning polytrees with n roots and k roots, which improves the algorithm of Gasper and et al. [5] by �(� L ) factor. • Applied this algorithm to predicting peptides that are phosphorylated (or phosphorylated by a particular kinase). • Is there an FPT algorithm for this problem?

References • [1] Chickering, D.M.: Learning Bayesian networks is NP-complete. In: Learning from data, pp. 121–130. Springer (1996) • [2] Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14(3), 462– 467 (1968) • [3] Dasgupta, S.: Learning polytrees. In: Uncertainty in Artificial Intelligence, pp. 134–141 (1999) • [4] Verma, T.S., Pearl, J.: Equivalence and synthesis of causal models. In: Uncertainty in Artificial Intelligence (UAI). pp. 220–227 (1990) • [5] Gaspers, S., Koivisto, M., Liedloff, M., Ordyniak, S., Szeider, S.: On finding optimal polytrees. In: Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)

Thank you • Any questions?

Learning Polytrees with Constant Number of Roots from Data Jan - PowerPoint PPT Presentation

Learning Polytrees with Constant Number of Roots from Data Jan Manuch 1,2 , Javad Safaei 1 , Ladislav Stacho 2 1. University of British Columbia, Department of Computer Science 2. Simon Fraser University, Department of Mathematics

Raizes/Roots (2000.4) Eduardo Pineda and Ray Patlan Coronado Playground, San Francisco

Roots Slide 4 / 180 The symbol for taking a square root is , it is a radical sign. The

Roots unrooted Pavel Caha The morphologists view: Roots vs. affixes 1/28 Roots: the

Non-constant Non-constant growth model growth model You are calculating the intrinsic value of

Roots of Polynomials Under Repeated Differentiation Stefan Steinerberger UCLA/Caltech, October

Hannah Roots Lawyer Hannah Roots Law Corporation 2019 NCSEA Board of Directors Election Hannah

I : Newstead definitions Basic mob of unity in a study as subgroups of # roots of

12 Tips for giving an Effective Presentation Louise Lehane, UoL, Ireland Tip Number One Tip

Motion with Constant Acceleration 1 Particle Under Constant Acceleration In the case of motion

Constant mean curvature surfaces in homogeneous manifolds Beno t Daniel August 29, 2012

ROOTS Sustainable Agricultural Technologies Ltd Corporate Presentation 20 20 Disclaimer

ROOTS Sustainable Agricultural Technologies Ltd Corporate Presentation October 2018 Disclaimer

ROOTS Presents An Evening of FOLK MUSIC ROOTS - Purpose Our vision is to popularise folk music

From the Roots of a Tree: From the Roots of a Tree: The Genealogy of Martin Luther King, Jr. The

Scaling Outreach in Immigrant Communities Revolution English - Our Roots Who we are Our roots

Root Zone Heating of Avocado Roots - Sustainable Agricultural Technologies Israeli-based and

Introduction to Easy and Fast Simulations with QwikMD Joo V. Ribeiro www.ks.uiuc.edu/~jribeiro

17/12461/OUT Tottenham House Tottenham House Tottenham House - Front Tottenham House - Front

Recent Status of Polarized Electron Sources at Nagoya University M. Kuwahara, N. Yamamoto, F.

ATLAS BEAMLINE TUNING AND CHARACTERIZATION d r h g f d j h n g n g f m h g m g h m g h j

Advanced Data Mining with Weka Class 2 Lesson 1 Incremental classifiers in Weka Albert Bifet

A synthetic entry to amino acid derivatives through Davidson-like Heterocyclization Jim Kppers

Printing multiple slides Microsoft PowerPoint 2007 You may wish to save your computer budget by

Features Uses only standard L A T EX commands of special commands like slidetex. 1. Create