PROBABILISTIC MODELS FOR STRUCTURED DATA Course Project - PowerPoint PPT Presentation

PROBABILISTIC MODELS FOR STRUCTURED DATA Course Project Instructor: Yizhou Sun yzsun@cs.ucla.edu January 14, 2020

Overview • Goal: design a probabilistic graphical model to solve real-world problems, and write a report that is potentially submitted to some venue for publication • Teamwork • 3-4 people per group • Milestones • Team formation due date: Week 2 (1pt as participation) • Proposal due date: Week 5 (5pt) • Presentation due date: 3/12/2020 in class (20pt) • Final report due date: 3/13/2020 (15pt) • What to submit: project report and code 2

Report Guideline • Format: no more than 8-page, ACM SIG template: https://www.acm.org/publications/proceedings- template-16dec2016: • 1. Title with group information (group # and name, group member names) 2. Abstract 3. Introduction of the overall goal and background 4. Problem definition and formalization 5. Methods description (detailed steps) 6. Experiments design and Evaluation • 7. Related work 8. Conclusion • 9. References 3

Breakdown Points 4. Report writing 1. Is the problem 2. Is the solution solid 3. Is there comparison formalization and reasonable? with alternative Quality reasonable? approaches with reasonable evaluation? 4

Problem 1: Paper Classification in Directed Citation Network • Cora Dataset: • http://www.cs.umass.edu/ ∼ mccallum/code- data.html • Cora.zip • Label: Each paper is associated with a research topic • There is a hierarchy structure in the dataset, please use the top hierarchy as labels • Feature: Each paper has words extracted from title 5

• Task: • Design a probabilistic graphical model to leverage the citation links to classify papers into research topics • Questions to address: • How to take the asymmetry in citation relation into the potential function design? • Design asymmetry potential function and implement it correctly • Will the consideration of asymmetry improve the classification accuracy? • Compare with the solution that simply ignores the asymmetry 6

• Evaluation: • Hide p% labels as test, use the remaining as training • Vary p to see its impact to the classification accuracy • Evaluation metric for multi-label classification 7

Problem 2: Node Classification in Heterogeneous Bibliographic Network • Dataset • four_area.zip • Label: authors and venues are associated with one of the four research areas, i.e., DB, DM, ML, IR • Label information can be found on DBLP_four_area.zip • Feature: Only Papers are associated with text information 8

• Task: • Design a probabilistic graphical model to classify all the objects into four category in the network • Questions to address: • How to leverage different types of links in the network? • Design different types of potential functions for different types of links by assuming different parameters • Will the consideration of type information for links improve the performance? • Compare the solution that treats all the links equally 9

• Evaluation: • Hide p% labels as test, use the remaining as training • Vary p to see its impact to the classification accuracy • Evaluation metric for multi-label classification • Evaluation when multiple types of nodes exist 10

Project 3: Polarity Detection for Twitter Users • Dataset: Crawl Twitter Users following Political figures, their following, retweet, and reply behaviors, as well as their tweets • Task: Design a probabilistic graphical model to classify all the users into two polarities 11

Project 4: Knowledge Completion for Knowledge Graphs via Higher-Order Dependency Modeling • Datasets: Knowledge Graphs, such as YAGO, FreeBase, and NELL • Task: Design a probabilistic graphical model to that can leverage higher-order dependency to solve knowledge graph completion tasks • i.e., < h,r,?> 12

Project 5: Construct CS Taxonomy from Wiki • Dataset: Wikipedia • Task: construct taxonomy for terms related to computer science • E.g., root node: “computer science” https://www.researchgate.net/figure/Computer-Science-Taxonomy_fig1_260318181 13

Project 6: NER for Wiki Pages in CS • Dataset: Wikipedia • Task: Conduct NER task for text of wiki pages • Categories: concept (e.g., machine learning, deep learning); algorithm (e.g., CNN); application (e.g., self driving car); dataset (e.g., ImageNet), etc. 14

PROBABILISTIC MODELS FOR STRUCTURED DATA Course Project - PowerPoint PPT Presentation

PROBABILISTIC MODELS FOR STRUCTURED DATA Course Project Instructor: Yizhou Sun yzsun@cs.ucla.edu January 14, 2020 Overview Goal: design a probabilistic graphical model to solve real-world problems, and write a report that is potentially

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Variational Inference for Tutorial Outline Structured NLP Models 1. Structured Models and Factor

CSCE 496/896 Lecture 11: Structured Prediction and Structured Prediction and Probabilistic

Semi-structured data Data is not just text, but is not as well- Semi-structured data

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

From Probabilistic Circuits to Probabilistic Programs and Back Guy Van den Broeck PROBPROG - Oct

Probabilistic Morphable Models 2019: Hands-on part Ghazi Bouabene Probabilistic Morphable Models

Computer Science Let me be provocative Probabilistic graphical models is how we do probabilistic

Data and Analysis Part I Structured Data Ian Stark January 2011 Part I: Structured Data

Logical Step-Indexed Logical Relations Derek Dreyer Max Planck Institute for Software Systems

Logic Conditionals, Supervenience, and Selection Tasks 7 th Workshop KI & Kognition

Spatial and Temporal Knowledge Representation Antony Galton University of Exeter, UK PART IV:

Symmetric lenses and universality Bob Rosebrugh (with Michael Johnson) Department of Mathematics

Introduction to dependency parsing Marco Kuhlmann Department of Computer and Information Science

Magneto-acoustic waves in an asymmetric magnetic slab Progress in spatial magneto-seismology

Where is the problem? Facts and Hypotheses. A general frame and some open questions.

Deciding Confluence of Certain Term Rewriting Systems in Polynomial Time Ashish Tiwari { tiwari

PROBABILISTIC MODELS FOR STRUCTURED DATA Course Project - PowerPoint PPT Presentation

PROBABILISTIC MODELS FOR STRUCTURED DATA Course Project Instructor: Yizhou Sun yzsun@cs.ucla.edu January 14, 2020 Overview Goal: design a probabilistic graphical model to solve real-world problems, and write a report that is potentially

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Variational Inference for Tutorial Outline Structured NLP Models 1. Structured Models and Factor

CSCE 496/896 Lecture 11: Structured Prediction and Structured Prediction and Probabilistic

Semi-structured data Data is not just text, but is not as well- Semi-structured data

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

From Probabilistic Circuits to Probabilistic Programs and Back Guy Van den Broeck PROBPROG - Oct

Probabilistic Morphable Models 2019: Hands-on part Ghazi Bouabene Probabilistic Morphable Models

Computer Science Let me be provocative Probabilistic graphical models is how we do probabilistic

Data and Analysis Part I Structured Data Ian Stark January 2011 Part I: Structured Data

Logical Step-Indexed Logical Relations Derek Dreyer Max Planck Institute for Software Systems

Logic Conditionals, Supervenience, and Selection Tasks 7 th Workshop KI &amp; Kognition

Spatial and Temporal Knowledge Representation Antony Galton University of Exeter, UK PART IV:

Symmetric lenses and universality Bob Rosebrugh (with Michael Johnson) Department of Mathematics

Introduction to dependency parsing Marco Kuhlmann Department of Computer and Information Science

Magneto-acoustic waves in an asymmetric magnetic slab Progress in spatial magneto-seismology

Where is the problem? Facts and Hypotheses. A general frame and some open questions.

Deciding Confluence of Certain Term Rewriting Systems in Polynomial Time Ashish Tiwari { tiwari

Logic Conditionals, Supervenience, and Selection Tasks 7 th Workshop KI & Kognition