Sea Search-Gu Guided, Lightly-Su Super ervised ed Training of - PowerPoint PPT Presentation

Sea Search-Gu Guided, Lightly-Su Super ervised ed Training of of Structured Prediction on Energy Networ orks Pedram Rooshenas Dongxu Zhang Gopal Sharma Andrew McCallum

St Struc uctur ured d Predi diction • We are interested to learn a function • X input variables • Y output variables • We can define as • For a Gibbs distribution:

St Struc uctur ured d Predi diction n Ene nergy Networks (SP SPENs) • If is parameterized using a differentiable model such as a deep neural network: • We can find a local minimum of E using gradient descent • The energy networks express the correlation among input and output variables. • Traditionally graphical models are used for representing the correlation among output variables. • Inference is intractable for most of expressive graphical models

En Energy Mod odels [picture from Altinel (2018)] [picture from Belanger (2016)]

Tr Training SPENs • Structural SVM (Belanger and McCallum, 2016) • End-to-End (Belanger et al., 2017) • Value-based training (Gygliet al. 2017) • Inference Network (Lifu Tu and Kevin Gimpel, 2018) • Rank-Based Training (Rooshenas et al., 2018)

In Indirect Supervisi sion • Data annotation is expensive, especially for structured outputs. • Domain knowledge as the source of supervision. • It can be written as reward functions evaluates a pair of input and output configuration into a scalar • value • For a given x, we are looking for the best y that maximize 6

Se Search-Gu Guided ed Training We have a reward function that provides indirect supervision

Se Search-Gu Guided ed Training We want to learn a smooth version of the reward function such that we can use gradient-descent inference at test time We have a reward function that provides indirect supervision

Se Search-Gu Guided ed Training y 0 We sample a point from energy function using noisy gradient-descent inference

Se Search-Gu Guided ed Training y 0 y 1 We sample a point from energy function using noisy gradient-descent inference

Se Search-Gu Guided ed Training y 0 y 2 y 1 We sample a point from energy function using noisy gradient-descent inference

Se Search-Gu Guided ed Training y 0 y 2 y 3 y 1 We sample a point from energy function using noisy gradient-descent inference

Se Search-Gu Guided ed Training y 0 y 2 y 3 y 1 y 4 We sample a point from energy function using noisy gradient-descent inference

Se Search-Gu Guided ed Training y 0 y 2 y 3 y 1 y 4 y 5 We sample a point from energy function using noisy gradient-descent inference

Se Search-Gu Guided ed Training y 0 y 2 y 3 y 1 y 4 y 5 Then we project the sample to the domain of the reward function (the sample is a point in the simplex, but the domain of the reward function is often discrete, i.e., the vertices of the simplex)

Se Search-Gu Guided ed Training y 0 y 2 y 3 y 1 y 4 y 5 Then the search procedure uses the sample as input and returns an output structure by searching the reward function

Se Search-Gu Guided ed Training y 0 y 2 y 3 y 1 y 4 y 5 We expect that the two points have the same ranking on the reward function and negative of the energy function

Se Search-Gu Guided ed Training y 0 y 2 Ranking violation y 3 y 1 y 4 y 5 We expect that the two points have the same ranking on the reward function and negative of the energy function

Se Search-Gu Guided ed Training y 0 y 2 y 3 y 1 y 4 y 5 When we find a pair of points that violates the ranking constraints, we update the energy function towards reducing the violation

Ta Task-Lo Loss as Reward Function fo for Multi-La Label Classification • The simplest form of indirect supervision is to use task-loss as reward function:

Do Domain Knowledge as Re Reward Function fo for Ci Citation Field Extraction 24

En Energy Model author title ... e z i s r e t l i F Wei 0.9 0.05 ... Li Filter size 0.9 0.04 ... . 0.85 0.1 ... Tokens Deep 0.4 0.45 ... Energy Learning 0.1 0.8 ... for 0.05 0.9 ... ... ... ... ... Input Tag Convolutional layer Max pooling Multi-layer perceptron embedding distribution with multiple filters and and different concatenation window sizes

Pe Performance on Citation Field Extraction

Se Semi-Supe Supervised d Se Setting ng • Alternatively use the output of search and ground-truth label for training.

Sha Shape pe Parser c(32,32,28) c(32,32,24) I - t(32,32,20) + Parsing

Shape Sha pe Parser c(32,32,28) c(32,32,28) c(32,32,24) c(32,32,24) I - - t(32,32,20) t(32,32,20) + + Parsing Parsing Predict

Sha Shape pe Parser c(32,32,28) c(32,32,24) c(32,32,28) c(32,32,24) c(32,32,28) c(32,32,28) c(32,32,24) c(32,32,24) I O - - - - Graphic Engine t(32,32,20) t(32,32,20) t(32,32,20) t(32,32,20) + + + + Parsing Parsing Parsing Parsing Predict

Sha Shape pe Parser Ene nergy Mode del ... - circle(16,16,12) circle(16,16,12) 0.8 ... 1e-5 triangle(32,48,16) 1e-5 1e-5 ... Program + 1e-5 ... 1e-3 circle(16,24,12) 0.01 ... 1e-5 1e-5 ... 0.9 Output Energy distribution CNN Input Convolutional layer Multi-layer perceptron image

Se Search h Budg udget vs. Cons nstraint nts

Pe Performance on Shape Pa Parser

Co Conclusion and Future Directions • If a reward function exists to evaluate every structured output into a scalar value • We can use unlabled data for training structured prediction energy networks • Domain knowledge or non-differentiable pipelines can be used to define the reward functions. • The main ingredient for learning from the reward function is the search operator. • Here we only use simple search operators, but more complex search functions derived from domain knowledge can be used for complicated problems.

Sea Search-Gu Guided, Lightly-Su Super ervised ed Training of - PowerPoint PPT Presentation

Sea Search-Gu Guided, Lightly-Su Super ervised ed Training of of Structured Prediction on Energy Networ orks Pedram Rooshenas Dongxu Zhang Gopal Sharma Andrew McCallum St Struc uctur ured d Predi diction We are interested to

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

SEA HORSE 2,5 HP SEA HORSE 2,5 HP SEA HORSE 2,5 HP SEA HORSE 2,5 HP Power Hp 2,5 Power

Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics

UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER-ORBITAL UNVEILING

SYNTHESIS OF SUPER SYNTHESIS OF SUPER NANOPOROUS SYNTHESIS OF SUPER SYNTHESIS OF

Ohio Sea Ohio Sea Grant & Stone Lab Subtitle Frank Lichtkoppler, Ohio Sea Grant Extension

MVC Guided Pathways Brief review of Guided Pathways at MVC Plan for Today Spring

2016 ANNUAL GENERAL MEETING Short Sea Shipping is OUR BUSINESS 2 Short Sea Shipping is OUR

Super GPU & Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

THE FALL 2018 NFL PRIMETIME SEASON & THE SUPER BOWL KTG CONTENT STRATEGY SUPER BOWL

Bigger is Better Trends in super computers, super software, and super data Michael L. Norman,

Super- -Kamiokande Kamiokande s s Solar Neutrino results Solar Neutrino results Super

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Year 3 Guided Pathways Plan Presentation Presented by: Palomar Guided Pathways Team DATE: May

Guided Pathways Equity & Education Update Feb 7, 2020 Guided Pathways Decision Making

Sea to Sea Proje ject. Imp mpor/ng seawater from m Sea of Cortez to elimi minate dust emi

Searching Checkout SortingAndSearching project from SVN Lets see Shlemiel the Painter

Sorting Algorithms Algorithm Analysis and Big-O Searching Checkout SortingAndSearching project

CS 162 Intro to Programming II Searching 1 Searching

CS101 Lecture 28: Searching Algorithms Linear Search Binary Search Aaron Stevens (azs@bu.edu)

Searching, Sorting part 1 Week 3 Objectives Searching: binary search Comparison-based

Algorithm Analysis: Searching Fall 2013 Carola Wenk Searching A List A very common task is to

Tree Searching Tree searches A tree search starts at the A root and explores nodes from

CSCI 104 Searching and Sorted Lists Mark Redekopp David Kempe Sandra Batista 2 SEARCH 3

Sea Search-Gu Guided, Lightly-Su Super ervised ed Training of - PowerPoint PPT Presentation

Sea Search-Gu Guided, Lightly-Su Super ervised ed Training of of Structured Prediction on Energy Networ orks Pedram Rooshenas Dongxu Zhang Gopal Sharma Andrew McCallum St Struc uctur ured d Predi diction We are interested to

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

SEA HORSE 2,5 HP SEA HORSE 2,5 HP SEA HORSE 2,5 HP SEA HORSE 2,5 HP Power Hp 2,5 Power

Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics

UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER-ORBITAL UNVEILING

SYNTHESIS OF SUPER SYNTHESIS OF SUPER NANOPOROUS SYNTHESIS OF SUPER SYNTHESIS OF

Ohio Sea Ohio Sea Grant &amp; Stone Lab Subtitle Frank Lichtkoppler, Ohio Sea Grant Extension

MVC Guided Pathways Brief review of Guided Pathways at MVC Plan for Today Spring

2016 ANNUAL GENERAL MEETING Short Sea Shipping is OUR BUSINESS 2 Short Sea Shipping is OUR

Super GPU &amp; Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

THE FALL 2018 NFL PRIMETIME SEASON &amp; THE SUPER BOWL KTG CONTENT STRATEGY SUPER BOWL

Bigger is Better Trends in super computers, super software, and super data Michael L. Norman,

Super- -Kamiokande Kamiokande s s Solar Neutrino results Solar Neutrino results Super

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Year 3 Guided Pathways Plan Presentation Presented by: Palomar Guided Pathways Team DATE: May

Guided Pathways Equity &amp; Education Update Feb 7, 2020 Guided Pathways Decision Making

Sea to Sea Proje ject. Imp mpor/ng seawater from m Sea of Cortez to elimi minate dust emi

Searching Checkout SortingAndSearching project from SVN Lets see Shlemiel the Painter

Sorting Algorithms Algorithm Analysis and Big-O Searching Checkout SortingAndSearching project

CS 162 Intro to Programming II Searching 1 Searching

CS101 Lecture 28: Searching Algorithms Linear Search Binary Search Aaron Stevens (azs@bu.edu)

Searching, Sorting part 1 Week 3 Objectives Searching: binary search Comparison-based

Algorithm Analysis: Searching Fall 2013 Carola Wenk Searching A List A very common task is to

Tree Searching Tree searches A tree search starts at the A root and explores nodes from

CSCI 104 Searching and Sorted Lists Mark Redekopp David Kempe Sandra Batista 2 SEARCH 3

Ohio Sea Ohio Sea Grant & Stone Lab Subtitle Frank Lichtkoppler, Ohio Sea Grant Extension

Super GPU & Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

THE FALL 2018 NFL PRIMETIME SEASON & THE SUPER BOWL KTG CONTENT STRATEGY SUPER BOWL

Guided Pathways Equity & Education Update Feb 7, 2020 Guided Pathways Decision Making