Networks on Structured Data Yingyu Liang@UW-Madison Joint work with - PowerPoint PPT Presentation

Aug 31, 2023 •322 likes •436 views

Learning Over-Parameterized Neural Networks on Structured Data Yingyu Liang@UW-Madison Joint work with Yuanzhi Li@Princeton Stanford Empirical Success of Deep Learning Machine translation Computer vision Game playing Robots Fundamental

Learning Over-Parameterized Neural Networks on Structured Data Yingyu Liang@UW-Madison Joint work with Yuanzhi Li@Princeton → Stanford
Empirical Success of Deep Learning Machine translation Computer vision Game playing Robots
Fundamental Questions • Optimization: Why can find a network with good accuracy on training data? • Generalization: Why the network also accurate on new test instances?
Fundamental Questions • Optimization: Why can find a network with good accuracy on training data? • Generalization: Why the network also accurate on new test instances? • Key challenge: the optimization is non-convex Theoretically hard but practically not difficult!
Mystery I: Over-Parameterization Helps Optimization • Empirical observation: easier to train wider networks Synthetic data … … Train a larger network Ground truth On the Computational Efficiency of Training Neural Networks. Roi Livni, Shai Shalev-Shwartz, Ohad Shamir. NeurIPS 2014.
Mystery II: Practical DNNs Easily Fit Random Labels • Empirical observation: practical DNNs easily fit random labels Understanding deep learning requires rethinking generalization. Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals. ICLR 2017.
Mystery II: Practical DNNs Easily Fit Random Labels • Empirical observation: practical DNNs easily fit random labels Understanding deep learning requires rethinking generalization. Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals. ICLR 2017.
Our Work Is there a simple theoretical explanation?
Our Work Is there a simple theoretical explanation? Our work: Yes for two-layer NN on clustered data!
Our Work Is there a simple theoretical explanation? Our work: Yes for two-layer NN on clustered data! Poster: Tue Poster Session A #143

Recommend

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE by Kirk Gittings Note: Kirk Gittings has been photographing the prehistor- coast, but m aking a real living as an art photographer ic,

1.16k views • 8 slides

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction Our goal today To define a Structure and Structured Prediction 1 What are structures? 2 Examples of structured data? 3 Examples of structured

661 views • 34 slides

Semi-structured data Data is not just text, but is not as well- Semi-structured data

Semi-structured data Data is not just text, but is not as well- Semi-structured data structured as data in databases Occurs often in web databanks Occurs often in integration of databanks 1 2 Semi-structured data - properties

95 views • 5 slides

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A common use case in big-data is to process structured or semi-structured data In Spark RDD, all functions and objects are black-boxes. Any

694 views • 37 slides

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan Log-Structured KV-Stores Log-Structured KV-Stores Why Log-Structured KV-Stores? Why Log-Structured KV-Stores? fast writes Why Log-Structured

2.34k views • 198 slides

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM) Professor Liang Huang (Chap. 17 of CIML) Structured Prediction x x the man bit the dog the man bit the dog x x DT NN

672 views • 27 slides

Data and Analysis Part I Structured Data Ian Stark January 2011 Part I: Structured Data

Inf1-DA 20102011 I: 1 / 117 Informatics 1 School of Informatics, University of Edinburgh Data and Analysis Part I Structured Data Ian Stark January 2011 Part I: Structured Data Inf1-DA 20102011 I: 2 / 117 Part I Structured Data

383 views • 23 slides

(XML from Chapter 20 of text) Outline Why Structured Data? Types of Structured Data

IT350 Web and Internet Programming Fall 2007 SlideSet #16: XML and Semantic Web (XML from Chapter 20 of text) Outline Why Structured Data? Types of Structured Data XML and Friends RDF and Semantic Web Structured Data

1.14k views • 16 slides

Structured Electronic Design Structured Electronic Design ET 8016 5 ECTS credits 1

Structured Electronic Design Structured Electronic Design ET 8016 5 ECTS credits 1 Structured Electronic Design Structured Electronic Design Some keywords: Design methodology Analysis and Synthesis Applied network theory

857 views • 70 slides

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction?

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction? Its just multi-class classification! Definition: Structured Something in the problem is exponentially large Definition: Structured

615 views • 32 slides

Variational Inference for Tutorial Outline Structured NLP Models 1. Structured Models and Factor

Variational Inference for Tutorial Outline Structured NLP Models 1. Structured Models and Factor Graphs 2. Mean Field 3. Structured Mean Field 4. Belief Propagation 5. Structured Belief Propagation ACL, August 4, 2013 David Burkett and Dan

390 views • 36 slides

Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush

Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush HarvardNLP 1 Deep Neural Networks for Text Processing and Generation 2 Attention Networks 3 Structured Attention Networks Computational Challenges

1.33k views • 110 slides

Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush

1.53k views • 127 slides

Masses Alon Halevy Google Structured Data & The Web Hard to find structured data via search

Bringing (Web) Databases to the Masses Alon Halevy Google Structured Data & The Web Hard to find structured data via search engines Discover Requires Data is infrastructure, embedded in concerns web page, about losing behind forms

729 views • 41 slides

Convolutional Kernel Networks for Graph-Structured Data Dexiong Chen 1 Laurent Jacob 2 Julien

Convolutional Kernel Networks for Graph-Structured Data Dexiong Chen 1 Laurent Jacob 2 Julien Mairal 1 1 Inria Grenoble 2 CNRS/LBBE Lyon ICML 2020 Dexiong Chen Graph Convolutional Kernel Networks 1 / 15 Graph-structured data are ubiquitous (a)

865 views • 28 slides

Greater Manchester Cricket A Structured Approach A Structured Approach Introductions John

Greater Manchester Cricket A Structured Approach A Structured Approach Introductions John Eady Bobby Denning Martin Kay Mike Hall Colleagues from Leicestershire Cricket Board A Structured Approach Agenda for the day 09.15

446 views • 30 slides

Theories of Neural Networks Training Lazy and Mean Field Regimes c Chizat * , joint work with

Theories of Neural Networks Training Lazy and Mean Field Regimes c Chizat * , joint work with Francis Bach + L ena April 10th 2019 - University of Basel e Paris-Sud + INRIA and ENS Paris CNRS and Universit Introduction Setting

435 views • 28 slides

Marcel Dettling Institute for Data Analysis and Process Design Zurich University of Applied

Applied Statistical Regression HS 2011 Week 13 Marcel Dettling Institute for Data Analysis and Process Design Zurich University of Applied Sciences marcel.dettling@zhaw.ch http://stat.ethz.ch/~dettling ETH Zrich, December 19, 2011

568 views • 25 slides

Workshop 10.6a: Poisson regression Murray Logan 12 Sep 2016 Section 1 Poisson regression

Workshop 10.6a: Poisson regression Murray Logan 12 Sep 2016 Section 1 Poisson regression Poisson regression Probability density function Cumulative density function = 25 = 15 = 3 0 5 10 15 20 25 30 35 40 0 5 10 15 20

472 views • 7 slides

Using phylogenetics to estimate species divergence times ... More accurately ... Basics and

Using phylogenetics to estimate species divergence times ... More accurately ... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures of homologous proteins ... from

276 views • 27 slides

Research Goal : reliable and easy-to-use optimizers for ML. 1 10 Challenges in Optimization

Aaron Mishkin Research Goal : reliable and easy-to-use optimizers for ML. 1 10 Challenges in Optimization for ML Stochastic gradient methods are the most popular algorithms for fitting ML models, w k +1 = w k k SGD: f ( w k ) .

395 views • 10 slides

A Bayesian approach to estimate the number and position of knots for linear regression splines

Framework Knots The model Simulation study Real data application Discussion A Bayesian approach to estimate the number and position of knots for linear regression splines Gioia Di Credico, Francesco Pauli and Nicola Torelli Department of

337 views • 16 slides

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural Language Processing Transformers Jeremy Barnes University of Oslo March 31, 2020 Attention - tl;dr Pay attention to a weighted combination of input

553 views • 36 slides

When Ensembling Smaller Models is More Efficient than Single Large Models Dan Kondratyuk, Mingxing

When Ensembling Smaller Models is More Efficient than Single Large Models Dan Kondratyuk, Mingxing Tan, Matthew Brown, Boqing Gong Google AI { dankondratyuk,tanmingxing,mtbr,bgong } @google.com Abstract For ensembles with more than two models,

175 views • 4 slides

Networks on Structured Data Yingyu Liang@UW-Madison Joint work with - PowerPoint PPT Presentation

Learning Over-Parameterized Neural Networks on Structured Data Yingyu Liang@UW-Madison Joint work with Yuanzhi Li@Princeton Stanford Empirical Success of Deep Learning Machine translation Computer vision Game playing Robots Fundamental

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Semi-structured data Data is not just text, but is not as well- Semi-structured data

Introduction to SparkSQL Structured Data Processing in Spark 1 Structured Data Processing A

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Data and Analysis Part I Structured Data Ian Stark January 2011 Part I: Structured Data

(XML from Chapter 20 of text) Outline Why Structured Data? Types of Structured Data

Structured Electronic Design Structured Electronic Design ET 8016 5 ECTS credits 1

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction?

Variational Inference for Tutorial Outline Structured NLP Models 1. Structured Models and Factor

Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush

Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush

Masses Alon Halevy Google Structured Data &amp; The Web Hard to find structured data via search

Convolutional Kernel Networks for Graph-Structured Data Dexiong Chen 1 Laurent Jacob 2 Julien

Greater Manchester Cricket A Structured Approach A Structured Approach Introductions John

Theories of Neural Networks Training Lazy and Mean Field Regimes c Chizat * , joint work with

Marcel Dettling Institute for Data Analysis and Process Design Zurich University of Applied

Workshop 10.6a: Poisson regression Murray Logan 12 Sep 2016 Section 1 Poisson regression

Using phylogenetics to estimate species divergence times ... More accurately ... Basics and

Research Goal : reliable and easy-to-use optimizers for ML. 1 10 Challenges in Optimization

A Bayesian approach to estimate the number and position of knots for linear regression splines

IN5550: Neural Methods in Natural Language Processing IN5550 Neural Methods in Natural

When Ensembling Smaller Models is More Efficient than Single Large Models Dan Kondratyuk, Mingxing

Masses Alon Halevy Google Structured Data & The Web Hard to find structured data via search