E9 205 Machine Learning for Signal Procesing Deep Learning for Audio - PowerPoint PPT Presentation

Sep 05, 2022 •227 likes •555 views

E9 205 Machine Learning for Signal Procesing Deep Learning for Audio and Vision 20-11-2019 Speech Recognition Noise Channel Automatic Speech Systems Courtesy Google Images Signal Modeling Short-term spectra integrated in mel frequency

E9 205 Machine Learning for Signal Procesing Deep Learning for Audio and Vision 20-11-2019
Speech Recognition Noise Channel Automatic Speech Systems Courtesy – Google Images
Signal Modeling Short-term spectra integrated in mel frequency bands followed by log ▪ compression + DCT – mel frequency cepstral coefficients (MFCC) [Davis and Mermelstein, 1979]. Short-term Spectrum Integration + Log 25ms + DCT
Mel Frequency Cepstral Coefficients MFCC processing repeated for every short-term frame yielding a ▪ sequence of features. Typically 25ms frames with 10ms hop in time.
Speech Recognition • Map the features to phone class. Using phone labelled data. w - |^| n Triphone /w/ /^/ /n/ Classes • Classical machine learning - train a classifier on speech training data that maps to the target phoneme class.
Back to Speech Recognition Mapping Speech Features to Phonemes
Back to Speech Recognition Mapping Speech Features to Phonemes to words Language Model Decoded [Dictionary of Words Text Pronunciation Model Word Syntax]
State of Progress 2018 5.3% Claims of human parity using BLSTM based Models !!!
Moving to End-to-End Text Output Audio Features
Image Processing
Visual Graphics Group Network
ImageNet Task 1000 images in each of 1000 categories. In all, there are roughly 1.2 million training images, 50,000 validation images, and 150,000 testing images. ImageNet consists of variable-resolution images. Therefore, the images have been down-sampled to a fixed resolution of 224 × 224.
Can we go deeper
Residual Blocks
Deep Networks with Residual Blocks
Deep Networks with Residual Blocks
Results with ResNet
Image Segmentation
The Problem of Segmentation
SegNet Architecture
Results from Segnet
U-net
Summary of the Course
Distribution Pie Chart Generative Modeling and Dimensionality Reduction 45% 55% Discriminative Modeling
Generative Modeling and Dimensionality Reduction Feature Processing PCA/LDA Gaussian and GMM NMF Linear and Logistic Regression 15% 15% kernel methods 15% 15% 8% 31%
Discriminative Modeling SVM Neural Networks Improving Learning Improving Generalization Deep Networks 11% 17% Conv. Networks RNNs Understanding DNNs 17% Deep Generative Modeling 11% Applications 6% 11% 6% 6% 6% 11%
When we started …
Dates of Various Rituals ❖ 5 Assignments spread over 3 months (roughly one assignment every two weeks). ❖ September 1st week - project topic announcements. ❖ September 3rd week - 1st Midterm ❖ September 4th week - project topic and team finalization and proposal submission. [1 and 2 person teams]. ❖ October 1st week - Project Proposal ❖ October 3rd week - 2nd MidTerm ❖ November 1st week - Project MidTerm Presentations. ❖ December 1st week - Final Exams ❖ December 2nd week - Project Final Presentations.
Content Delivery In Class Beyond Class Theory Intuition and and Mathematical Analysis Foundation Implementation and Understanding

Recommend

E9 205 Machine Learning for Signal Procesing Support Vector Machines 9-10-2019 Linear

E9 205 Machine Learning for Signal Procesing Support Vector Machines 9-10-2019 Linear Classifiers x f y est w x + b>0 0 = denotes +1 b + x denotes -1 w How would you classify this data? w x + b<0 SVM and applications,

496 views • 22 slides

I-205 SB Closed at X Johnson Creek Blvd I-205 SB Detour Route: Johnson Creek Blvd WB to OR213

I-205 SB Closed at X Johnson Creek Blvd I-205 SB Detour Route: Johnson Creek Blvd WB to OR213 (82 nd Ave) SB to I-205 SB I-205 NB Closed at Flavel St I-205 NB Detour Route: Johnson Creek Blvd WB to X OR213 (82 nd Ave) NB to Foster Rd EB to

362 views • 9 slides

Approximation schemes for machine scheduling with resource (in-)dependent processing times Klaus

Approximation schemes for machine scheduling with resource (in-)dependent processing times Klaus Jansen, Marten Maack, Malin Rau April 4, 2016 Scheduling parallel Tasks procesing time processors Scheduling parallel Tasks procesing time 6 7

831 views • 49 slides

E9 205 Machine Learning for Signal Processing Feature Extraction 08-08-2016 Recap Real-world

E9 205 Machine Learning for Signal Processing Feature Extraction 08-08-2016 Recap Real-world signals Patterns in signal Learning - uncovering the underlying patterns Roadmap of the course Types of Learning Supervised Learning

482 views • 24 slides

E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals

E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals 12-08-2019 Instructor - Sriram Ganapathy (sriramg@iisc.ac.in) Teaching Assistant - Prachi Singh (prachisingh@iisc.ac.in)

350 views • 15 slides

E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals

E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals 14-08-2017 Instructor - Sriram Ganapathy (sriram@ee.iisc.ernet.in) Teaching Assistant - Aravind Illa (aravindece77@gmail.com). Overview What

644 views • 30 slides

E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals

E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals 19-08-2019 Instructor - Sriram Ganapathy (sriramg@iisc.ac.in) Teaching Assistant - Prachi Singh (prachisingh@iisc.ac.in)

239 views • 21 slides

E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals

E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals 08-08-2018 Instructor - Sriram Ganapathy (sriramg@iisc.ac.in) Teaching Assistant - Akshara Soman (aksharas@iisc.ac.in). Feature Extraction

983 views • 28 slides

Tx Signal: 1000 Hz sine wave; Attenuation; Random noise with 0.5ms spike Tx Signal Noise Rx

Tx Signal: 1000 Hz sine wave; Attenuation; Random noise with 0.5ms spike Tx Signal Noise Rx Signal Tx Signal: 1000 Hz square wave; Attenuation; Random noise with 0.5ms spike Tx Signal Noise Rx Signal What about a signal with 2 levels vs a

158 views • 5 slides

Machine Learning for Signal Processing Lecture 1: Signal Representations Class 1. 27 August

11-755/18-797 Machine Learning for Signal Processing Machine Learning for Signal Processing Lecture 1: Signal Representations Class 1. 27 August 2012 Instructor: Bhiksha Raj 27 Aug 2012 11-755/18-797 1 What is a signal A mechanism for

1.18k views • 96 slides

E9 205 Machine Learning for Signal Processing Non-negative Matrix Factorization 16-09-2019 Audio

E9 205 Machine Learning for Signal Processing Non-negative Matrix Factorization 16-09-2019 Audio Composition Audio signal generated by a number of individual components (example - different instruments).

287 views • 18 slides

E9 205 Machine Learning for Signal Processing Linear Predictive Analysis 22-08-2016 Linear

E9 205 Machine Learning for Signal Processing Linear Predictive Analysis 22-08-2016 Linear Prediction Current sample expressed as a linear combination of past samples n- 3 n- 2 n- 1 n a 1 a 2 a 3 Properties of LP Error signal (for the

489 views • 20 slides

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Rob Schapire Princeton University www.cs.princeton.edu/ schapire Machine

1.26k views • 38 slides

E9 205 Machine Learning for Signal Processing Dimensionality Reduction - I 21-08-2019 Instructor

E9 205 Machine Learning for Signal Processing Dimensionality Reduction - I 21-08-2019 Instructor - Sriram Ganapathy (sriramg@iisc.ac.in) Principal Component Analysis Reducing the data of dimension to lower dimension

238 views • 10 slides

E9 205: Machine Learning for Signal Processing Introduction to 16-10-2019 Neural Network Models

E9 205: Machine Learning for Signal Processing Introduction to 16-10-2019 Neural Network Models Perceptron Algorithm Perceptron Model [McCulloch, 1943, Rosenblatt, 1957] Targets are binary classes [-1,1] What if the data is not linearly

420 views • 9 slides

E9 205 Machine Learning for Signal Processing 23-8-17 Outline Basics for Image Processing

E9 205 Machine Learning for Signal Processing 23-8-17 Outline Basics for Image Processing Filtering Smoothing Edge Detection Scale Invariant Feature Transform (SIFT) Reference: UCF, Computer Vision Course Link:

600 views • 42 slides

MuMMI : Multiple Metrics Modeling Infrastructure Valerie Taylor, Xingfu Wu, Charles Lively

MuMMI : Multiple Metrics Modeling Infrastructure Valerie Taylor, Xingfu Wu, Charles Lively (TAMU) Hung-Ching Chang, Kirk Cameron (Virginia Tech) Shirley Moore (UTEP), Dan Terpstra (UTK) NSF CSR Large Grant Petascale Tools Workshops 2013

596 views • 18 slides

Null Space Gradient Flows for Constrained Optimization with Applications to Shape Optimization

Null Space Gradient Flows for Constrained Optimization with Applications to Shape Optimization Florian Feppon Gr egoire Allaire, Charles Dapogny Julien Cortial, Felipe Bordeu SIAM CSE Spokane February 26, 2019 Shape optimization

693 views • 42 slides

Layout Hotspot Detection with Feature Tensor Generation and Deep Biased Learning Haoyu Yang 1 ,

Layout Hotspot Detection with Feature Tensor Generation and Deep Biased Learning Haoyu Yang 1 , Jing Su 2 , Yi Zou 2 , Bei Yu 1 , Evangeline F. Y. Young 1 1 The Chinese University of Hong Kong 2 ASML Brion Inc. 1 / 15 Outline Introduction

603 views • 25 slides

CSEE 6861 CAD of Digital Systems Handout: Lecture #14 4/28/16 Prof. Steven M. Nowick

CSEE 6861 CAD of Digital Systems Handout: Lecture #14 4/28/16 Prof. Steven M. Nowick nowick@cs.columbia.edu Department of Computer Science (and Elect. Eng.) Columbia University New York, NY, USA Introduction to Approximate Computing (follows

208 views • 5 slides

Output Prediction Logic: A High Performance CMOS Design Technique Carl Sechen Collaborator:

Output Prediction Logic: A High Performance CMOS Design Technique Carl Sechen Collaborator: Larry McMurchie Dept. of Electrical Engineering U. of Washington Seattle 206-619-5671 sechen@ee.washington.edu Outline Background Why

779 views • 45 slides

Back to the future: sockets and relational data in your (Windows) pocket Dragos Manolescu

Back to the future: sockets and relational data in your (Windows) pocket Dragos Manolescu Microsoft, Windows Phone Engineering Hewlett-Packard Cloud Services Background APIs Performance and Health Data and Cloud RTM Networking:

578 views • 54 slides

Low Power Design Dr Z Wang and Prof Dr J Henkel Dr. Z. Wang and Prof. Dr. J. Henkel CES - Chair

Hardware Power 1 Low Power Design Dr Z Wang and Prof Dr J Henkel Dr. Z. Wang and Prof. Dr. J. Henkel CES - Chair for Embedded Systems Karlsruhe Institute of Technology, Germany 3. Hardware power optimization and estimation estimation

1.29k views • 76 slides

Low Power Design Thomas Ebi and Prof. Dr. J. Henkel CES - Chair for Embedded Systems Karlsruhe

1 Hardware Power Low Power Design Thomas Ebi and Prof. Dr. J. Henkel CES - Chair for Embedded Systems Karlsruhe Institute of Technology, Germany 3. Hardware power optimization and estimation http://ces.itec.kit.edu T. Ebi and J. Henkel,

918 views • 76 slides