Unsupervised neural network based feature extraction using weak - PowerPoint PPT Presentation

Unsupervised neural network based feature extraction using weak top-down constraints Herman Kamper 1 , 2 , Micha Elsner 3 , Aren Jansen 4 , Sharon Goldwater 2 1 CSTR and 2 ILCC, School of Informatics, University of Edinburgh, UK 3 Department of Linguistics, The Ohio State University, USA 4 HLTCOE and CLSP, Johns Hopkins University, USA ICASSP 2015

Introduction ◮ Huge amounts of speech audio data are becoming available online. ◮ Even for severely under-resourced and endangered languages (e.g. unwritten), data is being collected. ◮ Generally this data is unlabelled. ◮ We want to build speech technology on available unlabelled data. 2 / 16

Introduction ◮ Huge amounts of speech audio data are becoming available online. ◮ Even for severely under-resourced and endangered languages (e.g. unwritten), data is being collected. ◮ Generally this data is unlabelled. ◮ We want to build speech technology on available unlabelled data. ◮ Need unsupervised speech processing techniques. 2 / 16

Example application: query-by-example search 3 / 16

Example application: query-by-example search Spoken query: 3 / 16

Example application: query-by-example search Spoken query: What features should we use to represent the speech for such unsupervised tasks? 3 / 16

Supervised neural network feature extraction 4 / 16

Supervised neural network feature extraction Output: predict phone states ay ey k v Input: speech frame(s) e.g. MFCCs, filterbanks 4 / 16

Supervised neural network feature extraction Output: predict phone states ay ey k v Feature extractor (learned from data) Input: speech frame(s) e.g. MFCCs, filterbanks 4 / 16

Supervised neural network feature extraction Output: predict phone states ay ey k v Phone classifier (learned jointly) Feature extractor (learned from data) Input: speech frame(s) e.g. MFCCs, filterbanks 4 / 16

Supervised neural network feature extraction Output: predict phone states ay ey k v Phone classifier (learned jointly) Feature extractor (learned from data) Input: speech frame(s) e.g. MFCCs, filterbanks But what if we do not have phone class targets to train our network? 4 / 16

Weak supervision: unsupervised term discovery 5 / 16

Weak supervision: unsupervised term discovery Can we use these discovered word pairs to provide us with weak supervision? 5 / 16

Weak supervision: align the discovered word pairs Use correspondence idea from [Jansen et al., 2013] 6 / 16

Weak supervision: align the discovered word pairs Use correspondence idea from [Jansen et al., 2013]: 6 / 16

Autoencoder (AE) neural network 7 / 16

Autoencoder (AE) neural network Output is same as input Input speech frame A normal autoencoder neural network is trained to reconstruct its input. 7 / 16

Autoencoder (AE) neural network Output is same as input Input speech frame This reconstruction criterion can be used to pretrain a deep neural network. 7 / 16

The correspondence autoencoder (cAE) Frame from other word in pair Frame from one word The correspondence autoencoder (cAE) takes a frame from one word, and tries to reconstruct the corresponding frame from the other word in the pair. 8 / 16

The correspondence autoencoder (cAE) Frame from other word in pair Unsupervised feature extractor Frame from one word In this way we learn an unsupervised feature extractor using the weak word-pair supervision. 8 / 16

Complete unsupervised cAE training algorithm Train correspondence (1) (4) Train stacked autoencoder autoencoder (pretraining) Initialize weights Speech corpus Unsupervised (3) feature extractor (2) Unsupervised term discovery Align word pair frames 9 / 16

Evaluation of features: the same-different task 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” Treat as query “apple” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” Treat as terms to search Treat as query “pie” “grape” “apple” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” “pie” “grape” “apple” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” DTW distance: “pie” d 1 “grape” “apple” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” different d 1 “grape” “apple” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” � different d 1 “grape” “apple” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” � different d 1 “grape” d 2 “apple” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” � different d 1 “grape” d 2 same “apple” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” � different d 1 “grape” d 2 same × “apple” “apple” “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” � different d 1 “grape” d 2 same × “apple” “apple” d 3 “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” � different d 1 “grape” d 2 same × “apple” “apple” d 3 same “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” � different d 1 “grape” d 2 same × “apple” “apple” � d 3 same “apple” “like” 10 / 16

Evaluation of features: the same-different task “apple” “pie” “grape” “apple” “apple” “like” d i < threshold? DTW distance: predict: “pie” � different d 1 “grape” d 2 same × “apple” “apple” � d 3 same “apple” d 4 different × “like” � d N different 10 / 16

Unsupervised neural network based feature extraction using weak - PowerPoint PPT Presentation

Unsupervised neural network based feature extraction using weak top-down constraints Herman Kamper 1 , 2 , Micha Elsner 3 , Aren Jansen 4 , Sharon Goldwater 2 1 CSTR and 2 ILCC, School of Informatics, University of Edinburgh, UK 3 Department of

Outline Reducing Dimensionality Feature Selection 1 Steven J Zeil Feature Extraction 2

3. Feature Extraction 3.1 Feature Extraction from Speech or other types of audio like music

Automated Feature Extraction Automated Feature Extraction for Object Recognition for Object

Feature Extraction 7-1 Ronald Peikert SciVis 2007 - Feature Extraction What are features?

Feature Extraction 7-1 Ronald Peikert SciVis 2008 - Feature Extraction What are features?

Reducing Dimensionality Steven J Zeil Old Dominion Univ. Fall 2010 1 Feature Selection

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

Decision Tree Prof. Seungchul Lee Industrial AI Lab. Feature Test Feature 1 Feature 2 Feature

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Object based feature extraction of Google based feature extraction of Google Object Earth

UNSUPERVISED LEARNING, CLUSTERING UNSUPERVISED LEARNING UNSUPERVISED LEARNING Supervised

AB Feature Extraction Experiments Discussion Noise Robust LVCSR Feature Extraction Based on

Feature Extraction Combining Feature Extraction Combining Spectral Noise Reduction and Spectral

Unsupervised Learning and Clustering l In unsupervised learning you are given a data set with no

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

A Distinctive Feature of A Distinctive Feature of A Distinctive Feature of A Distinctive Feature

Getting to the Core of Algorithmic News Curators: A Case Study of Apple News Jack Bandy

CS137: Electronic Design Automation Day 1: September 26, 2005 Introduction CALTECH CS137

CSE 110A: Winter 2020 Fundamentals of Compiler Design I Intro to Haskell Owen Arden

Do Don't t Trust t Your Eye: Ap Apple Graphics Is Compromised! Liang Chen (@chenliang0817)

About EPIC Apple, the FBI, and the Crypto Debate Marc Rotenberg Stanford Law School Palo Alto,

Measuring Relations between Concepts in Conceptual Spaces Lucas Bechberger and Kai-Uwe

Mobile Security Presenter: Yinzhi Cao Lehigh University Some contents are borrowed from the

iOS Security 101 -ish Vadim Drobinin | @valzevul About me 1. Why? iOS Security 101-ish /

Sambuz

Useful Links

Newsletter

Mail Us