Microphone Array Processing : A Quick Update Iain McCowan - PowerPoint PPT Presentation

Microphone Array Processing : A Quick Update Iain McCowan Guillaume Lathoud, Darren Moore, Olivier Masson TAM May 2003 – p. 1/6

Outline • Speech enhancement • Speaker segmentation • Files available online TAM May 2003 – p. 2/6

Speech enhancement • Improving enhancement in overlapping speech by post-filtering beamformer outputs • Beamformer outputs : y n ( f ) for each speaker location n = 1 : N • Post-filter A (Wiener-like - | S | 2 | N | 2 ) • | y n ( f ) | 2 y n ( f ) = ˆ m � = n | y m ( f ) | 2 y n ( f ) (1) 1 � N − 1 • Post-filter B (Binary Mask) • � y n ( f ) n = arg max m y m ( f ) y n ( f ) = ˆ (2) 0 otherwise TAM May 2003 – p. 3/6

Speech enhancement • Subjectively, post-filter B leads to significant reduction in cross-talk level. • To verify, initial recognition experiments • MONC (Multi-channel Overlapping Numbers Corpus - re-recording of Numbers 95). Note : baseline lapel with no conflicting speech is 7.0% WER. • With one overlapping speaker (word error rates) : Lapel Previous Array Best Post-filter B 26.7 19.3 12.2 • With two overlapping speaker : Lapel Previous Array Best Post-filter B 35.3 26.6 15.8 TAM May 2003 – p. 4/6

Speaker Segmentation • Previously, presented work on segmenting using location features. • Since then... • Now doing clustering and segmentation using both location features and standard acoustic features across meetings. • Segment in terms of location and identity (cluster index) concurrently. • Using multi-stream HMM to cluster in each space independently, but enforce same temporal segmentation. • Automatically converges to correct number of locations and identities. • Initial results show high segmentation accuracy ( ≈ 95% frame accuracy). TAM May 2003 – p. 5/6

Files available online • Now appearing on mmm.idiap.ch • Beamformer outputs for Post-filter A and B for each seated speaker location (1-4). (Scripted Meeting set only). • Beamformer-B files have lower noise, though perhaps more distortion than Beamformer-A. • Beamformer outputs for whiteboard and presentation not yet available. • current beamformers are too precise for the typical movement in these regions - investigating minimum beam-width constraint or adaptive techniques. • Beamformer-B mix file available (BeamB-mix) - simple sum of 4 speaker beamformers. • remember, this does not yet cater for white-board or presentation speech. • currently, low level buzz apparent in this mix file... to be fixed. TAM May 2003 – p. 6/6

Microphone Array Processing : A Quick Update Iain McCowan - PowerPoint PPT Presentation

Microphone Array Processing : A Quick Update Iain McCowan Guillaume Lathoud, Darren Moore, Olivier Masson TAM May 2003 p. 1/6 Outline Speech enhancement Speaker segmentation Files available online TAM May 2003 p. 2/6

Microphone Array Processing for Distant Speech Recognition From close-talking microphones to

singly linked lists Sept. 18, 2017 1 Recall last lecture: Java array array array array of

A synthetic aperture microphone array for the meeting room TNO TPD Synthetic aperture microphone

Microphone Array Processing M4 Progress Report Iain McCowan January 28, 2003 Objective and

Review We can declare an array of any type, even other arrays A 2D array is an array of

Cache Performance 1 C and cache misses (1) int array[1024]; // 4KB array int even_sum = 0,

Arrays Weather Problem Array Declaration Accessing Elements Arrays and for Loops Array length

Sorting Chapter 7 1 Quick Sort One of the most popular fast sorting algorithms Quick sort

Very Large Array Project The Expanded Observing with the Jansky VLA Gustaaf van Moorsel Array

Array Code Generation 1. Array code generation 2. Surprises in memory access 3. Lessons learned

SMO: An Integrated Approach To Intra-Array And Inter-Array Storage Optimization Somashekaracharya

x86 ARRAYS RECALL ARRAYS char foo[80]; An array of 80 characters int bar[40]; An array of

Handling array size limitations Handling array size limitations Issue: array size is fixed

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

QUICK INTRODUCTION People call me GONZ QUICK INTRODUCTION 1. Never went to Art School

For Wednesday Read chapter 5, sections 1-4 Homework: Chapter 3, exercise 23. Then do

Exercise 2: Thresholds FLUKA Advanced Course Exercise 2: Thresholds Aim of the exercise: 1.

Maverick: Discovering Exceptional Facts from Knowledge Graphs 12/03/19 Paper published in Proc.

EVALB, Improving CKY Parsing, Hw3 Evaluating parsers Hw3 Optimization: tips and tricks Scott

Microcell Urban Propagation Channel Analysis Using Measurement Data Mir Ghoraishi Jun-ichi

Initial objectives Employ plasmonic and geometrical resonances to enhance magneto-optical effects

Visualizing size-security tradeoffs for lattice-based encryption Daniel J. Bernstein Horizontal

MPII at the NTCIR-14 WWW-2 Task Andrew Yates Max Planck Institute for Informatics Motivation

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us