DCU at the NTCIR-11 SpokenQuery&Doc Task David N. Racca, Gareth - PowerPoint PPT Presentation

DCU at the NTCIR-11 SpokenQuery&Doc Task David N. Racca, Gareth J.F. Jones CNGL Centre for Global Intelligent Content School of Computing, Dublin City University Dublin, Ireland

Overview ― We participated in the slide-group SQ-SCR. ― General idea: ● Augment text-retrieval methods with prosodic features: pitch (F0), loudness, and duration. ● Compute an acoustic score for each term. ● Promote the rank of segments containing acoustically prominent terms. 3/19

Motivation ― Prosody : ● Rhythm, stress, intonation, duration, loudness. ― Shown useful in many speech processing tasks: ● Emotions, discourse structure, speech acts, speaker ID, topic segmentation. ― Prominent speech units stand-out from their context. ― Information status: old vs new information. 4/19

Related Work ― Crestani [1] : possible correlation between acoustic stress and TF-IDF scores (English). ― Chen et al [2] : signal amplitude and duration in a spoken document retrieval (SDR) task (Mandarin). ― Guinaudeau [3] : F0 and RMS energy in a topic tracking task (French). ― Racca et al [4] : F0, loudness, and duration in SCR (English). 5/19

Data Pre-processing — 1-best WORD match , unmatchAMLM , and manual transcripts. Provided by organisers %M Lectures ChaSen Julius 10-best ASR WAV or Capitalisation hypothesis LVCSR T ranscripts "%m %M %y" %m per IPU Enriched %y %M or %m ASR T ranscripts Manual Manual Forced IPUs Annotation VAD T ranscripts Annotated Alignment Removal WAV T ranscripts Enriched Manual Lecture Normalisation T ranscripts Normalised F0 v norm = v raw − min v Queries OpenSMILE F0 Loudness Loudness WAV every 10ms max v − min v every 10ms 6/19

Prosodic Features — Raw duration, lecture-normalised F0 and loudness. — Example: Duration d = 2.36 s − 1.02 s = 1.34 s Lecture Normalisation v norm = v raw − min v max v − min v start end ~1.02 ~ 2.36 Loudness Max ~ 1.16 Loudness k )= 1.16 Raw max ( l i, j F0 Max ~ 280.44 Hz k )= 0.37 Normalised max ( l i, j Pitch (F0) k )= 280.44 Hz Raw max ( f0 i, j k )= 0.58 Normalised max ( f0 i, j tf-idf 7/19

Prosodic Features — F0, loudness, and duration for the term “ i ” term in segment “ j ” . k ) } { max ( f0 i , j f0 ( i , j )= max k k ) } { max ( l i, j l ( i , j )= max k k } { d i, j d ( i , j )= max k k ) } − min k ) } { max ( f0 i , j k { min ( f0 i , j f0 range ( i , j )= max k 8/19

Acoustic Score — We experimented with six definitions for the acoustic score of term “ i ” in segment “ j ”. ac ( i , j )= { f0 ( i , j ) Pitch [P] l ( i, j ) Loudness [L] d ( i , j ) Duration [Dur] f0 range ( i , j ) Pitch Range [Pr] l ( i, j ) . f0 ( i, j ) [LP] l ( i, j ) . f0 range ( i, j ) [LPr] 9/19

Indexing Slide-group segments IPUs with with Prosody Prosody Terrier Segment Enriched Indexing Index T ranscripts IPU Grouping ― Slide-group segments indexed using Terrier IR Framework. ― Index stores F0, loudness and duration for each term occurrence along with text statistics. 10/19

Retrieval ― Probabilistic model with BM25 weighting: M rel ( q , s j )= ∑ w ( i , j ) i w ( i , j ) ― Three definitions for were explored: w ( i , j )= { idf ( i ,C )[α . tf ( i , j )+( 1 −α) ac ( i , j )] LI θ ir . tf ( i, j ) . idf ( i ,C )+θ ac . ac ( i, j ) idf ( i ,C )= log ( + 1 ) G N θ ir +θ ac n i k 1 . tf i, j tf ( i, j ) . idf ( i ,C ) TF_IDF tf ( i, j )= tf i, j + k 1 ( 1 − b + b dl j avdl ) 11/19

Parameter T uning ― SpokenDoc-2 passage retrieval: 120 text queries α ac ( i , j ) θ ir θ ac Lecture Transcript w ( i , j ) uMAP pwMAP fMAP LI LPr 0.7 .1369 .0976 .1005 LI Pr 0.7 .1369 .0951 .0995 Manual G LP 1 1 .1326 .0960 .0989 TF-IDF .1270 .0950 .0972 LI LPr 0.5 .0842 .0508 .0524 0.3 LI Dur .0819 .0498 .0521 Match G Pr 1 1 .0786 .0473 .0499 LI Pr 0.7 .0778 .0490 .0501 TF-IDF .0682 .0477 .0486 G P 3 1 .0288 .0208 .0131 0.5 LI LP .0278 .0210 .0135 UnmatchAMLM LI LPr 0.2 .0271 .0205 .0132 LI P 0.9 .0227 .0206 .0129 TF-IDF .0222 .0203 .0128 12/19

Results: SpokenQuery&Doc Manual Transcripts MAP LI-Pr-0.7 LI-LPr-0.7 TF_IDF 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Manual Match UnmatchAMLM Spoken Query Types 13/19

Results: SpokenQuery&Doc Match Transcripts MAP LI-LPr-0.5 LI-Pr-0.7 LI-Dur-0.3 TF_IDF 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Manual Match UnmatchAMLM Spoken Query Types 14/19

Results: SpokenQuery&Doc UnmatchAMLM Transcripts MAP LI-LPr-0.2 LI-LPr-0.5 LI-P-0.9 TF_IDF 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Manual Match UnmatchAMLM Spoken Query Types 15/19

Results: SpokenQuery&Doc 2 relevant segments Query 1: Prosodic-based vs TF_IDF TF_IDF Prosodic-based Manual Unmatch Unmatch Match Spoken Query Type Manual Match Unmatch Match Manual Manual Unmatch Match 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 AveP 16/19

Conclusions & Further Work ― Continued exploring if prosodic prominence can be used to improve retrieval effectiveness. ― No significant differences between prosodic and text based runs (t student's test ~ 95% conf. level). ― Transcript quality affects retrieval effectiveness. ― Prosodic-based models may be useful for some queries/target segments: • Future work: predict when this happens. 17/19

References — [1] Crestani. Towards the use of prosodic information for spoken document retrieval. SIGIR'01, 2001. — [2] Chen, et al. Improved spoken document retrieval by exploring extra acoustic and linguistic cues. INTERSPEECH'01, 2001. — [3] Guinaudeau and Hirschberg. Accounting for prosodic information to improve ASR-based topic tracking for TV broadcast news. INTERSPEECH'11, 2011. — [4] Racca et al. DCU search runs at MediaEval 2014 Search and Hyperlinking . MediaEval 2014 Multimedia Benchmark Workshop, 2014 18/19

Questions? 19/19

DCU at the NTCIR-11 SpokenQuery&Doc Task David N. Racca, Gareth - PowerPoint PPT Presentation

DCU at the NTCIR-11 SpokenQuery&Doc Task David N. Racca, Gareth J.F. Jones CNGL Centre for Global Intelligent Content School of Computing, Dublin City University Dublin, Ireland Overview We participated in the slide-group SQ-SCR.

ML4HMT: DCU Teams Overview Tsuyoshi Okita Dublin City University DCU Teams Overview Meta

T HE DCU W OMEN IN L EADERSHIP I NITIATIVE Tuesday, 3 rd November 2015 Launch of DCU Women in

DCU at the NTCIR-14 OpenLiveQ-2 Task Piyush Arora & Gareth J.F. Jones ADAPT Centre, School of

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30-

DOC Zoom Meeting April 28, 2020 www.ncsoccer.org DOC Meeting Welcome www.ncsoccer.org DOC

Neuchatel at NTCIR-4 From CLEF to NTCIR Jacques Savoy University of Neuchatel, Switzerland

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of

KSU Teams QA System for World History Exams at the NTCIR-13 QA Lab-3 Task Tasuku Kimura, Ryo

Kyoto-U: Syntactical EBMT System for NTCIR 7 Patent System for NTCIR-7 Patent Translation Task

Extending the DCU-250 Gold Standard f-structure Bank H. B echara hbechara@computing.dcu.ie

Communications Networks (EE414) Dr. Conor McArdle: room S335, mcardlec@eeng.dcu.ie,

Overview of the Sixth NTCIR Workshop Noriko Kando National Institute of Informatics

NTCIR 2014 Slides - TUW-IMP at the NTCIR-11 Math-2 Presentation February 2015 CITATIONS READS

CUTKB at NTCIR-14 QALab-PoliInfo Task Toshiki Tomihira and Yohei Seki University of Tsukuba,

Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Text Conversation Task

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel Mackenzie, Rodger Benham,

CheckThat! 2020 3 rd edition Enabling the Automatic Identification and Verification of Claims in

Introduction to Natural Language Processing Submission Requirements Evaluation Data 1 / 23

2 CMU 15-445/645 (Fall 2019) 3 Wait List Overview Course Logistics Relational Model

CS4501: Introduction to Computer Vision Deeper Convolutional Neural Network Architectures Last

ts Prts P t

Welcome to the Elizabeth Kenny McCann Journal Club! The Enduring Appeal of Learning Styles

Angular.js Scotty Labs WDW WHAT is Angular? Framework for Web Applications Follows an

Applying Language-based Static Verification in an ARM Operating System Matthew Danish Boston

DCU at the NTCIR-11 SpokenQuery&Doc Task David N. Racca, Gareth - PowerPoint PPT Presentation

DCU at the NTCIR-11 SpokenQuery&Doc Task David N. Racca, Gareth J.F. Jones CNGL Centre for Global Intelligent Content School of Computing, Dublin City University Dublin, Ireland Overview We participated in the slide-group SQ-SCR.

ML4HMT: DCU Teams Overview Tsuyoshi Okita Dublin City University DCU Teams Overview Meta

T HE DCU W OMEN IN L EADERSHIP I NITIATIVE Tuesday, 3 rd November 2015 Launch of DCU Women in

DCU at the NTCIR-14 OpenLiveQ-2 Task Piyush Arora &amp; Gareth J.F. Jones ADAPT Centre, School of

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30-

DOC Zoom Meeting April 28, 2020 www.ncsoccer.org DOC Meeting Welcome www.ncsoccer.org DOC

Neuchatel at NTCIR-4 From CLEF to NTCIR Jacques Savoy University of Neuchatel, Switzerland

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of

KSU Teams QA System for World History Exams at the NTCIR-13 QA Lab-3 Task Tasuku Kimura, Ryo

Kyoto-U: Syntactical EBMT System for NTCIR 7 Patent System for NTCIR-7 Patent Translation Task

Extending the DCU-250 Gold Standard f-structure Bank H. B echara hbechara@computing.dcu.ie

Communications Networks (EE414) Dr. Conor McArdle: room S335, mcardlec@eeng.dcu.ie,

Overview of the Sixth NTCIR Workshop Noriko Kando National Institute of Informatics

NTCIR 2014 Slides - TUW-IMP at the NTCIR-11 Math-2 Presentation February 2015 CITATIONS READS

CUTKB at NTCIR-14 QALab-PoliInfo Task Toshiki Tomihira and Yohei Seki University of Tsukuba,

Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Text Conversation Task

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel Mackenzie, Rodger Benham,

CheckThat! 2020 3 rd edition Enabling the Automatic Identification and Verification of Claims in

Introduction to Natural Language Processing Submission Requirements Evaluation Data 1 / 23

2 CMU 15-445/645 (Fall 2019) 3 Wait List Overview Course Logistics Relational Model

CS4501: Introduction to Computer Vision Deeper Convolutional Neural Network Architectures Last

ts Prts P t

Welcome to the Elizabeth Kenny McCann Journal Club! The Enduring Appeal of Learning Styles

Angular.js Scotty Labs WDW WHAT is Angular? Framework for Web Applications Follows an

Applying Language-based Static Verification in an ARM Operating System Matthew Danish Boston

DCU at the NTCIR-14 OpenLiveQ-2 Task Piyush Arora & Gareth J.F. Jones ADAPT Centre, School of