different contributions to cost effective transcription
play

Different Contributions to Cost-Effective Transcription and - PowerPoint PPT Presentation

Different Contributions to Cost-Effective Transcription and Translation of Video Lectures Thesis Presented by: Joan Albert Silvestre Cerd Supervisors: Dr. Alfons Juan Cscar Dr. Jorge Civera Saiz Machine Learning and Language Processing


  1. Different Contributions to Cost-Effective Transcription and Translation of Video Lectures Thesis Presented by: Joan Albert Silvestre Cerdà Supervisors: Dr. Alfons Juan Císcar Dr. Jorge Civera Saiz Machine Learning and Language Processing Departament de Sistemes Informàtics i Computació Universitat Politècnica de València January 27, 2016

  2. Outline Different Contributions to Cost-Effective Transcription and Translation of Video Introduction Lectures J.A. Silvestre Cerdà Introduction Explicit Length Modelling for SMT Explicit Length Modelling for SMT Efficient Audio Segmentation for Speech Detection Efficient Audio Segmentation for Speech Detection The transLectures-UPV Platform The transLectures-UPV Platform Recommender Systems for Online Learning Platforms Recommender Systems for Online Learning Platforms LM Adaptation Using External Resources for ASR LM Adaptation Using External Resources for ASR Demos Demos Conclusions Conclusions 35 MLLP - DSIC - UPV

  3. Outline Different Contributions to Cost-Effective Transcription and Translation of Video Introduction Lectures J.A. Silvestre Cerdà Introduction Explicit Length Modelling for SMT 2 Explicit Length Modelling for SMT Efficient Audio Segmentation for Speech Detection Efficient Audio Segmentation for Speech Detection The transLectures-UPV Platform The transLectures-UPV Platform Recommender Systems for Online Learning Platforms Recommender Systems for Online Learning Platforms LM Adaptation Using External Resources for ASR LM Adaptation Using External Resources for ASR Demos Demos Conclusions Conclusions 35 MLLP - DSIC - UPV

  4. Introduction Different Contributions to Cost-Effective Transcription and Translation of Video Lectures J.A. Silvestre Cerdà 3 Introduction ◮ Internet has brought new opportunities to academic institutions. Explicit Length Modelling for SMT ◮ Multimedia repositories as fundamental knowledge assets. Efficient Audio Segmentation for Speech Detection The ◮ Subtitles are really needed in these repositories. transLectures-UPV Platform Recommender ◮ Most repositories are neither transcribed nor translated. Systems for Online Learning Platforms LM Adaptation Using External Resources ◮ Cost-effective transcription and translation of video repositories. for ASR Demos Conclusions 35 MLLP - DSIC - UPV

  5. Scientific and Technological Goals Different Contributions to Cost-Effective Transcription and Translation of Video Lectures J.A. Silvestre Cerdà ◮ To propose an approach to explicit length modelling for SMT. Introduction 4 Explicit Length Modelling for SMT ◮ To develop efficient audio segmentation systems. Efficient Audio Segmentation for Speech Detection ◮ To design a system architecture for ASR and SMT integration. The transLectures-UPV Platform ◮ To improve adaptation techniques for ASR and SMT. Recommender Systems for Online Learning Platforms ◮ To design recommender systems using speech transcriptions. LM Adaptation Using External Resources for ASR Demos ◮ To evaluate these contributions in real-life scenarios. Conclusions 35 MLLP - DSIC - UPV

  6. Outline Different Contributions to Cost-Effective Transcription and Translation of Video Introduction Lectures J.A. Silvestre Cerdà Introduction Explicit Length Modelling for SMT 5 Explicit Length Modelling for SMT Efficient Audio Segmentation for Speech Detection Introduction Log-linear modelling Experiments Conclusions The transLectures-UPV Platform Efficient Audio Segmentation for Speech Detection Recommender Systems for Online Learning Platforms The transLectures-UPV Platform LM Adaptation Using External Resources for ASR Recommender Systems for Online Learning Platforms Demos LM Adaptation Using External Resources for ASR Conclusions Demos Conclusions 35 MLLP - DSIC - UPV

  7. Introduction Different Contributions to Cost-Effective Transcription and Translation of Video Lectures J.A. Silvestre Cerdà Introduction Explicit Length ◮ Length modelling is a well-known problem. Modelling for SMT 6 Introduction Log-linear modelling ◮ Focus on explicit length modelling for SMT. Experiments Conclusions Efficient Audio Segmentation for ◮ Comparative study on phrase length modelling for SMT. Speech Detection The transLectures-UPV ◮ Two novel length models for phrase-based SMT are presented. Platform Recommender Systems for Online Learning Platforms LM Adaptation Using External Resources for ASR Demos Conclusions 35 MLLP - DSIC - UPV

  8. SMT and Log-linear Modelling Different Contributions to Cost-Effective Transcription and Search the most likely translation ˆ y : Translation of Video Lectures J.A. Silvestre Cerdà y = argmax ˆ p ( y | x ) y Introduction 1 � � Explicit Length ∑ ≈ argmax λ i f i ( x , y ) Z ( x ) exp Modelling for SMT y i Introduction 7 Log-linear modelling ∑ = argmax λ i f i ( x , y ) Experiments y Conclusions i Efficient Audio Segmentation for where feature functions f i ( x , y ) are logs of: Speech Detection ◮ Phrase-based translation models: p ( y | x ) , p ( x | y ) . The transLectures-UPV Platform ◮ Lexical models: l ( y | x ) , l ( x | y ) . Recommender Systems for Online ◮ Language model: p ( y ) . Learning Platforms LM Adaptation Using ◮ Reordering models. External Resources for ASR ◮ Phrase-length models : std and spc (param/non-param). Demos Conclusions 35 MLLP - DSIC - UPV

  9. Europarl En → Es (train 1M, test 2K) Different Contributions to Cost-Effective 32.4 Transcription and Viterbi BLEU Translation of Video Lectures J.A. Silvestre Cerdà 32.2 Introduction Explicit Length Modelling for SMT 32.0 Introduction Log-linear modelling 8 Experiments 31.8 Conclusions Efficient Audio Segmentation for Speech Detection baseline 31.6 The std non-param transLectures-UPV Platform std param 31.4 Recommender spc non-param Systems for Online spc param Learning Platforms 31.2 LM Adaptation Using External Resources for ASR Maximum Phrase Length Demos 31.0 Conclusions 3 4 5 6 7 35 MLLP - DSIC - UPV

  10. Conclusions Different Contributions to Cost-Effective Transcription and Translation of Video Lectures J.A. Silvestre Cerdà Introduction Explicit Length ◮ Two novel phrase-length models for phrase-based SMT. Modelling for SMT Introduction Log-linear modelling ◮ Statistically significant improvements on all language pairs. Experiments 9 Conclusions Efficient Audio Segmentation for ◮ Length models behave differently depending on the task. Speech Detection The transLectures-UPV ◮ Trade-off between model complexity and data sparseness. Platform Recommender Systems for Online Learning Platforms LM Adaptation Using External Resources for ASR Demos Conclusions 35 MLLP - DSIC - UPV

  11. Outline Different Contributions to Cost-Effective Transcription and Translation of Video Introduction Lectures J.A. Silvestre Cerdà Introduction Explicit Length Modelling for SMT Explicit Length Modelling for SMT Efficient Audio Segmentation for Speech Detection 10 Efficient Audio Segmentation for Speech Detection Introduction The transLectures-UPV Platform System Description Experiments Conclusions Recommender Systems for Online Learning Platforms The transLectures-UPV Platform LM Adaptation Using External Resources for ASR Recommender Systems for Online Learning Platforms Demos LM Adaptation Using External Resources for ASR Conclusions Demos Conclusions 35 MLLP - DSIC - UPV

  12. Introduction Different Contributions to Cost-Effective Transcription and Translation of Video Lectures J.A. Silvestre Cerdà Introduction ◮ The temporal cost of ASR depends on the input length. Explicit Length Modelling for SMT ◮ Only speech segments should be delivered to ASR systems. Efficient Audio Segmentation for Speech Detection 11 Introduction ◮ A prior segmentation can provide a better transcription quality. System Description Experiments Conclusions ◮ A fast GMM-HMM Audio Segmentation system is proposed. The transLectures-UPV Platform Recommender ◮ Albayzin Audio Segmentation Evaluation 2012. Systems for Online Learning Platforms LM Adaptation Using External Resources for ASR Demos Conclusions 35 MLLP - DSIC - UPV

  13. System description Different Contributions to Cost-Effective Transcription and Translation of Video Lectures ◮ AS can be seen as a simplified case of ASR. J.A. Silvestre Cerdà Introduction ◮ Reduced set of acoustic classes C (i.e. speech, noise, music). Explicit Length Modelling for SMT Efficient Audio ◮ Search for a sequence of class labels ˆ c so that Segmentation for Speech Detection Introduction 12 System Description Experiments c = argmax ˆ p ( c | x ) Conclusions c ∈ C ∗ The = argmax p ( x | c ) p ( c ) transLectures-UPV Platform c ∈ C ∗ Recommender Systems for Online Learning Platforms where: LM Adaptation Using p ( x | c ) GMM-HMM based acoustic model. External Resources for ASR p ( c ) n -gram language model. Demos Conclusions 35 MLLP - DSIC - UPV

Recommend


More recommend