Machine Translation of Medical Text in the KConnect Project Petra - PowerPoint PPT Presentation

Apr 07, 2024 •141 likes •223 views

Machine Translation of Medical Text in the KConnect Project Petra Galukov, Jan Haji, Jindich Libovick, Pavel Pecina, Ale Tamchyna Charles University in Prague Institute of Formal and Applied Linguistics Introduction

Machine Translation of Medical Text in the KConnect Project Petra Galuščáková, Jan Hajič, Jindřich Libovický, Pavel Pecina, Aleš Tamchyna Charles University in Prague Institute of Formal and Applied Linguistics
Introduction ● KConnect is a follow-up project of Khresmoi ● goals: provide components developed in Khresmoi as commercialized cloud services ● role of MT: provide cross-lingual search and access to medical documents – search queries – document summaries
Training Data ● new languages: – Swedish, Spanish, Polish, Hungarian ● in-domain corpora collected and processed – UMLS, EMEA, MuchMore, Wikipedia, PatTR, COPPA, Mesh, subtitles,...
Training Data: Statistics parallel monolingual only general general in-domain in-domain domain domain cs 21 665 1 93 de 126 310 4 699 es 74 1248 2 474 fr 193 896 2 589 hu 19 641 1 98 pl 17 606 1 205 sv 24 409 21 158 en – – 6087 2100 Training data sizes, all figures are in millions of words.
Domain Adaptation ● Data selection – divide data into „medical-like“ and „general“ parts (based on language model perplexity) ● Model interpolation – build separate models (phrase table, language model) for each part – use linear interpolation to combine them ● SRILM ● TMCombine
MT as a Web Service ● MTMonkey ● developed within Khresmoi, now actively extended and maintained ● runs in a cluster of 20 servers
Training Toolkit ● Eman Lite ● fully automated MT system training ● command-line application implemented ● goal: web-based interface, tight integration with MTMonkey
Thank you! Questions?

Recommend

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample text Sample

208 views • 10 slides

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation www.uni-stuttart.de Problem: Automatic translation the foreign text: 2 Open Problems in Machine Translation www.uni-stuttart.de Ambiguity in translation

943 views • 44 slides

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment Statistical Machine Translation extract rules parallel text + alignment Statistical Machine Translation score extract rules rules parallel

1.27k views • 109 slides

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan Ganegedara Data Scientist and A u thor Machine translation MACHINE TRANSLATION IN PYTHON Machine translation MACHINE TRANSLATION IN PYTHON Co u rse o u

839 views • 38 slides

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik & DFKI eisele@dfki.de Foundations of Language Science and Technology WS 2007/8 Machine Translation: Overview Machine Translation: Overview

219 views • 21 slides

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1 Advances and Challenges 2 Gongbo Tang Neural Machine Translation 2/52 Neural Machine Translation Figure Recurrent neural network based NMT

907 views • 73 slides

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems Three part systems ASR ASR - -> Translation > Translation - -> TTS > TTS System configurations System

288 views • 27 slides

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020 Machine Translation: French (2012) 1 Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020 Machine

1.32k views • 114 slides

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here Enter Text Here Enter Text Here CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here Enter Text

699 views • 66 slides

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Components: Translation model, language model, decoder Statistical Machine Translation Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of Decoding

541 views • 9 slides

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation: Computer Aided Translation 30 April 2015 Why Machine Translation? 1 Assimilation reader initiates translation, wants to know content user is

777 views • 49 slides

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation: Computer Aided Translation 15 November 2018 Why Machine Translation? 1 Assimilation reader initiates translation, wants to know content user

1.04k views • 66 slides

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation: Going Deep 4 June 2015 How do we Improve Machine Translation? 1 More data Better linguistically motivated models Better machine learning

1.21k views • 67 slides

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015 Machine Translation: Chinese 1 Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015 Machine

1.07k views • 96 slides

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine Translation: Neural Machine Translation II Refinements 17 October 2017 Neural Machine Translation 1 <s> the house is big .

828 views • 44 slides

Unsupervised Machine Translation Sachin Kumar Conditional Text Generation Generate text

CMU CS11-737: Multilingual NLP (Fall 2020) Unsupervised Machine Translation Sachin Kumar Conditional Text Generation Generate text according to a specification: P(Y|X) Input X Output Y (Text) Task English Hindi Machine Translation

744 views • 35 slides

Flexible, fine-grained distributed access control John Mitchell Stanford with Adam Barth, Anupam

Flexible, fine-grained distributed access control John Mitchell Stanford with Adam Barth, Anupam Datta, Ninghui Li (Purdue), Helen Nissenbaum (NYU), Will Winsborough, . April 2006 Were all ears What policy concepts are important in

536 views • 20 slides

OTT-o-matic 7/29/2020 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA DocuSign Envelope ID:

DocuSign Envelope ID: C45796C4-6BE5-4787-9CDB-96E1C168A62F UTAH STUDENT DATA PRIVACY AGREEMENT Version 2.0 Washington County School District and OTT-o-matic 7/29/2020 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA DocuSign Envelope ID:

456 views • 16 slides

Beyond CCPA Melissa Maalouf ZwillGen Kandi Parsons ZwillGen Matt Scutari Optimizely Jim

Beyond CCPA Melissa Maalouf ZwillGen Kandi Parsons ZwillGen Matt Scutari Optimizely Jim Trilling Federal Trade Commission (Disclaimer: The views expressed in this presentation are Jims as a FTC staff attorney and do not necessarily

999 views • 31 slides

Parents, teenagers, and student privacy issues Engineering & Public

CyLab Parents, teenagers, and student privacy issues Engineering & Public Policy Rebecca Balebako With thanks to Manya Sleeper y & c S a e v c i u r P r i t e

706 views • 19 slides

outline Institute for Software Research motivation A Software Product Line Approach

Institute for Software Research outline Institute for Software Research motivation A Software Product Line Approach software product line for Handling Privacy Constraints our privacy-enabling user modeling architecture in Web

262 views • 4 slides

ARTIFICIAL INTELLIGENCE AND GOVERNING THE LIFE CYCLE OF PERSONAL DATA John Frank Weaver

ARTIFICIAL INTELLIGENCE AND GOVERNING THE LIFE CYCLE OF PERSONAL DATA John Frank Weaver Artificial Intelligence and the Law Symposium University of Richmond School of Law Journal of Law and Technology February 23, 2018 Personal Data Existing

651 views • 38 slides

The Platform for Privacy Preferences (P3P) February 2000 Update A user empowerment approach

The Platform for Privacy Preferences (P3P) February 2000 Update A user empowerment approach Marc Langheinrich ETH Zurich P3P Preference Group Chair Outline P3P February 2000 Update Platform for Privacy Preferences Policy Background

765 views • 37 slides

INTERESTING TIMES Will Business Survive? Ben Tomhave, MS, CISSP DISCLAIMER The views expressed

INTERESTING TIMES Will Business Survive? Ben Tomhave, MS, CISSP DISCLAIMER The views expressed during this talk are not representative of any employers, whether past, present, or future. Society of Information Risk Analysts SciTech

532 views • 29 slides

Machine Translation of Medical Text in the KConnect Project Petra - PowerPoint PPT Presentation

Machine Translation of Medical Text in the KConnect Project Petra Galukov, Jan Haji, Jindich Libovick, Pavel Pecina, Ale Tamchyna Charles University in Prague Institute of Formal and Applied Linguistics Introduction

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Unsupervised Machine Translation Sachin Kumar Conditional Text Generation Generate text

Flexible, fine-grained distributed access control John Mitchell Stanford with Adam Barth, Anupam

OTT-o-matic 7/29/2020 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA DocuSign Envelope ID:

Beyond CCPA Melissa Maalouf ZwillGen Kandi Parsons ZwillGen Matt Scutari Optimizely Jim

Parents, teenagers, and student privacy issues Engineering & Public

outline Institute for Software Research motivation A Software Product Line Approach

ARTIFICIAL INTELLIGENCE AND GOVERNING THE LIFE CYCLE OF PERSONAL DATA John Frank Weaver

The Platform for Privacy Preferences (P3P) February 2000 Update A user empowerment approach

INTERESTING TIMES Will Business Survive? Ben Tomhave, MS, CISSP DISCLAIMER The views expressed

Sambuz

Useful Links

Newsletter

Mail Us

Machine Translation of Medical Text in the KConnect Project Petra - PowerPoint PPT Presentation

Machine Translation of Medical Text in the KConnect Project Petra Galukov, Jan Haji, Jindich Libovick, Pavel Pecina, Ale Tamchyna Charles University in Prague Institute of Formal and Applied Linguistics Introduction

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Unsupervised Machine Translation Sachin Kumar Conditional Text Generation Generate text

Flexible, fine-grained distributed access control John Mitchell Stanford with Adam Barth, Anupam

OTT-o-matic 7/29/2020 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA DocuSign Envelope ID:

Beyond CCPA Melissa Maalouf ZwillGen Kandi Parsons ZwillGen Matt Scutari Optimizely Jim

Parents, teenagers, and student privacy issues Engineering &amp; Public

outline Institute for Software Research motivation A Software Product Line Approach

ARTIFICIAL INTELLIGENCE AND GOVERNING THE LIFE CYCLE OF PERSONAL DATA John Frank Weaver

The Platform for Privacy Preferences (P3P) February 2000 Update A user empowerment approach

INTERESTING TIMES Will Business Survive? Ben Tomhave, MS, CISSP DISCLAIMER The views expressed

Sambuz

Useful Links

Newsletter

Mail Us

Parents, teenagers, and student privacy issues Engineering & Public