Why You Should Care About Byte-Level Seq2Seq Models in NLP South - PowerPoint PPT Presentation

Why You Should Care About Byte-Level Seq2Seq Models in NLP South England Natural Language Processing Meetup Alan Turing Institute Monday March 4, 2019 Tom Kenter TTS Research Google UK, London

Based on internship at Google Research in Mountain View Byte-level Machine Reading across Morphologically Varied Languages Tom Kenter, Llion Jones, Daniel Hewlett Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018 https://ai.google/research/pubs/pub47437 Medium blogpost Why You Should Care About Byte-Level Sequence-to-Sequence Models in NLP Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Proprietary + Confidential Is it advantageous, when processing morphologically rich languages, to use bytes rather than words as input and output to RNNs in a machine reading task?

Machine Reading Computer reads text and has to answer questions about it. WikiReading datasets English WikiReading dataset ● (Hewlett, et al, ACL, 2016) Two extra datasets — Russian and Turkish — ● (Kenter et al, AAAI, 2018) https://github.com/google-research-datasets/wiki-reading Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Byte-level Machine Reading s d 1 1 1 n 0 1 0 a l 0 1 1 r e 0 0 0 h 1 1 0 e t 1 0 0 h e n 1 1 1 N t I 1 0 1 word-level byte-level e s m r i e 1 1 1 a 1 1 0 h d W 1 0 1 r 0 0 1 e 0 1 0 t s 1 1 0 m 0 0 1 A 0 0 1 Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Morphologically rich languages Russian Turkish В прошлом году Дмитрий переехал в kolay → easy kolayla ş tırabiliriz Москву. → we can make it easier kolayla ş tıramıyoruz → we cannot make it easier Где теперь живет Дмитрий? В Москве. Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Why should you care about byte-level seq2seq models in NLP? 1 1 1 0 1 0 Small input vocabulary → small model size 0 1 1 0 0 0 1 1 0 1 0 0 1 1 1 1 0 1 No out-of-vocabulary problem Allows for apples-to-apples comparison between models Universal encoding scheme across languages byte-level Longer unroll length for RNN 1 1 1 1 1 0 1 0 1 0 0 1 ⟷ read less input 0 1 0 1 1 0 0 0 1 0 0 1 Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Models Multi-level RNN Bidirectional RNN Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Models Hybrid word-byte model Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Models Convolutional recurrent Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Models Memory network Encoder-transformer-decoder Memory network/encoder-transformer-decoder Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Results Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Conclusions Reading and outputting bytes, instead of words, works. Byte-level models provide an elegant way of dealing with the out-of-vocabulary problem. Byte-level models perform on par with the state-of-the-art word-level model on English, and better on morphologically more involved languages. This is good news, as byte-level models have far fewer parameters. Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Are you interested in machine reading/question answering/NLU, and looking for a new challenge? Try your approach on 3 languages at once! WikiReading English & Russian & Turkish https://github.com/google-research-datasets/wiki-reading

Thank you Byte-level Machine Reading across Morphologically Varied Languages Tom Kenter, Llion Jones, Daniel Hewlett Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018 https://ai.google/research/pubs/pub47437 Medium blogpost Why You Should Care About Byte-Level Sequence-to-Sequence Models in NLP

Why You Should Care About Byte-Level Seq2Seq Models in NLP South - PowerPoint PPT Presentation

Why You Should Care About Byte-Level Seq2Seq Models in NLP South England Natural Language Processing Meetup Alan Turing Institute Monday March 4, 2019 Tom Kenter TTS Research Google UK, London Based on internship at Google Research in

Multilingual and Multitask Learning in seq2seq Models CMSC 470 Marine Carpuat Multilingual

Seq2Seq Models and Attention M. Soleymani Sharif University of Technology Spring 2020 Most

Colorados Direct Care Workforce State-Level Training Models Presented by: Hayley Gleason

Colorados Direct Care Workforce State-Level Training Models Presented by: Hayley Gleason 1

Byte-precise Verification of Low-level List Manipulation Kamil Dudka 1 , 2 Petr Peringer 1 Tom

IP Datagram ICMP Message Format 1 byte 1 byte 1 byte 1 byte VERS HL Service Total Length

Character-level Language Models With Word-level Learning Arvid Frydenlund March 16, 2018

Encoder Decoder Models Antonios Anastasopoulos Site https://phontron.com/class/mtandseq2seq2019/

Probabilistic Graphical Models 10-708 Models with Higher- -Level Level Models with Higher

Multilevel Models Session 3: Random coefficient models Outline Random coefficient models

Mul$-Level Page Tables Level 2 Suppose: Tables

Encoding Byte Values Byte = 8 bits Binary 00000000 2 to 11111111 2 0 0 0000 Decimal:

multi-byte values in memory Store across contiguous byte locations. 64-bit Words Bytes Address

TLBs 1 memory HW random memory image page tables with 1-byte page entries answer: 2-byte

Emergency Department Paediatric Models of Care Setthy Ung FACEM Staff Specialist Emergency

1 A comprehensive example Sow replacement in practice At every weaning or return to oestrus it

Bundled Payments for Care Improvement: Application Consideration for Models 2-4 Patient Care

IP Network Layer Programming TCP/IP Wenyuan Xu Department of Computer Science and

Senior Living Kim Terracina Community Relations Director Senior Living 5 Levels of Care

Utilization Management Process: ICF-MR Facilities and Level of Care Requests Presented by

Java ByteCode Manuel Oriol June 7th, 2007 Byte Code? The Java language is compiled into an

Java ByteCode Manuel Oriol June 8th, 2006 Byte Code? The Java language is compiled into an

Section 2 Link Layer CSE 461 Autumn 2015 Panji Wisesa Byte Count Add a length to the

PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes Rundi Wu Yixin Zhuang Kai Xu Hao