UTD HLTRI at TAC 2019: DDI Track Ramon Maldonado , Maxwell Weinzierl, & Sanda M. Harabagiu The University of Texas at Dallas Human Language Technology Research Institute http://www.hlt.utdallas.edu/~{ramon, max, sanda}
Outline 1. Introduction 2. The Approach 1. Pipeline Overview 2. Preprocessing 3. Multi-Task Transformer 4. Postprocessing 3. Results 4. Conclusion
Introduction Multi-task neural model for: • Task 1: entity identification • Task 2: relation identification • Task 3*: concept normalization • Task 4: normalized relation identification
Introduction Problem • Sentence-level • Binary Relation identification Our Approach • Multi-task learning – Sentence classification – Mention boundary detection – Relation extraction – PK effect classification • Pre-trained Transformer for shared representation
Outline 1. Introduction 2. The Approach 1. Pipeline Overview 2. Preprocessing 3. Multi-Task Transformer 4. Postprocessing 3. Results 4. Conclusion
The Approach FDA Label Drug-Drug Interaction Pipeline Postprocessing Structured Product Labels Task 1: Mentions • Mention Filtering • Continuation Linking Task 2: Relations SPLs SPLs SPLs • Unused mention/relation filtering Normalization PK Effects UMLS SNOMED-CT MED-RT Mentions Relations Sentence Mention Relation PKE Classifier Boundary Extractor Classifier Preprocessing Annotation Propagation Shared Representation • Mentions Task 3: • Relations Normalized Mentions • Pseudo-triggers BERT Tokenization • Spacy Multi-task Transformer Net for Identifying Drug-Drug Task 4: Label Interactions • Word-piece Interactions
Outline 1. Introduction 2. The Approach 1. Pipeline Overview 2. Preprocessing 3. Multi-Task Transformer 4. Postprocessing 3. Results 4. Conclusion
Preprocessing • Binary Relations – (Trigger, Precipitant, Effect) -> • (Trigger, Precipitant) • (Trigger, Effect) – Pseudo-triggers for SIs in some PDIs – PK effects as attributes • Mention annotation propagation – Ease the learning problem
Preprocessing • Tokenization – spaCy – WordPiece using BERT vocab • C-IOBES tagging – Continuation necessary for disjoint spans
Outline 1. Introduction 2. The Approach 1. Pipeline Overview 2. Preprocessing 3. Multi-Task Transformer 4. Postprocessing 3. Results 4. Conclusion
Multi-Task Transformer Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI) Relation Labels for all Mention Pairs Mention Type & Boundary Softmax Layer Labels for all words in an EEG report PKI effect codes r Mention Boundary Labeler Sentences containing PKE Classifier interactions b n b 1 b 2 Sentence Classifier Trigger Argument Context Embedding Embedding Embedding CRF Softmax Layer Softmax Layer r c 1 c 2 c 3 c 4 c 5 c 6 c n c n c 1 c 2 If r is a PKI s c n s c 1 c 2 c 3 BERT Sentence Encoder t 1 t 2 t 3 t n [CLS] [SEP]
BERT Sentence Encoder Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI) Relation Labels for all Mention Pairs Mention Type & Boundary Softmax Layer Labels for all words in an EEG report PKI effect codes r Mention Boundary Labeler Sentences containing PKE Classifier interactions b n b 1 b 2 Sentence Classifier Trigger Argument Context Embedding Embedding Embedding CRF Softmax Layer Softmax Layer r c 1 c 2 c 3 c 4 c 5 c 6 c n c n c 1 c 2 If r is a PKI s c n s c 1 c 2 c 3 BERT Sentence Encoder t 1 t 2 t 3 t n [CLS] [SEP]
Sentence Classifier Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI) Relation Labels for all Mention Pairs Mention Type & Boundary Softmax Layer Labels for all words in an EEG report PKI effect codes r Mention Boundary Labeler Sentences containing PKE Classifier interactions b n b 1 b 2 Sentence Classifier Trigger Argument Context Embedding Embedding Embedding CRF Softmax Layer Softmax Layer r c 1 c 2 c 3 c 4 c 5 c 6 c n c n c 1 c 2 If r is a PKI s c n s c 1 c 2 c 3 BERT Sentence Encoder t 1 t 2 t 3 t n [CLS] [SEP]
Mention Boundary Labeler Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI) Relation Labels for all Mention Pairs Mention Type & Boundary Softmax Layer Labels for all words in an EEG report PKI effect codes r Mention Boundary Labeler Sentences containing PKE Classifier interactions b n b 1 b 2 Sentence Classifier Trigger Argument Context Embedding Embedding Embedding CRF Softmax Layer Softmax Layer r c 1 c 2 c 3 c 4 c 5 c 6 c n c n c 1 c 2 If r is a PKI s c n s c 1 c 2 c 3 BERT Sentence Encoder t 1 t 2 t 3 t n [CLS] [SEP]
Relation Extractor Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI) Relation Labels for all Mention Pairs Mention Type & Boundary Softmax Layer Labels for all words in an EEG report PKI effect codes r Mention Boundary Labeler Sentences containing PKE Classifier interactions b n b 1 b 2 Sentence Classifier Trigger Argument Context Embedding Embedding Embedding CRF Softmax Layer Softmax Layer r c 1 c 2 c 3 c 4 c 5 c 6 c n c n c 1 c 2 If r is a PKI s c n s c 1 c 2 c 3 BERT Sentence Encoder t 1 t 2 t 3 t n [CLS] [SEP]
Pharmacokinetic Effect Classifier Multi-Task Transformer network for Identifying Drug-Drug Interactions (MTTDDI) Relation Labels for all Mention Pairs Mention Type & Boundary Softmax Layer Labels for all words in an EEG report PKI effect codes r Mention Boundary Labeler Sentences containing PKE Classifier interactions b n b 1 b 2 Sentence Classifier Trigger Argument Context Embedding Embedding Embedding CRF Softmax Layer Softmax Layer r c 1 c 2 c 3 c 4 c 5 c 6 c n c n c 1 c 2 If r is a PKI s c n s c 1 c 2 c 3 BERT Sentence Encoder t 1 t 2 t 3 t n [CLS] [SEP]
Outline 1. Introduction 2. The Approach 1. Pipeline Overview 2. Preprocessing 3. Multi-Task Transformer 4. Postprocessing 3. Results 4. Conclusion
Postprocessing • Filtering – Invalid boundary tag sequences – Repeated mentions – Mentions not involved in an interaction • C-spans linked to closest mention • Reconstruct ternary interactions from binary through shared trigger
Postprocessing • Normalization – String matching – SNOMED-CT • Specific interactions – MED-RT • Drug classes – UNII • precipitants – Augmented with atoms from UMLS • Map precipitants first to MED-RT, then to UNII of no match was found
Postprocessing Task 4 • inferred from unique interactions between normalized mentions • PK effect codes from MTTDDI
Outline 1. Introduction 2. The Approach 1. Pipeline Overview 2. Preprocessing 3. Multi-Task Transformer 4. Postprocessing 3. Results 4. Conclusion
Results Evaluated MTTDDI against two alternate configurations: • UTDHLTRI Run3: No sentence filtering/targeted training • Run3 + Filtering: Dedicated Learners System Task1 Task2 Task3 Task4 Best Submission 65.38 49.03 62.39 17.56 Median 48.97 37.13 45.53 17.56 UTDHLTRI Run3 35.04 27.48 28.66 17.56 Run3 + Filtering 56.03 42.29 45.73 24.07 MTTDDI 54.39 41.34 44.08 25.20 * Bold indicated best score. Italics indicates best score among LDIIP systems.
Outline 1. Introduction 2. The Approach 1. Pipeline Overview 2. Preprocessing 3. Multi-Task Transformer 4. Postprocessing 3. Results 4. Conclusion
Questions
Recommend
More recommend