Abbreviation and Acronym Disambiguation in Clinical Discourse Serguei - PDF document

Abbreviation and Acronym Disambiguation in Clinical Discourse Serguei Pakhomov, PhD 1 , Ted Pedersen, PhD 2 and Christopher G. Chute, MD, DrPH 1 1 Division of Biomedical Informatics, Mayo College of Medicine, Rochester, MN, USA 2 Department of Computer Science, University of Minnesota Use of abbreviations and acronyms is pervasive in medical Natural Language Processing (NLP) clinical reports despite many efforts to limit the use applications. of ambiguous and unsanctioned abbreviations and Ideally, when looking for documents containing acronyms. Due to the fact that many abbreviations “rheumatoid arthritis”, we want to retrieve and acronyms are ambiguous with respect to their everything that has a mention of RA in the sense of sense, complete and accurate text analysis is “rheumatoid arthritis” but not those documents impossible without identification of the sense that where RA means “right atrial.” Acronym was intended for a given abbreviation or acronym. disambiguation problem is a special case of the We present the results of an experiment where we word sense disambiguation (WSD) problem. used the contexts harvested from the Internet Approaches to WSD include supervised machine through Google API to collect contextual data for a learning techniques, where some amount of set of 8 acronyms found in clinical notes at the training data is marked up by hand and is used to train a decision tree classifier 5 . On the other side of Mayo Clinic. We then used the contexts to disambiguate the sense of abbreviations in a the spectrum, the fully unsupervised learning manually annotated corpus. methods such as clustering have been also successfully used 6 . A hybrid class of machine INTRODUCTION learning techniques for WSD relies on a small set of hand labeled data used to bootstrap a larger Many abbreviations and acronyms i are ambiguous corpus of training data 7,8 . The cornerstone of all with respect to their sense and constitute a machine learning techniques for WSD is the significant part of the general problem of text context 9 as this is also true for acronym normalization. Acronyms are used routinely throughout clinical texts and knowing their sense is disambiguation. critical to the understanding of the document One way to take context into account is to consider the type of discourse in which the whether we talk about automatic natural language acronym occurs. If we see RA in a cardiology understanding or simply human comprehension and report, then it can be normalized to “right atrial”, interpretation. The acronym ambiguity is a growing problem both in the number of new acronyms and else if it occurs in the context of a rheumatology the number of new senses for existing acronyms. note, it is likely to mean “rheumatoid arthritis.” For example, according to the UMLS  2001AB 1 , This method of using global context to resolve the acronym ambiguity suffers from at least three RA had the following 8 senses: “rheumatoid major drawbacks. First of all, it requires a database arthritis”, “renal artery”, “right atrium”, “right of acronyms and their expansions linked with atrial”, “refractory anemia”, “radioactive”, “right possible contexts in which particular expansions aram”, “rheumatic arthritis.” The 2005AA version of the UMLS  contains 17 additional senses: can be used. Second, it requires a rule-based system for assigning correct expansions. Third, the “ragweed antigen”, “refractory ascites”, “renin distinctions made between various senses are activity”, to name only a few. This is just an bound to be very coarse. We may be able to indication of the rate at which the ambiguity is distinguish correctly between “rheumatoid proliferating. Liu et al. 2 show that 33% of arthritis” and “right atrial” since the two are likely acronyms listed in the UMLS in 2001 are to occur in clearly separable contexts; however, ambiguous. In a later study, Liu et al. 3 distinguishing between “rheumatoid arthritis” and demonstrated that 81% of acronyms found in “right arm” becomes more of a challenge and may MEDLINE abstracts are ambiguous and have on require introducing additional rules to further average 16 senses. In addition to problems with complicate the system. text interpretation, Friedman, et al. 4 also point out Pakhomov 10 introduced a method for collecting that acronyms constitute a major source of errors in training data for supervised machine learning a system that automatically generates lexicons for approaches to disambiguating acronyms. The method is based on the assumption that the i To save space and for ease of presentation, we will use the expansion (or the sense) of an acronym and the word “acronym” to mean both “abbreviation” and “acronym” acronym itself tend to occur in similar contexts. For since the two could be used interchangeably for the purposes example, we would expect one to use the described in this paper

Abbreviation and Acronym Disambiguation in Clinical Discourse Serguei - PDF document

Abbreviation and Acronym Disambiguation in Clinical Discourse Serguei Pakhomov, PhD 1 , Ted Pedersen, PhD 2 and Christopher G. Chute, MD, DrPH 1 1 Division of Biomedical Informatics, Mayo College of Medicine, Rochester, MN, USA 2 Department of

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

NAACCR RECOMMENDED ABBREVIATION LIST ORDERED BY WORD/TERM(S) WORD/TERM(S) ABBREVIATION/SYMBOL

Division of Behavioral Health Services Abbreviations and Acronyms List Acronym/Abbreviation

Publications, Identity, and Disambiguation NIH Workshop on Identifiers and Disambiguation in

Word Sense Disambiguation WORD SENSE DISAMBIGUATION Homonymy and Polysemy As we have seen,

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

Word Meaning & Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

Most commonly used echocardiographic abbreviations Only use abbreviation if used more than 3 times

Joint Entity Disambiguation and Clustering Angela Fahrni, Thierry Gckel and Michael Strube

Signatures and grammars Signatures and grammars Why manual disambiguation in SDF?

Tulip: Lightweight Entity Recognition and Disambiguation Using Wikipedia-Based Topic Centroids

Data-driven sense induction for disambiguation and lexical selection in translation Marianna

Final Projects Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison

Topic Models for Word Sense Disambiguation and Token-based Idiom Detection Linlin Li, Benjamin

Methods Matter: Improving USPTO Inventor Disambiguation Algorithms with Classification Models and

Similarity-based Word Sense Disambiguation Yael Karov Shimon Edelman Weizmann Institute MIT

TERM ABBREVIATION Address ADDR Also Known As AKA And & Appointment APPT Approximately

Breed Code Guide Cattle breeds and their abbreviation codes Code Cattle Breed Code Cattle

A W ORD ABBREVI ATI ON ACCOUNTING ACCTG ACTIVE MANAGEMENT AREA AMA ACQUISITION ACQN

B afeb .................. afebrile AIDS ................ acquired immunodeficiency syndrome Ba

Common EMS Abbreviation and Acronyms ACLS Advanced Cardiac Life Support ACS Alternate Care

Integratingdeeplearningandlogic DeepLearning No constraints on output Differentiable

IEEE Abbreviations for Transactions, Journals, Letters Biomed En g/ IFEE Trans. Auton. Mental

ABBREVIATIONS USED FOR RESTANDARDIZATION OF SUBSISTENCE PRIME VENDOR MASTER DATABASE BY DLA

Sambuz

Useful Links

Newsletter

Mail Us

Abbreviation and Acronym Disambiguation in Clinical Discourse Serguei - PDF document

Abbreviation and Acronym Disambiguation in Clinical Discourse Serguei Pakhomov, PhD 1 , Ted Pedersen, PhD 2 and Christopher G. Chute, MD, DrPH 1 1 Division of Biomedical Informatics, Mayo College of Medicine, Rochester, MN, USA 2 Department of

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

NAACCR RECOMMENDED ABBREVIATION LIST ORDERED BY WORD/TERM(S) WORD/TERM(S) ABBREVIATION/SYMBOL

Division of Behavioral Health Services Abbreviations and Acronyms List Acronym/Abbreviation

Publications, Identity, and Disambiguation NIH Workshop on Identifiers and Disambiguation in

Word Sense Disambiguation WORD SENSE DISAMBIGUATION Homonymy and Polysemy As we have seen,

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

Word Meaning &amp; Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

Most commonly used echocardiographic abbreviations Only use abbreviation if used more than 3 times

Joint Entity Disambiguation and Clustering Angela Fahrni, Thierry Gckel and Michael Strube

Signatures and grammars Signatures and grammars Why manual disambiguation in SDF?

Tulip: Lightweight Entity Recognition and Disambiguation Using Wikipedia-Based Topic Centroids

Data-driven sense induction for disambiguation and lexical selection in translation Marianna

Final Projects Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison

Topic Models for Word Sense Disambiguation and Token-based Idiom Detection Linlin Li, Benjamin

Methods Matter: Improving USPTO Inventor Disambiguation Algorithms with Classification Models and

Similarity-based Word Sense Disambiguation Yael Karov Shimon Edelman Weizmann Institute MIT

TERM ABBREVIATION Address ADDR Also Known As AKA And &amp; Appointment APPT Approximately

Breed Code Guide Cattle breeds and their abbreviation codes Code Cattle Breed Code Cattle

A W ORD ABBREVI ATI ON ACCOUNTING ACCTG ACTIVE MANAGEMENT AREA AMA ACQUISITION ACQN

B afeb .................. afebrile AIDS ................ acquired immunodeficiency syndrome Ba

Common EMS Abbreviation and Acronyms ACLS Advanced Cardiac Life Support ACS Alternate Care

Integratingdeeplearningandlogic DeepLearning No constraints on output Differentiable

IEEE Abbreviations for Transactions, Journals, Letters Biomed En g/ IFEE Trans. Auton. Mental

ABBREVIATIONS USED FOR RESTANDARDIZATION OF SUBSISTENCE PRIME VENDOR MASTER DATABASE BY DLA

Sambuz

Useful Links

Newsletter

Mail Us

Word Meaning & Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

TERM ABBREVIATION Address ADDR Also Known As AKA And & Appointment APPT Approximately