Discovering similar Research Ideas using Semantic Vectors and Machine Learning Mads Rydahl, UNSILO
UNSILO Text Intelligence For Science Q4/2016
Mads Rydahl mads@unsilo.com https://linkedin.com/in/rydahl Mads has managed software dev teams for over 20 years. He has built games for Lego Mindstorms, interfaces for Bang & Olufsen, authored a portfolio of patents acquired by Apple, and created the world’s best casual game (before that was profitable ;-) Mads has lived 5 years in Silicon Valley, worked at Stanford University, and was head of Product and Design for Siri.com, a startup funded by SRI and DARPA and acquired by Apple in 2010. Mads is cofounder of UNSILO, a Danish startup building semantic discovery tools for science.
UNSILO Mission To build discovery services that make it easy and fast to find relevant knowledge and discover new patterns across all of science Automated Because scientific language is constantly growing, evolving, and accelerating. Omniscient Because important findings may not be apparent. Even to the author. Unbiased Because existing solutions rank by popularity and cause filter bubbles.
UNSILO Core Technology We extract key phrases without prior domain knowledge, and use Machine Learning to identify novel ideas as they emerge Apache UIMA, Apache Ruta Unstructured Information Management Framework Stanford NLP tools, DKPRo, et.al. Natural Language Processing suites Python, Java, Hadoop, Spark, TensorFlow, Mahout, Vowpal Wabbit, GenSim, LevelDb, Elasticsearch, Docker, AWS, Cloudsigma, Open Languages, libraries, and frameworks
Key Challenges Our Knowledge does Not Compute The world moves too fast for data curators and ontology writers ▪ Most Scientific Disciplines have no ontologies (or even controlled vocabularies) ▪ Dictionaries and Reference Works are too small and often out-of-date ▪ New discoveries have no official names ▪ People are too creative There is a lot of variation in language ▪ Researchers often add descriptive detail that obscure facts ▪ There is no “right way” to describe most things ▪ Some things seem obvious …but mostly to the author The right Level-of-Detail depends both on the context and the reader ▪ The most obvious facts are often omitted because they are implicitly included ▪ Editors think in themes and topics, researchers in methods, properties, and facts ▪
Full Text Search Pseudohyponatremia: Does It Matter in Current Clinical Practice? http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3894530/ doi: 10.5049/EBP.2006.4.2.77 Serum consists of water (93% of serum volume) and nonaqueous components, mainly lipids and proteins (7% of serum volume). Sodium is restricted to serum water. In states of hyperproteinemia or hyperlipidemia, there is an increased mass of the nonaqueous components of serum and a concomitant decrease in the proportion of serum composed of water. Thus, pseudohyponatremia results because the flame photometry method measures sodium concentration in whole plasma. A sodium-selective electrode gives the true, physiologically pertinent sodium concentration because it measures sodium activity in serum water. Whereas the serum sample is diluted in indirect potentiometry, the sample is not diluted in direct potentiometry. Because only direct reading gives an accurate concentration, we suspect that indirect potentiometry which many hospital laboratories are now using may mislead us to confusion in interpreting the serum sodium data. However, it seems that indirect potentiometry very rarely gives us discernibly low serum sodium levels in cases with hyperproteinemia and hyperlipidemia. As long as small margins of errors are kept in mind of clinicians when serum sodium is measured from the patients with hyperproteinemia or hyperlipidemia, the present methods for measuring sodium concentration in serum by indirect sodium-selective electrode potentiometry could be maintained in the clinical practice.
Using Keywords and Ontologies Pseudohyponatremia: Does It Matter in Current Clinical Practice? http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3894530/ doi: 10.5049/EBP.2006.4.2.77 Key: Chemical Technique Anatomy Disease Species Serum consists of water (93% of serum volume) and nonaqueous components, mainly lipids and proteins (7% of serum volume). Sodium is restricted to serum water. In states of hyperproteinemia or hyperlipidemia, there is an increased mass of the nonaqueous components of serum and a concomitant decrease in the proportion of serum composed of water. Thus, pseudohyponatremia results because the flame photometry method measures sodium concentration in whole plasma. A sodium-selective electrode gives the true, physiologically pertinent sodium concentration because it measures sodium activity in serum water. Whereas the serum sample is diluted in indirect potentiometry, the sample is not diluted in direct potentiometry. Because only direct reading gives an accurate concentration, we suspect that indirect potentiometry which many hospital laboratories are now using may mislead us to confusion in interpreting the serum sodium data. However, it seems that indirect potentiometry very rarely gives us discernibly low serum sodium levels in cases with hyperproteinemia and hyperlipidemia. As long as small margins of errors are kept in mind of clinicians when serum sodium is measured from the patients with hyperproteinemia or hyperlipidemia, the present methods for measuring sodium concentration in serum by indirect sodium-selective electrode potentiometry could be maintained in the clinical practice.
UNSILO Exhaustive Concept Extraction Pseudohyponatremia: Does It Matter in Current Clinical Practice? http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3894530/ doi: 10.5049/EBP.2006.4.2.77 Key: Chemical Technique Anatomy Disease Species Serum consists of water (93% of serum volume) and nonaqueous components, mainly lipids and proteins (7% of serum volume). Sodium is restricted to serum water. In states of hyperproteinemia or hyperlipidemia, there is an increased mass of the nonaqueous components of serum and a concomitant decrease in the proportion of serum composed of water. Thus, pseudohyponatremia results because the flame photometry method measures sodium concentration in whole plasma. A sodium-selective electrode gives the true, physiologically pertinent sodium concentration because it measures sodium activity in serum water. Whereas the serum sample is diluted in indirect potentiometry, the sample is not diluted in direct potentiometry. Because only direct reading gives an accurate concentration, we suspect that indirect potentiometry which many hospital laboratories are now using may mislead us to confusion in interpreting the serum sodium data. However, it seems that indirect potentiometry very rarely gives us discernibly low serum sodium levels in cases with hyperproteinemia and hyperlipidemia. As long as small margins of errors are kept in mind of clinicians when serum sodium is measured from the patients with hyperproteinemia or hyperlipidemia, the present methods for measuring sodium concentration in serum by indirect sodium-selective electrode potentiometry could be maintained in the clinical practice.
UNSILO Complete Semantic Mapping Pseudohyponatremia: Does It Matter in Current Clinical Practice? http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3894530/ doi: 10.5049/EBP.2006.4.2.77 Key: Action/Relation Chemical Technique Anatomy Disease Species Serum consists of water (93% of serum volume) and nonaqueous components, mainly lipids and proteins (7% of serum volume). Sodium is restricted to serum water. In states of hyperproteinemia or hyperlipidemia, there is an increased mass of the nonaqueous components of serum and a concomitant decrease in the proportion of serum composed of water. Thus, pseudohyponatremia results because the flame photometry method measures sodium concentration in whole plasma. A sodium-selective electrode gives the true, physiologically pertinent sodium concentration because it measures sodium activity in serum water. Whereas the serum sample is diluted in indirect potentiometry, the sample is not diluted in direct potentiometry. Because only direct reading gives an accurate concentration, we suspect that indirect potentiometry which many hospital laboratories are now using may mislead us to confusion in interpreting the serum sodium data. However, it seems that indirect potentiometry very rarely gives us discernibly low serum sodium levels in cases with hyperproteinemia and hyperlipidemia. As long as small margins of errors are kept in mind of clinicians when serum sodium is measured from the patients with hyperproteinemia or hyperlipidemia, the present methods for measuring sodium concentration in serum by indirect sodium-selective electrode potentiometry could be maintained in the clinical practice.
Phrase Extraction Natural Language Processing ■ Sentences are annotated with part-of-speech tags; noun, verb, adjective, and a dependency tree methods for measuring sodium concentration in serum by indirect sodium-selective electrode potentiometry [··thing··] [··action··] [···········thing··········] [·thing·] [····························· thing ······························] Extract all “things” ■ Method Sodium concentration Serum Indirect Sodium-Selective Electrode Potentiometry
Recommend
More recommend