Neural Embeddings for Populated GeoNames Locations Mayank Kejriwal, - PowerPoint PPT Presentation

Neural Embeddings for Populated GeoNames Locations Mayank Kejriwal, Pedro Szekely USC Information Sciences Institute

Motivation: feature extraction from locations • Essential for machine learning problems involving locations

Machine learning applications • Toponym resolution e.g. "Boston" in England, UK vs. "Boston" in Massachusetts, USA • Much more likely to be Boston, MA if ‘New York’ and ‘Martha’s Vineyard’ also got extracted in a similar context • Features are hybrid i.e. must encode both location and ‘context’ (e.g. text)

Machine learning applications • Toponym resolution e.g. "Boston" in England, UK vs. "Boston" in Massachusetts, USA • Much more likely to be Boston, MA if ‘New York’ and ‘Martha’s Vineyard’ also got extracted in a similar context • Features are hybrid i.e. must encode both location and ‘context’ (e.g. text) • Named entity disambiguation e.g. Was ‘Charlotte’ extracted as a name or a location?

Motivation: feature extraction from locations • Essential for machine learning problems Why not use latitude-longitude directly?

What makes for a ‘good’ feature space? • Captures proximity semantics • Real-valued, not very high-dimensional • Not too sensitive (1.0 vs. 1.001) • Extensible • Does not necessarily require manual tuning • Generic i.e. can be visualized in some way

Do lat-long points capture proximity semantics? • Only in a very dense, non-linear space

More formally... • dist(lat 1 , long 1 , lat 2 , long 2 ) is well-approximated using the Haversine formula

Do lat-long points capture proximity semantics? • Discontinuous (in linear space)!

Do lat-long points capture proximity semantics? • Sensitive (more than other features typically used in machine learning pipelines)

What makes for a good feature space? • Captures proximity semantics • Real-valued, not very high-dimensional • Not too sensitive (1.0 vs. 1.001) • Extensible • Does not necessarily require manual tuning • Generic i.e. can be visualized in some way

Id Idea: ‘Embed’ Geonames as a weighted, directed network... • ...in a vector space! • Vector similarities (using dot product similarity) depend inversely on geodesic distances 2-dimensional un-normalized 100-dimensional embeddings (latitude- normalized embeddings in longitude) in complex, dot product space sensitive space

Step 1: Determine set of nodes in network • Nodes in Geonames identifies by following feature codes: [`PPL', `PPLA', `PPLA2', `PPLA3', `PPLA4', `PPLC', `PPLCH', `PPLF', `PPLG', `PPLH', `PPLL', `PPLQ', `PPLR', `PPLS', `PPLW', `PPLX', ~4.4 million nodes `STLMT']

Step 2: Determine edges and weights • Pairwise in the worst case • Slide a window over nodes sorted by latitude or longitude, only form edges between nodes in the same window. ~357,000 nodes • Postprocess by removing nodes with ~9 million edges 0 population

Step 3: Run DeepWalk on network • DeepWalk (Perozzi et al., 2014) is a powerful neural network algorithm for embedding nodes in graphs; has achieved powerful results • Very fast!

Example in paper: North Dakota

Vectors, code and raw data all on GitHub (also, figshare) https://github.com/mayankkejriwal/Geonames-embeddings

Neural Embeddings for Populated GeoNames Locations Mayank Kejriwal, - PowerPoint PPT Presentation

Neural Embeddings for Populated GeoNames Locations Mayank Kejriwal, Pedro Szekely USC Information Sciences Institute Motivation: feature extraction from locations Essential for machine learning problems involving locations Machine learning

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

IN5550 Neural Methods in Natural Language Processing Convolutional Neural Networks Erik

IN5550 Neural Methods in Natural Language Processing Convolutional Neural Networks Erik

Preparing a Benefit-Cost Office of the Chief Analysis Economist United States July 23, 2019

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Overview Last time we studied the evolution of a discrete linear dynamical system, and today we

Silvano DAL ZILIO LAAS-CNRS, Vertics team presentation for our paper: MCC: a Tool for Unfolding

SOFTWARE ENGINEERING IN STARTUPS Dr. Vadim Zaytsev Universiteit van Amsterdam 20 January 2014

Classbased Detailed Routing in VLSI Design C. Schulte, T. Nieberg Research Institute for Discrete

GEOMETRIC STEINER TREE PACKING WITH DENSITY CONSTRAINTS Nicolai Hhnle, Pietro Saccardi Aussois

ACM.org Highlights For Scientists, Programmers, Designers, and Managers: Learning Center -

Sambuz

Useful Links

Newsletter

Mail Us

Neural Embeddings for Populated GeoNames Locations Mayank Kejriwal, - PowerPoint PPT Presentation

Neural Embeddings for Populated GeoNames Locations Mayank Kejriwal, Pedro Szekely USC Information Sciences Institute Motivation: feature extraction from locations Essential for machine learning problems involving locations Machine learning

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky &amp; Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky &amp; Martin How to

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

IN5550 Neural Methods in Natural Language Processing Convolutional Neural Networks Erik

IN5550 Neural Methods in Natural Language Processing Convolutional Neural Networks Erik

Preparing a Benefit-Cost Office of the Chief Analysis Economist United States July 23, 2019

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Overview Last time we studied the evolution of a discrete linear dynamical system, and today we

Silvano DAL ZILIO LAAS-CNRS, Vertics team presentation for our paper: MCC: a Tool for Unfolding

SOFTWARE ENGINEERING IN STARTUPS Dr. Vadim Zaytsev Universiteit van Amsterdam 20 January 2014

Classbased Detailed Routing in VLSI Design C. Schulte, T. Nieberg Research Institute for Discrete

GEOMETRIC STEINER TREE PACKING WITH DENSITY CONSTRAINTS Nicolai Hhnle, Pietro Saccardi Aussois

ACM.org Highlights For Scientists, Programmers, Designers, and Managers: Learning Center -

Sambuz

Useful Links

Newsletter

Mail Us

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to