Grounded Semantic Parsing of Claims and Questions Pascual Martínez-Gómez Artifjcial Intelligence Research Center, AIST Tokyo, Japan January 22, 2018 1 / 16
Objective Convert a claim/question into a SPARQL query. Angelina Jolie’s net worth is above 1.5 million USD. 2 / 16 ↓ ASK WHERE { dbr : AgenlinaJolie dbp : NetWorth ? x . FILTER (? x > 1500000) . }
Requirements: 1 Able to process claims and questions. 2 Must be independent to the language. 3 Easily extensible to difgerent KBs and data stores. 4 Interpretable: Journalists may interact or inspect the process. 3 / 16 • Possibly extend to paragraphs. • Most pressing: English, Japanese (FIJ) and French (?).
Approach Modular pipeline (interpretable). 1 Identify mentions (e.g. Agenlina Jolie , net worth ). 2 Map mentions to KB nodes and relations. 3 Induce a grammar that describes the space of SPARQL queries. 4 Generate SPARQL queries in order of plausibility (with scores). 5 Execute (and evaluate). 4 / 16
Identify mentions Depending on the depth of linguistic annotation: 1 Character sequence (e.g. A, n, g, e, l, i, n, a, , J, o,... ). 2 Part of Speech tags (e.g. Angelina:NNP , net:ADJ , ...). 3 Syntax (e.g. “Angelina Jolie”:NP , ...) 5 / 16 4 Semantics (e.g. ∃ x : Person , angelina ( x ) ∧ jolie ( x ) ).
Identify mentions 6 / 16 • Semantics.
Identify mentions 6 / 16 • Semantics.
Identify mentions 6 / 16 • Semantics.
• Semantics for net worth is above 1.5M USD not accurate! • Errors tend to accumulate. Explore the use of less annotations. Identify mentions • z above z 1 5 z million z usd z . 7 / 16 • Semantics (e.g. ∃ x : Person , angelina ( x ) ∧ jolie ( x ) ...).
• Errors tend to accumulate. Explore the use of less annotations. Identify mentions 7 / 16 • Semantics (e.g. ∃ x : Person , angelina ( x ) ∧ jolie ( x ) ...). • Semantics for net worth is above 1.5M USD not accurate! • ∃ z . above ( z ) ∧ 1 . 5 ( z ) ∧ million ( z ) ∧ usd ( z ) .
Identify mentions 7 / 16 • Semantics (e.g. ∃ x : Person , angelina ( x ) ∧ jolie ( x ) ...). • Semantics for net worth is above 1.5M USD not accurate! • ∃ z . above ( z ) ∧ 1 . 5 ( z ) ∧ million ( z ) ∧ usd ( z ) . • Errors tend to accumulate. Explore the use of less annotations.
Identify mentions VBZ above , 1.5 million USD . Angelina NNP Jolie NNP ’s POS NP net JJ worth NN NP NP is above IN ROOT S . . VP PP NP NNP USD QP CD million CD 1.5 8 / 16 • Syntax • Mentions: Angelina Jolie’s , net worth , Angelina Jolie’s net worth , • It overgenerates mentions. • But it is simple and may have good coverage.
• Learn a regressor f • It generalizes easily to other KBs (“simply” retrain!). • Make f robust against spelling variations. • Flexibility and adaptability. Map mentions to KB nodes and relations. Problems with traditional IR approaches: Proposed solution: . • are labels of entities, relations or types. • are mentions (as identifjed earlier). 9 / 16 • Symbolic nature: tf-idf sensitive to lexical variations. • KB textual information is quite short (i.e. labels, aliases, names). • Scoring functions are adhoc.
Map mentions to KB nodes and relations. Problems with traditional IR approaches: Proposed solution: 9 / 16 • Symbolic nature: tf-idf sensitive to lexical variations. • KB textual information is quite short (i.e. labels, aliases, names). • Scoring functions are adhoc. • Learn a regressor f θ : L × M → R . • L are labels of entities, relations or types. • M are mentions (as identifjed earlier). • It generalizes easily to other KBs (“simply” retrain!). • Make f θ robust against spelling variations. • Flexibility and adaptability.
D with Enc • Encoding parameters • At the moment, I use only cosine l m • This is an autoencoder with noise contrastive estimation. • Uses positive and negative examples. Map mentions to KB nodes and relations. Enc m . 5 Estimation: arg max cosine Enc l cosine Enc Enc l cosine Enc l Enc l l k l arg max 2 Enc. mention m into a vector m D . and might be equal. 3 Use vector similarity function between l and m . l m l m . 4 Linking results are: Approach: metric learning. 10 / 16 1 Encode a label l ∈ L into a vector ⃗ l ∈ R D with Enc θ ′ : L → R D .
• At the moment, I use only cosine l m • This is an autoencoder with noise contrastive estimation. • Uses positive and negative examples. Map mentions to KB nodes and relations. l l Enc l cosine Enc l Enc l arg max cosine Enc 5 Estimation: m . Enc l cosine Enc Approach: metric learning. arg max k 4 Linking results are: . m l l m 3 Use vector similarity function between l and m . 10 / 16 1 Encode a label l ∈ L into a vector ⃗ l ∈ R D with Enc θ ′ : L → R D . 2 Enc. mention m ∈ M into a vector ⃗ m ∈ R D with Enc θ ′′ : M → R D . • Encoding parameters θ ′ and θ ′′ might be equal.
• This is an autoencoder with noise contrastive estimation. • Uses positive and negative examples. Map mentions to KB nodes and relations. 5 Estimation: arg max l cosine Enc l Enc m . l arg max cosine Enc 4 Linking results are: Enc l cosine Enc l Enc l k 10 / 16 Approach: metric learning. m m . 1 Encode a label l ∈ L into a vector ⃗ l ∈ R D with Enc θ ′ : L → R D . 2 Enc. mention m ∈ M into a vector ⃗ m ∈ R D with Enc θ ′′ : M → R D . • Encoding parameters θ ′ and θ ′′ might be equal. 3 Use vector similarity function between ⃗ l and ⃗ ⃗ ⃗ l ⊺ ⃗ • At the moment, I use only cosine ( l , ⃗ m ) = ⃗ || l || 2 ·|| ⃗ m || 2 .
• This is an autoencoder with noise contrastive estimation. • Uses positive and negative examples. Map mentions to KB nodes and relations. l k arg max l 5 Estimation: arg max cosine Enc l Enc Approach: metric learning. cosine Enc l Enc l 4 Linking results are: 10 / 16 m m . 1 Encode a label l ∈ L into a vector ⃗ l ∈ R D with Enc θ ′ : L → R D . 2 Enc. mention m ∈ M into a vector ⃗ m ∈ R D with Enc θ ′′ : M → R D . • Encoding parameters θ ′ and θ ′′ might be equal. 3 Use vector similarity function between ⃗ l and ⃗ ⃗ ⃗ l ⊺ ⃗ • At the moment, I use only cosine ( l , ⃗ m ) = ⃗ || l || 2 ·|| ⃗ m || 2 . cosine ( Enc θ ′ ( l ) , Enc θ ′′ ( m )) .
Map mentions to KB nodes and relations. Approach: metric learning. arg max 5 Estimation: l arg max k 4 Linking results are: m m . 10 / 16 1 Encode a label l ∈ L into a vector ⃗ l ∈ R D with Enc θ ′ : L → R D . 2 Enc. mention m ∈ M into a vector ⃗ m ∈ R D with Enc θ ′′ : M → R D . • Encoding parameters θ ′ and θ ′′ might be equal. 3 Use vector similarity function between ⃗ l and ⃗ ⃗ ⃗ l ⊺ ⃗ • At the moment, I use only cosine ( l , ⃗ m ) = ⃗ || l || 2 ·|| ⃗ m || 2 . cosine ( Enc θ ′ ( l ) , Enc θ ′′ ( m )) . cosine ( Enc θ ′ ( l 1 ) , Enc θ ′′ ( l 1 )) − cosine ( Enc θ ′ ( l 1 ) , Enc θ ′′ ( l 2 )) θ ′ ,θ ′′ • This is an autoencoder with noise contrastive estimation. • Uses positive and negative examples.
Map mentions to KB nodes and relations. Examples (see gsemparse github repo): Example 1: no misspellings: 1 Mention: angelina jolie . 2 Labels: Angelina jolie, Angelina Jolie Trapdoor Spider, Angelina Jolie Voight, Angelina Jolie cancer treatment, Angelina Jolie Pitt, Angelina Joli, Angelina Jolie Filmography, Anjelina Jolie. Example 2: misspellings (2 char substitutions, 1 char deletion): 1 Mention: angeline yoli 2 Labels: Angeline Jolie, Uncle Willie, Parmelia (lichen), Uriele Vitolo, Ding Lieyun, Earl of Loudon, Angeline Myra Keen, Angel Negro, Angeline Malik, Angeline of Marsciano. 11 / 16
Map mentions to KB nodes and relations. Examples (see gsemparse github repo): Example 1: no misspellings: 1 Mention: angelina jolie . 2 Labels: Angelina jolie, Angelina Jolie Trapdoor Spider, Angelina Jolie Voight, Angelina Jolie cancer treatment, Angelina Jolie Pitt, Angelina Joli, Angelina Jolie Filmography, Anjelina Jolie. Example 2: misspellings (2 char substitutions, 1 char deletion): 1 Mention: angeline yoli 2 Labels: Angeline Jolie, Uncle Willie, Parmelia (lichen), Uriele Vitolo, Ding Lieyun, Earl of Loudon, Angeline Myra Keen, Angel Negro, Angeline Malik, Angeline of Marsciano. 11 / 16
Map mentions to KB nodes and relations. Show the CNN over characters (on a whiteboard). 12 / 16
Map mentions to KB nodes and relations. Resources for KB: http://wiki.dbpedia.org/downloads-2016-10 1 Infobox relations. 2 Infobox property defjnitions. 3 Ontology (types/classes and relations). 4 Labels (NIF) 5 Contexts (NIF) Resources for questions: Hobbit data - Scalable QA challenge. http://hobbitdata.informatik.uni-leipzig.de/SQAOC 1 Many questions with annotations on mentions and SPARQL queries. 2 Hopefully easy to evaluate. 13 / 16
Map mentions to KB nodes and relations. Resources for KB: http://wiki.dbpedia.org/downloads-2016-10 1 Infobox relations. 2 Infobox property defjnitions. 3 Ontology (types/classes and relations). 4 Labels (NIF) 5 Contexts (NIF) Resources for questions: Hobbit data - Scalable QA challenge. http://hobbitdata.informatik.uni-leipzig.de/SQAOC 1 Many questions with annotations on mentions and SPARQL queries. 2 Hopefully easy to evaluate. 13 / 16
Recommend
More recommend