Ne Neural T Text Ge Generation f from S Struct ctured Da Data wi with h Appl Application n to the he Biogr graph phy Domain Rémi Lebret, David Grangier, Michael Auli
Fr From Str truc uctur tured ed Data a to Sen entenc ences es • Why? Machines like to read structured data, people don’t User-friendly access to structured data: Ø Question answering Ø Virtual assistant Ø Profile summary
Co Concept-to to-Te Text Generation • Weather forecast: Cloudy, with temperatures between 10 and 20 degrees. South wind around 20 mph.
Co Concept-to to-Te Text Generation • Flight query: Give me the flights leaving Denver August ninth coming back to Boston before 4pm.
Mo Motivations for r Going La Large Sc Scale • Template-based approaches: PROS CONS Natural language Repetitive No training Scale poorly Small datasets with limited vocabularies • Generating natural language from Wikipedia infoboxes Ø 700K biographies Ø 400K words vocabulary
Ge Generatin ting Bio iograp aphy from Wi Wiki kipe pedi dia In Infobo box Z Copy actions Conditioning on tables (fields + values)
Pr Proposed Approach
Fr From Str truc uctur tured ed Data a to Sen entenc ences es • How? Neural language model for constrained sentence generation Success in: Ø Caption generation (Vinyals et al, 2015) Ø Machine translation (D. Bahdanau et al, 2014) Ø Modeling conversations and dialogues (Shang et al, 2015)
g 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) La Langu guage e Model el with Co Conditioning
g 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) La Langu guage e Model el with Co Conditioning
g 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) La Langu guage e Model el with Co Conditioning copy actions Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end
g 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) La Langu guage e Model el with Co Conditioning copy actions Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end
g 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) La Langu guage e Model el with Co Conditioning copy actions Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end
g 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) La Langu guage e Model el with Co Conditioning Local conditioning à already generated fields copy actions Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end
g 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) La Langu guage e Model el with Co Conditioning Local conditioning à already generated fields Global conditioning à fields and values copy actions Table descriptors: Ø Name of the field Ø Position from the start Ø Position from the end
Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) Embeddings-based model john doe ( 18 april 1352 ) is a
Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) Aggregating embeddings –> component-wise max john doe ( 18 april 1352 ) is a
Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) Input 𝑦 = 𝑑 $ , 𝑨 ) * , , , - : 𝜔(𝑑 $ ) 𝜔 3 𝑦 = 𝜔(𝑑 $ ); 𝜔(𝑨 ) * ); 𝜔( , ); 𝜔( - ) john doe ( 18 april 1352 ) is a 𝜔(𝑨 ) * ) 𝜔( , ) 𝜔( - )
Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) Input: Final score: 𝑦, 𝑥 𝒳 𝑦, 𝑥 + 𝜚 9 𝜔 3 𝑦 = 𝜔(𝑑 $ ); 𝜔(𝑨 ) * ); 𝜔( , ); 𝜔( - ) 𝜚 7 𝑦, 𝑥 = 𝜚 3 Non-linear transformation ℎ(𝑦)
� Ne Neur ural al Lang Languag uage Mode del l with ith Conditio nditioning ning 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) Input: Final score: 𝑦, 𝑥 𝒳 𝑦, 𝑥 + 𝜚 9 𝜔 3 𝑦 = 𝜔(𝑑 $ ); 𝜔(𝑨 ) * ); 𝜔( , ); 𝜔( - ) 𝜚 7 𝑦, 𝑥 = 𝜚 3 Softmax function: log 𝑄(𝑥|𝑦) = 𝜚 7 𝑦, 𝑥 − log ? exp 𝜚 7 (𝑦, 𝑥′) -E∈𝒳∪ Training: Maximize Likelihood of Training Text J 𝑀 7 𝑡 = ? log 𝑄(𝑥 $ |𝑑 $ , 𝑨 ) * , , , - ) $KL
Ex Experi riments
Wi Wiki kiBio da dataset 728,321 Wikipedia biographies (80% - 10% - 10%) • Ø Infobox Ø Introduction section (only 1st sentence for the generation) Available at https://rlebret.github.io/wikipedia-biography-dataset/
Qu Quantitative Results without copy actions with copy actions KN = Kneser-Ney language model (5-gram) • NLM = Neural Language model (11-gram) •
At Attention Mechanism • Adding a bias 𝜚 to 𝜚 𝒳 Continuing an incomplete field Handling transitions between fields
Be Beam m Si Size Imp mpact 45 Template KN Table NLM beam size ● 40 • Much faster than Kneser-Ney 345 67 810 35 ● thanks to GPU ● ● ● 15 ●● ●● ● 20 25 1 ● ● ● ● 200 ms BLEU 30 25 • Best BLEU with 𝐿 = 5 5 6 20 4 ● 8 10 15 2025 ● ● ● ● ● ● ● ● ● 3 2 15 ● ● 1 ● 100 200 500 1000 2000 time in ms
Qualitative Results Qu MODEL GENERATED SENTENCE Template KN frederick parker-rhodes ( born november 21 , 1914 – march 2 , 1987 ) was an english cricketer . Table NLM frederick parker-rhodes ( 21 november 1914 – 2 march +Local (field, start) 1987 ) was an australian rules footballer who played with carlton in the victorian football league ( vfl ) during the XXXXs and XXXXs . + Global (field) frederick parker-rhodes ( 21 november 1914 – 2 march 1987 ) was an english mycology and plant pathology , mathematics at the university of uk . + Global (field, word) frederick parker-rhodes ( 21 november 1914 – 2 march 1987 ) was a british computer scientist , best known for his contributions to computational linguistics .
Co Conclusi sion • Generating sentences with: Ø copying facts from the table. Ø understanding type of fields. Ø understanding relation between record tokens and table tokens. Ø network with low capacity → fast generation. • WikiBio dataset available to download Ø https://rlebret.github.io/wikipedia-biography-dataset/
Futur Future e Work • Generating multiple sentences • Loss / evaluation that assess factual accuracy ( ≠ BLEU) Thank you!
Recommend
More recommend