FA N D A : A Novel Approach to Perform Follow-up Query Analysis Qian Liu, Bei Chen, Jian-Guang Lou, Ge Jin, Dongmei Zhang
In Intr trodu oduct ction on Interaction Precedent Query User [1] : show the sales of BMW in 2009. System : SELECT Sales WHERE Brand = BMW and Year = 2009 Precedent SQL Follow-up Query User [2] : what about profit? Fused Query show the profit of BMW in 2009. System : SELECT Profit WHERE Brand = BMW and Year = 2009 Follow-up SQL User [3] : of Benz? show the profit of Benz in 2009. Brand Sales Profit Year System : SELECT profit WHERE Brand = Benz and Year = 2009 BMW 31020 5000 2009 User [4] : Compare it to Ford. Ford 25220 3000 2009 Compare the profit of Benz in 2009 to Ford. Benz 47060 6000 2009 System : SELECT profit WHERE ( Brand = Benz or Brand = Ford ) and 74.58% queries follow immediately after the Year = 2009 question they are related to. (Bertomeu et al., 2006) FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Intr In trodu oduct ction on ATIS3 S3 • Dahl D A, Bates M, Brown M, et al. Expanding the scope of the ATIS task: The ATIS-3 corpus[C]//HLT-ACL 1994 • Miller S, Stallard D, Bobrow R, et al. A fully statistical approach to natural language interfaces[C]//ACL 1996 • Zettlemoyer L S, Collins M. Learning context-dependent mappings from sentences to logical form[C]//ACL 2009 • Suhr A, Iyer S, Artzi Y. Learning to Map Context-Dependent Sentences to Executable Formal Queries[C]//NAACL 2018 SequentialQA Se ntialQA • Iyyer M, Yih W, Chang M W. Search-based neural structured learning for sequential question answering[C]//ACL 2017 Non-senten entential tial Questio tion Resolutio lution • Kumar V, Joshi S. Non-sentential Question Resolution using Sequence to Sequence Learning[C]//COLING 2016 • Kumar V, Joshi S. Incomplete Follow-up Question Resolution using Retrieval based Sequence to Sequence Learning[C]//SIGIR 2017 FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Intr In trodu oduct ction on Prior work in context-dependent parsing focuses on specific domain or • simple scenarios. Our goal: language understanding in complex scenarios covering • diverse domains in NLIDB. Dataset : A new dataset FollowUp is presented for research and o evaluation. Method : A novel approach is presented for taking account interaction o history information into current sentence. FA N D A : A Novel Approach to Perform Follow-up Query Analysis
In Intr trodu oduct ction on Dataset 1000 queries in 120 different Tables inherited from WikiSQL Annotation with Query Triple : (Precedent, Followup, Fused) Train/Dev/Test : 640/160/200 FA N D A : A Novel Approach to Perform Follow-up Query Analysis
In Intr trodu oduct ction on Method Encode & Encode Vector Decode Structure Fusion Predict Output Input Input Output F ollow-up A N alysis for D A tabase Sequence to Sequence • Learning to encode and decode • Learning to encode, fusion with semantic • Non-interpretable rules • Require lots of training data • Reason when fusion • Cold start for little data FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Follow low-up up Analys alysis s for Data tabas base Anonymization Generate symbol sequence Generation Generate segment sequence Fusion Query fusion in structure level FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Follow low-up up Analys alysis s for Data tabas base Anonymization Symbol Meaning • Words in utterance are split into two types: Col Column Name analysis-specific words and rhetorical words . • Val Cell Value All numbers and dates belong to Val . Table-Related • One analysis-specific word could belong to different Agg Aggregation Knowledge symbols, generating several symbol sequences. Com Comparison Dir Order Direction Per Personal Pronoun Language-Related Pos Possessive Pronoun Knowledge Dem Demonstrative FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Follow low-up up Analys alysis s for Data tabas base Generation • symbol does not consider the context around and segment structure is designed. • segment is a combination of adjacent symbols , inspired by SQL parameter and common sense. Precedent Query : Segment Compositional Deduction Rule Could you tell me the player whose score is larger than 67 Select Select [ Agg + [ Val ] ] + Col 𝑋 1 Group Col Follow Query (1) : Order [ Dir ] + Col Who play the same position as him ? [ Col ] + [ Com ] + Val 𝑋 1 𝑄 3 𝑄 1 Col + Com + Col 𝑋 2 Per 𝑄 1 Follow Query (2) : Pos 𝑄 2 sort them using their score in ascending order. Dem + Col 𝑄 3 𝑄 𝑃𝑠𝑒𝑓𝑠 1 FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Follow low-up up Analys alysis s for Data tabas base Generation 1. Symbols are combined to generate all possible segment sequences. 2. A ranking model is built to score these segment sequences and pick the best one as output. 3. Intent was introduced to distinguish two scenarios: Refine & Append. FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Follow low-up up Analys alysis s for Data tabas base Fusion Network Year TSN 1995 CBC 1995 CFL 1996 1. Conflicting segment pairs will not happen at the same time. 2. Utilize one sentence to make up for the lack of the another sentence. FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Follow low-up up Analys alysis s for Data tabas base Fusion FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Follow low-up up Analys alysis s for Data tabas base Model Learning 𝒬 𝒪 𝑃 𝑃 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝑇𝑓𝑚𝑓𝑑𝑢 𝐽 … 𝑆𝑓𝑔𝑗𝑜𝑓 𝐽 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝑃 𝑃 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝑇𝑓𝑚𝑓𝑑𝑢 𝐽 … 𝐵𝑞𝑞𝑓𝑜𝑒 𝐽 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝑃 𝑃 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝑇𝑓𝑚𝑓𝑑𝑢 𝐽 … 𝑆𝑓𝑔𝑗𝑜𝑓 𝐽 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝑃 𝑃 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝑇𝑓𝑚𝑓𝑑𝑢 𝐽 … 𝐵𝑞𝑞𝑓𝑜𝑒 𝐽 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝐁 𝑃𝑇 𝐶 𝐁 𝑃𝑃 … CRF Layer 𝑃 𝑃 𝑇𝑓𝑚𝑓𝑑𝑢 𝐶 𝑃 LSTM output 𝐠 1 𝐠 2 𝐠 3 𝐠 n … backward ℎ 1 ℎ 2 ℎ 3 ℎ 𝑜 … forward ℎ 1 ℎ 2 ℎ 3 ℎ 𝑜 input 𝑦 1 𝑦 2 𝑦 3 𝑦 𝑜 show the sum average … FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Exp xper eriments iments Dataset Symbol Acc : Symbol Consistent With Gold Fused Query • BLEU : Quality of Output Fused Query • • Execution Accuracy : Output Query Execution Correctness ( Parser using Coarse-to-Fine ) SEQ2SEQ : Attention SEQ2SEQ o COPYNET: + copy mechanism o S2S + ANON: SEQ2SEQ + anonymization o COPY + ANON: COPY+ anonymization o CONCAT: Concatenate Precedent Query and Follow-up o E2ECR: End to End Coreference Resolution System o Anonymization Origin : In 1995, is there any network named CBC? Any TSN? Transform : In Val#1, is there any Col#1 named Val#2? Any Val#3? FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Exp xper eriments iments Ablation Results 59.87 59.02 60 55.01 52.92 48.2 50 47.8 40 FANDA + Pretrain 35.3 FANDA 30 FANDA - Intent 24.3 FANDA - Ranking 20 10 0 Symbol Acc BLEU FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Exp xper eriments iments Error Case Analysis COP OPY+ANON FA FA N D A √ ( Segment Type ) √ Substantial Overlap √ ( Table Structure) No Overlap √ ( Combination of Symbol) Ambiguous overlap FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Futu ture re Work o Extending to multi-turns and multi-tables. o Using reinforcement learning Thank you!
Qu Questi estion on & A Answer swer SEQ2SEQ 𝑻 1 = 𝑆𝑂𝑂(𝑰 𝑜 , 𝑭 𝑡𝑝𝑡 ) 𝑈 𝑿 𝑰 𝑗 α 𝑗 = 𝑻 1 𝑓 α 𝑗 𝑏 𝑗 = σ 𝑓 α 𝑙 𝑫 = 𝑏 𝑗 · 𝑰 𝑗 𝑷 = 𝐺 𝑫, 𝑻 1 𝑻 2 = 𝑆𝑂𝑂(𝑻 1 , 𝑭 𝑝 ) FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Qu Questi estion on & A Answer swer COPYNET Copy Generate 𝑈 tanh(𝑿[𝑰 𝑜 𝑰 𝑗 ]) α 𝑗 = 𝑊 (Source Sentence) (Vocabulary) 𝑏 𝑓 α 𝑗 𝑏 𝑗 = σ 𝑓 α 𝑙 Softmax show T by C1 how about by C2 𝑫 = 𝑏 𝑗 · 𝑰 𝑗 … <PAD> show 𝑻 1 = 𝑆𝑂𝑂(𝑰 𝑜 , [𝑭 𝑡 , 𝑫]) S 𝑷 𝐻 = 𝑿 𝑝 𝑻 1 𝑰 𝑜 𝑻 1 (Decoder Hidden State) 𝑭 𝑡 β j = tanh 𝑿 𝐷 𝑰 𝑘 𝑻 1 A 𝑫 𝑷 𝐷 = [β 1 , … , β 𝑡 ] (Attention) 𝑸 = 𝑇𝑝𝑔𝑢𝑛𝑏𝑦( 𝑷 𝐻 , 𝑷 𝐷 ) Memory 𝑞 𝑧 𝑢 = 𝑞 𝑧 𝑢 , · + 𝑞(𝑧 𝑢 , 𝑑| · ) (Encoder Hidden State) show T by C1 how about by C2 FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Questi Qu estion on & A Answer swer Score Evolution • In Iteration 1 , different but similar scores are assigned to all candidates in P and N with random initialization. • From Iteration 5 to 21 , the score distribution becomes increasingly skewed. • From Iteration 13 to 21 , the candidate with the highest score remains unchanged, indicating the stability of weakly supervised learning. FA N D A : A Novel Approach to Perform Follow-up Query Analysis
Recommend
More recommend