Neural Information Retrieval Wassila Lalouani 1
Plan § Neural network architectures § Neural IR tasks § Neural IR architecture § Feature Representations § Neural IR query auto completion § Neural IR query suggestion § Neural IR document ranking § Conclusion 2
Neural network architectures § neural networks Linear boundary § Auto encoders learn § Convolutional Neural Network § Recurrent Neural Networks Time series learn Embedding 3
Deep learning? Neural net Perceptron x 1 w 1 y x 2 w 2 …… f1 w d x d -b 1 LeNet C3: f. maps 16@10x10 C1: feature maps S4: f. maps 16@5x5 INPUT 6@28x28 32x32 S2: f. maps C5: layer OUTPUT F6: layer 6@14x14 120 10 84 Gaussian connections Full connection Subsampling Subsampling Full connection Convolutions Convolutions 4
Neural IR tasks § Query auto Document Ranking § • Next Query Suggestion . completion. ü How Neural Network can be used in information retrieval? ü What should we learn using Neural network? ü What are the neural IR task ? ü How to incorporate Neural network into traditional IR architectures? ü What are the input/output? ü How the input/output are represented? ü IR architecture versus Neural Information retrieval? Examples extracted from [2] 5
Neural IR architecture Neural networks for IR Query or prefix representation § ü Learning a matching function using Candidate document, suffix or queries § traditional feature based representation. representation of query and document Estimate of the input and candidate § ü learning representations of text to relevance. deal with vocabulary mismatch ü Learning end-to end IR. Figure from [2] 6
Background Feature RESENTATIONS q VECTOR REPRESENTATIONS q Local representation Every term is represented by a distinct binary vector. q Distributed representation Every term is represented by a real-valued reflecting the features of the term. q Embedding: q The vector dimensions are not hand-crafted features q latent space that preserve the properties and the relationships between the items. q Similarity: q Topical similarity (Seattle vs Seahawks) q Type similarity (Seattle vs Sydney) Examples extracted from [1] 7
convolutional latent semantic model (CLSM) § Query auto-completion § Query prefix and suffix <“things to do in”, “seattle”> § Consecutive queries in user sessions § Next query suggestion § <“things to do in seattle”, “space needle”> § CLSM is a deep neural network with a convolutional and pooling layers. Paper [7] § CLSM projects a variable-length text into a fixed-length vector. § Each word in the input text is first hashed to a letter trigram vector. § For each word, the convolutional layer extracts contextual features based on its immediate neighbours within the window size. § A max pooling layer combines the output of the convolutional layer into a fixed-length feature vector. § The output layer produces the final vector representation of the entire query. 8
Query Auto-Completion for Rare Prefixes Auto-Completion Paper [3] Algorithm: § QAC suggest queries according to observed queries § in the logs. Select the set of candidates given partial user § query Compute the frequently of the query suffixes § based on historical of the search. The rank the candidat suffix based on n-gram § feature and CLSM. n-gram frequency features depend on the number of words in the suffix and the frequency of the grams. CLSM model uses prefix-suffix pairs of the dataset to train CLSM. Normalize the prefix and suffix by removing the end-term The CLSM model projects both the prefix and the suffix to a common vector space to rank the candidate according to prefix-suffix cosine similarity. CLSM output the clsmsim feature based on the cosine similarities between pair of suffix prefix. ꭕ recommend completion for rare query prefixes. ꭕ Generic mining and efficient suffix ranking. ꭕ N-gram suffix generate synthetic suggestion candidates that have never been observed in the historical. ꭕ N-gram outperform CLSM suggestion and no no semantic evaluation. 9
Query Auto-Completion for Rare Prefixes Auto-Completion ꭕ CLSM and ngram perform better than each separately. ꭕ The approach perform better than baseline for unseen and rate suffix. 10
Representations of Queries and Reformulations Consecutive queries in user sessions Paper [4] CLSM represent query reformulations as § vectors that map semantically and syntactically similar query closer in the embedding space. CLSM provide semantical mapping of query to § the embedding space. The input in this architecture are pair of § queries formulated during same users session. Non-contextual features: prefix length, § suggestion length, the vowels,… Contextual features: n-gram similarity § between the suggestion candidate and the previous queries from the same user session. ꭕ recommend suggestion based on semantically embedding. CLSM: The cosine similarity between the CLSM § ꭕ Lack of context-aware ranking vectors corresponding to the suggestion model. candidate and a maximum of previous 10 queries from the same session 11
Representations of Queries and Reformulations Consecutive queries in user sessions mean reciprocal rank ü The supervised ranking models using contextual features (CLSM) show large improvements on the short prefixes and handle efficiently the ambiguity. ü Across all CLSM based features out-perform the baseline ranking model. 12
Document ranking Paper [5] Feature representation Neural model ü The relevance of a document based on exact matches of query terms in the body text. ü The short vertical lines correspond to exact matches between pairs of query and document terms. ü The query term matches in the relevant documents are more localized and clustered. ü Does not learn embedding. 13
Document ranking Paper [5] § Distributed representation: § latent semantic analysis learn distributed representations of text § Match the query against the document in the latent semantic space. § Deep neural network to matches the query and the document using learned distributed representations. § n-graph based representation of each term in the query and document. § The matching is based on Hadamard product between the embedded document matrix and the extended query embedding § Duet Model: § The join training will provide the robustness to learn both embeddings that are combined with similarity checking layer. ü Traditional IR based on statistical proprieties offer outstanding results for matching query terms and the document ü distributed representations complement the model with semantically equivalence matching. 14
Document ranking normalized discounted cumulative gain ü All duet runs outperformed local and distributed, non-neural and neural baselines. 15
Perspective § Neural network complement semantically the traditional IR. § latent semantic analysis learn distributed representations based on global dataset. § Tendency to extract specialized learning from global to meet specific domain knowledge. § Incorporate the categorization of the documents using knowledge graphs. § The term clustering and localized occurrence could produce more adaptive relevance score. 16
Question 17
Reference https://www.researchgate.net/figure/Artificial-neural-network-architecture- § ANN-i-h-1-h-2-h-n-o_fig1_321259051 Craswell, N., Croft, W.B., de Rijke, M. et al. Neural information retrieval: introduction to the special issue. Inf § Retrieval J 21, 107–110 (2018). https://doi.org/10.1007/s10791-017-9323-9 NEURAL MODELS FOR INFORMATION RETRIEVAL. Bhaskar Mitra and Nick Craswell. 2015. Query Auto-Completion for Rare Prefixes. In Proceedings of the 24th ACM § International on Conference on Information and Knowledge Management (CIKM ’15). Association for Computing Machinery, New York, NY, USA, 1755–1758. Bhaskar Mitra. 2015. Exploring Session Context using Distributed Representations of Queries and Reformulations. § In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’15). Association for Computing Machinery, New York, NY, USA, 3–12. Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to Match using Local and Distributed § Representations of Text for Web Search. In Proceedings of the 26th International Conference on World Wide Web (WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1291–1299. Bhaskar Mitra and Nick Craswell. 2017. Neural Models for Information Retrieval. CoRR abs/1705.01509 (2017). § arXiv:1705.01509. Shen, Yelong et al. “A Convolutional Latent Semantic Model for Web Search.” (2014). § 18
Recommend
More recommend