Learned Index Structures paper by Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis Bigtable Research Review Meeting Presented by Deniz Altinbuken go/learned-index-structures-presentation January 29, 2018
Objectives 1. Show that all index structures can be replaced with deep learning models: learned indexes . 2. Analyze under which conditions learned indexes outperform traditional index structures and describe the main challenges in designing learned index structures. 3. Show that the idea of replacing core components of a data management system through learned models can be very powerful.
Claims ● Traditional indexes assume worst case data distribution so that they can be general purpose. ○ They do not take advantage of patterns. ● Knowing the exact data distribution enables highly optimizing any index the database system uses. ● ML opens up the opportunity to learn a model that reflects the patterns and correlations in the data and thus enable the automatic synthesis of specialized index structures:learned indexes.
Main Idea A model can learn the sort order or structure of lookup keys and use this signal to effectively predict the position or existence of records.
Background Learned Index Structures Results Conclusion
Background
Background Neural Networks: An Example Recognizing handwriting Learned Index ● Very difficult to express our intuitions such as "9 has a Structures loop at the top, and a vertical stroke in the bottom right". ● Very difficult to create precise rules and solve this algorithmically. ○ Too many exceptions, special cases. Results Conclusion
Background Neural Networks: An Example Neural networks approach the problem in a different way. Learned Index Structures ● Take a large number of handwritten digits: training data . Results Conclusion ● Develop a system which can learn from the training data.
Background Neural Networks: An Example Neural networks approach the problem in a different way. Learned Index Structures Automatically infer rules for recognizing handwritten digits by going through Results examples! Conclusion
Background Neural Networks: An Example Neural networks approach the problem in a different way. Learned Index Structures Create a network of neurons that can Results learn! :) Conclusion
Background Neurons: Perceptron A perceptron takes several binary inputs , x1,x2,… and Learned Index produces a single binary output : Structures x 1 w 1 w 2 x 2 output w 3 Results x 3 The output is computed as a function of the inputs, where Conclusion weights w1,w2,… express the importance of inputs to the output.
Background Neurons: Perceptron The output is determined by whether the weighted sum Learned Index ∑ j w j x j is less than or greater than some threshold value . Structures x 1 w 1 w 2 x 2 output t w 3 Results x 3 Just like the weights, the threshold is a number which is a Conclusion parameter of the neuron. If the threshold is reached, the neuron fires.
Background Neurons: Perceptron The output is determined by whether the weighted sum Learned Index ∑ j w j x j is less than or greater than some threshold value . Structures 0 if ∑ j w j x j ≤ threshold output = 1 if ∑ j w j x j > threshold Results Just like the weights, the threshold is a number which is a Conclusion parameter of the neuron. If the threshold is reached, the neuron fires.
Background Neurons: Perceptron A more common way to describe a perceptron is: Learned Index Structures Bias describes how easy ● ∑ j w j x j w ⋅ x it is to get the neuron to fire. ● -threshold bias Results 0 if w ⋅ x + bias ≤ 0 0 if ∑ j w j x j ≤ threshold output = output = 1 if w ⋅ x + bias > 0 1 if ∑ j w j x j > threshold Conclusion
Background Neurons: Perceptron ● By varying the weights and the threshold, we get different Learned Index models of decision-making. Structures ● A complex network of perceptrons that uses layers can make quite subtle decisions. Results inputs output Conclusion
Background Neurons: Perceptron ● By varying the weights and the threshold, we get different Learned Index models of decision-making. Structures ● A complex network of perceptrons that uses layers can make quite subtle decisions. Results inputs output Conclusion 1 st layer 2 nd layer
Background Neurons: Perceptron ● By varying the weights and the threshold, we get different Learned Index models of decision-making. Structures ● A complex network of perceptrons that uses layers can make quite subtle decisions. Results inputs output Conclusion input layer hidden layers output layer
Background Neurons: Perceptron Learned Index Structures Perceptrons are great for decision making. Results Conclusion
Neurons: Perceptron How about learning? Learned Index Conclusion Results Background Structures
Background Neurons: Perceptron Earlier Learned Index Structures Automatically infer rules for recognizing handwritten digits by going through Results examples! Conclusion
Background Learning ● A neural network goes through examples to learn weights Learned Index and biases so that the output from the network correctly Structures classifies a given digit. ● When a small change is made in some weight or bias in the network if this causes a small corresponding change Results in the output from the network, the network can learn. Conclusion
Background Learning ● A neural network goes through examples to learn weights Learned Index and biases so that the output from the network correctly Structures classifies a given digit. ● When a small change is made in some weight or bias in the network if this causes a small corresponding change Results in the output from the network, the network can learn. Trying to create the right mapping for all cases. Conclusion
Background Learning Learned Index Structures The neural network is “trained” by adjusting weights and biases to find the perfect model that would generate the Results expected output for the “training data”. Conclusion
Background Learning Learned Index Structures Through training you minimize the prediction error. Results (But having perfect output is difficult.) Conclusion
Background Neurons: Sigmoid ● Sigmoid neurons are similar to perceptrons, but modified Learned Index so that small changes in their weights and bias cause Structures only a small change in their output. Small Δ in any weight or bias causes a small Δ in the output! w + Δ w Results output + Δ output inputs Conclusion
Background Neurons: Sigmoid ● A sigmoid takes several inputs , x1,x2,… which can be Learned Index any real number between 0 and 1 (i.e. 0.256) and Structures produces a single output , which can also be any real number between 0 and 1. output = σ (w ⋅ x + bias) Results Conclusion
Background Neurons: Sigmoid ● A sigmoid takes several inputs , x1,x2,… which can be Learned Index any real number between 0 and 1 (i.e. 0.256) and Structures produces a single output , which can also be any real number between 0 and 1. output = σ (w ⋅ x + bias) Results 1 σ (z) = 1 + e -z Conclusion sigmoid function
Background Neurons: Sigmoid ● A sigmoid takes several inputs , x1,x2,… which can be Learned Index any real number between 0 and 1 (i.e. 0.256) and Structures produces a single output , which can also be any real number between 0 and 1. output = σ (w ⋅ x + bias) Results Great for representing Conclusion probabilities!
Background Neurons: ReLU (Rectified Linear Unit) ● Better for deep learning because it preserves the Learned Index information from earlier layers better as it goes through Structures hidden layers. Results inputs output Conclusion
Background Neurons: ReLU (Rectified Linear Unit) ● Better for deep learning because it preserves the Learned Index information from earlier layers better as it goes through Structures hidden layers. 0 if x ≤ 0 Results output = x if x > 0 Conclusion
Background Activation Functions (Transfer Functions) Learned Index Structures To get an intuition about the neurons, it helps to see the Results shape of the activation function. Conclusion
Learned Index Structures
Background Index Structures as Neural Network Models ● Indexes are already to a large extent learned models like Learned Index neural networks. Structures ● Indexes predict the location of a value given a key. ○ A B-tree is a model that takes a key as an input and predicts the position of a data record. Results ○ A bloom filter is a binary classifier, which given a key predicts if a key exists in a set or not. Conclusion
Background B-tree The B-tree provides a mapping from a lookup key into a Learned Index position inside the sorted array of records. Structures Results Conclusion
Background B-tree The B-tree provides a mapping from a lookup key into a Learned Index position inside the sorted array of records. Structures For efficiency, index to page Results granularity. Conclusion
Recommend
More recommend