Deep Learning for Semantic Search in E-commerce
Somnath Banerjee Head of Search Algorithms at Walmart Labs https://www.linkedin.com/in/somnath-banerjee/
March 19, 2019
Deep Learning for Semantic Search in E-commerce Somnath Banerjee - - PowerPoint PPT Presentation
Deep Learning for Semantic Search in E-commerce Somnath Banerjee Head of Search Algorithms at Walmart Labs https://www.linkedin.com/in/somnath-banerjee/ March 19, 2019 Walmart E-commerce search problem E-commerce Search Store Associate
March 19, 2019
2
provides the functionality of a human but at scale
3
4
5
6
7
Learning book Tide 100 oz Tide 100 fl oz Tide 100 ounce Neck style? Fabric?
Ziploc Ambiguity Missing catalog values Levi’s Levi Strauss Signature by Levi Strauss and Co. Open vocabulary in query and catalog
8
$300!!!
9
Pump shoes
Position 1 Position 2
10
Lemon
Nivea 16oz
11
12
13
14
15
Text query
Product Type
16
Computer Video Cards : 0.85 Laptop Computers: 0.08 Desktop Computers: 0.06
nvidia gpu
Food Storage Bags: 1.0
ziploc bags
Large number of product types
bedroom furniture
17
Short text
tokens Large scale classification
product types (classes) Multi-class, multi- label problem
have multiple product types Needs to respond in few milliseconds
at runtime Unbalanced class distribution
types are much more popular
18
https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html
Output Layer
word2vec
Softmax/sigmoid
19
Without Query Classification After we understand the query “lemon” as a fruit
20
6% higher accuracy
More accurate
6X faster
Equal
21
Television Stands : 0.32; Laptop Computers : 0.27 Hard Drives : 0.11 Hard Drives : 1.00
samsung 850 evo 250gb 2.5 inch
22
23
Different tensorflow and numpy seeds
24
variance, particularly on the low traffic queries
stable but less accurate
interdependent across classes and less stable
than click
25
26
27
Query Query tokens tagged with Attribute Names faded glory long sleeve shirts for women Faded Glory Long sleeve shirts women for
28
blue women levis jeans Brand Product Type Gender Color toys for girls 3 – 6 years Age Value Age Unit Gender Product Type
query, “canopy tents for outside”?
29
word2vec Features for CRF
https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html
Linear Chain CRF query tokens
Char embedding
30
G P U
word2vec
word2vec type learnt on character sequence
31
sansung tv sansung tv Brand Product Type NULL Product Type
32
Before After understanding the Gender token
33
samsung tv 32 in 32 in vizio tv sanyo flat screen tv led tv sony 55” samsung tv stand sony tv remote
screen size) that customers look for for in a product type query (e.g. TV)
34
35
Input Embedding Concatenation Neural Transformation Transformed feature Relevance Score
36
Input Embedding Neural Transformation Query, item embeddings Relevance Score
shared weights
37
Input Embedding
token 1
token n
Query
Item Title
token 1
token n
Input Embedding Query
Item Title
38
*Position bias correction for ctr of a query, item pair 𝑑𝑢𝑠 = σ𝑠 𝑑𝑚𝑗𝑑𝑙𝑡_𝑑𝑝𝑠𝑠𝑓𝑑𝑢𝑓𝑒𝑠 σ𝑠 𝑗𝑛𝑞𝑠𝑓𝑡𝑡𝑗𝑝𝑜𝑡𝑠 𝑑𝑚𝑗𝑑𝑙𝑡_𝑑𝑝𝑠𝑠𝑓𝑑𝑢𝑓𝑒𝑠 = 𝑑𝑚𝑗𝑑𝑙𝑡𝑠 + 𝑗𝑛𝑞𝑠𝑓𝑡𝑡𝑗𝑝𝑜𝑡𝑠 − 𝑑𝑚𝑗𝑑𝑙𝑡𝑠 ∗ 𝑄 𝑑𝑚𝑗𝑑𝑙 𝑠) 𝑠 = 𝑠𝑏𝑜𝑙 𝑏𝑢 𝑥ℎ𝑗𝑑ℎ 𝑢ℎ𝑓 𝑗𝑢𝑓𝑛 𝑥𝑏𝑡 𝑒𝑗𝑡𝑞𝑚𝑏𝑧𝑓𝑒
39
Brooks shoes
40
0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% Design 1 Design 2
NDC DCG@10 lift lift ag against bas baseline
30.00% 31.00% 32.00% 33.00% 34.00% 35.00% 36.00% Design 1 Design 2
Pair air Ac Accuracy lift lift ag against base baseline
41
42
Predicted Attributes
43
more accurate
harder than predicting product type
a well established startup
token based approach
44
45
Web Search E-commerce Search
Conversational commerce Seamless search and personalized results V-Commerce
46