{ "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.39556286, "hits": [ { "_index": "starwars", "_type": "quotes", "_id": "1", "_score": 0.39556286, "_source": { "quote": "These are <em>not</em> the droids you are looking for." } } ] } }
POST /starwars/_search { "query": { "match": { "quote": { "query": "van", "fuzziness": "AUTO" } } } }
{ "took": 14, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.18155496, "hits": [ { "_index": "starwars", "_type": "quotes", "_id": "2", "_score": 0.18155496, "_source": { "quote": "Obi-Wan never told you what happened to your father." } } ] } }
SELECT * FROM starwars WHERE quote LIKE "?an" OR quote LIKE "V?n" OR quote LIKE "Va?"
Scoring
MongoDB
> db.starwars.find({ $text: { $search: "droid" }}, {score: {$meta: "textScore"}}) { "_id": ObjectId("57f2d54de814412463c3adef"), "quote": "These are not the droids you are looking for.", "score": 0.75 } Fetched 1 record(s) in 14ms
One Term https://github.com/mongodb/mongo/blob/v3.2/src/mongo/db/fts/fts_spec.cpp#L219 double coeff = (0.5 * data.count / numTokens) + 0.5; data.count: matches numTokens: stemmed words
Search for droid "These are not the droids you are looking for." droid look == 1 match, 2 tokens coeff:
Search for father "No. I am your father." father == 1 match, 1 token coeff:
Search for father "Obi-Wan never told you what happened to your father." obi wan never told happen father == 1 match, 6 tokens coeff:
> db.starwars.find({ $text: { $search: "obi-wan" }}, {score: {$meta: "textScore"}}) { "_id": ObjectId("57f2d56fe814412463c3adf0"), "quote": "Obi-Wan never told you what happened to your father.", "score": 1.1666666666666667 } Fetched 1 record(s) in 6ms
Multiple Terms https://github.com/mongodb/mongo/blob/v3.2/src/mongo/db/fts/fts_spec.cpp#L228 score += (weight * data.freq * coeff * adjustment); weight: method parameter data.freq, adjustment: 1
Search for obi-wan obi wan never told happen father == 1 match, 6 tokens coeff:
Search for obi-wan obi wan never told happen father == 1 match, 6 tokens coeff:
Search for obi-wan score: Sum:
Elasticsearch
Term Frequency / Inverse Document Frequency (TF/IDF) Search one term
BM25 https://speakerdeck.com/elastic/ improved-text-scoring-with-bm25
Term Frequency
Inverse Document Frequency
Field-Length Norm
Putting it Together score(q,d) = queryNorm(q) · coord(q,d) · ∑ ( tf(t in d) · idf(t) ² · t.getBoost() · norm(t,d) ) (t in q)
Recommend
More recommend