how elasticsearch powers the guardian s newsroom
play

How Elasticsearch powers the Guardians newsroom shay banon @kimchy - PowerPoint PPT Presentation

How Elasticsearch powers the Guardians newsroom shay banon @kimchy graham tackley @tackers creator, co-founder and cto director of architecture elasticsearch guardian news and media created in 1936 ... to secure the financial and


  1. How Elasticsearch powers the Guardian’s newsroom shay banon ■ @kimchy graham tackley ■ @tackers creator, co-founder and cto director of architecture elasticsearch guardian news and media

  2. “created in 1936 ... to secure the financial and editorial independence of the Guardian in perpetuity”

  3. our in-house real-time traffic tool

  4. production apaches my desktop workstation something ? htmly

  5. ssh $SERVER "nice tail -f /apache2/logs/guardian-access_log"

  6. 2 x production apaches my desktop workstation ssh “tail” SEO zeromq publisher dashboard x

  7. x my desktop workstation

  8. Javascript in browser hidden pixel Tracker SNS SQS Dashboard

  9. Javascript in browser hidden pixel Tracker SNS SQS SQS Serf Dashboard elasticsearch Dashboard

  10. 1 2 * m 3 . x l a r g e i n s t a n c e s t o r e ( S S D ) i n a n a u t o s c a l i n g g r o u p ( w i t h m a n u a l s c a l i n g ) https://github.com/guardian/status-app

  11. { ⇠ count per minute "dt": "2014-03-03T02:01:48.026Z", "url": "http://www.theguardian.com/film/2014/mar/03/oscars-2014-winners-list", "queryString": "", "host": "www.theguardian.com", ⇠ filter "path": "/film/2014/mar/03/oscars-2014-winners-list", "section": "film", "platform": "r2", "userAgent": { "type": "Browser", "family": "Safari 5.1.9", "os": "OS X 10.6.8", "device": "Personal computer" }, "documentReferrer": "http://www.theguardian.com/world", "browser": { "id": "gA6RUFLhWNQvWdt0rW4r78Fg", "isNew": false }, ⇠ filter "referringHost": "theguardian.com", "referringPath": "/world", "isContent": true, "contentPublicationDate": "2014-03-03", "countryCode": "US", "countryName": "United States", "location": { "lonlat": [-73.4409, 41.2094] } }

  12. { "query" : { "filtered" : { "query" : { "match_all" : { } }, "filter" : { "term" : { "path" : "/film/2014/mar/03/oscars-2014-winners-list" } } } }, …

  13. … "facets": { "Reddit": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "term": { "referringHost": "reddit.com" } } }, "Facebook": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "term": { "referringHost": "facebook.com" } } }, "Google": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "or": { "filters": [ { "prefix": { "referringHost": "www.google." } }, { "prefix": { "referringHost": "news.google." } } ] } } } } }

  14. /graph/breakdown?section=commentisfree

  15. ?section=commentisfree ophan.StandardFilters ophan.StandardFiltersToElasticsearch org.elasticsearch.index. query.FilterBuilder

  16. { "query" : { "filtered" : { "query" : { "match_all" : { } }, "filter" : { "term" : { "path" : "/film/2014/mar/03/oscars-2014-winners-list" } } } }, …

  17. "filter": { "and": { "filters": [ { "range": { "dt": { "from": "2014-03-03T00:00:00.000Z", "to": "2014-03-03T22:30:59.999Z", "include_lower": true, "include_upper": false } } }, { "not": { "filter": { "term": { "countryCode": "GNM" } } } }, { "not": { "filter": { "term": { "userAgent.type": "Robot" } } } }, { "filter": { "terms": { "section": [ "commentisfree" ] }} } ] } }

  18. thank you shay banon ■ @kimchy graham tackley ■ @tackers creator, co-founder and cto director of architecture elasticsearch guardian news and media

Recommend


More recommend