elastic search
play

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search - PowerPoint PPT Presentation

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed search Full text Search Near real time search Evolution of Data Size of data being generated and stored has grown exponentially over


  1. Elastic Search - Aditi Choksi (EW18455)

  2. Elastic Search • Search engine • Distributed search • Full text Search • Near real time search

  3. Evolution of Data • Size of data being generated and stored has grown exponentially over the past few decades.

  4. Need for Distributed Data Systems Vertical Scaling – increase machine size Horizontal Scaling – add more machines • Elastic search sends a query to every node / machine and then collects and combines the results from them to return to the user.

  5. Elastic Search Cluster Shard Shard Shard Shard 2 Shard 3 Shard 1 Shard 1 Shard 2 Shard 3

  6. Lucene Index Segment

  7. Inverted Indexes Term count Frequency Documents choice 1 3 coming 1 1 contours 2 2, 3 fury 1 2 is 3 1, 2, 3 ours 1 2 the 2 2,3 winter 1 1 yours 1 3 dictionary postings

  8. Inverted Indexes Term count Frequency Documents choice 1 3 coming 1 1 contours 2 2, 3 2 fury 1 2 is 3 1, 2, 3 ours 1 2 the 2 2,3 2, 3 winter 1 1 yours 1 3 dictionary postings

  9. Wild Card Queries • Wild card searches are difficult Term count Frequency • choice 1 These are unindexed queries coming 1 • So searching somethings like *our* requires going contours 2 through all the terms of the index. fury 1 is 3 ours 1 the 2 winter 1 yours 1

  10. Question • Can you think of a way to make queries like *ours Term count Frequency choice 1 efficient? What kind of index can we create? coming 1 contours 2 fury 1 is 3 ours 1 the 2 winter 1 yours 1

  11. Question • Can you think of a way to make queries like *ours Term count Reversed word choice eciohc efficient? What kind of index can we create? coming gnimoc • Reverse Indexing: contours sroutnoc fury yruf is si *ours → sruo* ours srou the eht • search(our*) union search(sruo*) winter retniw yours sruoy

  12. Bottom up • Indexes are immutable, Shard Shard Shard segments are merged and that’s when obsolete Shard 2 Shard 3 Shard 1 entries are cleaned Shard 3 Shard 1 Shard 2

  13. References • [1]Reaz Ahmed, R. Boutaba , 2011 “A Survey of Distributed Search Techniques in Large Scale Distributed Systems”, IEEE Communications Surveys and Tutorials • [2]Enrico Nardelli, Fabio Barillari , 2015, “Distributed Searching of Multi - dimensional Data” • [3] ShaoHua Liu ; Xing Xue, 2016, Distributed Database Query Based on Improved Genetic Algorithm, 3rd International Conference on Information Science and Control Engineering • [4] Clinton Gourmley, Zachary Tong, 2015, ElasticSearch: The Definitive Guide • https://www.youtube.com/watch?v=lWKEphKIG8U

  14. Thanks ☺

Recommend


More recommend