There is no dichotomy between effectiveness and efficiency in - PowerPoint PPT Presentation

Aug 30, 2022 •374 likes •473 views

School of Electrical Engineering and Computer Science There is no dichotomy between effectiveness and efficiency in keyword search over databases Vahid Ghadakchi, Arash Termehchy IDEA Lab Most users can not express their intent over databases

School of Electrical Engineering and Computer Science There is no dichotomy between effectiveness and efficiency in keyword search over databases Vahid Ghadakchi, Arash Termehchy IDEA Lab
Most users can not express their intent over databases • Most users are not familiar with SQL, schema and exact content Keyword Query Interface Dark Knight Movie Batman Dark Knight Trilogy ID Title DID Search ⋮ ⋮ ⋮ Director DID Movie Results ⋮ ⋮ 2
Keyword queries are inherently vague Dark Knight Trilogy Keyword Query Interface Batman Dark Knight 1- Batman Begins Movie 2- Dark Knight Search 3- Dark Knight Rise ID Title DID ⋮ ⋮ ⋮ Results 4 Dark Knight Rises 40 Title Director ⋮ ⋮ ⋮ Reviews: Batman Dark Knight Antwiller 10 Batman Begins 40 The Dark Knight Movie Review Rodriguez Dark Knight Nolan Dark Knight Parody Bane Precision = 1/5 Dark Knight Aurora Lopez Recall = 1/3 3
Keyword query interfaces has low efficiency Keyword Query Interface Dark Knight series.. ⋈ Movie Plot Batman Dark Knight ID Title DID PID Text 1 Batman Returns 10 Search 40 The first movie in ⋮ ⋮ ⋮ Batman Returns ⋈ Movie Actor ID Title DID AID Name Keyword Query Interface 1 Dark Knight 10 70 Bale ⋮ ⋮ ⋮ Batman Dark Knight ⋈ Characters Search AID CID Character 70 10 Batman Dark Knight 4
Leveraging the query distribution • The probability of a tuple being a relevant answer to a query follows a Zipfian distribution • A small subset has most of the relevant answers • Solution: Make an effective subset using tuples with high probability Wikipedia Tuple Probabilities Wikipedia Subset Size 5
The algorithm to pick the effective subset 1. Compute probability of each tuple based on past interactions 2. Sort tuples based on their probability 3. Build different subsets of the database with tuples with high probability 4. Use a sample of the query workload to pick an effective subset ⊂ ⊂ 1% 2% 100% 3% • The effective subset is much smaller than the full database, thus it increases the efficiency of query answering while increasing the average precision • The effective subset does not include all the tuples – May decrease recall and have problem with long tail queries
How we handle recall and long-tail queries • Recall: Effective subset can preserve recall while maintaining high precision • Long-tail queries: Our system uses a machine learning technique to send the long-tail queries to the full database 7
Results on real world data and query workload • Dataset: Snapshot of Wikipedia with 12 million documents • Query Set #1: 7000 keyword queries sampled from MSN search engine • Query Set #2: 150 keyword queries from INEX competition • Search System: Lucene over MySQL database Effective Subset Full Database MRR of Query Set #1 0.62 0.25 MRR of Query Set #2 0.80 0.65 Average Query Time 27 (ms) 205 (ms)

Recommend

Dichotomy between Rights-based and Market- based Dichotomy between Rights-based and Market- based

Dichotomy between Rights-based and Market- based Dichotomy between Rights-based and Market- based Development : What is real and what is false? Development : What is real and what is false? Presentation by P V Satheesh, Director, Deccan

196 views • 4 slides

Dichotomy for conservative digraphs Alexandr Kazda Department of Algebra Charles University,

Introduction Coloring pairs There is no blue pair Combinatorics on potatoes Dichotomy for conservative digraphs Alexandr Kazda Department of Algebra Charles University, Prague June 9th, 2012 Alexandr Kazda Dichotomy for conservative

756 views • 34 slides

A Dichotomy for Non-Repeating Queries with Negation in Probabilistic Databases Robert Fink and Dan

A Dichotomy for Non-Repeating Queries with Negation in Probabilistic Databases Robert Fink and Dan Olteanu PODS June 24, 2014 1 / 20 Outline The Dichotomy The Interesting but Hard Queries The Easy Queries Leftovers 2 / 20 Problem Setting

505 views • 24 slides

Every graph is easy or hard: dichotomy theorems for graph problems Dniel Marx 1 1 Institute for

Every graph is easy or hard: dichotomy theorems for graph problems Dniel Marx 1 1 Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI) Budapest, Hungary ICGT 2014 Grenoble, France July 3, 2014 1 Dichotomy

931 views • 69 slides

The Constraint Satisfaction Dichotomy Theorem for Dummies Beginners Tutorial Part 1 Ross

The Constraint Satisfaction Dichotomy Theorem for Dummies Beginners Tutorial Part 1 Ross Willard University of Waterloo BLAST 2019 CU Boulder, May 20, 2019 Ross Willard (Waterloo) CSP Dichotomy Theorem BLAST 2019 0 / 22 BLAST 2010

821 views • 38 slides

The Open Dihypergraph Dichotomy for Definable Subsets of Generalized Baire Spaces Dorottya

The Open Dihypergraph Dichotomy for Definable Subsets of Generalized Baire Spaces Dorottya Szirki joint work with Philipp Schlicht Hamburg Set Theory Workshop 2020 The Open Dihypergraph Dichotomy for Definable Subsets of 1 Dorottya

1.02k views • 73 slides

1 ) Hom sets of reals, A dichotomy for ( 2 with applications to generic absoluteness Trevor

Some nice sets of reals A dichotomy for ( 2 1 ) Hom sets Applications to generic absoluteness 1 ) Hom sets of reals, A dichotomy for ( 2 with applications to generic absoluteness Trevor Wilson University of California, Irvine

422 views • 24 slides

CSP dichotomy for special oriented trees Jakub Bul n Department of Algebra, Charles

CSP dichotomy for special oriented trees Jakub Bul n Department of Algebra, Charles University in Prague The 83rd Workshop on General Algebra Jakub Bul n (Charles Univ., Prague) CSP dichotomy for special oriented trees AAA83 1 / 19

531 views • 40 slides

New algebraic insights from the solutions to the dichotomy conjecture What I learned from reading

New algebraic insights from the solutions to the dichotomy conjecture What I learned from reading Dmitriys proof (of the CSP Dichotomy Theorem), Part 5 Ross Willard University of Waterloo Second Algebra Week Siena June 28, 2019 Ross

601 views • 45 slides

The Constraint Satisfaction Dichotomy Theorem for Beginners Tutorial Part 2 Ross Willard

The Constraint Satisfaction Dichotomy Theorem for Beginners Tutorial Part 2 Ross Willard University of Waterloo BLAST 2019 CU Boulder, May 22, 2019 Ross Willard (Waterloo) CSP Dichotomy Theorem BLAST 2019 0 / 21 Recall: An algebra A =

520 views • 22 slides

There s no s no there there there! there! There W. Hyattsville Station

There s no s no there there there! there! There W. Hyattsville Station www.VirtualAdjacency.com 17 17 Woodmont Triangle page HB 948 State Rail Station Overlay Districts Proposed Howard University Proposed Howard

150 views • 3 slides

CSC Effectiveness Review CSC Effectiveness Review Team October 2018 ICANN63 Need for Review of

CSC Effectiveness Review CSC Effectiveness Review Team October 2018 ICANN63 Need for Review of Effectiveness Review Effectiveness of the CSC shall be reviewed two years after the first meeting of the CSC; and then every three years

607 views • 9 slides

The dichotomy between structure and randomness International Congress of Mathematicians, Aug 23

The dichotomy between structure and randomness International Congress of Mathematicians, Aug 23 2006 Terence Tao (UCLA) 1 A basic problem that occurs in many areas of analysis, combinatorics, PDE, and applied mathematics is the following: The

925 views • 24 slides

Educator Effectiveness Grant New Teacher S upport and Development Educator Effectiveness Grant

Educator Effectiveness Grant New Teacher S upport and Development Educator Effectiveness Grant In June 2015, Governor Jerry Brown and the state Legislature allocated $500 million dollars to support educator effectiveness, primarily for our

562 views • 17 slides

Office of Institutional Effectiveness Overview Institutional Effectiveness Overview The Office of

Office of Institutional Effectiveness Overview Institutional Effectiveness Overview The Office of Institutional Effectiveness functions as the foundation of assessment and evaluation District wide. IE ensures that all stated strategic goals,

335 views • 15 slides

Determining the Determining the Effectiveness & ROI Effectiveness & ROI of Your GRC

6/18/2012 Effectiveness & ROI of GRC June 22, 2012 1 Determining the Determining the Effectiveness & ROI Effectiveness & ROI of Your GRC Program of Your GRC Program Bob Conlin, Chief Products Officer SCCE Regional Conference June

503 views • 9 slides

The reaction keyword Sets the B(F) and B(GT) matrix elements for a cross section calculation

reaction FILE The reaction keyword Sets the B(F) and B(GT) matrix elements for a cross section calculation The format of this command is FILE is the name (and full path, if needed) of a file that contains the evaluated matrix

430 views • 6 slides

CSCI0170 An Integrated Introduction to Computer Science Prof. John Hughes Todays topics

CSCI0170 An Integrated Introduction to Computer Science Prof. John Hughes Todays topics Racket review A little more arithmetic The string data type Tokenizing a program Describing legal programs Announcements Register via

712 views • 29 slides

Many-core Architectures and Programming Models Using SHOC M. Graham Lopez Jeffrey Young Jeremy

Examining Recent Many-core Architectures and Programming Models Using SHOC M. Graham Lopez Jeffrey Young Jeremy S. Meredith Philip C. Roth Mitchel Horton Jeffrey S. Vetter PMBS15 Sunday, 15 Nov 2015 ORNL is managed by UT-Battelle for

638 views • 33 slides

Introduction to the theory of secret key cryptography Andreas H ulsing Eindhoven University

Introduction to the theory of secret key cryptography Andreas H ulsing Eindhoven University of Technology 17 June 2019 Secret key encryption MAC Main primitives of secret key / symmetric cryptography High-level primitives Low-level

1.51k views • 114 slides

Exact Pattern Matching p t Goal: Find all occurrences of a pattern in a text Input: Pattern p = p

Exact Pattern Matching p t Goal: Find all occurrences of a pattern in a text Input: Pattern p = p 1 p n and text t = t 1 t m Output: All positions 1< i < ( m n + 1) such that the n - letter substring of t starting at i matches p

486 views • 24 slides

Test Autom ation and Test Autom ation and Keyw ord-driven testing Brian Nielsen,

Test Autom ation and Test Autom ation and Keyw ord-driven testing Brian Nielsen, bnielsen@cs.aau.dk 3 Script Based Testing 3. Script-Based Testing + / - test impl. = programming + automatic execution + auto regression testing + auto

285 views • 14 slides

HexaGAN: Generative Adversarial Nets for Real World Classification Uiwon Hwang , Dahuin

HexaGAN: Generative Adversarial Nets for Real World Classification Uiwon Hwang , Dahuin Jung, and Sungroh Yoon Seoul National University Electrical and Computer Engineering speaker Problem Definition Missing data problem

516 views • 23 slides

BANKS BANKS Browsing rowsing an and d K Keyword eyword S Search earch B in Relational

BANKS BANKS Browsing rowsing an and d K Keyword eyword S Search earch B in Relational Databases in Relational Databases B. Aditya, Gaurav Bhalotia, B. Aditya, Gaurav Bhalotia, Soumen Chakrabarti, Soumen Chakrabarti, Arvind Hulgeri,

466 views • 7 slides