the data exploration game
play

The Data Exploration Game Ben McCamish, Arash Termehchy, Behrouz - PowerPoint PPT Presentation

The Data Exploration Game Ben McCamish, Arash Termehchy, Behrouz Touri I nformation & D ata Manag e ment and A nalytics Laboratory (IDEA) Most users cannot precisely express their intents Intents they wish to find Use Queries to Grades


  1. The Data Exploration Game Ben McCamish, Arash Termehchy, Behrouz Touri I nformation & D ata Manag e ment and A nalytics Laboratory (IDEA)

  2. Most users cannot precisely express their intents Intents they wish to find Use Queries to Grades express intents Kerry Smith First_Name Last_Name Dept. Grade in CS … … … … Smith Sarah Smith CE A John Smith EE B Kerry Smith CS D … … … … Results First_Name Last_Name Dept. Grade • Database system returns only a Sarah Smith CE A subset of matching tuples John Smith EE B 2

  3. But they learn by interacting with database systems Grades Kerry Smith First_Name Last_Name Dept. Grade in CS … … … … Smith CS Sarah Smith CE A John Smith EE B Kerry Smith CS D … … … … Results First_Name Last_Name Dept. Grade • Learning and reformulating query allowed the user to find Kerry Smith CS D the desired student Sarah Smith CE A 3

  4. Database system can also learn from interactions Grades Kerry Smith in CS First_Name Last_Name Dept. Grade … … … … Smith Sarah Smith CE A John Smith EE B Kerry Smith CS D … … … … Results • First_Name Last_Name Dept. Grade Database system has learned to return Kerry Smith in CS Kerry Smith CS D department John Smith EE B 4

  5. Naturally data interaction is a game between two rational agents • Two Players: user and database system • Final goal: user to get desired information ‣ Database system understands the intent behind users queries. ‣ User expresses intent in a way that DBMS understands. • Strategy of the user is how intents are expressed using queries. • Strategy of the database system is how to decode queries. • Reward : the desired information user receive. 5

  6. User may use a single query for multiple intents User Strategy (U) Intent # Intent q 2 q 1 e 1 John Smith in EE e 1 0 1 e 2 Sarah Smith in CE e 2 1 0 e 3 Kerry Smith in CS e 3 0 1 Query # Query Grades q 1 “Smith CE” First_Name Last_Name Dept. Grade q 2 “Smith” … … … … • Row stochastic mapping from intents to Sarah Smith CE A queries John Smith EE B • Due to the lack of knowledge, saving time, … Kerry Smith CS D • Makes it hard to interpret the exact intent behind the query. … … … … 6

  7. Database system strategy Database Strategy Query # Query e 2 e 1 e 3 q 1 “Smith CE” q 1 0 1 0 q 2 q 2 0.5 0 0.5 “Smith” Intent # Intent Sarah Smith in CE e 1 ans(y) ← Grades(x,’Smith’, ‘EE’, y) Grades e 2 ans(y) ← Grades(x,’Smith’, ‘CE’, y) First_Name Last_Name Dept. Grade e 3 ans(y) ← Grades(x,’Smith’, ‘CS’, y) … … … … Sarah Smith CE A • Row-stochastic John Smith EE B mapping from queries Kerry Smith CS D to intents … … … … 7

  8. Database system must take into account user learning. • Current systems assume that users do not learn ( static environment ). • They cannot discover user intents accurately where users learn ( dynamic environment ). 8

  9. First, we want to know how users learn • Research in psychology shows that humans exhibit reinforcement learning behavior • Components of reinforcement learning: ‣ Select a query based on its past success, i.e., exploitation. ‣ Explore and try new/ less successful queries to gain new knowledge, i.e., exploration. ‣ Sacrifice immediate success for more success in the long run. 9

  10. User learning mechanisms • We evaluate 6 di ff erent human reinforcement learning methods from experimental economics and psychology. • They di ff er in: ‣ The rate of exploration/ exploitation ‣ The degree by which users use the experience from past interactions ‣ The degree of reinforcement 10

  11. Empirical evaluation • Yahoo! interaction history of 300,000 interactions. Method Mean Squared Error Bush and Mosteller’s 0.0673 Cross’s 0.0686 Roth and Erev 0.0666 Roth and Erev Modified 0.0666 Win-Stay/Lose-Randomize 0.0713 Latest Reward 0.1427 • Roth and Erev explores, using a randomized method, remembers all previous interactions, and reinforces based on reward in each interaction. 11

  12. How should the DBMS learn? • We have proposed a reinforcement learning algorithm for the DBMS that considers user learning. • It is more e ff ective that the state-of-the-art learning algorithms in current systems. • It scales to large databases: • Full paper: arxiv.org/abs/1603.04068v4 12

Recommend


More recommend