Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 - PowerPoint PPT Presentation

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 Gillian Hayes RL Lecture 18a 7th March 2007

1 Focussed Web Crawling Using RL • Searching web for pages relevant to a specific subject • No organised directory of web pages Web Crawling : start at one root page, follow links to other pages, follow their links to further pages, etc. Focussed Web Crawling : specific topic. Find maximum set of relevant pages having traversed minimum number of irrelevant pages. Why try this? : Less bandwidth, storage time (can take weeks for exhaustive search – billions of web pages) Good for dynamic content – can do frequent updates Can get indexing for a particular topic Alexandros Grigoriadis, MSc AI, Edinburgh 2003 + CROSSMARC project – extracting multilingual info from web on specific domains e.g. laptop retail info, job adverts on companies’ web pages Gillian Hayes RL Lecture 18a 7th March 2007

2 Web Crawler Retrieve Evaluate Good base set pages pages www Link Evaluate Extract queue links links RL link scorer • Link Queue: current set of links that have to be visited. Fetch link with highest score on queue Gillian Hayes RL Lecture 18a 7th March 2007

3 • Evaluate page this link points to: based on set of text/content attributes. If relevant, store on Good Pages • Get links from page • Evaluate links, add to link queue. Does does the link point to a relevant page? will it lead to relevant pages in future? • Where can we use RL? In the link scorer Gillian Hayes RL Lecture 18a 7th March 2007

4 RL Crawling • Reward when it finds relevant pages • Needs to recognise important attributes and follow most promising links first • Aim is to get π ∗ • How to formulate problem? What are states? What are actions? Alternatives: • State = a link, Action = { follow, don’t follow } • State = web page, Action = links • Learn V? Must do local search to get policy • Learn Q? More training examples needed since Q(s,a). But faster to use Choice: Action–links and learn V using TD( λ ) Gillian Hayes RL Lecture 18a 7th March 2007

5 How to Characterise a State? • Use text analyser to come up with keywords for domain – these words typically appear on web pages on this subject area • Feature vector of 500 binary attributes: existence or not of a keyword • State space: 2 500 states ∼ 10 150 – too large for a table • Use a neural network for function approximation to give V(s) • Learn weights of network using temporal difference learning • Eligibility trace on weights instead of states • Reward is 1/0 if page is/is not relevant Gillian Hayes RL Lecture 18a 7th March 2007

6 State Values V Tabular S V V(s) table Feature V(f) = f(s) S V(s) vector V(f(s)) encoding network Gillian Hayes RL Lecture 18a 7th March 2007

7 Learning Procedure • Use a number of training sets of web pages, e.g. different companies’ web sites containing numbers of pages with job adverts and start with a random policy • Learn V π , need to do GPI to get V ∗ • Then incorporate into a regular crawler: the RL neural net evaluates each page – the V value is its score • Which link to choose? Must do one-step lookahead – follow all links in current page, evaluate the pages they lead to • Place new pages on link queue according to score • Follow link at front of link queue to next page with highest likely relevance Gillian Hayes RL Lecture 18a 7th March 2007

8 Performance: Finds relevant pages (if > 1) following fewer links but searches more pages in the 1-step lookahead vs. CROSSMARC non-RL web crawler. Not so good at finding a single relevant page on a site. • Datasets: up to 2000 pages, 16000 links, tiny number of relevant pages in each dataset, English and Greek, 1000 training episodes Gillian Hayes RL Lecture 18a 7th March 2007

9 Issues Depends on: graphical structure of pages • Features chosen: many attributes were == 0 so not discriminating enough • Need to try on bigger datasets • Paper outlines alternative learning procedures Andrew McCallum’s CORA – searching computer science research papers • Treated roughly as a bandit problem learning Q(a). Action a = link on a web page and words in its neighbourhood • Choose the link expected to give highest future discounted reward • 53,000 documents, half a million links, 3x increase in efficiency (no. links followed before 75% of docs found vs. breadth-first search) Gillian Hayes RL Lecture 18a 7th March 2007

10 Alexandros Grigoriadis, Georgios Paliouras: Focused crawling using temporal difference-learning. Proceedings of the Panhellenic Conference in Artificial Intelligence (SETN), Lecture Notes in Artificial Intelligence 3025, 142–153, Springer-Verlag, 2004. Andrew McCallum et al.: Building domain-specific search engines with ML techniques. Proc AAAI-99 Spring Symposium on Intelligent Agents in Cyberspace Gillian Hayes RL Lecture 18a 7th March 2007

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 - PowerPoint PPT Presentation

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 Gillian Hayes RL Lecture 18a 7th March 2007 1 Focussed Web Crawling Using RL Searching web for pages relevant to a specific subject No organised directory of web pages

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Class Structure Last time: Midterm This time: Fast Learning Next time: Fast Learning Lecture 11:

Reinforcement Learning Lecture 8 Reinforcement Learning November 24, 2015 1 Wentworth

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

Digital Magazine Design Page Make A Plan Plan 1.1 1.2 1.3 1.5 1.6 1.4 Structure

Intro to Online Learning Instructor: Haifeng Xu Outline Online Learning/Optimization

Digital Advertising (PPC/SEM) Course Digital Advertising (PPC/SEM) Equinet 1 Academy Digital

disambiguation on Twitter Damiano Spina, Enrique Amig and Julio Gonzalo

GATORCON 2020 4TH 6TH FEBRUARY . OLD THORNS MANOR HOTEL #GatorCon2020 Live Q&A at sli.do

Honeycomb Crea/ve Works is financed by the European

ON-PAGE SEO HOW TO ANALYZE YOUR SEO PROJECT BEFORE YOU GET STARTED You Need Solid On-Page To

How To Use Social Elements to Achieve Specific Email Goals Marc Majers Manager of Web

Sambuz

Useful Links

Newsletter

Mail Us

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 - PowerPoint PPT Presentation

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 Gillian Hayes RL Lecture 18a 7th March 2007 1 Focussed Web Crawling Using RL Searching web for pages relevant to a specific subject No organised directory of web pages

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Class Structure Last time: Midterm This time: Fast Learning Next time: Fast Learning Lecture 11:

Reinforcement Learning Lecture 8 Reinforcement Learning November 24, 2015 1 Wentworth

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

Digital Magazine Design Page Make A Plan Plan 1.1 1.2 1.3 1.5 1.6 1.4 Structure

Intro to Online Learning Instructor: Haifeng Xu Outline Online Learning/Optimization

Digital Advertising (PPC/SEM) Course Digital Advertising (PPC/SEM) Equinet 1 Academy Digital

disambiguation on Twitter Damiano Spina, Enrique Amig and Julio Gonzalo

GATORCON 2020 4TH 6TH FEBRUARY . OLD THORNS MANOR HOTEL #GatorCon2020 Live Q&amp;A at sli.do

Honeycomb Crea/ve Works is financed by the European

ON-PAGE SEO HOW TO ANALYZE YOUR SEO PROJECT BEFORE YOU GET STARTED You Need Solid On-Page To

How To Use Social Elements to Achieve Specific Email Goals Marc Majers Manager of Web

Sambuz

Useful Links

Newsletter

Mail Us

GATORCON 2020 4TH 6TH FEBRUARY . OLD THORNS MANOR HOTEL #GatorCon2020 Live Q&A at sli.do