csci 2350 social economic networks
play

CSCI 2350: Social & Economic Networks WWW: Information Networks - PDF document

4/22/15 CSCI 2350: Social & Economic Networks WWW: Information Networks Chapters 13, 14 Mohammad T . Irfan Questions 1. What does the web look like? [Ch 13] 2. How does Google search it? [Ch 14] 1 4/22/15 Information


  1. 4/22/15 ¡ CSCI 2350: Social & Economic Networks WWW: Information Networks Chapters 13, 14 Mohammad T . Irfan Questions 1. What does the web look like? [Ch 13] 2. How does Google search it? [Ch 14] 1 ¡

  2. 4/22/15 ¡ Information network u Common things with social and economic network u Graphs, paths, giant components u Connections to matching markets and auctions Web u Application for sharing info over the Internet u Created by Tim Berners-Lee (1989—91) u 2 perspectives u Web pages: Make documents easily available to anyone on the Internet u Browser: Retrieve and display documents u Web organizes information in a unique fashion u Different from library system u Different from folders in a computer u Different from indexing u Hypertext 2 ¡

  3. 4/22/15 ¡ Hypertext u Replaces linear structure of text by adding pointers u Concept dates back to 1950s Precursor to hypertext u Citation network 3 ¡

  4. 4/22/15 ¡ Precursor to hypertext u Semantic network Precursor to hypertext u Vannevar Bush (1945) u Associative memory in “Memex” u Cited by Tim Berners-Lee 4 ¡

  5. 4/22/15 ¡ Evolution of the web u Navigational functions (1990s) u Static web pages u Transactional functions u Dynamic, real-time operations u Web 2.0 u New attitude to technology, not new technology Collective creation and maintenance of shared 1. content (Wikipedia) Move personal data to corporate servers (Gmail) 2. Network among individuals, not just web pages 3. (Facebook) Web as a directed graph u Nodes: Web pages u Directed edges: Links u Bowdoin College à Restaurants and Lodgings à Brunswick Downtown Association à Things to do à Museum of Art à Bowdoin College u A directed cycle 5 ¡

  6. 4/22/15 ¡ Example Strongly Connected Component (SCC) 6 ¡

  7. 4/22/15 ¡ Bow-tie structure of the web Link analysis and web search Chapter 14 7 ¡

  8. 4/22/15 ¡ Web search u Google “Bowdoin” u What do you see? u Why is Bowdoin College ranked first? (Why not James Bowdoin?) u Google’s source of information is the web itself u No expert intervention u There must be enough information intrinsic to the web! Information retrieval u 1960s: Search repositories of newspapers, patents, etc. by keywords u Done by specialized people u Challenges u Synonymy: scallion vs. onion u Polysemy: jaguar (you mean the animal or the car or the football team?) u Abundance of information (opposite of needle-in- haystack) 8 ¡

  9. 4/22/15 ¡ Ranking algorithms u Voting by in-links u Hubs and authorities u PageRank Voting by in-links u Highest in-degree node is ranked first, and so on… 9 ¡

  10. 4/22/15 ¡ Hubs and authorities algorithm (1998) Image source: http://www.math.cornell.edu/~mec/Winter2009/RalucaRemus/Lecture4/lecture4.html Hubs and authorities example Round 1 10 ¡

  11. 4/22/15 ¡ Hubs and authorities example (cont…) Round 2 Hubs and authorities example Normalization 11 ¡

  12. 4/22/15 ¡ PageRank JCPenney scandal (2011) 12 ¡

  13. 4/22/15 ¡ How JCPenney did it u Hired SearchDex u Black hat optimization Image source: http://blogs.cornell.edu/info2040/2011/11/03/j-c-penney%E2%80%99s-pagerank/ How they got caught u NY Times + Blue Fountain Media Google’s spam cop Matt Cutts (Image source: NY Times) u Punishment u Precedence: BMW in Germany (2006) 13 ¡

  14. 4/22/15 ¡ PageRank (PR) (1998) u Intuition u Update rule u Demo u http://www.ladamic.com/netlearn/GUESS/ pagerank.html Modern web search u Google, Yahoo!, Bing, Ask u PageRank is a central ingredient of Google u There are more ingredients u In 2004, Google incorporated the “Hilltop” method (2001); Ask incorporated the hubs and authorities algorithm u Exact search method: secret! 14 ¡

  15. 4/22/15 ¡ Modern web search Modern web search u Combination of links, text, and clicks u Anchor text: “I’m a student of Bowdoin College.” u Moving target u Google’s changes in algorithm causes millions of dollars of damage to many companies u Companies seek help from SEOs to climb up the ranking 15 ¡

  16. 4/22/15 ¡ Link analysis beyond web search u Citation analysis u Journal’s impact scores Link analysis of U.S. Supreme Court citations u Fowler & Jeon’s study (2008) u Hubs and authorities algorithm applied to data spanning 2 centuries! u Important precedence has very high authority score u Public recognition comes later (authority score can predict future popularity) 16 ¡

  17. 4/22/15 ¡ Link analysis of U.S. Supreme Court citations u Rise and fall of authority scores 17 ¡

Recommend


More recommend