User Interests Driven Web Personalization based on Multiple Social Networks p Yi Zeng 1 , Hongwei Hao 1 , Ning Zhong 2 , Xu Ren 2 , Yan Wang 2 1 I 1. Institute of Automation, Chinese Academy of Sciences, P.R. China tit t f A t ti Chi A d f S i P R Chi 2. Beijing University of Technology, P.R. China
Semantic Data at Web Scale From large scale Web pages to large scale linked open semantic data Number of Web Pages that Google indexes 1998: 270 million 2000: 1 billion 2008: 1 trillion March, 2010: 13 Billion RDF Triples June, 2011: 12 Billion RDF Triples from the Web October, 2011: 31.6 Billion RDF Triples “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Personalization for Large scale and Web Enabled Semantic Data Processing (cont.) • An illustration of the basic idea: [ [s, p, “semantic Web mining”] p g ] Interests analysis, evaluation and ranking y g Frank van Harmelen’s Ranked Interests Original datasets (Semantic Interests related Semantic Spyros Kotoulas Web Dog Food, Twitter, triples SwetoDBLP) RDF Ivan Herman Selected triple set Selected triple set Knowledge Knowledge DERI DERI [s, p, “RDF triple store”] that are related to … … user interests [s, p, “Spyros Kotoulas”] For more details: •Yi Zeng, Erzhong Zhou, Yan Wang, Xu Ren, Yulin Qin, Zhisheng Huang, Ning Zhong. Research Interests : Their Dynamics, Structures and Applications in Unifying Search and Reasoning. Journal of Intelligent Information Systems, Volume 37, Number 1, 65-88, Springer, 2011. •Yi Zeng, Ning Zhong, Yan Wang, Yulin Qin, Zhisheng Huang, Haiyan Zhou, Yiyu Yao, and Frank van Harmelen. User- centric Query Refinement and Processing Using Granularity Based Strategies. Knowledge and Information Systems, Volume 27, Number 3, 419-450, Springer, 2011.
Personalization for Large A Comparative Study of Query Time and Efficiency for Different Strategies scale and Web Enabled Semantic Data Processing (cont.) SwetoDBLP dataset : 1.49x10 7 RDF Triples 1.49x10 RDF Triples Participants 7 DBLP authors: • Preference order 100% : List 2, , List 3 List 1 • Preference order 100% : 2 3 List List • Preference order 83.3% : List Li 2 2 List Li 3 3 List Li 1 1 • Preference order 16.7% : 3 2 1 List List List See references in the previous page
Massive Semantic Data from the Social Web • The social Web platforms and the microblog platforms adopt and benefit p g p p from semantic techniques • The semantic Web gets huge data from these Social Web platforms. 150 million users Cyber Social Sensors Cyber-Social Sensors 845 million active users • Friends http://en.wikipedia.org/wiki/Facebook • Professional Interests • Education Information Education Information • Work Experiences • Friends • Personal Notes • Likes 350 million users • 300 million tweets per day • 1 6 billion queries per date 1.6 billion queries per date http://en.wikipedia.org/wiki/Twitter • Interesting Places • Interesting Events • Interesting Events • Following, Followers • Following, Followers • Following, Followers • Following, Followers • Following, Followers • Real time personal • Real time personal • Real time personal • Real time personal • Real time personal information information information information information 60 million users • interesting news • interesting news • interesting news • interesting news • interesting news • From Web of Contents to Web of People • Users play more and more important roles
Personal Interests Data Fusion Strategies g m Weighted Fusion Strategy : I i ( ) w I i ( ) n n n 1 • Average fusion strategy w w 1/ 1/ n n n w w . .. w 1 1 2 n • Time-sensitive fusion strategy w w : : w w :...: : : w w f f : : f f : :...: : f f 1 1 2 2 n 1 1 2 2 n ... 1 w w w 1 2 n Slides 7 10 are from our following paper: Slides 7-10 are from our following paper: Yunfei Ma, Yi Zeng, Xu Ren, and Ning Zhong. User Interest Modeling Based on Multi-source Personal Information Fusion and Semantic Reasoning. Proceedings of the 2011 International Conference on Active Media Technology, Lecture Notes in Computer Science 6890, 195-205, Springer, Lanzhou, China, September 7-9, 2011.
An Illustration of Multi-source Personal Interests Fusion Evolution of Scientific Information Sharing “ Open Science ” Challenges Journal Tradition with Web Collaboration • User: Frank van Harmelen • Data Source: Data Source: 60 Twitter Twitter 50 Facebook st Values 40 LinkedIn 30 Interes 20 20 10 0 d data n data Web RDF Web arKC RQL RDFa ience roject ngine osium PhD rupal ation puter ustry earch rdam ersity titute ation fessor ector Comp Indu Profe La R Search En Dr Rese Amster Unive Educational Inst Linked Open Semantic Sci SPA Pr Sympo Inform owledge Represent Scientific Dir Top-K interests from different sources • Some of the interests have overlaps among each other. S f h i h l h h Kno Interest Terms • Diversities among these Top-K interests are even more obvious. A comparative study of interests from three single sources
An Illustration of Multi-source Personal Interests Fusion Update frequency : d f Twitter: f 1 =2.5, Facebook: f 2 =0.2, LinkedIn: f 3 =0.0004 (per day) Weighted Interests Fusion Function : g I i ( ) 0 . 9 2 5 8 I i ( ) 0 . 0 7 4 1 I i ( ) 0 . 0 0 0 1 I i ( ) 1 2 3 40 T itt Twitter 35 35 erest Values 30 Average Fusion 25 Time-sensitive Fusion 20 15 10 10 Inte 5 0 Linked Open mantic Search Web RDF LarKC PARQL RDFa cience Project mposiu PhD Engine data data Web m W Sym P E S SP S Se Interest Terms A comparative study of interests from a single source and multiple interests sources • Average Fusion : Twitter(7) 、 Facebook(7) , LinkedIn(2) • Time Sensitive Fusion : (1) Top-10 overlaps with Twitter; (1) Top 10 overlaps with Twitter; (2) Values are very close to the ones from Twitter, but entirely different; (3) No interests from Facebook and LinkedIn.
Interests Representation and Reasoning about Interests Reasoning about Interests (http://wiki.larkc.eu/e-foaf:interest) Interests Representation using e-FOAF:interest <foaf:Person rdf:about="http://www cs vu nl/~frankh/"> <foaf:Person rdf:about= http://www.cs.vu.nl/~frankh/ > Frank van Harmelen is interested F k H l i i t t d <foaf:name>Frank van Harmelen</foaf:name> in RDF in a certain degree <e-foaf:interest> <rdf:Description rdf:about="http://www.wici-lab.org/wici/wiki/index.php/RDF"> <dc:title>RDF</dc:title> <dc:title>RDF</dc:title> <e-foaf:cumulative_interest_value rdf:parseType="Resource"> <rdf:value rdf:datatype="&xsd;number"> 21.293 </rdf:value> </e-foaf:cumulative_interest_value> </rdf:Description> </rdf:Description> RDF representation of AI Ontology RDF t ti f AI O t l </e-foaf:interest> <rdfs:Class rdf: ID="Graph-based Representation"> ... <rdfs:subClassOf rdf: resource="Knowledge Representation"/> </foaf:Person> </rdfs:Class> <rdfs:Class rdf: ID="RDF"> A Fragment of AI Ontology <rdfs:subClassOf rdf: resource="Graph-based Representation"/> </rdfs:Class> Reasoning about interests from RDF to Knowledge Representation g p Appeared on Frank van Harmelen’s homepage, but not elsewhere.
Ranking Strategy for User Interests Related Sources q m C T U C T U ( ( , ) ) S S N i N i ( ) ( ) s s i i n n p p 1 n 1 1 s i f f i ( ) ( ) n N i ( ) n m f i ( ) n n n 1 1
Active Academic Visit Recommendation Application (AAVRA) Recommendation Application (AAVRA) • Collaboration network is already too complex, but… • Academic collaboration candidates not only appear on publication data, but also on many other social networking environment such as Twitter. h T itt • AAVRA was proposed in the following publication the following publication [Zeng2012a], nevertheless, ranking strategies among different social network has different social network has The upper snapshot is from http://data.semanticweb.org/organization not been investigated. Data Sources for AAVRA: Twitter Data Semantic Web Dog Food data DBLP data Google Maps API Twitter Data, Semantic Web Dog Food data, DBLP data, Google Maps API [Zeng2012a] Yi Zeng, Ning Zhong, Xu Ren and Yan Wang. User Interests Driven Web Personalization Based on Multiple Social Networks. Proceedings of the 4th International Workshop on Web Intelligence & Communities, collocated with the 2012 World Wide Web Conference (WWW 2012), Lyon, France, April 16th, 2012.
AAVRA: Data Acquisition Twitter data acquisition Twitter data acquisition to : Twitter data acquisition to : • Locate the end user; • Find agents that the end user follows; • • User real time interests analysis; User real time interests analysis; • Locating followings and their interests
Recommend
More recommend