s e c n e i c S n o i t a m CareerMap: Visualizing Career Trajectory r o f n I a Kan Wu, Jie Tang, Zhou Shao, Xinyi Xu, Bo Gao & Shu Zhao n i h Dept. Of Computer Science, Tsinghua University C e c n e i c S
s e Challenges c n e i c S Challenge 1: n solution Unified Probabilistic Models [1] o i t Name ambiguity a m r o Challenge 2: f solution Spatial-Temporal Factor Graph n I Model (STFGM) Data incompletion a n i h C Challenge 3: solution e Hotspot detection algorithm c Visualize many scholars’ merged trajectories on the n map, e.g. 100 people move from Boston to NewYork e i c 1.Jie Tang, A.C.M. Fong, Bo Wang, and Jing Zhang. A Unified Probabilistic Framework for Name Disambiguation in Digital Library. IEEE S TKDE, 2012
s e Architecture c n e i c S Analytic Visualization n o Visualization Analysis i t a m r Hotspot Detection o f n I a n Career Trajectory Extraction i h C Smoothing e c n Affiliation Extraction e i c S
s e Spatial-Temporal Factor Graph Model (continued) c n e • The general idea i c S – try to find the affiliation-known n coauthor who has the same affiliation as o i the target author with missing affiliation. t a m r Each green point with common t outside, o f representing a tuple of <Time t, Author a i1 , Author n ai2 >, is an observation instance where ai1 is the I a target author and ai2 is a coauthor with known n affiliation at t. Associated with each observation i h instance is a hidden binary-valued variable C representing the affiliation similarity between the e c two authors. If they belong to the same affiliation n at that time, the hidden value is 1, otherwise 0. e i c S
s e Spatial-Temporal Factor Graph Model (continued) c n e i • Attribute factor c S n – captures the features of each tuple o <Time t, Author a i1 , Author a i2 >, i t a m • Space factor r o f n – captures the correlation between the I hidden variables in the same time a n – N S denotes all the space relations i h C – • Time factor e c – captures the correlation between the n e hidden variables in the same time i c – N T denotes all the time relations S
s e Spatial-Temporal Factor Graph Model c n e i • Model Learning c S n – Maximize the likelihood of the observed data o i t – θ ≜ (ω T , 𝛾 T , 𝛿 T ) T is the parameters to be learned of the model a m r o f n I a n i h C e c n e i c S
s e Smooting c n e i • The general idea c S n – Use weight to reflect confidence of an affiliation at a time. o i – Leverage the number of papers with the affiliation at time t as the weight. t a m – Denoting the weights at t 1 and t 2 are w 1 and w 2 respectively, the weight center r o t c can be computed from: f n I a n i h C e c – If information between t 1 and t 2 is missing, n e – ∀ t (t 1 < t < t c ), Affiliation(a, t) = Affiliation(a, t 1 ) i c S – ∀ t (t c < t < t 2 ), Affiliation(a, t) = Affiliation(a, t 2 )
s e Example of scholar career trajectory extraction c n e i c S n o i t a m r o f n I a n i h C e c n e i c S
s e Hotspot detectionalgorithm c n e i • The general idea c S n – The heat centers have more o i neighbors than surrounding t a m points. r o – The heat centers ”absorb” their f n surrounding points as their I a neighbors. If a point is ”absorbed” n i by a heat center, then its neigh- h C bors are emptied. e c – Finally, the points left out with n e nonempty neighbors are heat i c centers. S
s e Trajectory map generated by Career Map c n e i c S n o i t a m r o f n I a n i h C e c n e i c S
s e Analytic Visualization c n e i c S n o i t a m r o f n I a n i h C e c n e i c S
s e Some Interesting Case Study (continued) c n e i c S n o i t a m r o f n I a n i h C e c n e i c S
s e Some Interesting Case Study (continued) c n e i c S n o i t a m r o f n I a n i h C e c n e i c S
s e Some Interesting Case Study c n e i c S n o i t a m r o f n I a n i h C e c n e i c S
s e Summary c n e i c S n o i t a We introducethe m 1 2 challenges of building r o CareerMap, a system for f Architecture,technologies n visualizing scholars’ and main features of the I a career trajectory system n i h Some interesting C 3 case studies e c n e i c S
s e c n e i c S n o i t Thanks a m r o f n Q&A I a n i h C e c n e i c S
Recommend
More recommend