Large-Scale Social Media Analytics Yun Raymond Fu Assistant Professor Electrical and Computer Engineering (ECE), COE College of Computer and Information Science (CCIS) Northeastern University
Motivations
Human-Centered Computing • Machine Learning • Computer Vision Systems Human- • Manifold/Subspace Learning LabelRelation, M-face, EAVA, • Transfer Learning Computer hMouse, Facetransfer, RTM-HAI, Shrug Detector. • Low-Rank Matrix Analytics Interaction • Sparse Representation • Large-Scale Optimization • Demographic Recognition • Internet Vision • Human-Centered Cyber- Human • Action/Activity/Intention Analysis Physical Systems Centered • Geolocation from Social Context • Health Care Computing • Social Network Analysis • Intelligent Systems Smart Social Media Environments Analytics
Motivation 1: Smart Environment Wikipedia.com: conceptually a physical world that is richly and invisibly interwoven with sensors, actuators, displays, and computational elements, embedded seamlessly in the everyday objects of our lives, and connected through a continuous network … The image is from http://sedl.kaist.ac.kr/images/smart_architecture_spaces.jpg
Motivation 2: Social Media in the Cloud How to model the multi-label, multi-instance, and multi-task characteristics? o How to effectively infer meaningful user information from large scale visual data? o How to provide targeted services through human-computer interactions? o
Motivation 3: Multi-label Social Media
It Is All About Data! Goal: Interpret given human images in terms of demographic and behavioral attributes (Expression, Age, Gender, Occupation, Kinship, Action, Pose, and Intention, etc.). Challenge Dimensionality redundancy Large scale (big data) Unknown distribution Large attributes variations Multimodality , multi-source, multi-label data Noise and outliers
Methodologies for Social Media Computing
Background: Existing Methods Global and Local Learning Methods Local Learning vs. Global Learning, K. Huang, H. Yang, I. King, and M. R. Lyu; Global Versus Local Methods in Nonlinear Dimensionality Reduction, V. de Silva and J. Tenenbaum; Generalized principal component analysis (GPCA), Y. Ma, et. al.; Globally-Coordinated Locally-Linear Modeling, C.-B. Liu. Localized Subspace Learning Methods Locally Embedded Linear Subspaces, Z. Li, L. Gao, and A. K. Katsaggelos; Locally Adaptive Subspace, Y. Fu, Z. Li, T.S. Huang, A.K. Katsaggelos. Patches/Parts Based Methods Flexible X-Y Patches, M. Liu, S.C. Yan, Y. Fu, and T. S. Huang; Patch-based Image Correlation, G-D. Guo and C. Dyer. Feature Extraction Methods Local Binary Pattern (LBP), T. Ojala, M. Pietikainen, and T. Maenpaa; Histogram of Oriented Gradient descriptor (HOG), N. Dalai and B. Triggs. Nonlinear Graph Embedding Methods Locally Linear Embedding (LLE), S.T. Roweis & L.K. Saul; Isomap, J.B. Tenenbaum, V.de Silva, J.C. Langford; Laplacian Eigenmaps (LE), M. Belkin & P. Niyogi Linear Subspace Learning Methods Principal Component Analysis (PCA), M.A. Turk & A.P. Pentland; Multidimensional Scaling (MDS), T.F. Cox and M.A.A. Cox; Locality Preserving Projections (LPP), X.F. He, S.C. Yan, Y.X. Hu Fisher Graph Methods Linear Discriminant Analysis (LDA), R.A. Fisher; Marginal Fisher Analysis (MFA), S.C. Yan, et al.; Local Discriminant Embedding (LDE), H.-T. Chen, et al. Tensor Subspace Learning Methods Two-dimensional PCA (TPCA), J. Yang, et.al.; Two-dimensional LDA (TLDA), J. Ye, et.al.; Tensor subspace analysis (TSA), X. He, et al.; Tensor LDE (TLDE), J. Xia, et al.; Rank-r approximation, H. Wang. Correlation-based Subspace Learnng Methods Discriminative Canonical Correlation (DCC), T.-K. Kim, et al.; Correlation Discriminant Analysis (CDA), Y. Ma, et al.
Graph Embedded Multilabel Learning Machine Learning Framework Subspace Learning Demographic Recognition Emotion/Expression Analysis Age/Gender Estimation Inference Ethnic Group Recognition Kinship Recognition Occupation Recognition Courtesy of Tamara Berg Human-Centered Computing
Level 3: Manifold Learning Swiss Roll Dimensionality Reduction Courtesy of Sam T. Roweis and Lawrence K. Saul, Sience 2002
Level 3: Fisher Graph Graph Embedding ( S. Yan, IEEE TPAMI, 2007 ) G ={ X , W } is an undirected weighted graph. W measures the similarity between a pair of vertices. Laplacian matrix Most manifold learning method can be reformulated as where d is a constant and B is the constraint matrix. Within-Locality Graph Between-Locality Graph Courtesy of Shuicheng Yan
Discriminant Simplex Analysis Y. Fu , et. al., IEEE Transactions on Information Forensics and Security, 2008.
Level 3: Similarity Metric Single-Sample Metric Euclidean Distance and Pearson Correlation Coefficient. Θ Multi-Sample Metric k-Nearest- Neighbor Simplex Q Q
Correlation Embedding Analysis Objective Function Correlation Distance Fisher Graph Y. Fu , et. al., IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Level 3: High-Order Data Structure m -th order tensors Representation where Define , where Here, tensor means multilinear representation. 1-st order 2-nd order vector matrix
Tensor Y. Fu , et. al., IEEE Transactions on Circuits and Systems for Video Technology, 2009.
Correlation Tensor Analysis Given two m-th order tensors, Pearson Correlation Coefficient (PCC): CTA objective function Correlation Distance and Fisher Graph Multilinear Representation m different subspaces Y. Fu , et. al., IEEE Transactions on Image Processing, 2008.
Large Scale Manifold Learning Graph based methods require spectral decomposition of matrices of n x n, where n denotes the number of samples. The storage cost and computational cost of building neighborhood maps are O( n 2 ) and O( n 3 ), it is almost intractable to apply these methods to large-scale scenarios. Neighborhood search is also a large scale aspect.
Large Scale Manifold Learning Graph oriented clustering K-means clustering
Previous and Current Work: Social Media Scenario
Expression Manifold Manifold visualization of 1,965 Frey’s face images by LEA using k = 6 nearest neighbors. Yun Fu, et. al. “Locally Adaptive Subspace and Similarity Metric Learning for Visual Clustering and Retrieval”, CVIU, Vol. 110, No. 3, pp: 390 -402, 2008.
Emotion State Manifold Manifold visualization for 11,627 AAI sequence images of a male subject using LLE algorithm. (a) A video frame snapshot and the 3D face tracking result. The yellow mesh visualizes the geometric motion of the face. (b) Manifold visualization with k=5 nearest neighbors. (c) k=8 nearest neighbors. (d) k=15 nearest neighbors and labeling results.
Application for Age Estimation AS International , How Old Are You? , www.asmag.com Vol. 120, Page 40-41, Dec. 2008. PhysOrg.com , Intelligent Computers See Your Human Traits , May 2008. Roland Piquepaille's Technology Trends , Computers can now guess our age , Sep. 2008. UIUC News Bureau , Step right up, let the computer look at your face and tell you your age , Sep. 2008. ABC Science , Age recognition software has a human eye , Oct. 2008. UPI.com , Age estimation software is created , Sep. 2008. Eureka! Science News , Step right up, let the computer look at your face and tell you your age , 2008 Zdnet.com , Computers can now guess our age , Sep. 2008. Webindia123.com , Age estimation software is created , Sep. 2008. Newkerala.com , Now, a computer software that can tell age just by looking at your face! , 2008. Hindustantimes.com , Computer that says how old you are , Sep. 2008. TXonline.net , Age estimation software is created , Sep. 2008. Topnews.in , Now, computer software that can tell age just by looking at your face , Oct. 2008. Age estimation on Einstein’s faces. The estimated ages below each face might be a little bit older than the true ages (unknown to us) but reasonable. Our training data are all Asian faces. This might be a good example to echo the phenomenon that Asian faces often aesthetically look younger than the Western. Y. Fu , et. al., IEEE TPAMI, CVPR, ICCV, 2009, 2010, 2011.
Why Regression on Manifold? YGA database 1600 Asian subjects Age range from 0 to 93 years 60x60 gray-level patches 8000 images in total. 4000 female and 4000 male Y. Fu , et. al., IEEE Transactions on Multimedia, 2008.
Regression Framework Multiple linear regression Model fitting Ordinary Least Squares Residuals Quadratic function Y. Fu , et. al., IEEE Transactions on Multimedia, 2008.
CEA for Age Estimation Female Male Y. Fu , et. al., IEEE Transactions on Multimedia, 2008.
Automatic Age Estimation MAEs (in years) comparison with the result in [33] that uses manual separation of gender. Y. Fu , et. al., IEEE CVPR, ICCV, 2009.
Gender Recognition from Body Bio-Inspired Feature Y. Fu , et. al., ACCV, 2009.
Kinship Recognition Son Father Mother KinFace Database Family Album Young Father Son Father o Hypothesis: most of children look like their parents at young ages o Utilizing transfer learning method to bridge the gap
Recommend
More recommend