ThemeDelta: Dynamic Segmentations over Temporal Topic Models Paper By: Samah Gad, Waqas Javed, Sohaib Ghani, Niklas Elmqvist, Tom Ewing, Keith N. Hampton, and Naren Ramakrishnan Published: IEEE Transactions on Visualization and Computer Graphics 21(5) 2015 Presentation By: Yasha Pushak
What: Text Dataset with Timestamps What: Text Dataset “These are the faces of the Syrian refugees. “I can’t believe I have so Men, women, and children who’s homes many racist friends…” were destroyed and were forced to flee.” “Canada extends it’s “Shouldn’t we help our c ondolences to France.” homeless before refugees?” “Diversity is our strength. We strongly “Terror in Paris…” condemn the acts aimed at certain Canadians after the Paris attacks.” “Canada is full. Say No to terrorists.” “My great grand parents immigrated from Russia to escape violence… No one told them ‘We’re full’.” “Canada stands with Paris.” 2 Time
Why: Identify Scatter/Gather Relationships “Terror in Paris…” “These are the faces of the Syrian refugees. “My great grand parents immigrated “Shouldn’t we help our “Terror in Paris…” Men, women, and children who’s homes “These are the faces of the Syrian refugees. from Russia to escape violence… “My great grand parents immigrated homeless before refugees?” “Shouldn’t we help our “I can’t believe I have so were destroyed and were forced to flee.” Men, women, and children who’s homes No one told them ‘We’re full’.” from Russia to escape violence… homeless before refugees?” What: Derived “I can’t believe I have so many racist friends…” “I can’t believe I have so were destroyed and were forced to flee.” “Diversity is our strength. We strongly No one told them ‘We’re full’.” “Shouldn’t we help our many racist friends…” “I can’t believe I have so many racist friends…” condemn the acts aimed at certain “Diversity is our strength. We strongly “Diversity is our strength. We strongly homeless before refugees?” “Shouldn’t we help our many racist friends…” Canadians after the Paris attacks.” condemn the acts aimed at certain “Canada is full. Say “Canada extends it’s condemn the acts aimed at certain “Diversity is our strength. We strongly homeless before refugees?” Bag of word representation “Terror in Paris…” Canadians after the Paris attacks.” No to terrorists.” “Canada is full. Say c ondolences to France.” “These are the faces of the Syrian refugees. Canadians after the Paris attacks.” condemn the acts aimed at certain “Terror in Paris…” “Canada stands with Paris.” No to terrorists.” “Canada stands with Paris.” Men, women, and children who’s homes “These are the faces of the Syrian refugees. “Canada extends it’s Canadians after the Paris attacks.” “Canada stands with Paris.” “Canada is full. Say “Canada stands with Paris.” were destroyed and were forced to flee.” Men, women, and children who’s homes c ondolences to France.” “Canada extends it’s No to terrorists.” “Canada is full. Say “My great grand parents immigrated were destroyed and were forced to flee.” c ondolences to France.” “My great grand parents immigrated No to terrorists.” from Russia to escape violence… “Canada extends it’s from Russia to escape violence… No one told them ‘We’re full’.” c ondolences to France.” No one told them ‘We’re full’.” “Canada extends it’s condolences to France .” Canada: 1 Condolences: 1 France: 1 Syria: 0 5 3
But what if we have lots of data? “Terror in Paris…” “These are the faces of the Syrian refugees. “My great grand parents immigrated “Shouldn’t we help our “Terror in Paris…” Men, women, and children who’s homes “These are the faces of the Syrian refugees. from Russia to escape violence… “My great grand parents immigrated homeless before refugees?” “Shouldn’t we help our “I can’t believe I have so were destroyed and were forced to flee.” Men, women, and children who’s homes No one told them ‘We’re full’.” from Russia to escape violence… homeless before refugees?” What: Derived “I can’t believe I have so many racist friends…” “I can’t believe I have so were destroyed and were forced to flee.” “Diversity is our strength. We strongly No one told them ‘We’re full’.” “Shouldn’t we help our many racist friends…” “I can’t believe I have so many racist friends…” condemn the acts aimed at certain “Diversity is our strength. We strongly “Diversity is our strength. We strongly homeless before refugees?” “Shouldn’t we help our many racist friends…” Canadians after the Paris attacks.” condemn the acts aimed at certain “Canada is full. Say “Canada extends it’s condemn the acts aimed at certain “Diversity is our strength. We strongly homeless before refugees?” Bag of word representation “Terror in Paris…” Canadians after the Paris attacks.” No to terrorists.” “Canada is full. Say c ondolences to France.” “These are the faces of the Syrian refugees. Canadians after the Paris attacks.” condemn the acts aimed at certain “Terror in Paris…” “Canada stands with Paris.” No to terrorists.” “Canada stands with Paris.” Men, women, and children who’s homes “These are the faces of the Syrian refugees. “Canada extends it’s Canadians after the Paris attacks.” “Canada stands with Paris.” “Canada is full. Say “Canada stands with Paris.” were destroyed and were forced to flee.” Men, women, and children who’s homes c ondolences to France.” “Canada extends it’s No to terrorists.” “Canada is full. Say “My great grand parents immigrated were destroyed and were forced to flee.” c ondolences to France.” “My great grand parents immigrated No to terrorists.” from Russia to escape violence… “Canada extends it’s from Russia to escape violence… No one told them ‘We’re full’.” c ondolences to France.” No one told them ‘We’re full’.” “Canada extends it’s condolences to France .” Canada: 1 Condolences: 1 France: 1 Syria: 0 4
Processing the bags of words Latent Dirichlet Allocation (LDA) Input: Bag of Words over time Output: Topics (Groups of keywords at a specific point in time) Timeline Segmentation Input: Topics Output: Optimal time intervals containing groups of topics 5
How? What? Why? What: Data Timestamped text dataset What: Derived Bag of Words over time Topics (Groups of keywords at a specific point in time) Time intervals containing groups of topics Why: Tasks Identify changes in topics over time Identify scatter/gather relationships 6
What? Why? How? How: Encode Parallel axes for time segments Spatially partition topics along a segment Label keywords within topics Linked keywords across time intervals Segment labels for dates and duration How: Encode (Free Channels) Size of labels for quantitative data Width of links for quantitative data Link colour for categorical or ordered data 7
What? Why? How? How: Encode (Free Channels) Size of labels for quantitative data Width of links for quantitative data Link colour for categorical or ordered data How: Manipulate Navigate: geometric zooming and panning Select: highlight keywords Search: Select keywords by searching How: Reduce Filter: by selected keywords and resort 8
Filtering on “Energy” 9
Example: US Presidential Election 2012 - Mitt Romney 10
Spanish Flu in the News Sep 09 – Oct 09 “ German ”, “mask” – advisories from the first world war to wear a mask Oct 10 – Dec 05 “home”, “family”, “son”, “daughter” – men from the army were allowed to return home Dec 06 – Dec 13 “German” disappears – The war was won on November 11. 11
Expert User Study on Spanish Flu Data Changed-Focused Questions How did the newspapers describe the spread of influenza? How does the description of the pandemic change over time? Are there different times when the influenza pandemic becomes less important? What are those time periods? Connection-Focused Questions What are the categories that appear to be associated with influenza in different newspapers? Was there a specific feeling that surrounded the influenza reporting in the newspapers? 12
Scalability Limits 13
Thank You References ThemeDelta: Dynamic Segmentations over Temporal Topic Models , by Samah Gad, Waqas Javed, Sohaib Ghani, Niklas Elmqvist, Tom Ewing, Keith N. Hampton, Naren Ramakrishnan, in IEEE Transactions on Visualization and Computer Graphics 21(5) 2015. Visualization Analysis and Design , by Tamara Munzner, A K Peters Visualization Series, CRC Press, 2014. 14
Recommend
More recommend