measuring happiness the big data way
play

Measuring Happiness the Big Data Way Measuring emotional content - PowerPoint PPT Presentation

Happiness Some motivation Measuring Happiness the Big Data Way Measuring emotional content DPG Spring Meeting, Dresden 2011 Data sets Analysis Songs Peter Dodds, Chris Danforth, Blogs Tweets Isabel Kloumann, Cathy Bliss, and Kameron


  1. Happiness Some motivation Measuring Happiness the Big Data Way Measuring emotional content DPG Spring Meeting, Dresden 2011 Data sets Analysis Songs Peter Dodds, Chris Danforth, Blogs Tweets Isabel Kloumann, Cathy Bliss, and Kameron Harris. Mechanical Turk References Department of Mathematics & Statistics Center for Complex Systems Vermont Advanced Computing Center University of Vermont 1 of 58

  2. Happiness Outline Some motivation Some motivation Measuring emotional content Data sets Measuring emotional content Analysis Songs Blogs Data sets Tweets Mechanical Turk References Analysis Songs Blogs Tweets Mechanical Turk References 2 of 58

  3. Happiness The Team: 1. People: anks to ... Some motivation Measuring emotional content Kameron Harris Isabel Kloumann Catherine Bliss Chris Danforth Data sets Analysis Songs Blogs Tweets Mechanical Turk References 2. Machines: ◮ 1400 processors + storage at the Vermont Advanced Computing Center ◮ 30 TB of storage in Danforth’s office. 3. Support: NSF and NASA. 3 of 58

  4. Happiness Papers, etc.: Some motivation ◮ PSD, KDH, IMK, CAB, and CMD Measuring “Temporal patterns of happiness and information in a emotional content global social network: Hedonometrics and Twitter.” Data sets http://arxiv.org/abs/1101.5120 ( ⊞ ) Analysis Songs ◮ P . S. Dodds and C. M. Danforth Blogs Tweets “Measuring the Happiness of Large-Scale Written Mechanical Turk Expression: Songs, Blogs, and Presidents.” [8] References Journal of Happiness Studies, 2009. ◮ http://www.uvm.edu/ ∼ pdodds/research/ ( ⊞ ) ◮ “Does a Nation’s Mood Lurk in Its Songs and Blogs?” by Benedict Carey New York Times, August 2009. ( ⊞ ) 4 of 58

  5. Happiness Happiness: Some motivation Measuring emotional content Data sets Analysis Songs Blogs Tweets Mechanical Turk References Jefferson: Bentham: Socrates et al.: . . . the pursuit of hedonistic eudaimonia [9] happiness calculus 5 of 58

  6. Happiness Early drafts: Some motivation Measuring emotional content Data sets Analysis Songs Blogs Tweets Mechanical Turk References 6 of 58

  7. Happiness Happiness: Some motivation Measuring emotional content Data sets Analysis Songs Blogs Tweets Even the odd modern economist Mechanical Turk likes happiness: References “Happiness” by Richard Layard [12] [amazon] ( ⊞ ) 7 of 58

  8. Happiness Desiring happiness—not just for boffins: ◮ Average people routinely report being happy is what Some motivation they want most in life [12, 13, 7] Measuring emotional content ◮ And it matters: “Happy people live longer:. . . ” Data sets Survey by Diener and Chan. [7] Analysis Songs Blogs Tweets Mechanical Turk References National indices of well-being: ◮ Bhutan ◮ France ◮ Australia 8 of 58

  9. Happiness Science ≃ Describe + Explain: Some motivation Measuring emotional content Lord Kelvin (possibly): Data sets ◮ “To measure is to know.” Analysis Songs ◮ “If you cannot measure it, you Blogs Tweets cannot improve it.” Mechanical Turk ◮ “X-rays will prove to be a References hoax.” ◮ “There is nothing new to be discovered in physics now, All that remains is more and more precise measurement.” 9 of 58

  10. Happiness Emotional content Some motivation So how does one measure Measuring emotional content 1. happiness? Data sets 2. levels of other emotional states? Analysis Songs Blogs Tweets Just ask people how happy they are. Mechanical Turk ◮ Experience sampling [4, 6, 5] (Csikszentmihalyi et al.) References ◮ Day reconstruction [10] (Kahneman et al.) But self-reporting has drawbacks... ◮ relies on memory and self-perception ◮ induces misreporting [14] ◮ costly 10 of 58

  11. Happiness Happiness, attention, and doing: Some motivation Measuring emotional content Data sets Analysis Songs Blogs Fig. 1. Mean happiness reported during each ac- tivity ( top ) and while mind wandering to unpleas- Tweets ant topics, neutral topics, pleasant topics or not Mechanical Turk mind wandering ( bottom ). Dashed line indicates mean of happiness across all samples. Bubble area References indicates the frequency of occurrence. The largest bubble ( “ not mind wandering ” ) corresponds to 53.1% of the samples, and the smallest bubble ( “ praying/worshipping/meditating ” ) corresponds to 0.1% of the samples. Killingsworth and Gilbert, Science, 2011 [11] 11 of 58

  12. Happiness Measuring Emotional Content: Some motivation We’d like to build an ‘hedonometer’: Measuring emotional content Data sets ◮ An instrument to Analysis ‘remotely-sense’ emotional Songs Blogs states and levels, in real time or Tweets Mechanical Turk post hoc. References Ideally: ◮ Transparent ◮ Non-reactive ◮ Fast ◮ Complementary to ◮ Based on written self-reported measures expression ◮ Improvable ◮ Uses human evaluation 12 of 58

  13. Happiness ANEW study words—examples 9 Some motivation love/paradise/triumphant Measuring 8 emotional content glory/luxury/trophy Data sets 7 Analysis optimism/pancakes/church Songs 6 Blogs valence v engine/paper/street Tweets 5 Mechanical Turk derelict/neurotic/vanity References 4 fault/corrupt/lawsuit 3 trauma/hostage/disgusted 2 funeral/rape/suicide 1 0 50 100 150 200 frequency ANEW = “Affective Norms for English Words” [3] 13 of 58

  14. Happiness Analysing text: Some motivation Measuring emotional content Data sets Analysis Songs Blogs Tweets ANEW Lyrics for v k f k words Mechanical Turk k v k f k � Michael Jackson’s Billie Jean v text = References k =1. love 8.72 1 “She was more like a beauty queen k f k 2. mother 8.39 1 � from a movie scene. 3. baby 8.22 3 4. beauty 7.82 1 5. truth 7.80 1 And mother always told me, 6. people 7.33 2 v Billie Jean be careful who you love. = 7.1 7. strong 7.11 1 And be careful of what you do 8. young 6.89 2 ’cause the lie becomes the truth. v Thriller 9. girl 6.87 4 = 6.3 Billie Jean is not my lover, 10. movie 6.86 1 11. perfume 6.76 1 She’s just a girl who claims v Michael = 6.4 12. queen 6.44 1 that I am the one. Jackson 13. name 5.55 1 1 14. lie 2.79 14 of 58

  15. Happiness Data sets: Some motivation Measuring Texts: emotional content Data sets 1. Song lyrics (1960–2007) Analysis Songs 2. Song titles (1960–2008) Blogs Tweets 3. State of the Union (SOTU) Addresses (1790–2008) Mechanical Turk References Sources: ◮ hotlyrics.com ( ⊞ ) ◮ freedb.com ( ⊞ ) ◮ American Presidency Project: www.presidency.ucsb.edu ( ⊞ ). 15 of 58

  16. Happiness Data sets: 4. Blog phrases containing “I feel...”, “I am feeling”, etc., Some motivation taken from wefeelfine.org ( ⊞ ) (API, 2005–2010) Measuring emotional content Data sets Analysis Songs Blogs Tweets Mechanical Turk References ◮ Created by Jonathan Harris & Sep Kamvar 16 of 58

  17. Happiness Data sets: Some motivation Measuring emotional content Data sets Analysis Songs 5. Blogs Tweets Mechanical Turk References 6. New York Times (20 years) 7. Gutenberg.org 8. Google Books: http://ngrams.googlelabs.com/ ( ⊞ ) 9. . . . 17 of 58

  18. Happiness Some numbers: Some motivation Measuring Counts Song lyrics Song titles emotional content All words 58,610,849 60,867,223 Data sets Analysis ANEW words 3,477,575 (5.9%) 5,612,708 (9.2%) Songs Individuals ∼ 20,000 ∼ 632,000 Blogs Tweets Mechanical Turk Counts blogs SOTU References All words 155,667,394 1,796,763 ANEW words 8,581,226 (5.5%) 61,926 (3.5%) Individuals ∼ 2,335,000 43 Counts Twitter ∼ 30 × 10 9 All words ∼ 1 × 10 9 (3.7%) ANEW words ∼ 50 × 10 6 Individuals 18 of 58

  19. Happiness Summary: Some motivation Measuring emotional content Data sets Analysis Songs Blogs Tweets Mechanical Turk References Science = Orwell Policy = Brave New World 19 of 58

  20. Text: h avg Words with a similar score: Soul/Gospel 6.9 chocolate (6.88), leisurely (6.88), lyrics [8] penthouse (6.81) Pop lyrics [8] 6.7 dream (6.73), honey (6.73), sugar (6.74) Dante’s 6.5 muffin (6.57), rabbit (6.57), smooth Paradise [1] (6.58) Tweets, 9/9/2008 6.4 thought (6.39), face (6.39), blond (6.42) to 12/31/2010 Rock lyrics [8] 6.3 church (6.28), tree (6.32), air (6.34) Enron Emails [2] 6.2 clouds (6.18), alert (6.20), computer (6.24) State of the Union 6.1 grass (6.12), idol (6.12), bottle (6.15) Messages [8] New York Times 6.0 hotel (6.00), tennis (6.02), wonder (6.03) (1987–2007) [15] Blogs [8] 5.8 owl (5.80), whistle (5.81), humble (5.86) Dante’s Inferno [1] 5.5 glacier (5.50), repentant (5.53), mischief (5.57) Heavy Metal 5.4 lamp (5.41), elevator (5.44), truck (5.47) lyrics [8]

  21. Happiness Song Lyrics—average happiness (valence) Some motivation 6.8 Measuring emotional content 6.7 Data sets 6.6 mean valence v avg Analysis Songs 6.5 Blogs Tweets 6.4 Mechanical Turk References 6.3 6.2 6.1 6 5.9 1960 1970 1980 1990 2000 2010 year 22 of 58

  22. Happiness Song Lyrics—average happiness of genres: Some motivation Measuring emotional content 7 Data sets Analysis mean valence v avg 6.5 Songs Blogs Tweets 6 Mechanical Turk References Gospel/Soul (6.91) Pop (6.69) 5.5 Reggae (6.40) Rock (6.27) 5 Rap/Hip−Hop (6.01) Punk (5.61) Metal/Industrial (5.10) 4.5 1960 1970 1980 1990 2000 2010 year 23 of 58

Recommend


More recommend