Intended for the 2015 FedCASIC Meeting by James R Caplan PhD James R. Caplan, PhD. This presentation is my own and does not represent any Official Position of the Department of Defense
Cognitive psychology formed around how words Cognitive psychology formed around how words and ideas are connected. Think Berkeley, Hume and John Stuart Mill from the 18 th Century J y Reemerged in the 1950s based on the WWII focus on human performance and attention, p developments in computer science, especially artificial intelligence, and interest in linguistics. Think Chomsky and McClelland h k Ch k d Cl ll d Basis of my graduate training. 1968 Masters th thesis based on word associations i b d d i ti 2
Early studies were human-powered Early studies were human-powered Subjects sorted statements into “buckets” based on how they seemed to “go together” based on how they seemed to go together Group would discuss results and reach consensus d h 3
Category definitions changed with ongoing Category definitions changed with ongoing context – requiring resorting Definitions were hard to keep in mind as Definitions were hard to keep in mind as number of buckets increased past 9 or 10 Consensus building arbitrary and results Consensus-building arbitrary and results unreliable across groups 4
Employee attitude surveys typically included Employee attitude surveys typically included open-ended questions, comments, and “Other/Specify” responses Other/Specify responses Contractors sanitized personal information, places and expletives then categorized and places, and expletives then categorized and coded answers 5
Expensive time-consuming Expensive, time-consuming Added months to final analysis, obviating the advantages of computer administration advantages of computer administration Eventually, open-ended questions were dropped from our employee surveys dropped from our employee surveys 6
Important way to know if some questions Important way to know if some questions were confusing or ambiguous Lost alternatives we never considered Lost alternatives we never considered The ability for respondents to interact with us: perhaps an important positive motivator us: perhaps, an important positive motivator 7
SPSS comes up with “Text Analysis for Surveys ” SPSS comes up with Text Analysis for Surveys, approx. 2008 with promise of automated coding and categorizing Reality: relies on data dictionaries and intensive R lit li d t di ti i d i t i human intervention- it’s a note taker Quote from Roller and Lavrakes (2015), computer Quote from Roller and Lavrakes (2015), “computer software programs can provide important assistance in the coding of manifest content, but these programs cannot handle the coding of complex latent programs cannot handle the coding of complex latent content for which the human brain is best suited." Problem: extensive preparation required, no context sensitivity, no serious natural language processing Three versions later, no real improvement 8
2009 IBM purchases SPSS 2009, IBM purchases SPSS Many other solutions emerge, Cognos, IBM Media Analytics Watson just by IBM Media Analytics, Watson, just by IBM Other solutions emerge but emphasis on marketing research biological and medical marketing research, biological and medical research, brand and product preferences, analysis of Big Data and national security analysis of Big Data, and national security University of Maryland, Institute for Advanced Computing develops Topic Analysis based on Computing develops Topic Analysis, based on natural language processing 9
Application of topic analysis to survey data Application of topic analysis to survey data Solves the problem of valence/affect (known as sentiment analysis by market researchers) Sanitized dataset from 2008 f Question: “If you have comments or concerns that you were not able to express in answering this you were not able to express in answering this survey, please enter them in the space provided. Any comments you make on this questionnaire will be kept confidential and no follow up action will be kept confidential, and no follow-up action will be taken in response to any specifics reported.” Very preliminary results – IBM threw this into Watson y p y with no instructions. I haven’t had a chance to interact with it yet 10
11
Comments about the Survey itself (too Comments about the Survey, itself (too 1 1. long, redundant) (59.2% negative) Comments about specific questions (50 2% Comments about specific questions (50.2% 2 2. negative) Comments about the organization (57.7% Comments about the organization (57 7% 3. 3 negative) Comments about work/job satisfaction Comments about work/job satisfaction 4 4. (55.1% negative) Etc Etc. 5. 12
Survey researchers need to explore these Survey researchers need to explore these tools ◦ Refine our sentiment analysis ◦ Refine our sentiment analysis ◦ See how clusters correlate with known demographics g p ◦ Check out some “Other/Specify” responses Still requires the human brain but the heavy lifting can be done for us with these techniques 13
james.r.caplan2.civ@mail.mil 14
Recommend
More recommend