Natural Language Processing for Framing Noah Smith University of Washington nasmith@cs.washington.edu Collaborators: David Bamman (UCB), Amber Boydstun (UCD), Dallas Card (CMU), Justin Gross (UMass), Brendan O’Connor (UMass), Philip Resnik (UMD) May 24, 2016 These slides: http://tinyurl.com/framing-noah
Outline A R K ◮ Motivation: a study in which we’re using NLP ◮ Building a text classifier: 1. Define the classes 2. Annotate training examples 3. Featurize data ◮ Brief tangent: creating new features 4. Learn to classify 5. Evaluate the classifier ◮ Looking ahead
Some Terminology A R K Natural language processing (NLP): Algorithms that do useful things with text. (for someone) (or other linguistic data) Framing is choosing “a few elements of perceived reality and assembling a narrative that highlights connections among them to promote a particular interpretation.” Entman (1993, 2007)
Media Framing and Public Opinion A R K ◮ We know that framing works . . . sometimes. ◮ Lack of systematic tests of framing effects on public opinion When do media framing and public opinion covary?
Hypotheses A R K H1: Issue Salience The covariance between media framing of immigration and public opinion will be stronger during periods of time when immigration is highly salient in the media, relative to periods of time when the issue is not highly salient.
Hypotheses A R K H1: Issue Salience The covariance between media framing of immigration and public opinion will be stronger during periods of time when immigration is highly salient in the media, relative to periods of time when the issue is not highly salient. H2: Frame Competition The more diffuse media coverage of immigration is across competing frames, the weaker the covariance between media framing of the issue and public opinion about the issue will be.
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal)
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal) 75 70 65 60 55 50 45 40 1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012
Text Corpus A R K ◮ 13 U.S. newspapers (e.g., NYT , USA Today ) ◮ 1980–2012 (132 quarters) ◮ 38,283 articles ◮ Annotated for tone (pro/anti-immigration) and 14 emphasis framing “dimensions” ◮ Random subset of 4,154 manually annotated ◮ Automatic annotation of the rest ◮ (More about this later!)
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles):
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles): ◮ Media tone: count(pro) – count(anti)
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles): ◮ Media tone: count(pro) – count(anti) 750 Pro 500 250 0 750 Neutral 500 250 0 750 Anti 500 250 0 1980 1985 1990 1995 2000 2005 2010
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles): ◮ Media tone: count(pro) – count(anti) ◮ High salience: ≥ 350 articles published in the quarter? (binary)
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles): ◮ Media tone: count(pro) – count(anti) ◮ High salience: ≥ 350 articles published in the quarter? (binary) H1: Public mood ∝ Media tone × High salience
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles): ◮ Media tone: count(pro) – count(anti) ◮ High salience: ≥ 350 articles published in the quarter? (binary) ◮ Frame competition: Shannon entropy across emphasis framing dimensions in the quarter
Framing Dimensions over Time A R K
Variables A R K ◮ Public mood (dependent variable) from Stimson (2014) (higher is more liberal) From the text corpus (38,283 articles): ◮ Media tone: count(pro) – count(anti) ◮ High salience: ≥ 350 articles published in the quarter? (binary) ◮ Frame competition: Shannon entropy across emphasis framing dimensions in the quarter H2: Public mood ∝ –(Media tone × Frame competition)
Regression A R K coefficient standard error Public mood (lagged) 0.05 0.83 Media tone 222.09 108.53 High salience 0.30 1.26 Media tone × high salience 9.57 5.00 H1 Frame competition –10.06 10.86 Media tone × frame competition –87.41 43.60 H2 Constant 32.48 27.61 N = 132; adjusted R 2 = 0.759, RMSE = 3.772, p < 0.05 in bold
Discussion A R K ◮ Public opinion on immigration ∝ Media tone on immigration ◮ . . . more when immigration is a salient issue ◮ . . . less when frame competition is high ◮ Still to be accounted for: ◮ Demographic shifts ◮ Major events ◮ . . .
Outline A R K ◮ Motivation: a study in which we’re using NLP � ◮ Building a text classifier: 1. Define the classes 2. Annotate training examples 3. Featurize data ◮ Brief tangent: creating new features 4. Learn to classify 5. Evaluate the classifier ◮ Looking ahead
Text Classification A R K Mosteller and Wallace (1963) automatically inferred the authors of the disputed Federalist Papers . Many other examples: ◮ News: politics vs. sports vs. business vs. technology ... ◮ Reviews of films, restaurants, products: postive vs. negative ◮ Email: spam vs. not ◮ What is the reading level of a piece of text? ◮ Will a scientific paper be cited? ◮ Will a piece of proposed legislation pass?
Media Frames Codebook: Framing Dimensions A R K Boydstun et al. (2014) Health and safety : health care, sanitation, Economic : costs, benefits, or other and public safety financial implications Quality of life : threats and opportunities Capacity and resources : availability of for the individual’s health, happiness, and physical, human, or financial resources well-being Morality : religious or ethical implications Cultural identity : traditions, customs, or Fairness and equality : balance or values of a social group in relation to a distribution of rights, responsibilities, and policy issue resources Public opinion : attitudes and opinions of Legality, constitutionality and the general public, including polling and jurisprudence : rights, freedoms, and the demographics authority of government Political : considerations related to politics Policy prescription and evaluation : and politicians, including lobbying, discussion of specific policies aimed at elections, and attempts to sway voters addressing problems External regulation and reputation : Crime and punishment : effectiveness and international reputation or foreign policy of implications of laws and their enforcement the United States Security and defense : threats to welfare of Other : any coherent group of frames not the individual, community, or nation covered by the above categories
Outline A R K ◮ Motivation: a study in which we’re using NLP � ◮ Building a text classifier: 1. Define the classes � 2. Annotate training examples 3. Featurize data ◮ Brief tangent: creating new features 4. Learn to classify 5. Evaluate the classifier ◮ Looking ahead
Media Frames Corpus A R K Card et al. (2015) ◮ Articles selected by keyword search across thirteen newspapers, 1980–2012, on three issues ◮ Annotated for primary framing dimension, overall tone (i.e., stance on the issue, pro-/anti-/neutral), and arbitrary spans that evoke framing dimensions ◮ 5,549 (immigration) ◮ 6,298 (same-sex marriage) ◮ 4,077 (smoking) ◮ https://github.com/dallascard/media_frames_corpus
Example ( Denver Post , 2006) A R K [WHERE THE JOBS ARE] Economic [Critics of illegal immigration can make many cogent arguments to support the position that the U.S. Congress and the Colorado legislature must develop effective and well-enforced immigration policies that will restrict the number of people who migrate here legally and illegally.] Policy prescription [It’s true that all forms of [immigration exert influence over our economic and cultural make-up.] Cultural identity In some ways, immigration improves our economy by adding laborers, taxpayers and consumers, and in other ways immigration detracts from our economy by increasing the number of students, health care recipients and other beneficiaries of public services.] Economic [Some economists say that immigrants, legal and illegal, produce a net economic gain, while others say that they create a net loss] Economic . There are rational arguments to support both sides of this debate, and it’s useful and educational to hear the varying positions.
Recommend
More recommend