a framework for political portmanteau decomposition
play

A Framework for Political Portmanteau Decomposition Nabil Hossain - PowerPoint PPT Presentation

A Framework for Political Portmanteau Decomposition Nabil Hossain Minh Tran Henry Kautz nhossain@cs.rochester.edu Dept. Computer Science University of Rochester, NY Political Portmanteau Portmanteau words formed by combining sounds


  1. A Framework for Political Portmanteau Decomposition Nabil Hossain Minh Tran Henry Kautz nhossain@cs.rochester.edu Dept. Computer Science University of Rochester, NY

  2. Political Portmanteau • Portmanteau • words formed by combining sounds and meanings of two words • brunch = br eakfast + l unch motel = mo tor + ho tel • Political portmanteau (PP) • portmanteau in which at least one word refers to political entity • libtard = lib eral + re tard repugnican = repugn ant + republ ican • political framing • creative, sticky • novel slang; can be used in hate speech

  3. Political Portmanteau • Portmanteau • words formed by combining sounds and meanings of two words • brunch = br eakfast + l unch motel = mo tor + ho tel • Political portmanteau (PP) • portmanteau in which at least one word refers to political entity • libtard = lib eral + re tard repugnican = repugn ant + republ ican • political framing • creative, sticky • novel slang; can be used in hate speech

  4. Political Portmanteau • Portmanteau • words formed by combining sounds and meanings of two words • brunch = br eakfast + l unch motel = mo tor + ho tel • Political portmanteau (PP) • portmanteau in which at least one word refers to political entity • libtard = lib eral + re tard repugnican = repugn ant + republ ican • o ff ensive; political framing • creative, humorous, slang, sticky • can be used in hate speech

  5. Contributions • Framework for identifying political portmanteau from the web • Algorithm for PP detection and decomposition into root words • First shared dataset of PP

  6. Method ICWSM 2018 Slang Detection Reddit Comments Potential Slang • Extract words from Reddit news comments • Apply slang detection algorithm • Classify the detected words into PP vs not-PP • Decompose detected PP into root words: • � [ where X or Y is a political X + Y → PP term ] Hossain, Nabil, Thanh Thuy Trang Tran, and Henry Kautz. "Discovering Political Slang in Readers' Comments." In ICWSM 2018.

  7. Method ICWSM 2018 Expert PP Annotators Detection Slang Detection Reddit Not-PP PP Comments (repub) (libtard) Potential Slang • Extract words from Reddit news comments • Apply slang detection algorithm • Classify the detected words into PP vs not-PP • Decompose detected PP into root words: • � [ where X or Y is a political X + Y → PP term ] Hossain, Nabil, Thanh Thuy Trang Tran, and Henry Kautz. "Discovering Political Slang in Readers' Comments." In ICWSM 2018.

  8. Method ICWSM 2018 Expert PP Annotators Detection Slang Detection Reddit Not-PP PP Comments (repub) (libtard) Potential Slang PP Decomposition Political Entities Prefix/suffix match • Extract words from Reddit news comments (liberal, cruz, …) • Apply slang detection algorithm lib + C = libtard • Classify the detected words into PP vs not-PP Comment Wordlist Classifier Context • Decompose detected PP into root words: • � C = {retard, dotard, custard, …} or � E + C → PP C + E → PP Hossain, Nabil, Thanh Thuy Trang Tran, and Henry Kautz. "Discovering Political Slang in Readers' Comments." In ICWSM 2018.

  9. Model Details • � distribution Model — no contextual features β • Edit distances, word length, usage frequency • capture sound blending and word popularity • XGBoost — uses pre-trained GloVe word vector features from comments • also uses � distribution model features β PP Decomposition Accuracy PP Detection Accuracy Questions: nhossain@cs.rochester.edu

  10. Results • � distribution Model — no contextual features β • Edit distances, word length, usage frequency • capture sound blending and word popularity • XGBoost — uses pre-trained GloVe word vector features from comments • also uses � distribution model features β PP Decomposition Accuracy PP Detection Accuracy Questions: nhossain@cs.rochester.edu

  11. Results • � distribution Model — no contextual features β • Edit distances, word length, usage frequency • capture sound blending and word popularity • XGBoost — uses pre-trained GloVe word vector features from comments • also uses � distribution model features β PP Decomposition Accuracy PP Detection Accuracy Questions: nhossain@cs.rochester.edu Website: https://cs.rochester.edu/u/nhossain

  12. Results • � distribution Model — no contextual features β • Edit distances, word length, usage frequency • capture sound blending and word popularity • XGBoost — uses pre-trained GloVe word vector features from comments • also uses � distribution model features β PP Decomposition Accuracy PP Detection Accuracy Questions: nhossain@cs.rochester.edu Website: https://cs.rochester.edu/u/nhossain

Recommend


More recommend