TAVeer - An Interpretable Topic-Agnostic Authorship Verification - PowerPoint PPT Presentation

TAVeer - An Interpretable Topic-Agnostic Authorship Verification Method Oren Halvani, Lukas Graner and Roey Regev (Fraunhofer SIT) PAN: Stylometry and Digital Text Forensics held in conjunction with the CLEF 2020 Conference and Labs of the Evaluation Forum Information Access Evaluation meets Multilinguality, Multimodality, and Visualization, 22 - 25 September 2020, Thessaloniki - Greece ATHENE is a research center of the Fraunhofer-Gesellschaft with the participation of

Motivation  A central question that has occupied digital text forensics for decades is how to determine whether two documents were written by the same person  Authorship verification (AV) is a branch of research that deals with this important question  AV can be used for a wide range of applications including: ?  Continuous authentication  Expose malicious emails  Ghostwriting / plagiarism detection  Authentication of historical writings  Detection of speech changes in dementia patients  …

Motivation  Over the years, research activities in the field of AV have steadily increased, which has led to numerous approaches that aim to solve this problem e.g.  However, a large number of existing AV approaches consider features within the documents that are not always related to the writing style...

Motivation  Many AV methods, for example, rely on implicitly defined features such as character 𝒐 -grams: This year ARES & CD-MAKE will be held as an all-digital conference from August 25th to August 28th, 2020. This y , his ye , is yea , … Authors of accepted papers are required to provide prerecorded videos of their paper presentation. Character 6-grams Source: https://www.ares-conference.eu  Characters 𝒐 grams are extracted uncontrolled from texts and thus capture text units that are not only related to the writing style but also to other document properties such as topic, genre, structure, sentiment, etc.  Therefore, it may accidentally happen that the prediction of an AV method is not really based on the writing style, so that it will miss its true purpose

Proposed Feature Categories  To counteract this, we follow an alternative idea in which we consider explicitly defined features. More precisely, we focus on 20 categories of topic-agnostic ( TA ) words and phrases… ℒ 𝑈𝐵 ≈ 1000 words and phrases

Proposed Feature Categories  Based on ℒ 𝑈𝐵 , we propose the following feature categories that are used by our AV method Example sentence: " So that's the way it goes. "  Note that here, unlike standard n-grams, we have full control over which text units are captured

Proposed AV Approach  Based on the proposed feature categories, we introduce in the following our alternative AV approach TAVeer  TAVeer can essentially be divided into two phases: training and inference TAVeer ℳ ( Y ) ( Y ) ( Y ) 𝐸 𝑉 … Decision 𝑑 1 𝑑 3 𝑑 5 𝒟 = ℳ ( Y or N ) + Confidence score (Training) (Inference) 𝐸 ( N ) ( N ) ( N ) … 𝐵 𝑑 2 𝑑 4 𝑑 6  Training : A model ℳ has to be "learned" on the basis of a training corpus 𝒟 consisting of known verification cases labeled as Y (same author) and N (different author)  Inference : ℳ is applied to an unseen verification case in order to accept or reject the questioned authorship

TAVeer (Training)  Required building blocks for the calculation of distances and thresholds… For each feature category 𝐺 𝑗 ∈ 𝔾 = {𝐺 1 , 𝐺 2 , … , 𝐺 𝑛 } Verification case Training corpus For each 𝑑 𝑘 = 𝐸 𝑉 , 𝐸 𝐵 ∈ 𝒟 = (𝑑 1 , 𝑑 2 , … , 𝑑 𝑜 ) Feature vector Normalized 𝑒 11 , … 𝑒 1𝑜 construction feature vectors 𝐸 𝑉 Compute distance 𝑒 21 , … 𝑒 2𝑜 𝑌 = 𝑦 1 , 𝑦 2 , … , 𝑦 𝑙 Thresholding 𝑒 𝑗𝑘 dist(𝑌, 𝑍) 𝑔(𝐸 𝑉 , 𝐸 𝐵 , 𝐺 𝑗 ) (EER) … 𝑍 = (𝑧 1 , 𝑧 2 , … , 𝑧 𝑙 ) 𝐸 (Manhattan metric) 𝑒 𝑛1 , … 𝑒 𝑛𝑜 𝐵

TAVeer (Training)  After all distances have been computed, we determine for each 𝐺 𝑗 its corresponding threshold 𝜄 𝐺 𝑗 via the EER ( e qual e rror r ate), which is the point on the ROC-curve where the false positive rate equals the false negative rate  In our setting, all corpora are balanced (same number of Y / N -cases). Therefore, we use the median as an approximation of the EER True positive rate  The result of the thresholding procedure is a set Θ = 𝜄 𝐺 1 , 𝜄 𝐺 2 , … , 𝜄 𝐺 𝑛 , with 𝜄 𝐺 𝑗 = median 𝑒 𝑗1 , 𝑒 𝑗2 , … , 𝑒 𝑗𝑜 A rea U nder the (ROC-) C urve False positive rate

TAVeer (Training)  To construct ℳ , we further require a similarity function that: (1) transforms the computed distances into similarity scores falling into 0; 1 and (2) calibrates these scores so that 0.5 marks the decision boundary  For the intended purpose, we designed the following piecewise function: 1 − 𝑒 , if 𝑒 < 𝜄 𝐺 2𝜄 𝐺 sim 𝑒, 𝑒 𝑛𝑏𝑦 , 𝜄 𝐺 = 1 𝑒 − 𝜄 𝐺 otherwise 2 − , 2 𝑒 𝑛𝑏𝑦 − 𝜄 𝐺 Upper bound of the distance function (for the Manhattan metric 𝑒 𝑛𝑏𝑦 = 2 holds)

TAVeer (Training)  To find a suitable ensemble, we first create a set 𝔾 Θ = 𝐺 1 , 𝜄 𝐺 1 , 𝐺 2 , 𝜄 𝐺 2 , … , 𝐺 𝑛 , 𝜄 𝐺 𝑛  Based on 𝔾 Θ we generate all possible ensembles ℇ 1 , ℇ 2 , … by using the powerset: 𝒬 𝔾 Θ ∖ ∅ = 𝐺 1 , 𝜄 𝐺 , 𝐺 1 , 𝜄 𝐺 1 , 𝐺 2 , 𝜄 𝐺 , … 1 2 Atomic ensemble Ensemble  Next, we construct an aggregated similarity function on top of sim · to take ℇ into account: sim ℇ 𝐸 𝑉 , 𝐸 𝐵 , 𝑒 𝑛𝑏𝑦 , ℇ = median sim dist 𝑔 𝐸 𝑉 , 𝐸 𝐵 , 𝐺 , 𝑒 𝑛𝑏𝑦 , 𝜄 𝐺 |(𝐺, 𝜄 𝐺 ) ∈ ℇ dist(𝑌, 𝑍)

TAVeer (Training)  To find an optimal ℇ that will be chosen as the final model ℳ , a ranking mechanism is needed…  For this, we define a classification function: classify 𝐸 𝑉 , 𝐸 𝐵 , 𝑒 𝑛𝑏𝑦 , ℇ = ቊ Y same author , if sim ℇ 𝐸 𝑉 , 𝐸 𝐵 , 𝑒 𝑛𝑏𝑦 , ℇ > 0.5 N different author , otherwise  Using this function, we classify all verification cases (𝑑 1 , 𝑑 2 , … , 𝑑 𝑜 ) in the training corpus 𝒟 for each ensemble ℇ 𝑗 ∈ 𝒬 𝔾 Θ and calculate the respective accuracies

TAVeer (Training)  To obtain the optimal ℇ , we sort all resulting ensembles one by one according to the following three criteria (each in descending order): (1) Accuracy of ℇ (calculated for 𝒟 ) (2) Number of feature categories ℇ contains (3) Median accuracy regarding all atomic ensembles in ℇ (calculated for 𝒟 )  Finally, we select the first ensemble from the sorted list, which represents the final model ℳ

TAVeer (Inference)  Based on the resulting model ℳ , TAVeer performs the following steps, to decide for an unseen verification case 𝑑 new = (𝐸 𝑉 , 𝐸 𝐵 ) whether both documents were written by the same author  Using classify · , TAVeer first computes the aggregated similarity value: 𝑡 new = sim ℇ 𝐸 𝑉 , 𝐸 𝐵 , 𝑒 𝑛𝑏𝑦 , ℳ  Afterwards, a binary prediction ( Y / N ) regarding the questioned authorship of 𝐸 𝑉 is obtained by comparing 𝑡 new against the decision boundary... decision(𝑑 new ) = ቊ Y , 𝑡 new , if 𝑡 new > 0.5 N , 𝑡 new , otherwise

Evaluation  To evaluate TAVeer, we made use of four self-compiled corpora (described in detail in the official paper) comprising verification cases with cross , related and mixed topics.  Each corpus was partitioned into author disjunct training and test corpora based on a 40/60% ratio and was designed in a balanced manner (number of Y-cases equals the number of N-cases).  In total, we have selected eight baseline methods (including the SOTA) that have shown their strengths in previous AV studies  After training TAVeer and the respective baselines, we evaluated all methods on the four test corpora, using accuracy as a primary performance measure  In two cases, TAVeer outperformed all baselines, while regarding the other two corpora it performed similar to the strongest baseline

Evaluation (Model analysis)  To gain an insight into how the individual feature categories performed on the test corpora, we analyzed the trained models  Using sim ℇ · , we calculated for all verification cases the aggregated similarity values with respect to the involved atomic ensembles in each model and visualized them as violin plots …  Interpretation: The distribution of the similarity values for each 𝐺 𝑗 are colored green ( Y ) and red ( N ), respectively, while the dashed line represents the decision boundary. The better this line can separate both distributions and the less they overlap, the more suitable is 𝐺 𝑗 for the test corpus

TAVeer - An Interpretable Topic-Agnostic Authorship Verification - PowerPoint PPT Presentation

TAVeer - An Interpretable Topic-Agnostic Authorship Verification Method Oren Halvani, Lukas Graner and Roey Regev (Fraunhofer SIT) PAN: Stylometry and Digital Text Forensics held in conjunction with the CLEF 2020 Conference and Labs of the

A Nonparametric N-Gram Topic Model with Interpretable Latent Topics Shoaib Jameel and Wai Lam

MANA for MPI MPI-Agnostic Network-Agnostic Transparent Checkpointing Rohan Garg, *Gregory Price,

Two-level Authoring of Computer- Interpretable Guidelines David Buenestado, Juan M. Pikatza, Unai

LANGUAGE-AGNOSTIC INJECTION LANGUAGE-AGNOSTIC INJECTION DETECTION DETECTION Lars Hermerschmidt,

IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

Interpretable sets in o-minimal structures Will Johnson March 27, 2015 Will Johnson

From ML Successes to Applications ICIP18 Tutorial on Interpretable Deep Learning 2 Black Box

Pool-based Agnostic Pool-based Agnostic Experiment Design Experiment Design in Linear

Deep Visual Models with Interpretable Features and Modularized Structures Quanshi Zhang John

9/15/17 Outline Topic 1.Introduc8on Topic 2. RCS for six key fuels Topic 3.

2019.07.04 Xie, P., Zuo, K., Zhang, Y., Li, F., Yin, M., & Lu, K. (2019). Interpretable

Task-Agnostic Dynamics Priors for Deep Reinforcement Learning Yilun Du 1 , Karthik Narasimhan 2

Machine-Interpretable Online Debates The vision of a new

CS485/685 Lecture 16: March 1, 2012 Agnostic Learning [BDSS] Chapters 2, 3 CS485/685 (c) 2012 P.

Illustrating Agnostic Learning We want a classifier to distinguish between cats and dogs Image 1

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

Self-supervised Learning of Interpretable Keypoints from Unlabelled Videos

Incremental Approach to Interpretable Classification Rule Learning Bishwamittra Ghosh and Kuldeep

Virtual Student Orientation Information for Families SLIDESMANIA.COM TOPIC TOPIC TOPIC TOPIC

Toward In Interpretable De Deep Re Reinforcement Lea Learning g wi with Li Linea ear Model

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan> Shrikumar, Peyton

2 Cross-linguistic evidence for UH Grammaticality and [-interpretable] features in Spanish SLA: !

TOPIC #X: TOPIC NAME DATE, 2020 PRESENTATION OUTLINE Main topic #1 Main topic #2 Main

Authorship Attribution in Russian with New High-Performing and Fully Interpretable

TAVeer - An Interpretable Topic-Agnostic Authorship Verification - PowerPoint PPT Presentation

TAVeer - An Interpretable Topic-Agnostic Authorship Verification Method Oren Halvani, Lukas Graner and Roey Regev (Fraunhofer SIT) PAN: Stylometry and Digital Text Forensics held in conjunction with the CLEF 2020 Conference and Labs of the

A Nonparametric N-Gram Topic Model with Interpretable Latent Topics Shoaib Jameel and Wai Lam

MANA for MPI MPI-Agnostic Network-Agnostic Transparent Checkpointing Rohan Garg, *Gregory Price,

Two-level Authoring of Computer- Interpretable Guidelines David Buenestado, Juan M. Pikatza, Unai

LANGUAGE-AGNOSTIC INJECTION LANGUAGE-AGNOSTIC INJECTION DETECTION DETECTION Lars Hermerschmidt,

IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

Interpretable sets in o-minimal structures Will Johnson March 27, 2015 Will Johnson

From ML Successes to Applications ICIP18 Tutorial on Interpretable Deep Learning 2 Black Box

Pool-based Agnostic Pool-based Agnostic Experiment Design Experiment Design in Linear

Deep Visual Models with Interpretable Features and Modularized Structures Quanshi Zhang John

9/15/17 Outline Topic 1.Introduc8on Topic 2. RCS for six key fuels Topic 3.

2019.07.04 Xie, P., Zuo, K., Zhang, Y., Li, F., Yin, M., &amp; Lu, K. (2019). Interpretable

Task-Agnostic Dynamics Priors for Deep Reinforcement Learning Yilun Du 1 , Karthik Narasimhan 2

Machine-Interpretable Online Debates The vision of a new

CS485/685 Lecture 16: March 1, 2012 Agnostic Learning [BDSS] Chapters 2, 3 CS485/685 (c) 2012 P.

Illustrating Agnostic Learning We want a classifier to distinguish between cats and dogs Image 1

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

Self-supervised Learning of Interpretable Keypoints from Unlabelled Videos

Incremental Approach to Interpretable Classification Rule Learning Bishwamittra Ghosh and Kuldeep

Virtual Student Orientation Information for Families SLIDESMANIA.COM TOPIC TOPIC TOPIC TOPIC

Toward In Interpretable De Deep Re Reinforcement Lea Learning g wi with Li Linea ear Model

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan&gt; Shrikumar, Peyton

2 Cross-linguistic evidence for UH Grammaticality and [-interpretable] features in Spanish SLA: !

TOPIC #X: TOPIC NAME DATE, 2020 PRESENTATION OUTLINE Main topic #1 Main topic #2 Main

Authorship Attribution in Russian with New High-Performing and Fully Interpretable

2019.07.04 Xie, P., Zuo, K., Zhang, Y., Li, F., Yin, M., & Lu, K. (2019). Interpretable

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan> Shrikumar, Peyton