Authorship ID at PAN11 What -- Why -- How Patrick Juola Evaluating - PowerPoint PPT Presentation

Mar 09, 2024 •363 likes •495 views

Authorship ID at PAN11 What -- Why -- How Patrick Juola Evaluating Variations in Language Laboratory Duquesne University, Pittsburgh PA, USA juola@mathcs.duq.edu Authorship Identification needs little definition among this group

Authorship ID at PAN’11 What -- Why -- How Patrick Juola Evaluating Variations in Language Laboratory Duquesne University, Pittsburgh PA, USA juola@mathcs.duq.edu
Authorship Identification ◦ … needs little definition among this group ◦ Differs subtly from plagiarism detection  Plagiarism : This part and THAT part differ  ID : This part is by THAT person ◦ But, yeah, still the same problem
Authorship Identification ◦ … needs little motivation among this group, either  School essays  Forged or disputed documents  Poison-pen letters (or Email)  Anonymous or corporate authorship ◦ Lots of reasons to study
… and lots of ways to do it  Something of a “professional ad-hocracy”  My own system (JGAAP) implements more than 1 million different approaches, most of which “work”  … and none of which work perfectly
Hence, this track/lab  NSF funded to create “community resources” to evaluate proposed methods  NSF funded to create evaluation framework – i.e. on behalf of the NSF, welcome
This track : Email authorship  Why one track? Possible better results from drilling down.  Possible ability to re-use analysis; e.g. is one stemmer “better” than another?  Why Email? Lots of data, and lots of importance. ◦ If we had suggested a track on the Paston letters, who would have come?
Structure : 5 subtasks  Closed class : 26 authors  Closed class : 72 authors  Open class : 26 authors  Closed class : 72 authors  Verification : 1 author at a time
Participants  31 registered groups /13 submissions8  Scored by averaging precision, recall, and F score  “Winners” : ◦ LudovicTanguy (University of Toulouse & CNRS, France) ◦ IoannisKourtis (University of the Aegean, Greece) ◦ Mario Zechner (Know-Center, Austria) ◦ Tim Snyder (Porfiau, Canada)
… but the real winner is the field  … and everyone who participated ◦ … or observed  … or is motivated to start looking further at this  We hope to be back with an improved lab next year based on feedback here  We hope to see you all back here with improved technology based on feedback here  I look forward to seeing the papers!
Questions for next time  New corpus, or extended corpus?  Standardized markup?  What languages/genres?  What evaluation scheme?  What other changes?
Dankuwel!

Recommend

Authorship & Publication August 4, 2009 Authorship Publication Authorship Each author

Authorship Publication Authorship & Publication August 4, 2009 Authorship Publication Authorship Each author should have made substantial contribution to research Each author should have participated sufficiently in the

308 views • 3 slides

Authorship: why not just toss a coin? Benefits and responsibilities of authorship Tactics

Authorship: why not just toss a coin? Benefits and responsibilities of authorship Tactics of authorship abuse Authorship policies and requirements Examples of authorship disputes How to avoid problems Kevin Strange, PhD,

543 views • 25 slides

COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF

337 views • 17 slides

Uncovering Plagiarism, Authorship, and Social Software Misuse PAN 2011 [pan.webis.de] The PAN

Uncovering Plagiarism, Authorship, and Social Software Misuse PAN 2011 [pan.webis.de] The PAN Team Teresa Holfeld Andreas Eiselt Martin Potthast Alberto Barrn-Cedeo Efstathios Stamatatos Moshe Koppel Patrick Juola Shlomo Argamon

742 views • 36 slides

GLAD: Groningen Lightweight Authorship Detection PAN, Authorship verification, 2015 Manuela

GLAD: Groningen Lightweight Authorship Detection PAN, Authorship verification, 2015 Manuela Hrlinmann, Benno Weck, Esther van den Berg, Simon uster, Malvina Nissim The challenge given: a set of Known documents written by the same Author

581 views • 34 slides

Uncovering Plagiarism, Authorship, and Social Software Misuse PAN 2011 Results [pan.webis.de]

Uncovering Plagiarism, Authorship, and Social Software Misuse PAN 2011 Results [pan.webis.de] The PAN Competition Plagiarism Detection The web is rife with text reuse: boilerplate, translations, paraphrases, summaries, and plagiarism. c 2

874 views • 13 slides

Cross-domain Authorship Attribution Overview of the Author Identification Task at PAN-2018

Cross-domain Authorship Attribution Overview of the Author Identification Task at PAN-2018 PAN@CLEF2018, Avignon, 11 September 2018 Mike Kestemont, Efstathios Stamatatos, Walter Daelemans, Benno Stein, Martin Potthast Authorship attribution

418 views • 14 slides

Deep Bayes Factor Scoring for Authorship Verifjcation Benedikt Boenninghoff Dorothea Kolossa

Deep Bayes Factor Scoring for Authorship Verifjcation Benedikt Boenninghoff Dorothea Kolossa Julian Rupp Robert M. Nickel PAN@CLEF2020 * Authorship verifjcation (AV) tasks at PAN 2020 to 2022 1 (Kestemont, Manjavacas, et al. 2020) Task:

814 views • 67 slides

Kernel Methods and String Kernels for Authorship Analysis Marius Popescu 1 Cristian Grozea 2 1

String Kernels Authorship Attribution Authorship Clustering Sexual Predator Identification Kernel Methods and String Kernels for Authorship Analysis Marius Popescu 1 Cristian Grozea 2 1 University of Bucharest, Romania popescunmarius@gmail.com

207 views • 18 slides

A Mathematical Study A Mathematical Study of Authorship Attribution of Authorship Attribution

Yang Wang Yang Wang Department of Mathematics Department of Mathematics Michigan State University Michigan State University A Mathematical Study A Mathematical Study of Authorship Attribution of Authorship Attribution Who Wrote Who Wrote

480 views • 34 slides

PAN 2010 Uncovering Plagiarism, Authorship, and Social Software Misuse Webis @

PAN 2010 Uncovering Plagiarism, Authorship, and Social Software Misuse Webis @ Bauhaus-Universitt Weimar NLEL @ Universidad Politcnica de Valencia University of the Aegean Bar-Ilan University http://pan.webis.de Who we are... Benno Stein

756 views • 32 slides

A multitude of linguistically- rich features for authorship attribution Ludovic Tanguy, Assaf

A multitude of linguistically- rich features for authorship attribution Ludovic Tanguy, Assaf Urieli, Basilio Calderone, Nabil Hathout, and Franck Sajous CLLE-ERSS: CNRS & University of Toulouse, France PAN 2011 Workshop Authorship

647 views • 27 slides

Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce Ludovic

Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce Ludovic Tanguy, Franck Sajous, Basilio Calderone and Nabil Hathout CLLE-ERSS: CNRS & University of Toulouse, France PAN 2012 Authorship Attribution - CLEF

434 views • 10 slides

PAN 2010 Results Uncovering Plagiarism, Authorship, and Social Software Misuse

PAN 2010 Results Uncovering Plagiarism, Authorship, and Social Software Misuse Bauhaus-Universitt Weimar Martin Potthast, Benno Stein Andreas Eiselt, Teresa Holfeld Universidad Politcnica de Valencia Alberto Barrn-Cedeo, Paolo

538 views • 25 slides

EACH-USP Ensemble Cross-domain Authorship Attribution for PAN-CLEF-2018 J. Eleandro Cust odio,

EACH-USP Ensemble Cross-domain Authorship Attribution for PAN-CLEF-2018 J. Eleandro Cust odio, Ivandr e Paraboni { eleandro,ivandre } @usp.br Avignon, 11 September 2018 School of Arts, Sciences and Humanities University of S ao Paulo

520 views • 15 slides

Graph-based and Lexical-Syntactic Approaches for the Authorship Attribution Task Notebook for PAN

Introduction Proposed approaches Experimental settings and results Universidad Aut onoma de Puebla Conclusion Graph-based and Lexical-Syntactic Approaches for the Authorship Attribution Task Notebook for PAN at CLEF 2012 Esteban Castillo,

506 views • 14 slides

Upgrade Injector Test Facility: UITF CEBAF related Why Build It? HDIce commissioning

Upgrade Injector Test Facility: UITF CEBAF related Why Build It? HDIce commissioning Bubble chamber physics with Bremsstrahlung x-rays: photo-disintegration of oxygen into helium and carbon In support of parity violation experiments,

602 views • 23 slides

Talk outline Overview: Advantages and challenges of the SRF gun technology SRF gun

Survey of SRF guns First beam 21 st April 2011 Sergey Belomestnykh Collider-Accelerator Department, BNL July 25, 2011 SRF Conference Chicago July 25-29, 2011 Talk outline Overview: Advantages and challenges of the SRF gun

1.06k views • 26 slides

Complete Legal Protection for Armed Self Defense We live in violent times Someone uses a gun for

Complete Legal Protection for Armed Self Defense We live in violent times Someone uses a gun for self defense once every 13 seconds. Every year, guns are used for self defense over 1.5 million times. It can happen to YOU! You may have to use

626 views • 17 slides

Suicide

Suicide Prevention Center There is hope and there is help

257 views • 13 slides

Information and Resources E V E R Y W E D N E S D AY Wednesday Webinars 10:00 11:00 am

COVID-19: Information and Resources E V E R Y W E D N E S D AY Wednesday Webinars 10:00 11:00 am Serving Commercial Fisherman and their Families Today's Panelists Jonathan Hughes Dan Orchard Dan Orchard Steve Barkhuff Certified

188 views • 4 slides

Key Decisions Overcapacity and overfishing Key decision 1: Which prohibition or qualitative

Key Decisions Overcapacity and overfishing Key decision 1: Which prohibition or qualitative restrictions should apply to fisheries subsidies? Options : Prohibition of subsidies that contribute to overcapacity and overfishing What

178 views • 5 slides

Rate of Change Part 2: Using Lines INFO-1301, Quantitative Reasoning 1 University of Colorado

Rate of Change Part 2: Using Lines INFO-1301, Quantitative Reasoning 1 University of Colorado Boulder April 17, 2016 Prof. Michael Paul Interpreting Linear Functions Fishermen in the Finger Lakes Region have been recording the dead fish they

410 views • 15 slides

Splitting the Nucleus Caused by: neutron hitting nucleus Most cases split in 2 main

Splitting the Nucleus Caused by: neutron hitting nucleus Most cases split in 2 main parts (binary fission) Releases Energy Some Possible Products: Ba, Kr, Sr, Cs, I and Xe Some Possible Reactions: 235 U + 1 neutron 92

245 views • 12 slides

Authorship ID at PAN11 What -- Why -- How Patrick Juola Evaluating - PowerPoint PPT Presentation

Authorship ID at PAN11 What -- Why -- How Patrick Juola Evaluating Variations in Language Laboratory Duquesne University, Pittsburgh PA, USA juola@mathcs.duq.edu Authorship Identification needs little definition among this group

Authorship & Publication August 4, 2009 Authorship Publication Authorship Each author

Authorship: why not just toss a coin? Benefits and responsibilities of authorship Tactics

COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF

Uncovering Plagiarism, Authorship, and Social Software Misuse PAN 2011 [pan.webis.de] The PAN

GLAD: Groningen Lightweight Authorship Detection PAN, Authorship verification, 2015 Manuela

Uncovering Plagiarism, Authorship, and Social Software Misuse PAN 2011 Results [pan.webis.de]

Cross-domain Authorship Attribution Overview of the Author Identification Task at PAN-2018

Deep Bayes Factor Scoring for Authorship Verifjcation Benedikt Boenninghoff Dorothea Kolossa

Kernel Methods and String Kernels for Authorship Analysis Marius Popescu 1 Cristian Grozea 2 1

A Mathematical Study A Mathematical Study of Authorship Attribution of Authorship Attribution

PAN 2010 Uncovering Plagiarism, Authorship, and Social Software Misuse Webis @

A multitude of linguistically- rich features for authorship attribution Ludovic Tanguy, Assaf

Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce Ludovic

PAN 2010 Results Uncovering Plagiarism, Authorship, and Social Software Misuse

EACH-USP Ensemble Cross-domain Authorship Attribution for PAN-CLEF-2018 J. Eleandro Cust odio,

Graph-based and Lexical-Syntactic Approaches for the Authorship Attribution Task Notebook for PAN

Upgrade Injector Test Facility: UITF CEBAF related Why Build It? HDIce commissioning

Talk outline Overview: Advantages and challenges of the SRF gun technology SRF gun

Complete Legal Protection for Armed Self Defense We live in violent times Someone uses a gun for

Suicide

Information and Resources E V E R Y W E D N E S D AY Wednesday Webinars 10:00 11:00 am

Key Decisions Overcapacity and overfishing Key decision 1: Which prohibition or qualitative

Rate of Change Part 2: Using Lines INFO-1301, Quantitative Reasoning 1 University of Colorado

Splitting the Nucleus Caused by: neutron hitting nucleus Most cases split in 2 main

Sambuz

Useful Links

Newsletter

Mail Us

Authorship ID at PAN11 What -- Why -- How Patrick Juola Evaluating - PowerPoint PPT Presentation

Authorship ID at PAN11 What -- Why -- How Patrick Juola Evaluating Variations in Language Laboratory Duquesne University, Pittsburgh PA, USA juola@mathcs.duq.edu Authorship Identification needs little definition among this group

Authorship &amp; Publication August 4, 2009 Authorship Publication Authorship Each author

Authorship: why not just toss a coin? Benefits and responsibilities of authorship Tactics

COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF ILE COM PAN Y PROF

Uncovering Plagiarism, Authorship, and Social Software Misuse PAN 2011 [pan.webis.de] The PAN

GLAD: Groningen Lightweight Authorship Detection PAN, Authorship verification, 2015 Manuela

Uncovering Plagiarism, Authorship, and Social Software Misuse PAN 2011 Results [pan.webis.de]

Cross-domain Authorship Attribution Overview of the Author Identification Task at PAN-2018

Deep Bayes Factor Scoring for Authorship Verifjcation Benedikt Boenninghoff Dorothea Kolossa

Kernel Methods and String Kernels for Authorship Analysis Marius Popescu 1 Cristian Grozea 2 1

A Mathematical Study A Mathematical Study of Authorship Attribution of Authorship Attribution

PAN 2010 Uncovering Plagiarism, Authorship, and Social Software Misuse Webis @

A multitude of linguistically- rich features for authorship attribution Ludovic Tanguy, Assaf

Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce Ludovic

PAN 2010 Results Uncovering Plagiarism, Authorship, and Social Software Misuse

EACH-USP Ensemble Cross-domain Authorship Attribution for PAN-CLEF-2018 J. Eleandro Cust odio,

Graph-based and Lexical-Syntactic Approaches for the Authorship Attribution Task Notebook for PAN

Upgrade Injector Test Facility: UITF CEBAF related Why Build It? HDIce commissioning

Talk outline Overview: Advantages and challenges of the SRF gun technology SRF gun

Complete Legal Protection for Armed Self Defense We live in violent times Someone uses a gun for

Suicide

Information and Resources E V E R Y W E D N E S D AY Wednesday Webinars 10:00 11:00 am

Key Decisions Overcapacity and overfishing Key decision 1: Which prohibition or qualitative

Rate of Change Part 2: Using Lines INFO-1301, Quantitative Reasoning 1 University of Colorado

Splitting the Nucleus Caused by: neutron hitting nucleus Most cases split in 2 main

Sambuz

Useful Links

Newsletter

Mail Us

Authorship & Publication August 4, 2009 Authorship Publication Authorship Each author