Novel Balanced Feature Representation for Wikipedia Vandalism - PowerPoint PPT Presentation

Apr 18, 2023 •37 likes •137 views

Novel Balanced Feature Representation for Wikipedia Vandalism Detection Task Istvn Hegeds, Rbert Ormndi, Richrd Farkas, and Mrk Jelasity University of Szeged Hungary ihegedus@inf.u-szeged.hu Our approach Supervised learning

Novel Balanced Feature Representation for Wikipedia Vandalism Detection Task István Hegedűs, Róbert Ormándi, Richárd Farkas, and Márk Jelasity University of Szeged Hungary ihegedus@inf.u-szeged.hu
Our approach • Supervised learning • Rich feature set • Meta-learning scheme
Vector space model (VSM) • unigrams • values: – N if does not occure in the edit – A if in added sequence – D if in removed sequence – C if in changed sequence • #features = 47 324 • best 100 by InfoGain
Balanced VSM • sample is unbalanced – 93.9% regular • BVSM: for i in 1 to N do D = vandalism AND random_regular IG += InfoGainScore(D) done VSM = best(IG,100)
d
Other features • CharacterStatistic upercase and lowercase ratio • RepeatedCharSequences – asdasdasdasdasd • ValidWordRatio – English/pejorative words • CommentStatistic • UserNameOrIP – nickname or country from IP
10-fold-cross-validation AUC (10-fold) Balanced VSM 0.813 Balanced VSM + stopword 0.843 Other features 0.883 Other + unbalanced VSM 0.884 Other + balanced VSM 0.887
Meta learning J48=0.3; NaiveBayes=0.09; Logistic=0.61
Results (eval) AUC (LogReg) AUC (Voting) Balanced VSM 0.744 0.761 Other features 0.865 0.876 Other + 0.854 0.877 balanced Other + 0.864 0.880 unbalanced
Summary • VSM has no significant added value • meta-learning (+2%)

Recommend

Feature Representation Vision BoWs and Beyond Praveen Krishnan Feature Representation in

Feature Representation Vision BoWs and Beyond Praveen Krishnan Feature Representation in Vision Low Level Local Detectors and Descriptors Bag of Words Mid Level Parts Attributes Hierarchical Deep

487 views • 46 slides

Intro to Feature Representation in Virtual Screening Shengchao Liu, Gitter Group Feature

Intro to Feature Representation in Virtual Screening Shengchao Liu, Gitter Group Feature Representation 1. Raw Molecule Representation (Graph CNN) a. atom info b. bond info 2. SMILES (RNN, CNN, RNN+CNN) a. string, a sequence of

369 views • 6 slides

Neural representation of linguistic feature Neural representation of linguistic feature hierarchy

Neural representation of linguistic feature Neural representation of linguistic feature hierarchy reflects language proficiency hierarchy reflects language proficiency Giovanni Di Liberto Jinping Nie Jeremy Yeaton, Bahar Khalighinejad, Shihab

294 views • 13 slides

Feature Representation in Person Re-identification Hong Chang Institute of Computing Technology

Feature Representation in Person Re-identification Hong Chang Institute of Computing Technology Chinese Academy of Sciences 2020.1 Contents Feature representation in person Re-ID Related recent works Learning features with High

471 views • 35 slides

Wikipedia Vandalism Detection Feature Review and New Proposals Santiago M. Mola Velasco <

Introduction Features Classification Conclusions Wikipedia Vandalism Detection Feature Review and New Proposals Santiago M. Mola Velasco < sanmove@posgrado.upv.es > 4th International Workshop on Uncovering Plagiarism, Authorship, and

389 views • 20 slides

Lecture 22: Representation Learning Kai-Wei Chang CS @ University of Virginia kw@kwchang.net

Lecture 22: Representation Learning Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501-NLP 1 Feature Representations Feature Representation Learning Algorithm Color_red

141 views • 13 slides

Visual Feature Learning and Representation Qingshan Liu Nanjing University of Information

Visual Feature Learning and Representation Qingshan Liu Nanjing University of Information Science & Technology 11. 5. 2016 What can we read from this story? What Can We Read From Face Images? Visual Recognition = Feature + Classifier

627 views • 51 slides

Feature Representation Learning in Deep Learning Networks

ASR Chapter 9: Feature Representation Learning in Deep Learning Networks SNU Spoken Language Processing Lab /

826 views • 56 slides

Semantic Wikipedia [[enhances::Wikipedia]] Wikipedia today A free online encyclopdia

Max Vlkel, Markus Krtzsch, Denny Vrandecic, Heiko Haller, Rudi Studer AIFB and FZI Karlsruhe, Germany @WWW2006, 26.05.2006 Semantic Wikipedia [[enhances::Wikipedia]] Wikipedia today A free online encyclopdia 16th most accessed

841 views • 46 slides

Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representation Eliyahu

Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representation Eliyahu Kiperwasser & Yoav Goldberg 2016 Presented by: Yaoyang Zhang Outline Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature

344 views • 22 slides

Maximizing Gain Full Feature Space Representation While Upgrading Minimal Subset of PCs Tom

Minimizing Risk While Maximizing Gain Full Feature Space Representation While Upgrading Minimal Subset of PCs Tom Drabas Senior Data Scientist the pr probl blem em highly diverse ecosyst osystem em circle of upd pdat ates es data

641 views • 36 slides

Support Vector Machines Alex Leblang and Sam Birch ML Framework Data projected into feature

Support Vector Machines Alex Leblang and Sam Birch ML Framework Data projected into feature space, each feature is a dimension. Via Wikipedia, scikit-learn examples SVMs Supervised learning model: train / test Binary, discriminative

350 views • 12 slides

Effective Feature Representation for Clinical Text Concept Extraction Yifeng Tao 1,2 , Bruno

Effective Feature Representation for Clinical Text Concept Extraction Yifeng Tao 1,2 , Bruno Godefroy 1 , Guillaume Genthial 1 , Christopher Potts 1,3,* 1 Roam Analytics 2 Carnegie Mellon University 3 Stanford University Yifeng Tao et al. NAACL

306 views • 17 slides

Todays Topic CSE-571 EKF Feature-Based SLAM Probabilistic Robotics State Representation

10/27/15 Todays Topic CSE-571 EKF Feature-Based SLAM Probabilistic Robotics State Representation Process / Observation Models Landmark Initialization Robot-Landmark Correlation SLAM: Simultaneous Localization and Mapping Many

261 views • 25 slides

Nameless Feature Selection Challenge Attempt By Ran Gilad-Bachrach and Amir Navot Overview

Nameless Feature Selection Challenge Attempt By Ran Gilad-Bachrach and Amir Navot Overview In most cases we have used standard out of the box algorithms Obvious modifications for balanced error were done A novel feature

344 views • 11 slides

Saturday, 29 January 2011 OVERVIEW What is Wikipedia/Wikimedia? (Mike) What makes a

Saturday, 29 January 2011 OVERVIEW What is Wikipedia/Wikimedia? (Mike) What makes a Wikipedia article? (Magnus) Cancer Research on Wikipedia (Darren and Alex) Editing Wikipedia (Paul) Common pitfalls (Martin) Q&A

659 views • 16 slides

Recap: Q-Learning with state abstraction Using a feature representation, we can write a Q

Recap: Q-Learning with state abstraction Using a feature representation, we can write a Q function (or value function) for any state using a few weights: ( ) = w 1 f 1 s ( ) + w 2 f 2 s ( ) + + w n f n s ( ) V s ( ) = w 1 f 1 s , a (

536 views • 30 slides

Computers Session 1 INST 346 Agenda The Computer The Course Source: Wikipedia

Computers Session 1 INST 346 Agenda The Computer The Course Source: Wikipedia Source: Wikipedia Source: Wikipedia Source: Wikipedia The Big Picture Memory Processor Network Hardware Processing Cycle Input comes from

476 views • 25 slides

Wikipedia: n ++ made easy Matt Might University of Utah / NGLY1.org matt.might.net What

Wikipedia: n ++ made easy Matt Might University of Utah / NGLY1.org matt.might.net What is Wikipedia? What is Wikipedia? Anyone can view. Anyone can edit. Wikipedia is the worlds database. Its what parents and patients search.

720 views • 53 slides

Balanced Literacy Balanced Literacy M A R Y R O W L A N D S O N E L E M E N T A R Y S CH O O L

Balanced Literacy Balanced Literacy M A R Y R O W L A N D S O N E L E M E N T A R Y S CH O O L M A R Y R O W L A N D S O N E L E M E N T A R Y S CH O O L 2 0 13 To create independent readers readers, competent writers, articulate i

534 views • 16 slides

Wikipedia Sociographics Jimmy Wales President, Wikimedia Foundation Wikipedia Founder Todays

Wikipedia Sociographics Jimmy Wales President, Wikimedia Foundation Wikipedia Founder Todays Talk Quick introduction to who we are and what we are doing Two views of how Wikipedia works Details about the Community What is the

465 views • 46 slides

Introduction to Wikipedia editing Mike Peel 12 November 2014 Questions Who has used

Introduction to Wikipedia editing Mike Peel 12 November 2014 Questions Who has used Wikipedia? for their schoolwork? for their own interests? Who has referenced a Wikipedia article? Who has edited Wikipedia before?

654 views • 17 slides

Trade Presentation Wikipedia:

Trade Presentation Wikipedia: http://en.wikipedia.org/wiki/Cultural_achievements_of_pre-colonial_Philippines Communities of ancient Philippines were active in international trade, and they used the ocean as natural highways.[3] Early Filipinos

640 views • 39 slides

6.1 Representation and Interpolation of Rotations Jaakko Lehtinen with lots of slides from

Wikipedia user Blutfink Aalto CS-C3100 Computer Graphics 6.1 Representation and Interpolation of Rotations Jaakko Lehtinen with lots of slides from Frdo Durand 1 In This Video What is a rotation? Some simple rotation

883 views • 42 slides