IP Reputation Analysis of Public Databases and Machine Learning - PowerPoint PPT Presentation

IP Reputation Analysis of Public Databases and Machine Learning Techniques Jared Lee Lewis Geanina F. Tambaliuc Husnu S. Narman Wook-Sung Yoo Weisberg Division of Computer Science Marshall University narman@marshall.edu https://hsnarman.github.io/ February 2020

Outline • Introduction • Blacklists • Machine Learning Techniques • System Model • Results • Conclusion Husnu S. Narman

Introduction Blacklist Introduction • The common usage of Internet adds many Learning challenges in terms of protecting user data. • Unfortunately, applications cannot protect the user privacy and become a threat to user data security Model because of new malware. • 4 new malware samples discovered / sec Results • More than 200 million new malware samples / year Conclusion Husnu S. Narman

Conclusion Results Model Learning Blacklist Introduction Introduction Husnu S. Narman

Introduction Blacklist Microsoft Exchange To prevent the users from spam and phishing email, Microsoft Exchange uses 8 filtering criteria: Learning • Connection Filtering • Sender Filtering • Recipient Filtering Model • Sender ID • Content Filtering • Sender Reputation Results • Attachment Filtering • Junk Email Filtering Conclusion Husnu S. Narman

Introduction Blacklist The Importance of DNS The Domain Name System (DNS) plays an important role in filtering and protection techniques because DNS protocol is used by both cyber-attacks Learning and authorized services. Model Domain Name IP: 153.92.0.100 Results Conclusion Husnu S. Narman

Introduction Blacklist Objective The objective of this research is to analyze the Learning public databases and machine learning techniques to detect malicious IP addresses Model and domains and introduce Automated IP Reputation Analyzer Tool (AIRPA), which uses both approaches to check the reputations of Results IPs and domains. Conclusion Husnu S. Narman

Introduction Blacklist Public Blacklist Databases • Seven main databases: Learning • VirusTotal • URLVoid • MyIP.MS Model • Censys • AbuseIPDB • Apility.io Results • Shodan and 102 sub-databases. Conclusion Husnu S. Narman

Introduction Blacklist Limitations of Public Blacklist Databases Unfortunately, the public blacklists have some limitations (Free Learning versions): • VirusTotal: 4 requests / minute • AbuseIPDB: 1,000 reports and checks per day and 60 requests per minute Model • Shodan: 1 request/ second • MyIP.MS: 150 requests/month Results • Apility.io: 250 requests/day and 50 requests/minute • Censys: 250 requests/month • May not regularly update Conclusion • Wrong information Husnu S. Narman

Introduction Blacklist Machine Learning Models With 80,000 good and 80,000 bad domains Learning • Logistic Regression • Bayes Model • Random Forest Results • Logistic Regression with geolocation • Bayes with geolocation Conclusion • Random Forest with geolocation Husnu S. Narman

Introduction Blacklist System Model and App: http://ipreputation.herokuapp.com/ Learning Model Results Conclusion Logistic Regression Husnu S. Narman

Introduction Blacklist App: http://ipreputation.herokuapp.com/ Learning Model Results Conclusion Husnu S. Narman

Introduction Blacklist App Fast Check: http://ipreputation.herokuapp.com/ Learning Model Results Conclusion Husnu S. Narman

Introduction Blacklist Results Result for testing unsafe 1586 IPs in public databases and AIRPA Learning AIRPA has the highest correctness rate with cross Model check Results Conclusion Husnu S. Narman

Introduction Blacklist Results Result for testing distinct learning techniques with/without geolocation Learning Logistic Regression with geolocation has the highest Model correctness. Random Forest without Results geolocation has the lowest correctness. Conclusion Husnu S. Narman

Introduction Blacklist Results Result for Runtime of distinct learning techniques with / without geolocation. Learning Logistic Regression has the lowest running time. Model Random Forest Results with geolocation has the highest running time. Conclusion Husnu S. Narman

Introduction Conclusion Blacklist Cross-checking system is better in terms of detection the malicious IPs in public databases but also decrease false positives. Learning Considering additional parameters with machine learning techniques to find IPs’ reputations can affect the obtained results in a better way but increase runtime Model Ability in public databases and Logical Regression in machine learning techniques have higher detection rates. Results Conclusion 17 Husnu S. Narman

Thank You narman@marshall.edu https://hsnarman.github.io/ Husnu S. Narman

IP Reputation Analysis of Public Databases and Machine Learning - PowerPoint PPT Presentation

IP Reputation Analysis of Public Databases and Machine Learning Techniques Jared Lee Lewis Geanina F. Tambaliuc Husnu S. Narman Wook-Sung Yoo Weisberg Division of Computer Science Marshall University narman@marshall.edu

Aucklands DNA Rollout Process Place DNA Reputation Themes Final Reputation Framework

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Online Reputation Security It takes 20 years to build a reputation and five minutes to ruin

Content-Driven Author Reputation and Text Trust for the Wikipedia Luca de Alfaro UC Santa Cruz

The Influence of Reputation Reputation is strongest when its management is based on values and

Robert Pohnke Plan 1. Problem Description 2. Canal mechanism 3. Results 4. Q&A Reputation

Reputation Management Destroy or Salvage Reputation? Through their actions following a crisis,

Schools & Reputation Management Overview Introductions The Importance of managing

GEMS/Food Databases and GEMS/Food Databases and GEMS/Food Databases and in the Food Supply

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to

3. Text and document databases Normal databases: formatted records; document databases:

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Lecture 10: Larger-than-Memory Databases 1 / 53 Larger-than-Memory Databases Recap

Administrivia CS 188: Artificial Intelligence Reminder: Spring 2006 Drop-in Python/Unix

Machine Learning Machine Learning: algorithms that use experience to improve their

CMPSC443 - Introduction to Computer and Network Security Module: EMail Secuirty Professor

20 years of Web search where to next? Mark Sanderson Who am I? Professor at RMIT

Lecture 4: Introduction to Classification for NLP Julia Hockenmaier juliahmr@illinois.edu 3324

What is Green? What does it mean to be green? Why is being green important?

Natural language processing using constraint-based grammars Ann Copestake University of Cambridge

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Sambuz

Useful Links

Newsletter

Mail Us

IP Reputation Analysis of Public Databases and Machine Learning - PowerPoint PPT Presentation

IP Reputation Analysis of Public Databases and Machine Learning Techniques Jared Lee Lewis Geanina F. Tambaliuc Husnu S. Narman Wook-Sung Yoo Weisberg Division of Computer Science Marshall University narman@marshall.edu

Aucklands DNA Rollout Process Place DNA Reputation Themes Final Reputation Framework

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Online Reputation Security It takes 20 years to build a reputation and five minutes to ruin

Content-Driven Author Reputation and Text Trust for the Wikipedia Luca de Alfaro UC Santa Cruz

The Influence of Reputation Reputation is strongest when its management is based on values and

Robert Pohnke Plan 1. Problem Description 2. Canal mechanism 3. Results 4. Q&amp;A Reputation

Reputation Management Destroy or Salvage Reputation? Through their actions following a crisis,

Schools &amp; Reputation Management Overview Introductions The Importance of managing

GEMS/Food Databases and GEMS/Food Databases and GEMS/Food Databases and in the Food Supply

Databases and PHP Accessing databases from PHP PHP &amp; Databases l PHP can connect to

3. Text and document databases Normal databases: formatted records; document databases:

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Lecture 10: Larger-than-Memory Databases 1 / 53 Larger-than-Memory Databases Recap

Administrivia CS 188: Artificial Intelligence Reminder: Spring 2006 Drop-in Python/Unix

Machine Learning Machine Learning: algorithms that use experience to improve their

CMPSC443 - Introduction to Computer and Network Security Module: EMail Secuirty Professor

20 years of Web search where to next? Mark Sanderson Who am I? Professor at RMIT

Lecture 4: Introduction to Classification for NLP Julia Hockenmaier juliahmr@illinois.edu 3324

What is Green? What does it mean to be green? Why is being green important?

Natural language processing using constraint-based grammars Ann Copestake University of Cambridge

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Sambuz

Useful Links

Newsletter

Mail Us

Robert Pohnke Plan 1. Problem Description 2. Canal mechanism 3. Results 4. Q&A Reputation

Schools & Reputation Management Overview Introductions The Importance of managing

Databases and PHP Accessing databases from PHP PHP & Databases l PHP can connect to