Detection in IoT Networks Imtiaz Ullah and Qusay H. Mahmoud 33 rd - PowerPoint PPT Presentation

A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT Networks Imtiaz Ullah and Qusay H. Mahmoud 33 rd Canadian Conference on Artificial Intelligence 12-15 May 2020

Agenda ▪ Introduction ▪ Motivation ▪ Problem Statement ▪ Related work ▪ Testbed Architecture ▪ Results and Analysis o Correlated Features o Feature Ranking o Learning Curve o Validation Curve o Classification ▪ Conclusion ▪ Future work [2]

Introduction ▪ Smart digital devices o Become part of our daily lives o Improve the quality of life o Make communication easier o Increase the data transfer and information sharing ▪ “Things” in the IoT could be anything o Physical o Virtual ▪ Technological challenges o Security o Power usage o Scalability o Communication mechanisms [3]

Introduction Cont. ▪ Exponential growth of IoT make it a smart object for the attackers. ▪ The effects of cyber-attacks become more destructive. Fig. 1. Source: https://www.forbes.com/sites/gilpress/2016/09/02/internet-of-things-by-the-numbers-what-new-surveys-found/#a60d28116a0e [4]

Motivation ▪ The exponential growth of the Internet of Things (IoT) devices provides a large attack surface for intruders to launch more destructive cyber-attacks. ▪ New techniques and detection algorithms required a well-designed dataset for IoT networks. ▪ Available IoT intrusion dataset had limited number of features. ▪ A very limited number of flow based feature in available IoT datasets. [5]

Problem Statement ▪ Firstly, we reviewed the weaknesses of various intrusion detection datasets. ▪ Secondly, we proposed a new dataset, adopted from https://ieee-dataport.org/open-access/iot-network- intrusion-dataset ▪ Thirdly we provide a significant set of features with their corresponding weights. ▪ Finally, we propose a new detection classification methodology using the generated dataset. ▪ The IoT Botnet dataset can be accessed from https://sites.google.com/view/iot-network-intrusion-dataset. [6]

Related Work ▪ DARPA 98 / 99 o Developed at MIT Lincoln Lab via an emulated network environment. o The DARPA 98 dataset contains seven days. o The DARPA 99 contains five weeks of network traffic. ▪ Lee and Stolfo developed the KDD99 dataset from DARPA 98/99. ▪ NSL-KDD removed redundant records from the KDD99 dataset. o Training data of KDD99 contains 78% redundant instances. o Testing data contains 75% redundant instances. ▪ ISCX Dataset at CIC university of new Brunswick. o Systematic approach to generate normal and malicious traffics. o Multistage attack. o Publicly available. [7]

Related Work Cont. ▪ UNSW-NB15. o Comprehensive modern normal network traffic. o Diverse intrusions scenario. o In-depth structured network traffic information. o Publicly available. o 49 Features • Flow, Basic, Content, Time, Additional Generated features, Connection, Labeled features. ▪ CICIDS2017 o Modern normal and malicious network traffic. o 80 network features. o Reliable normal, and malicious network flows. o Publicly available. ▪ CICDDOS2019 o Up-to-date normal and malicious DDOS network traffic. o 12 DDoS attacks. o Publicly available. o Comprehensive metadata about IP addresses. [8]

Related Work Cont. ▪ BoT-IoT Dataset o Developed via legitimate and emulated IoT networks. o A typical smart home configuration designed. o Dataset is publicly available. o 49 Features. ▪ Botnet IoT Dataset o Dataset generated using. • Nine commercial IoT devices. • Two IoT-based botnets BASHLITE and Mirai. o 115 Network Features. [9]

Testbed Architecture ▪ A typical smart home environment. ▪ Smart home device SKT NGU and EZVIZ Wi-Fi camera to generate the IoTID20 dataset. ▪ Other devices Laptops, Tablets, Smartphones. ▪ The SKT NGU and EZVIZ Wi-Fi camera are IoT victim devices, and all other devices in the testbed are the attacking devices. ▪ CIC Flowmeter to extract features. Fig. 2. Source: https://ieee-dataport.org/open-access/iot-network-intrusion-dataset [10]

Testbed Architecture Cont. ▪ New IoTID20 dataset for anomalous activity detection in IoT networks. ▪ IoTID20 available in CSV format. ▪ Various types of IoT attacks and families. ▪ Large number of general features. ▪ Large number of flow based features. Fig. 3. IoTID20 Dataset Attack Taxonomy ▪ High rank features. [11]

Label Feature of IoTID20 Table 1. Binary, Category, and Sub-Category of IoTID20 Dataset Binary Category Subcategory Normal, Normal Normal, Anomaly DoS, Syn Flooding, Mirai, Brute Force, HTTP Flooding, UDP Flooding MITM, ARP Spoofing Scan Host Port, OS [12]

Result and Analysis TP+TN Accuracy = TP+TN+FP+FN TP Precision = TP + FP TP Recall = TP + FN F − measure = 2 Precision. Recall Precision + Recall [13]

IoTID20 Dataset Correlated Features ▪ The correlated features degrade the detection capability of a machine learning algorithm. ▪ A correlation coefficient of 0.70 to remove a list of correlated features. Table 2. IoTID20 Dataset Correlated Features Total Featur Feature Name es 12 Active_Max, Bwd_IAT_Max, Bwd_Seg_Size_Avg, Fwd_IAT_Max, Fwd_Seg_Size_Avg, Idl e_Max, PSH_Flag_Cnt, Pkt_Size_Avg, Subflow_Bwd_Byts, SubflowBwd_Pkts, Subflow_F wd_Byts, Subflow_Fwd_Pkts [14]

Feature Ranking ▪ More than 70 % of the feature ranked with a value greater than 0.50. ▪ Shapira-Wilk algorithm to rank IoTID20 features. ▪ High ranked features improve feature selection. Ranking Score 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 1 Flow_ID Src_IP Src_Port Dst_IP Dst_Port Protocol Timestamp Flow_Duration Tot_Fwd_Pkts Tot_Bwd_Pkts TotLen_Fwd_Pkts TotLen_Bwd_Pkts Fwd_Pkt_Len_Max Fwd_Pkt_Len_Min Fwd_Pkt_Len_Mean Fwd_Pkt_Len_Std Bwd_Pkt_Len_Max Bwd_Pkt_Len_Min Bwd_Pkt_Len_Mean Bwd_Pkt_Len_Std Flow_Byts/s Flow_Pkts/s Flow_IAT_Mean Fig. 4. Feature Ranking Shapiro-Wilk algorithm Flow_IAT_Std Flow_IAT_Max Flow_IAT_Min Fwd_IAT_Tot Fwd_IAT_Mean Bwd_IAT_Mean Fwd_IAT_Max Fwd_IAT_Min Bwd_IAT_Tot Bwd_IAT_Mean Bwd_IAT_Std Bwd_IAT_Max Bwd_IAT_Min Fwd_PSH_Flags Bwd_PSH_Flags Features Fwd_URG_Flags Bwd_URG_Flags Fwd_Header_Len Bwd_Header_Len Fwd_Pkts/s Bwd_Pkts/s Pkt_Len_Min Pkt_Len_Max Pkt_Len_Mean Pkt_Len_Std Pkt_Len_Var FIN_Flag_Cnt SYN_Flag_Cnt RST_Flag_Cnt PSH_Flag_Cnt ACK_Flag_Cnt URG_Flag_Cnt CWE_Flag_Count ECE_Flag_Cnt Down/Up_Ratio Pkt_Size_Avg Fwd_Seg_Size_Avg Bwd_Seg_Size_Avg Fwd_Byts/b_Avg Fwd_Pkts/b_Avg Fwd_Blk_Rate_Avg Bwd_Byts/b_Avg Bwd_Pkts/b_Avg Bwd_Blk_Rate_Avg Subflow_Fwd_Pkts Subflow_Fwd_Byts Subflow_Bwd_Pkts Subflow_Bwd_Byts Init_Fwd_Win_Byts Init_Bwd_Win_Byts Fwd_Act_Data_Pkts Fwd_Seg_Size_Min Active_Mean Active_Std Active_Max Active_Min Idle_Mean Idle_Std Idle_Max Idle_Min [15]

Learning Curve ▪ A Learning Curve shows o Relationship between the training and validation of an algorithm using various training samples. o How the algorithm can benefit by providing more data or the data provided enough for better performance of the algorithm. 100 90 80 70 60 50 32000 40000 48000 56000 64000 72000 80000 88000 96000 102000 Training-Binary Testing-Binary [16] Fig. 5. Learning Curve for Binary Label

Learning Curve 100 95 90 85 80 75 70 65 60 55 50 32000 40000 48000 56000 64000 72000 80000 88000 96000 102000 Training-Category Testing-Category Fig. 6. Learning Curve for Category [17]

Learning Curve 100 95 90 85 80 75 70 65 60 55 50 32000 40000 48000 56000 64000 72000 80000 88000 96000 102000 Training-Sub-Category Testing-Sub-Category Fig. 7. Learning Curve for Subcategory [18]

Validation Curve ▪ A validation curve shows o Effectiveness of a classifier on the data it is trained. o Efficiency of the classifier to the new test data. 100 95 90 85 80 75 70 65 60 55 50 1 2 3 4 5 6 7 8 9 10 Training-Binary Testing-Binary Fig. 8. Validation Curve for Binary Label [19]

Validation Curve 100 95 90 85 80 75 70 65 60 55 50 1 2 3 4 5 6 7 8 9 10 Training-Category Testing-Category Fig. 9. Validation Curve for Category [20]

Validation Curve 100 95 90 85 80 75 70 65 60 55 50 1 2 3 4 5 6 7 8 9 10 Training-Sub-Category Testing-Sub-Category Fig. 10. Validation Curve for Subcategory [21]

Binary Classification ▪ Classifies the dataset as normal network traffic or 100 90 80 malicious network traffic. 70 F-Score 60 ▪ SVM, Gaussian NB, LDA, and Logic regression 50 40 poorly performed for binary label classification. 30 20 ▪ The decision tree, random forest, and ensemble 10 0 SVM Gaussian NB LDA Logistic Regression Decision Tree Random Forest Ensemble performed very well for binary label classification. Normal Anomaly Fig. 11. F-Score for Binary Label ▪ 3, 5, and 10-fold cross-validation test to check the overfitting of classifiers. ▪ Cross-fold validation test remains unchanged. [22]

Category Classification ▪ Classifies the dataset as normal network 100 90 80 traffic or any of the following attack category 70 F-Score 60 DoS, Mirai, MITM, or Scan. 50 40 ▪ Decision tree estimator performs very well 30 20 for all attack categories. 10 0 SVM Gaussian NB LDA Logistic Regression Decision Tree Random Forest Ensemble ▪ Poor performance by logic regression, LDA, Normal DoS Mirai MITM Scan Fig. 12. F-Score for Category Label Gaussian NB, and SVM. [23]

Detection in IoT Networks Imtiaz Ullah and Qusay H. Mahmoud 33 rd - PowerPoint PPT Presentation

A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT Networks Imtiaz Ullah and Qusay H. Mahmoud 33 rd Canadian Conference on Artificial Intelligence 12-15 May 2020 Agenda Introduction Motivation Problem Statement

The Internet of Things: (almost) every thing connected to Internet By Vctor M. Rivas Santos

IoT - Big Data & Security MWC Smart Cities Seminar Telefnica Global IoT Group Feb 2017

An Introduction to IoT Penetration Testing @libertyunix www.kmco.com The Agenda n IoT Attack

Internet of Things (IoT) Raspberry Pi Summer Camp Tech Talk Raspberry Pi Camp IoT 1

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

IoT Trade Mission to Malaysia 23 rd 26 th April 2018 IOT IN ASIA AND MALAYSIA Global IoT

Why IoT IoT Domain IoT Data Characteristics Massive data: 20.4 Billion connected Growing

Akintayo Akinyoade 12/01/2017 Survey Roadmap Internet of Things (IoT)? Tech. Enablers for IoT

Considerations for Enterprise Grade IoT Ishu Verma Red Hat AGENDA l 50 Shades of IoT l Functions,

Data Privacy and Security in the Age of IoT(Internet of Things) What is IoT? (The Internet of

IoT-Flows: Lightweight Policy Enforcement of Information Flows in IoT Infrastructures Jos

NB-IOT Antti Ratilainen LPWAN@IETF96 1 NB-IoT targeted use cases NB-IoT Low cost Ultra

Consumer IoT security What is consumer IoT? We have defined consumer IoT as products that are

Telkomsel Presenta.on IoT for Making Indonesia 4.0 Jakarta Conven,on Center, 28 November

(IoT) and the Future of Property Management By Nardo Snyman What is IOT? IoT is short for

IOT & Fixed 5G Next Generation of IOT David Sumi, VP of Marketing at Siklu Special Guest:

Security in Mobile and Wireless Networks APRICOT Tutorial Perth Australia 27 February, 2006

MiTM Attack MiTM Attack Edri Guy Edri Guy May 29 ,2013 May 29 ,2013 PC-Labs May 29 2013

Case Studies - Case Studies - Eduroam in Slovenia Eduroam in Slovenia Rok Pape ARNES -

CSE 127: Introduction to Security Lecture 11: Network Attacks Nadia Heninger and Deian Stefan

Preventing and detecting network attacks Harald Vranken 1 About me Open University &

The Shepherd Project Automated security audits of web login processes Benjamin Krumnow

Security: Network Security Overview Kameswari Chebrolu All the figures used as part of the

CHAPTER 5: NAMING DR. TRN HI ANH Outline 2 Names. Identifiers and Address 1. Flat

Detection in IoT Networks Imtiaz Ullah and Qusay H. Mahmoud 33 rd - PowerPoint PPT Presentation

A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT Networks Imtiaz Ullah and Qusay H. Mahmoud 33 rd Canadian Conference on Artificial Intelligence 12-15 May 2020 Agenda Introduction Motivation Problem Statement

The Internet of Things: (almost) every thing connected to Internet By Vctor M. Rivas Santos

IoT - Big Data &amp; Security MWC Smart Cities Seminar Telefnica Global IoT Group Feb 2017

An Introduction to IoT Penetration Testing @libertyunix www.kmco.com The Agenda n IoT Attack

Internet of Things (IoT) Raspberry Pi Summer Camp Tech Talk Raspberry Pi Camp IoT 1

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

IoT Trade Mission to Malaysia 23 rd 26 th April 2018 IOT IN ASIA AND MALAYSIA Global IoT

Why IoT IoT Domain IoT Data Characteristics Massive data: 20.4 Billion connected Growing

Akintayo Akinyoade 12/01/2017 Survey Roadmap Internet of Things (IoT)? Tech. Enablers for IoT

Considerations for Enterprise Grade IoT Ishu Verma Red Hat AGENDA l 50 Shades of IoT l Functions,

Data Privacy and Security in the Age of IoT(Internet of Things) What is IoT? (The Internet of

IoT-Flows: Lightweight Policy Enforcement of Information Flows in IoT Infrastructures Jos

NB-IOT Antti Ratilainen LPWAN@IETF96 1 NB-IoT targeted use cases NB-IoT Low cost Ultra

Consumer IoT security What is consumer IoT? We have defined consumer IoT as products that are

Telkomsel Presenta.on IoT for Making Indonesia 4.0 Jakarta Conven,on Center, 28 November

(IoT) and the Future of Property Management By Nardo Snyman What is IOT? IoT is short for

IOT &amp; Fixed 5G Next Generation of IOT David Sumi, VP of Marketing at Siklu Special Guest:

Security in Mobile and Wireless Networks APRICOT Tutorial Perth Australia 27 February, 2006

MiTM Attack MiTM Attack Edri Guy Edri Guy May 29 ,2013 May 29 ,2013 PC-Labs May 29 2013

Case Studies - Case Studies - Eduroam in Slovenia Eduroam in Slovenia Rok Pape ARNES -

CSE 127: Introduction to Security Lecture 11: Network Attacks Nadia Heninger and Deian Stefan

Preventing and detecting network attacks Harald Vranken 1 About me Open University &amp;

The Shepherd Project Automated security audits of web login processes Benjamin Krumnow

Security: Network Security Overview Kameswari Chebrolu All the figures used as part of the

CHAPTER 5: NAMING DR. TRN HI ANH Outline 2 Names. Identifiers and Address 1. Flat

IoT - Big Data & Security MWC Smart Cities Seminar Telefnica Global IoT Group Feb 2017

IOT & Fixed 5G Next Generation of IOT David Sumi, VP of Marketing at Siklu Special Guest:

Preventing and detecting network attacks Harald Vranken 1 About me Open University &