Integrating Human and Synthetic Reasoning Via Model-Based Analysis

Introduction and Explanation • This is an experimental idea and very rough – Glue together very tame AI and user interface through some fault trees • To capture knowledge • Improve efficiency • Overview of work • (My) QuesBons!

If you haven’t seen this slide, you haven’t attended any of my talks

How much do we know about network traffic? 3e+07 1000 All TCP activity All TCP activity Identifiable File Transfers Identifiable File Transfers Identifiable Control Identifiable Control Scanning Scanning 2.5e+07 100 2e+07 Activity (Flows/5min) Activity (GB/5min) 10 1.5e+07 1e+07 1 5e+06 0.1 0 d06h00 d06h02 d06h04 d06h06 d06h08 d06h10 d06h12 d06h14 d06h16 d06h18 d06h20 d06h22 d07h00 d06h00 d06h02 d06h04 d06h06 d06h08 d06h10 d06h12 d06h14 d06h16 d06h18 d06h20 d06h22 d07h00 Time Time

Basic problem • We don’t know what we know and we don’t know what we don’t know • Most valuable resource available is analyst head Bme • Lots of repeBBve mindless aHacks • Lots of low‐risk, high‐threat aHacks • Have to automate • Also have to ensure automa(on isn’t self defea(ng

A Metric For Knowledge • Every day we receive k‐billion flows – We can understand and accurately tag x% of them – As x approaches 100%, the beHer • We improve x: – Hiring more analysts – Reducing traffic into the network – AutomaBng the process – Describing mulBple flows at 1 Bme

Prototype System Diagram Reports Models AI Flow Alerts AI Admission DPI Of Ignorance AI Maps Conflict Historical Maps

What is an AI? DPI AI Scans Important Maps Scans Systems Scanned • An AI is a system that reads in network data and outputs: – A domain – Some models – Alerts – Inventory data

Complementary AIs • Accurate • Predictable • Unambiguous human/ machine communicaBon • Humans serve as the final judge • Don’t overwhelm with trivia

Accuracy • Control ambiguity – ROC curves provide us with a measure of True positive rate (percentage) 100 � =2 accuracy 95 � =6 90 85 – But we’ve generally 80 75 been unsure about 70 65 what TP to use 60 55 • AIs will not guess in 50 0 5 10 15 20 25 30 35 40 45 50 False positive rate (percentage) order to avoid a bad guess

Predictable • There isn’t much we can do… – Reports: periodic and predictable informaBon on the state of the system (e.g., scanning) – Alerts: When an ac(onable event occurs, a noBce of the event and a recommended strategy (alter fw rules, take down machine, send people with guns) – Internal Intelligence: maps of the inside of the network – External Intelligence: maps of the outside world • Inventory is central

Conflict Resolution • We know that something is something – By fiat (“It’s my webserver”) – By published reference (port 80 is hHp) – Deep packet inspecBon (HTTP/1.0…) – Behaviorally (short requests, big transfers) • Hierarchy of certainty – DPI >> Fiat >> Behavioral >> Published reference

Managing Conflict DPI Fiat Behavioral Published Result A ‐ ‐ ‐ Map as A, alert on lack of published info A !A X X Map as A, alert on conflict A A X X Map as A A ‐ ‐ !A Map as A, alert on masquerade ‐ A ‐ A Map as A ‐ A !A A Report anomaly, Map as A ‐ A ‐ ‐ Map as A

Human/Machine Communication • AIs don’t raise alerts on normal behavior – Reports are for that • AIs raise alerts on ac(onable anomalies – Provide diagnosBcs, inventory and history • AIs raise alerts on conflicts – Rely on the user to resolve the conflict and move on

User Controls • Everyone controls domains: sip, dip, sport, dport, Bme and protocol value – Domains have wildcards • Agents mark or subscribe domain: – Mark: this happened in the past and I can infer what happened • For AI’s, Mark indicates “I recognize this” – Subscribe: I will control and worry about this from now onto the future • For users, subscripBon says “This is my territory”

Models • AIs don’t output flow data – They mark off some segment of flows and group them together as a separate structure • For example: – A “scan” – A “BiHorrent Network” – A “Surfing session” • These models, in turn, have quesBons and structures that are more relevant to analysis – Who did the scan hit? – How much traffic was transferred in BT?

A Really Ugly UI

What that is • Certainly not a testament to my visualizaBon skills • Prototype using two systems – Simple scan detecBon – BitTorrent detecBon • The black is what’s lei

Problems • As I said, this is all very rough right now • Problems remaining: – ApplicaBon/Knowledge Layering – Model Taxonomy – User experience – Backtracking – Metadata

Application Layering • Make judgments at different levels of the stack • Different inferenBal resoluBon: – Does this IP exist? – Does this IP communicate? – What does this IP communciate? – Is this IP significant in its network?

Model Taxonomy Flood Service RouBng Scanning BackscaHer Xfer ChaHy Worm DDoS • Models replace flows with more compact descripBons of phenomena – E.g., “A Scan” is a list of the scanned IP’s, and anything that responded • Trying to begin with broad behavioral descripBons and move down from there

Unsolved Problems • Weirdometer metrics – Flows/IPS/bytes/IP pairs? • Backtracking – How much do we want to see flow vs. model vs. map? • Response Mechanism – What can a CSIRT do? • Meta metrics – How much of the traffic do we understand?

Integrating Human and Synthetic Reasoning Via Model-Based Analysis - PowerPoint PPT Presentation

Integrating Human and Synthetic Reasoning Via Model-Based Analysis Introduction and Explanation This is an experimental idea and very rough Glue together very tame AI and user interface through some fault trees To capture knowledge

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

Synthetic Biology Considerations in Synthetic Biology Considerations in Synthetic Biology

Probabilistic Reasoning; Probabilistic Reasoning; Network-based reasoning Network-based

Evidential and Causal Reasoning Much reasoning in AI can be seen as evidential reasoning ,

Modular Synthetic Receptor System Interfaced with Nano Breadboard Synthetic receptor scheme

CHAPTER-4 1 LOGIC AND REASONING ! Knowledge and ! Reasoning in Knowledge- Reasoning Based

Synthetic Biology and Rational Design Keith Shearwin University of Adelaide Synthetic biology

DNA-based synthetic DNA-based synthetic lectins lectins to inv to investigate estigate clathrin

SECTION 1: Introductions Code Reasoning Forward Reasoning CODE REASONING +

Synthetic EDCs at the present human exposure ARE NO RISK for human health 21.03.18 Prof. Dan

Knowledge Graph Reasoning CSCI 699: ML4Know Instructor: Xiang Ren USC Computer Science Overview

On first-order model-based reasoning Maria Paola Bonacina Dipartimento di Informatica Universit`

Reasoning and Meta-reasoning Sonia Marin IT-University of Copenhagen, Denmark 85-211

Reasoning Skills Alicia Foy Gifted Specialist 3/21/19 1 www.FLDOE.org Objectives Student

Model-based Diagnosis of HVAC Systems Peter Struss Tech. Univ. of Munich University College Cork

Model-based reasoning DPLL(+ T ): algorithmic reasoner + first-order prover DPLL(+ T ) +

Drone Net Using Tegra for Multi-Spectral Detection and Tracking in Shared Air Space

REGional Workshop R10 Activity Management Development (LA7, LA8, & LA9) 1 Workshop

Digging Deeper: Considering researchers work on complexity in past science assessments and the

Exploring the Scala Macro System for Compile Time Model-Based Generation of Statically Type-Safe

but some are useful George E. P. Box 2 1 15/10/2013 Why validate? ECMs used in many ways,

L OW E MI SSI ONS DE VE L OPME NT ST RAT E GI E S (L E DS) MODE L L I NG

Sambuz

Useful Links

Newsletter

Mail Us