VAST: Visibility Across Space and Time Architecture & Usage Matthias Vallentin matthias@bro.org BroCon August 19, 2014
2 / 27
3 / 27
4 / 27
Outline 1. Introduction: VAST 2. Architecture Overview Example Workflow: Query Data Model Implementation 3. Using VAST 4. Demo 5 / 27
VAST: Visibility Across Space and Time VAST 10.0.0.1 10.0.0.254 53/udp 10.0.0.2 10.0.0.254 80/tcp A unified platform for network forensics Data Goals ◮ Interactivity ◮ Sub-second response times VAST ◮ Iterative query refinement ◮ Scalability ◮ Scale with data & number of nodes ◮ Sustain high & continuous input rates Queries ◮ Strong and rich typing ◮ High-level types and operations ◮ Type safety 6 / 27
VAST & Bro Bro ◮ Generates rich-typed logs representing summary of activity → How to process these huge piles of logs? ◮ Fine-grained events exist during runtime only → Make ephemeral events persistent? VAST: Visibility Across Space and Time ◮ Visibility across Space ◮ Unified data model: same expressiveness as Bro ◮ Combine host-based and network-based activity ◮ Visibility across Time ◮ Historical queries: retrieve data from the past ◮ Live queries: get notified when new data matches query 7 / 27
VAST & Big Data Analytics MapReduce (Hadoop) Batch-oriented processing: full scan of data + Expressive: no restriction on algorithms - Speed & Interactivity: full scan for each query In-memory Cluster Computing (Spark) Load full data set into memory and then run query + Speed & Interactivity: fast on arbitrary queries over working set - Thrashing when working set too large Distributed Indexing (VAST) Distributed building and querying of bitmap indexes + Fast: only access space-efficient indexes + Caching of index hits enables iterative analyses - Reduced computational model (e.g., no joins in query language) 8 / 27
Outline 1. Introduction: VAST 2. Architecture Overview Example Workflow: Query Data Model Implementation 3. Using VAST 4. Demo 8 / 27
Outline 1. Introduction: VAST 2. Architecture Overview Example Workflow: Query Data Model Implementation 3. Using VAST 4. Demo 8 / 27
High-Level Architecture of VAST Import 10.0.0.1 10.0.0.254 53/udp ◮ Unified data model 10.0.0.2 10.0.0.254 80/tcp Import ◮ Sources generate events Archive ◮ Stores raw data as events ◮ Compressed chunks & segments Index Archive Index ◮ Secondary indexes into archive ◮ Horizontally partitioned Export ◮ Interactive query console Export ◮ JSON/Bro output 9 / 27
Query Language Boolean Expressions Examples ◮ Conjunctions && ◮ A && B || !(C && D) ◮ Disjunctions || ◮ orig_h == 10.0.0.1 && &time < now - 2h ◮ Negations ! ◮ &type == "conn" || :string +] "foo" ◮ Predicates ◮ duration > 60s && service == "tcp" ◮ LHS op RHS ◮ (expr) LHS: Extractors Relational Operators RHS: Value ◮ &type ◮ < , <= , == , >= , > ◮ T , F ◮ &time ◮ +42 , 1337 , 3.14 ◮ in , ni , [+ , +] ◮ x.y.z.arg ◮ !in , !ni , [- , -] ◮ "foo" ◮ :type ◮ 10.0.0.0/8 ◮ ~ , !~ ◮ 80/tcp , 53/? ◮ {1, 2, 3} 10 / 27
Outline 1. Introduction: VAST 2. Architecture Overview Example Workflow: Query Data Model Implementation 3. Using VAST 4. Demo 10 / 27
Query 11 / 27 Client
Query client 1. Send query string to search Search 11 / 27 Client
Query client Index 1. Send query string to search Partitions Indexers Search src == 10.0.0.1 && port == 53/udp 11 / 27 Client
Query client Index 1. Send query string to search search Partitions 1. Parse and validate query string 2. Spawn dedicated query Indexers Search Query src == 10.0.0.1 && port == 53/udp 11 / 27 Client
Query client Index 1. Send query string to search 2. Receive query actor search Partitions 1. Parse and validate query string 2. Spawn dedicated query Indexers Search Query src == 10.0.0.1 && port == 53/udp 11 / 27 Client
Query client Index 1. Send query string to search 2. Receive query actor src == 10.0.0.1 && port == 53/udp search Partitions 1. Parse and validate query string 2. Spawn dedicated query 3. Forward query to index Indexers Search Query 11 / 27 Client
Query client Index 1. Send query string to search 2. Receive query actor search Partitions 1. Parse and validate query string src == 10.0.0.1 port == 53/udp 2. Spawn dedicated query 3. Forward query to index Indexers Search Query 11 / 27 Client
Query client Index 1. Send query string to search 2. Receive query actor search Partitions 1. Parse and validate query string 2. Spawn dedicated query 3. Forward query to index Indexers Search Query 11 / 27 Client
Query client Index 1. Send query string to search 2. Receive query actor search Partitions 1. Parse and validate query string 2. Spawn dedicated query 3. Forward query to index Indexers Search Query 11 / 27 Client
Query client Index 1. Send query string to search 2. Receive query actor search Partitions 1. Parse and validate query string 2. Spawn dedicated query 3. Forward query to index Indexers Search Query 11 / 27 Client
Query client Index Archive 1. Send query string to search 2. Receive query actor search Partitions 1. Parse and validate query string 2. Spawn dedicated query 3. Forward query to index Indexers Search Query query 1. Receive hits from index 11 / 27 Client
Query client Index Archive 1. Send query string to search 2. Receive query actor search Partitions 1. Parse and validate query string 2. Spawn dedicated query 3. Forward query to index Indexers Search Query query 1. Receive hits from index 2. Ask archive for segments 11 / 27 Client
Query client Index Archive 1. Send query string to search 2. Receive query actor search Partitions 1. Parse and validate query string 2. Spawn dedicated query 3. Forward query to index Indexers Search Query query 1. Receive hits from index 2. Ask archive for segments 3. Extract events, check candidates 11 / 27 Client
Query client Index Archive 1. Send query string to search 2. Receive query actor 3. Extract results from query search Partitions 1. Parse and validate query string 2. Spawn dedicated query 3. Forward query to index Indexers Search Query query 1. Receive hits from index 2. Ask archive for segments 3. Extract events, check candidates 4. Send results to client 11 / 27 Client
Outline 1. Introduction: VAST 2. Architecture Overview Example Workflow: Query Data Model Implementation 3. Using VAST 4. Demo 11 / 27
VAST Architecture 10.0.0.1 10.0.0.254 53/udp 10.0.0.2 10.0.0.254 80/tcp Import Archive Index Export 12 / 27
Data Representation Terminology ◮ Data : C ++ structures (e.g., 64ull ) ◮ Type : interpretation of data (e.g., count ) Event ID TYPE TIME “foo” 3.14 7 ms ◮ Value : data + type ◮ Event : value + meta data ◮ Type with a unique name (e.g., conn ) Chunk ◮ Meta data META ◮ A timestamp ◮ A unique ID i where i ∈ [1 , 2 64 − 1) Segment ◮ Schema : collection of event types ◮ Chunk : serialized & compressed events META ◮ Meta data: schema + time range + IDs ◮ Fixed number of events, variable size ◮ Segment : sequence of chunks ◮ Meta data: union of chunk meta data ◮ Fixed size, variable number of chunks 13 / 27
Types: Interpretation of Data TYPE bool string vector set int regex record count address TYPE TYPE … double subnet field 1 field n table time range port TYPE TYPE KEY VALUE time point none container types compound types basic types recursive types 14 / 27
VAST Architecture 10.0.0.1 10.0.0.254 53/udp 10.0.0.2 10.0.0.254 80/tcp Import Archive Index Export 15 / 27
Index Hits: Sets of Events 0 . Bitvector : sets of events 0 ◮ Query result ≡ set of event IDs from [1 , 2 64 − 1) 0 0 0 → Model as bit vector : [4 , 7 , 8] = 0000100110 · · · 1 = 1 0 0 Bitstream : encoded append-only sequence of bits 0 0 ◮ EWAH (no patents unlike WAH, PLWAH, COMPAX) 1 . ◮ Compact, space-efficient representation 2 64 − 1 ◮ Bitwise operations do not require decoding Data Bitmap B 0 B 1 B 2 B 3 2 0 0 1 0 Bitmap : maps values to bitstreams 1 0 1 0 0 2 0 0 1 0 ◮ push_back(T x ) : append value x of type T 0 1 0 0 0 0 1 0 0 0 ◮ lookup(T x , Op ◦ ) : get bitstream for x under ◦ 1 0 1 0 0 3 0 0 0 1 16 / 27
Composing Results via Bitwise Operations Combining Predicates ◮ Query Q = X ∧ Y ∧ Z ◮ x = 1 . 2 . 3 . 4 ∧ y < 42 ∧ z ∈ ” foo ” ◮ Bitmap index lookup yields X → B 1 , Y → B 2 , and Z → B 3 ◮ Result R = B 1 & B 2 & B 3 B 1 B 2 B 3 R & & = 17 / 27
Outline 1. Introduction: VAST 2. Architecture Overview Example Workflow: Query Data Model Implementation 3. Using VAST 4. Demo 17 / 27
Recommend
More recommend