19
This paper is included in the Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16). March 16–18, 2016 • Santa Clara, CA, USA ISBN 978-1-931971-29-4 Open access to the Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) is sponsored by USENIX. VAST: A Unified Platform for Interactive Network Forensics Matthias Vallentin, University of California, Berkeley; Vern Paxson, University of California, Berkeley, and International Computer Science Institute; Robin Sommer, International Computer Science Institute and Lawrence Berkeley National Laboratory https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/vallentin

VAST: A Unified Platform for Interactive Network Forensics

  • Upload
    vanbao

  • View
    227

  • Download
    1

Embed Size (px)

Citation preview

Page 1: VAST: A Unified Platform for Interactive Network Forensics

This paper is included in the Proceedings of the 13th USENIX Symposium on Networked Systems

Design and Implementation (NSDI ’16).March 16–18, 2016 • Santa Clara, CA, USA

ISBN 978-1-931971-29-4

Open access to the Proceedings of the 13th USENIX Symposium on

Networked Systems Design and Implementation (NSDI ’16)

is sponsored by USENIX.

VAST: A Unified Platform for Interactive Network Forensics

Matthias Vallentin, University of California, Berkeley; Vern Paxson, University of California, Berkeley, and International Computer Science Institute; Robin Sommer, International

Computer Science Institute and Lawrence Berkeley National Laboratory

https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/vallentin

Page 2: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 345

VAST: A Unified Platform for Interactive Network Forensics

Matthias [email protected]

UC Berkeley

Vern [email protected]

UC Berkeley / ICSI

Robin [email protected]

ICSI / LBNL

AbstractNetwork forensics and incident response play a vital role insite operations, but for large networks can pose daunting dif-ficulties to cope with the ever-growing volume of activity andresulting logs. On the one hand, logging sources can generatetens of thousands of events per second, which a system support-ing comprehensive forensics must somehow continually ingest.On the other hand, operators greatly benefit from interactiveexploration of disparate types of activity when analyzing anincident.

In this paper, we present the design, implementation, andevaluation of VAST (Visibility Across Space and Time), a dis-tributed platform for high-performance network forensics andincident response that provides both continuous ingestion ofvoluminous event streams and interactive query performance.VAST leverages a native implementation of the actor modelto scale both intra-machine across available CPU cores, andinter-machine over a cluster of commodity systems.

1 Introduction

Security incidents often leave network operators scram-bling to ferret out answers to key questions: How did theattackers get in? What did they do once inside? Wheredid they come from? What activity patterns serve as indi-cators reflecting their presence? How do we prevent thisattack in the future?

Operators can only answer such questions by drawingupon high-quality logs of past activity recorded over ex-tended time. Incident analysis often starts with a narrowpiece of intelligence, typically a local system exhibitingquestionable behavior, or a report from another site de-scribing an attack they detected. The analyst then triesto locate the described behavior by examining logs ofpast activity, often cross-correlating information of dif-ferent types to build up additional context. Frequently,this process in turn produces new leads to explore itera-tively (“peeling the onion”), continuing and expandinguntil ultimately the analyst converges on as complete of

an understanding of the incident as they can extract fromthe available information.

This process, however, remains manual and time-consuming, as no single storage system efficiently inte-grates the disparate sources of data (e.g., NIDS, firewalls,NetFlow data, service logs, packet traces) that investiga-tions often involve. While standard SIEM systems suchas Splunk aggregate logs from different sources into asingle database, their data models omit crucial semantics,and they struggle to scale to the data rates that large-scaleenvironments require.

Based on these needs, and drawing upon our years ofexperience working closely with operational security staff,we formulate three key goals for a system supporting theforensic process [2]:

Interactivity. The potential damage that an attackercan wreak inside an organization grows quickly as a func-tion of time, making fast detection and containment a vitalconcern. Further, a system’s interactivity greatly affectsthe productivity of an analyst [16]. We thus desire repliesto queries to begin arriving within a second or so.

Scalability. The volume of data to archive and processexceeds the capacity of single-machine deployments. Afundamental challenge lies in devising a distributed archi-tecture that scales with the number of nodes in the system,as well as maximally utilizes the cores available in eachnode.

Expressiveness. Representing arbitrary activity re-quires a richly typed data model to avoid losing domain-specific semantics when importing data. Similarly, thesystem should expose a high-level query language to en-able analysts to work within their domain, rather thanspending time translating their workflows to lower-levelsystem idiosyncrasies.

In this work, we develop a system for network foren-sics and incident response that aims to achieve these goals.We present the design and implementation of VAST (Visi-bility Across Space and Time), a unified storage platformthat provides: (i) an expressive data model to capture de-

Page 3: VAST: A Unified Platform for Interactive Network Forensics

346 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

scriptions of various forms of activity; (ii) the capabilityto use a single, declarative query language to drive bothpost-facto analyses and detection of future activity; and(iii) the scalability to support archiving and querying ofnot just log files, but a network’s entire activity, fromhigh-level IDS alerts to raw packets from the wire.

The key to VAST’s power concerns providing the nec-essary performance to support both very high data vol-umes (100,000s of events/sec) and interactive queriesagainst extensive historical data. VAST features an en-tirely asynchronous architecture designed in terms of theactor model [25], a message-passing abstraction for con-current systems, to fully utilize all available CPU and stor-age resources, and to transparently scale from single-nodeto cluster deployments. To support interactive queries,VAST relies extensively on bitmap indexes that we adaptto support its expressive query language.

Our evaluations show that on a single machine VASTcan ingest 100 K events/sec for events with 20 fields, re-flecting an input rate of 2 M values/sec. Moreover, dis-tributed ingestion allows for spreading the load over nu-merous system entry points. Users receive a “taste” oftheir results typically within 1 sec. This first subset helpsusers to quickly triage the relevance of the result andmove on with the analysis by aborting or modifying thecurrent query. We also show that VAST, with its unifiedapproach, can effectively serve as a high-volume packetbulk recorder.

We structure the rest of the paper as follows. In §2 wesummarize related work. We present the architecture ofVAST in §3 and our implementation in §4. In §5 weevaluate VAST and assess its aptness for the domain.Finally, we conclude in §6.

2 Related Work

Data Warehouses. VAST receives read-only data forarchiving, similar to a data warehouse. Dremel [40] storessemi-structured data in situ and offers an SQL interfacefor ad-hoc queries with interactive response. Dremel’squery executor forms a tree structure where intermediatenodes aggregate results from their children. VAST gen-eralizes this approach with its actor-based architecture toboth data import and query execution.

Succinct [1] also stores data in situ, but in compressedflat files that do not require decompression when searched.Internally, Succinct operates on suffix trees and thereforesupports point, wildcard, and lexicographical lookup onstrings. Other data types (e.g., arithmetic, compound)require transformations into strings to maintain a lexico-graphical ordering. Succinct exhibits high preprocessingcosts and modest sequential throughput, rendering it inaptfor high-volume scenarios. When the working set fits in

memory, Succinct offers competitive performance, butnot when primarily executing off the filesystem.

ElasticSearch [17] is a distributed, document-orienteddatabase built on top of Apache Lucene [36], whichprovides a full-text inverted index. ElasticSearch hidesLucene behind a RESTful API and a scheme to partitiondata over a cluster of machines. VAST uses a similardeployment model, but instead provides a semi-structureddata model and internally relies on different indexing tech-nology more amenable to hit composition and iterativerefinements.

Network Forensics. NET-Fli [20] is a single-machineNetFlow indexer relying on secondary bitmap indexes.The authors also present a promising (though patented)bit vector encoding scheme, COMPAX. Instead of hand-optimizing a system for NetFlow records, VAST offers ageneric data model. The separation between base data andindexes has also found application in similar systems [51],with the difference of relying on a column store for thebase data instead of a key-value store. The existing sys-tems show how one can design a single-machine archi-tectures, whereas we present a design that transparentlyscales from single-machine to cluster deployments.

The Time Machine [38] records raw network trafficin PCAP format and builds indexes for a limited set ofpacket header fields. To cope with large traffic volumes,the Time Machine employs a cutoff to cease recording aconnection’s packets after they exceed a size threshold.The system hard-codes the use of four tree indexes overthe connection tuple, and cannot reuse its indexes acrossrestarts. Similarly, NetStore [22], pcapIndex [19], andFloSIS [33] offer custom architectures geared specificallytowards flow archival. VAST represents a superset of bulkpacket recorders: it supports the same cutoff functionality,and packets simply constitute a particular event type inVAST’s data model.

The GRR Rapid Response framework [11] enables liveforensic investigations where analysts push out queriesto agents running on end-hosts. A NoSQL store accu-mulates the query results in a central location, but GRRdoes not feature a persistence component to comprehen-sively archive end-host activity over long time periods.VAST can serve as a long-term storage backend for host-level data, which allows analysts to query both host andnetwork events in a unified fashion.

Finally, existing aggregators such as Splunk [49] oper-ate on unstructured, high-level logs that lack the semanticsto support typed queries, and are not designed for stor-ing data at the massive volumes required by lower-levelrepresentations of activity. Splunk in particular cannotdynamically adapt its use of CPU resources to change inworkload.

Distributed Computing. The MapReduce [14] execu-tion model enables arbitrary computation, distributed over

Page 4: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 347

a cluster of machines. While generic, MapReduce cannotdeliver interactive response times on large datasets dueto the full data scan performed for each job. Spark [58]overcomes this limitation with a distributed in-memorycluster computing model where data is efficiently sharedbetween stages of computation. However, for rapid re-sponse times, the entire dataset must reside preloaded inmemory. But analysts can rarely define a working seta priori, especially for spatially distant data, which canresult in thrashing due to frequent loading and evictingof working sets from memory. We envision VAST goinghand-in-hand with Hadoop or Spark, where VAST quicklyfinds a tractable working set and then hands it off to asystem well-suited for more complex analysis.

3 Architecture

To support flexible deployments on large-scale clusters,as well as single machines while retaining a high de-gree of concurrency, we designed VAST in terms ofthe actor model [25]. In this model, concurrent en-tities (“actors”) execute independently and in parallel,while providing local fault isolation and recovery. Usingunique, location-independent addresses, actors communi-cate asynchronously solely via message passing. They donot share state, which prevents data races by design.

A related model of computation is communicating se-quential processes (CSP) [26] in which processes commu-nicate via synchronous channels. As a result, the senderblocks until the receiver has processed the message. Thiscreates a tighter coupling compared to the asynchronousfire-and-forget semantics of actor messaging. CSP em-phasizes the channel while the actor model its endpoints:actors have a location-independent address whereas pro-cesses remain anonymous. In the context of distributedsystems, the focus on endpoints provides a powerful ad-vantage: the actor model contains a flexible failure propa-gation model based on monitors, links, and hierarchicalsupervision trees [4]. These primitives allow for isolatingfailure domains and implementing local recovery strate-gies, and thus constitute an essential capability at scale,where component failure is the norm rather than the ex-ception. For these reasons, we deem the actor model asuperior fit for our requirements.

We first present the underlying data model and theassociated query language (§3.1), and then VAST’s com-ponents and their structure (§3.2).

3.1 Data Model

VAST’s data model consists of types, which define thephysical interpretation of data. A type’s signature in-cludes a type name and type attributes. A value combines

Boolean Expression Symbol

Conjunction E1 && E2Disjunction E1 || E2Negation / Group ! E / (E)Predicate LHS ◦ RHS

Relational Operator ◦ Symbol

Arithmetic <, <=, ==, !=, >=, >

Membership in, !in

Extractor (LHS/RHS) Semantics

:T All values having type Tx.y.z Value according to schema&key Event meta data

Types Examples

bool T, Fint / count / real +42 / 42 / -4.2

duration / time 10ms / 2014-09-25

string "foo", "b\x2Ar"addr 10.0.0.1, ::1subnet 192.168.0.0/24

port 80/tcp, 53/udp, 8/icmpvector<T> / set<T> [x, x, x] / {x, x, x}

table<T,U> {(k,v), ..}

Table 1: VAST’s query language.

a type with a data instance. An event is a value with addi-tional metadata, such as a timestamp, a unique identifier,and arbitrary key-value pairs. A schema describes theaccess structure of one or more types.

VAST’s type system includes basic types to represent asingle value (booleans, signed/unsigned integers, floating-point, times and durations, strings, IPv4 and IPv6 ad-dresses, subnets, ports), container types for bundled val-ues (vectors, sets, tables), and compound types to createsequenced structures (records), where each named fieldholds a value (or nil if absent).

Query Language. VAST’s query language supportsfiltering data according to boolean algebra. Table 1 liststhe key syntactic elements. A query expression con-sists of one or more predicates connected with booleanAND/OR/NOT. A predicate has the form LHS ◦ RHS,with ◦ representing a binary relational operator. VASTsupports arithmetic and membership operators. At leastone side of the operator typically must be an extractor,which specifies the lookup aspect for the value, as follows.

Schema extractors refer to particular values in theschema. For example, in the predicate http.method

== "POST", http.method is a schema extractor, and"POST" is the value to match. Meta extractors refer toevent metadata, such as &name to reference the eventname and &time the event timestamp. For example, thepredicate &time > now - 1d selects all events within

Page 5: VAST: A Unified Platform for Interactive Network Forensics

348 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

importer

archive

index

exporter

node

source sink

(a) Single-machine deployment. The NODE actoraccommodates all key system actors.

(b) Cluster deployment. Multiple NODEs peer to forma cluster, which users access transparently.

Figure 1: VAST system architecture.

the last 24 hours. Type extractors leverage the strict typingin VAST to perform queries over all values having a giventype. For example, the predicate :addr in 10.0.0.0/8

applies to all IP addresses (any VAST value or record fieldwith type addr).

To represent a log file having a fixed number ofcolumns, VAST automatically transforms each line into arecord whose fields correspond to the columns. VASTenforces type safety over queries and only forwards themto those index partitions with a compatible schema.

3.2 ComponentsFrom a high-level view, VAST consists of four key com-ponents: (i) import to parse data from a source into eventsand assign them a globally unique ID, (ii) archive to storecompressed events and provide a key-value interface forretrieval, (iii) index to accelerate queries by keeping a par-titioned secondary index referencing events in the archive,and (iv) export to spawn queries and relay them to sinksof various output formats. Each component consists ofmultiple actors, which can execute all inside the sameprocess, across separate processes on the same host, or ondifferent machines.

A NODE1 represents a container for other actors. Typi-cally, a NODE maps to a single OS-level process. Userscan spawn components all within a NODE (Figure 1(a))or spread them out over multiple NODEs (Figure 1(b)) toharness available hardware and network resources in themost effective way. NODEs can peer to form a cluster,using Raft [42] for achieving distributed consensus overglobal state accessible through a key-value store interfacesimilar to etcd [18]. We refer to this globally replicatedstate as meta store. Each NODE has access to its own metastore instance which performs the fault-tolerant distribu-tion of values. A typical cluster deployment exhibits ashared-nothing architecture, where each NODE constitutesa fully independent system by itself. In the following we

1We refer to particular actors in a SMALL CAPS font style.

discuss each of the four components in more detail.Import. Data enters VAST via SOURCEs, each of

which can parse various input formats. SOURCEs producebatches of events and relay them to IMPORTER, wherethey receive a unique monotone ID. Upon receiving theID range and assigning them to the events, IMPORTERrelays the batch to ARCHIVE and INDEX.

Each event represents a unique description of activitywhich analysts need to be able to unambiguously refer-ence. This requires each event to have a unique iden-tifier (ID) as meta data independent of its value. TheID also establishes a link between the archive and indexcomponent: an index lookup yields a set of IDs, each ofwhich identify a single event in the archive. This yieldsthe following requirements on ID generation: (i) 64 bitsto represent a sufficiently large number of events, but notlarger since modern processors efficiently operate on 64-bit integers, (ii) monotonicity because the indexes we useare append-only data structures, and (iii) ID generationshould also work in distributed setups.

The sequentiality requirement precludes approachesinvolving randomness, such as universally unique identi-fiers [32]. In fact, any random ID generation algorithmexperiences collisions after ≈

√N IDs due to the birthday

paradox. In combination with the 64-bit requirement, thiswould degenerate the effective space from 264 to only√

264 = 232 IDs. Our approach uses a single distributedcounter in the meta store. Requesting a range of N IDstranslates to incrementing this counter by N, which re-turns a pair with the old and new counter value [o,n)denoting the allocated half-open range with n− o ≤ NIDs. To avoid high latencies from frequent interactionwith the meta store, IMPORTER keeps a local cache of IDsand only replenishes it when running low of IDs.

Archive. The ARCHIVE receives events from IM-PORTER and stores them compressed on the filesystem.2

To avoid frequent I/O operations for small amounts of

2VAST supports LZ4 [37] and Snappy [48] for compression; bothtrade higher speeds for lower compression ratios.

Page 6: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 349

data, ARCHIVE keeps event batches in a fixed-size mem-ory buffer (by default 128 MB) before writing them to thefilesystem. The buffer (which we term segment) keepsbatches sorted by the ID of their first event. Becauseevents have continuous IDs within a batch, this processensures strictly monotonic IDs within a segment.

ARCHIVE exposes a key-value interface: queried with aspecific ID, it returns a batch containing the ID. The alter-native, returning the single matching event, only works forsmall requests, but would quickly bottleneck the messag-ing subsystem for moderate request volumes. Internally,ARCHIVE operates at the granularity of segments to fur-ther group event batches into larger blocks suitable forsequential filesystem I/O. ARCHIVE keeps an LRU cacheof a configurable number of segments in memory. In thefuture, we plan to store data in a format also shareablewith other applications, e.g., HDFS [46].

Index. By itself, ARCHIVE does not provide efficientaccess to data, since extracting events with specific prop-erties would require a full scan of the archive. Therefore,VAST maintains a comprehensive set of secondary in-dexes, which we divide up in horizontal partitions as aunit of data scaling.

We chose bitmap indexes [41] because they provide anexcellent fit for the domain. First, appending new valuesonly requires time linear in the number of values in thenew data, which is optimal and yields deterministic per-formance. Second, bitmap indexes have space-efficientrepresentations that enable us to carry out bitwise oper-ations without expanding them. Third, bitmap indexescompose efficiently: intermediate results have the formof bit vectors and combining them with logical operationtranslates into inexpensive bitwise operations. In §4.2 wedescribe bitmap indexes in more detail.

VAST’s index consists of horizontal PARTITIONs, eachof which manages an independent set of bitmap indexes.An active partition is mutable and can incorporate events,whereas a passive partition is an immutable unit whichthe scheduler manages during query processing. INDEXrelays each arriving batch of events to the currently activePARTITION, which spawns a set of INDEXERs, one perevent type. An INDEXER may internally further helperactors, e.g., record INDEXERs use one helper per field.In comparison to a relational database, this architectureprovides a fine-grained concurrent equivalent to tables(INDEXERs) and their attributes. PARTITION relays eachbatch concurrently to all INDEXERs, each of which pro-cesses only the subset matching its type. The copy-on-write message passing semantics of the actor model im-plementation makes this an efficient operation (INDEXERsonly need read-access to the events) while providing ahigh degree of parallelism.

Export. While the import component handles eventingestion, the export component deals with retrieving

subscribe

importer

archive

index

sinkexporter

relay event stream

filterresults

source

assignIDs

Figure 2: Continuous query architecture.

events through queries. There exists one EXPORTER perquery, who sends its result to SINKs for rendering. VASTcurrently includes ASCII, JSON, PCAP [35], Bro [43],and Kafka [29] SINKs.

For historical queries over existing data, INDEX ana-lyzes the abstract syntax tree (AST) of the query expres-sion to determine the relevant PARTITIONs and constructsa schedule to process them sequentially. This processbegins with partition pruning: the selection of relevantpartitions for a query. To this end, VAST uses a “metaindex,” consisting of event meta data and event type in-formation. The meta index has different requirementsthan the event indexes within a partition: it must tolerateimmense throughput and update rates. We currently asso-ciate with each partition the time period its events spansand record the entire partition schema.

After pruning, INDEX relays the query to the remain-ing PARTITIONs, which then deconstructs the AST intoits predicates to match them with INDEXERs. If neces-sary, PARTITION loads the INDEXER from the filesysteminto memory. Upon performing the predicate lookup, IN-DEXERs send their hits back to their PARTITION, wherethey trigger a re-evaluation of the query expression. If theevaluation yields a change, PARTITION relays the deltahits from the last evaluation up to INDEX, which in turnforwards them to the subscribed EXPORTERs. As soon asthe first hits arrive at an EXPORTER, extracting events canbegin by asking ARCHIVE. When EXPORTER receives ananswer in the form of a batch of events, it concurrentlyprefetches another batch proceeding with the “candidatecheck” to filter out false positives, which may be neces-sary for some indexes types (e.g., when using binningfor floating point values, see §4.2). Finally, EXPORTERsends the matching results back to SINK. The processterminates after EXPORTER has no more unprocessed hitsto extract.

VAST also supports continuous queries to subscribe tonew results as they arrive. As we illustrate in Figure 2,EXPORTER can subscribe to a copy of the full incomingevent feed at IMPORTER to filter out those events matchingthe query expression. Since IMPORTER and EXPORTERlive in the same process this operation does not copy any

Page 7: VAST: A Unified Platform for Interactive Network Forensics

350 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

data. In fact, a continuous query is effectively a candidatecheck, with the only difference that events now come fromIMPORTER instead of ARCHIVE. The main challengelies in efficiently performing this check. To this end,EXPORTER derives from the AST of the query expressiona visitor performing the candidate check. EXPORTERconstructs one visitor per unique event type on the fly,and dispatches to the relevant visitor efficiently througha hash table. For example, an expression may includethe predicate :addr in 10.1.1.0/24. If the currentevent has no data of type addr, the visitor discards thepredicate.

3.3 Distribution

When the amount of data exceeds the capacity of a sin-gle machine, VAST can distribute the load over multiplemachines, as we show in Figure 1(b) for 4 peering NODEs.

Cluster Deployment. In case a SOURCE producesevents at a rate that overloads a single system, it canload-balance its events over all NODEs. Alternatively,users can pin SOURCEs to a specific set of NODEs (e.g.,if certain data should land on a more powerful system).The dual occurs during querying: the user decides onwhich NODEs to spawn an EXPORTER, each of which willrelay its events to the same SINK. In the common caseof round-robin load-balancing of input data, a user queryresults in spawning one EXPORTER on all NODEs.

Fault Tolerance. Coping with NODE failure concernstwo aspects: data loss and query execution. To avoidpermanent data loss, we assume a reliable, distributedfilesystem, e.g., HDFS. Data loss can still occur duringingestion, when ARCHIVE and INDEX have not yet writtentheir data to the filesystem. VAST minimizes this risk bywriting data out as quickly as possible. When a NODEfails during query execution, another takes over respon-sibility of its persistent ARCHIVE and INDEX data, andre-executes the query on its behalf. EXPORTER can peri-odically checkpoint its state (consisting of index hits andevent identifiers of results that have passed the candidatecheck) to reduce the amount of duplicate results.

4 Implementation

We now present some implementation highlights of thepreviously discussed architecture with a focus on ac-tors (§4.1), bitmap indexes (§4.2), and queries (§4.3).

4.1 Actors

We implemented the components discussed in §3.2 withthe C++ Actor Framework (CAF), a native implementa-tion of the actor model [10].

A

B

C

D

E

F

G

H

I

J

K

L

Figure 3: Flow-control implemented as back-pressure:overloaded nodes J, I, and E propagate their load statusupstream such that data sources can throttle their sendingrate.

Performance. CAF offers a high-performance runtimewith a type-safe messaging layer that exhibits minimalmemory overhead. The runtime can distribute actors dy-namically, within the same process, across the network,or to GPUs. Within the same process, CAF uses copy-on-write semantics to enable efficient messaging. CAF’snetworking layer handles actor communication and rout-ing transparent to the user: the runtime decides whethersending a message translates into a local enqueue opera-tion into the recipient’s mailbox, or whether a middlemanserializes the message and ships it across the network.

CAF’s copy-on-write messaging proves particularlyvaluable during the indexing process, where PARTITIONsends the same set of events to each INDEXER withoutincurring a copy of the data. Although each INDEXERsees the full data feed, this method still runs faster thanchopping up the input sequence and incurring extra mem-ory allocations. Such a computation style resembles GPUprogramming where we make available data “globally”and each execution context picks its relevant subset.

Flow Control. CAF operates entirely asynchronouslyand does not block: immediately after sending a messagean actor can dequeue a message from its mailbox. Thismakes it easy for data producers to overwhelm down-stream nodes if not equipped with enough processing ca-pacity. A naive reaction entails provisioning more buffercapacity at the edge so that the system can receive moremessages. But without including the true bottlenecks inthe decision, buffer bloat [31] only worsens the situationby introducing higher latency and jitter. Flow controlattempts to prevent this scenario from happening: back-pressure signals overload back to the sender, load shed-ding reduces the accumulating tasks at the bottleneck, andtimeouts at various stages in the data flow graph boundthe maximum response time.

CAF currently does not support flow control. Duringdata import, data producers (SOURCEs) can easily over-load downstream components (ARCHIVE and INDEX). Tothrottle the sending rate of SOURCEs, we implemented asimple back-pressure mechanism: when an actor becomesoverloaded, it sends an overload message to all its reg-istered upstream components, which either propagate it

Page 8: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 351

further, or if being the data producer themselves, throttletheir sending rate (see Figure 3). When an overloadedactor becomes underloaded again, it sends an underloadmessage to signal upstream senders that it can processdata again. This basic mechanism works well to preventsystem crashes due to overloads, but does not help at ab-sorb peak input rates. To prevent data loss of non-critical,latency-insensitive events, queueing brokers that spill tothe filesystem (e.g., Kafka [29]) at the edges help smooth-ing the data arrival rate to facilitate resource provisioningfor the average case. We are currently working with theCAF developers to integrate various forms of flow controldeeper into the runtime.

4.2 Composable IndexingVAST exclusively relies on bitmap indexes to acceleratequeries. We begin in §4.2.1 with briefly summarizingexisting work, which we then rely on in §4.2.2 to definecomposable higher-level index types.

4.2.1 A Unified Indexing Framework

Traditional tree and hash indexes [6, 30] provide a quickentry point into base data, but they do not compose effi-ciently for higher-dimensional search queries. Invertedand bitmap indexes avoid this problem by adding an extralevel of indirection: instead of looking up base data di-rectly, they operate on the IDs of the base data, allowingfor combining results from multiple index lookups via setoperations. Consider the event values x(α) where x rep-resents a numeric value and α its ID, e.g., 1(0), 3(1), 1(2),2(3), and 1(4). An inverted index represents the eventsas a mapping from values to position lists: 1 →{0,2,4},2 →{3}, and 3 →{1}. A bitmap index is an isomorphicdata structure where the position lists have the form of bitvectors:3 1 → 〈10101〉, 2 → 〈00010〉, and 3 → 〈01000〉.When the distinction does not matter, we use the termidentifier set to mean either position list or bit vector.

There exist hybrid schemes which combine both indextypes in a single data structure [7], but VAST currentlyimplements its algorithms only in terms of bitmap indexes,where operations from boolean algebra (set, intersection,complement) naturally map to native CPU instructions. Ingeneral, an index I provides two basic primitives: addingnew values and looking up existing ones under a givenpredicate. To add a new value x(α), the index adds α tothe identifier set for x. A lookup under a predicate I ◦ xretrieves the identifier set S = {α | x(α) ∈ I}. The size ofand index |I|= N represents the number of values enteredand the cardinality #I =C the number of distinct values.

3The literature often uses the term “bitmap” to refer to a bit vector,i.e., a sequence of bits. We use “bit vector” instead to avoid confusionbetween “bitmap” and “bitmap index.”

2

1

3

X

001

0

0

1

0

1

0

012

0010

0

0

1

0

0

0

0

1

0123

1

11

0

0,11,2

Equality Range Interval

1 0110100 11

1 0110100 11

0

0

Figure 4: Equality, range, and interval encoding exempli-fied using bitmap indexes.

Our example has N = 5 and C = 3. In the following, wesketch key concepts that affect the inherent space-timetrade-off during the implementation of indexes, whichinclude binning, coding, compression, and composition.

Binning reduces the cardinality of an index by group-ing values into bins or rounding floating-point values toa certain decimal precision. For example, we could cre-ate bins [1,2] → 〈10111〉 and [3,4] → 〈01000〉, whichreduces the cardinality of the index to C = 2. The sur-jective nature of binning introduces false positives andtherefore requires a candidate check with the base data toverify whether a certain hit qualifies as an actual result.A candidate check can easily dominate the entire queryexecution time due to materializing additional base data(high I/O costs) and extra post-processing. Therefore,choosing an efficient binning strategy requires carefultuning and domain knowledge, or advanced adaptive al-gorithms [44, 47].

Encoding determines how an index maps values toidentifier sets. We show in Figure 4 the three major exist-ing encoding schemes for a fixed cardinality C = 4, whichcan represent values 0–3. Equality encoding associateseach value with exactly one identifier set. This schemereflects our running example and consists of exactly Cidentifier sets. Range encoding associates each value x(α)

with a range of C − 1 identifier sets such that an ID αlands in i sets where x ≤ i. We can omit the last identifierset because x ≤ C−1 holds true for all possible values.Interval encoding splits the index into �C

2 � overlappingslices, each of which covers half of the values. In ourexample, we have two the intervals [0,1] and [1,2].

Compression algorithms for bit vectors typically usevariations of run-length encoding, which support bitwiseoperations without prior decompression. There exist nu-merous algorithms: BBC [3], WAH [56], COMPAX [20],CONCISE [12], WBC/EWAH [57, 34], PLWAH [15],DFWAH [45], PWAH [53], VLC [13], and VAL [24].We chose EWAH for VAST because when we began ourproject software patents covered (and still do) the otherattractive candidates (WAH, PLWAH, COMPAX), whichwould have prevented us from releasing our project as

Page 9: VAST: A Unified Platform for Interactive Network Forensics

352 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

open-source software. EWAH also trades space for time:while exhibiting a slightly larger footprint, it executesfaster in certain conditions [23] because it can skip entireblocks of words.

Multi-component indexes combine several individualindex instances (which might use different approaches)such that each covers a disjoint partition of the valuedomain. Doing so provides an exponential reduction inspace by decreasing the size of the value domain by a mul-tiplicative factor for each component. We can decomposea value x into k components 〈xk, . . . ,x1〉 by representingit with respect to a fixed base (or radix) β = 〈βk, . . . ,β1〉:x = ∑k

i=1 xiβi, where xi = �x/∏i−1j=1 β j� mod βi, for all

i ∈ {1, . . . ,k}. This decomposition scheme directly ap-plies to the index structure as well: a multi-componentindex Kβ = 〈Ik, . . . , I1〉 consists of k indexes, where eachIi covers a total of βi values. A base is uniform if βi = β jfor all i �= j. A uniform base with βi = 2 for all 1 ≤ i ≤ kyields the bit-sliced index [55], because each xi can onlytake on values 0 and 1. We denote this special case byΘk = Kβ where |β | = k and βi = β j = 2 for all i �= j.Further, we define Φw = Kβ where ∏k

i=1 βi ≤ 2w as anindex which supports up to 2w values.

For example, consider a two-component index Kβ =〈I2, I1〉 with β = 〈10,10〉, which supports 100 distinctvalues. Appending a value x(α) = 42 involves first de-composing it into 〈4,2〉, and then appending 4(α) to I2and 2(α) to I1. Looking up the value x = 23 begins withdecomposing x into 〈2,3〉, and then proceeds with com-puting I2 = 2 ∧ I1 = 3. The final step resolves each com-ponent lookup according to its encoding scheme. Oper-ators other than {=, �=} require more elaborate lookupalgorithms [8, 9], which we lay out in greater detail sepa-rately [52].

4.2.2 Higher-Level Indexing

For each type in VAST’s data model, there exist differ-ent requirements derived from the desired query opera-tions. For example, numeric values commonly involveinequality comparisons and IP address lookups top-k pre-fix search. Different lookup operations require differentindex layouts.

Per §3.1, a value consists of a type and correspondingdata. A value can exhibit no data, in which case it onlycarries type information. We define the value index V=〈N,D〉 as a composite structure with a null index N torepresent whether data is null (implemented as singleidentifier set), and a data index D to represent a type-specific index, whose instantiations we describe next.

Integral Indexes. The boolean index B= 〈S〉 for typebool consists of a single identifier set, where α ∈ S im-plies x(α) = true.

For types count and int, the challenge lies in both

supporting lookups under {<,≤,=, �=,≥,>}, as well asrepresenting 264 distinct values. To address the high car-dinality challenge, we use a multi-component index Φ64.We found that a uniform base 10 works well in prac-tice. To support the desired arithmetic operations, we userange coding, which supports both equality and inequalitylookups efficiently.

Signed integers introduce a complication: we cannotmap negative values to array indices during encoding, andonly positive numbers work with value decomposition.For a w-bit signed integer, we therefore introduce a “bias”of 2w−1, which shifts the smallest value of −2w−1 to 0in the unsigned representation. This allows us to use thesame index type for signed an unsigned integers inter-nally. Thus, we define the count index as C= Φ64 and theinteger index as I= Φ64 with the aforementioned bias.

Floating-Point Index. Type real corresponds to aIEEE 754 double precision floating point value [28],which consists of one sign bit, 11 bits for the exponent,and 52 bits for the significand. Consequently, we con-struct the real index F = 〈S,E,M〉 as a boolean indexS = B for the sign, a bit-sliced index E = Θ11 for the ex-ponent, and a bit-sliced index M = Θ52 for the significand.Varying the number of bits in E and M allows for tradingspace for precision without the need to round to a specificdecimal point.

Temporal Indexes. VAST represents duration dataas a 64-bit signed integers in nanosecond resolution,which can represent ±292.3 years. Since duration andint are representationally equal, the duration index D= Idirectly maps to an integer index.

The type time describes a fixed point in time, whichVAST internally expresses as duration relative to theUNIX epoch. Thus, the time index T= D is representa-tionally equal to the duration index.

String Index. Existing string indexes rely on a dic-tionary to map each unique string value to a unique nu-meric value [50]. However, constructing a space-efficientdictionary poses a challenge in itself [39, 27]. More-over, this design only supports equality lookups naturally:for substring search, one must search the dictionary keyspace first to get the identifier sets, and then perform thelookup for each identifier set, and finally combine theresult. Instead of using a stateful dictionary, one can relyon hashing to compute a unique string identifier [51]. Thepossibility of collisions now requires a candidate check.While space-optimal due to the absence of a dictionary,and time-efficient due to fast computation, this approachdoes not support substring search.

We propose a new approach for string indexing thatsupports both equality and substring search, yet operatesin a stateless fashion without dictionary. Our string indexS= 〈φ ,κ1, . . . ,κM〉 consists of an index φ = Kβ for thestring length, plus M indexes κi = Φ8 per character where

Page 10: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 353

M is the largest string added to the index. When repre-senting the character-level indexes κi as a bit-sliced indexΘ8, we obtain efficient case-insensitive (substring) searchfor ASCII-encoded strings. This works because only the6th bit determines casing in ASCII, and by simply omit-ting the corresponding identifier set Θ8

6 during lookup,case-insensitive search executes faster than case-sensitivesearch. We plan on supporting search via a subset of regu-lar expressions by compiling a pattern into an automatonperforming a sequence of per-character lookups. Like-wise this structure lends itself to similarity search, e.g.,via edit-distance.

This design works well for bounded, non-uniformstring data, such as URLs or HTTP user agents. Forother workloads we fall back to hashing in combinationwith tokenization. This preprocessing step splits a stringaccording to a pattern (e.g., whitespace for text, ’/’ forURIs, etc.), which then creates a set of multiple smallerstrings. Adaptively switching between the index typespresents an interesting opportunity for future work, e.g.,by inspecting both the nature of user queries as well asinferring the data distribution.

Network Indexes. IP addresses constitute a centraldata type to describe endpoints of communicating enti-ties. The most common operation on IP addresses con-sists of top-k prefix search, e.g., I ∈ 192.168.0.0/24

or I /∈ fd00::/8. We can consider equality lookup as aspecial case when k = 32 and k = 128 for IPv4 and IPv6,respectively. There exists a standardized scheme to embeda 32-bit IPv4 address inside a 128-bit IPv6 address [5]:set the first 96 bits to 0 and copy the IPv4 address in thelast 32 bits. This yields the address index A= Θ128 whereeach bit in the address corresponds to one identifier set inthe index.

The subnet type consists of a network address anda prefix. Typical queries involve point lookups of IPaddresses (e.g., 192.168.0.42 ∈ I), and subset relation-ships to test whether one subnet contains another (e.g.,192.168.0.0/28⊆ I). The subnet index U= 〈A,P〉 con-sists of an address index A and a single equality-encodedindex P for the prefix.

The port type consists of a 16-bit number and a trans-port protocol type. The port index P= 〈Φ16,T 〉 consistsof an index for the 16-bit bit port number and a singleequality-encoded index T for the different port types.

Container Indexes. Container types include vector,set, and table. A container contains a variable num-ber of elements of a homogeneous type, unlike records,which allows for fixed-length heterogeneous data withnamed fields. For example, a set describes DNS lookupresults, where a single host name has associated multipleA records. We support cardinality and subset queries oncontainers.

The design of string indexes generalizes to contain-

extracting

waitingidle

done

hits arrived,unprocessed hits

finished batch,no inflight batches

finished batch,inflight batch

processed fewer eventsthan in batch

all hits arrived,no unprocessed hits batch

arrived

Figure 5: The QUERY state machine.

ers. Let M denote the maximum number of elementsin a container. We define the vector index XV and setindex XS both as 〈φ ,V1, . . . ,VM〉: an index φ = Kβ forthe container size and M value indexes Vi. The table in-dex XT = 〈φ ,X1, . . . ,XM,Y1, . . . ,YM〉 consists of an indexfor the size φ , a sequence of value indexes Xi = V for thetable keys, and a sequence of value indexes Yi = V forthe table values. Tables support key lookup, value lookup,and checking for specific mappings.

4.3 QueriesThe actor model provides an apt vehicle to implement afully asynchronous architecture, which enables VAST todeliver interactive response times. We illustrate how thisapplies to query processing in the following.

Finite state machines. We found that finite state ma-chines (FSMs) prove an indispensable mechanism to en-sure correct message handling during query execution.Recall that NODE spawns an EXPORTER for each queryto bridge ARCHIVE and INDEX. We implemented EX-PORTER as a finite state machine (see Figure 5), whichbegins in idle state. Upon receiving new hits EXPORTERasks ARCHIVE for the corresponding batches and tran-sitions into waiting. As soon as the first batch arrives,it transitions into extracting, from where a user canselectively control it to fetch specific results. By lettingthe user drive the extraction, VAST does not consumeresources unless needed.

Predicate-level caching. To speed-up related queries,INDEX maintains a predicate cache. If hits for the expres-sion A || B exists already, then a new expression A &&

D only requires looking up D. This makes iterative querystyles viable, where the analyst keeps on refining a filteruntil having pin-pointed the desired information.

Evaluating expressions. When INDEX receives aquery consisting of multiple predicates, VAST evaluatesthem concurrently. Recall from §3.2 that during a histori-cal query INDEXERs send their hits back to PARTITION,where they trigger an evaluation of the expression. Ifthe evaluation yields new hits (i.e., a bit vector with new

Page 11: VAST: A Unified Platform for Interactive Network Forensics

354 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

A

&&

BA

!

&&

B

||

C

! !

!

C

!

&&

B' C'A

Figure 6: Expression normalization: negation normalform (NNF) and negation absorbing. The expression A ∧B ∨C first becomes A ∧ B ∧C. Thereafter, we can absorbnegations further and reduce the intermediate expressionto A ∧ B′ ∧C′, e.g., if B = I < x, then B′ = I ≥ x.)

1-bits), PARTITION forwards them to INDEX, which inturn relays them to EXPORTER.

To minimize latency and relay hits as soon as possible,we normalize queries to negation normal form (NNF),which eliminates double-negations and pushes negationsinwards to the predicate level. We also absorb remainingnegations into the predicates, which is possible becauseeach operator has a complement (e.g., < and ≥). Figure 6illustrates the normalization procedure. The absence ofnegations, aside from saving an extra complement oper-ation, has a useful property: a 1-bit will never turn to 0during evaluation.

To understand this benefit, consider a predicate Awhich decomposes into n sub-predicates. This may occurfor predicates of the form :addr in 172.16.0.0/16,where the type extractor acts as a placeholder resolving ton concrete schema extractors. When PARTITION sends Ato the n INDEXERS, they report their hits asynchronouslyas soon as they become available. PARTITION contin-uously re-evaluates the AST for new arriving Hi, untilhaving computed A = H1 ∨ ·· · ∨ Hn. As soon as a re-evaluation yields one or more new 1-bits, PARTITIONrelays this delta upstream to INDEX. If we kept the nega-tion A, we would to wait for all n hits to arrive in orderto ensure we are not producing false positives. Withoutnegations, we can relay this change immediately since a1-bit cannot turn 0 again in a disjunction.

5 Evaluation

We evaluate our implementation in terms of through-put (§5.1), latency (§5.2), and storage requirements (§5.4).We performed measurements with two types of inputs:synthetic workloads that we can precisely control, andreal-world network traffic. For the former, we imple-mented a benchmark SOURCE that generates input forVAST according to a configuration file. The SOURCEgenerates all synthetic data in memory to avoid addingI/O load. For real-world input, we use logs from Bro andraw PCAP traces. For the latter, VAST functions as aflow-oriented bulk packet recorder.

VAST comprises 36,800 lines of C++14 code (exclud-ing whitespace and comments), plus 6,700 line of C++unit tests verifying the system’s building blocks and basicinteractions. We expand on these checks with an end-to-end test of whether the entire pipeline—from import, overquerying, to export—yields correct results. For valida-tion, we processed our ground truth (Bro logs and PCAPtraces) separately and cross-checked against the queryresults VAST delivers. We found full agreement.

We conducted our single-machine evaluation experi-ments on a 64-bit FreeBSD system with two 8-core IntelXeon E5-2650 CPUs with 128 GB of RAM and four 3 TBSAS 7.2 K disks (RAID 10 with 2 GB of cache). Ourdataset encompasses a full-packet trace from the upstreamlink at the International Computer Science Institute (ICSI),containing 10 M packets over a 24-hour window on Feb.24, 2015. We further use 3.4 M Bro connection logs de-rived from this trace. For our cluster experiments, weuse 1.24 B Bro connection logs (152 GB), split into Nslices for N worker nodes, with N ranging from 1 to 24.Each worker node runs FreeBSD 10 on a system with two8-core Intel Xeon E5430 CPUs with 12 GB of RAM and2 x 500 MB SATA disks. An additional machine with twoXeon X5570 CPUs and 24 GB performs the slicing. Themachines share a 1 GE network link.

5.1 Throughput

One key performance metric represents the rate of eventsthat VAST can ingest. Recall the data flow: SOURCEsparse and send input to a system entry point, an IM-PORTER, which dispatches the events to ARCHIVE andINDEX. Because we can spawn multiple SOURCEs for ar-bitrary subsets of the data, we did not optimize SOURCEsat this stage in the development, nor ARCHIVE since itmerely sequentially compresses events into fixed-sizechunks and writes them out to the filesystem. Instead, weconcerned ourselves with achieving high performance atthe bottleneck: INDEX, which performs the CPU-intensivetask of building bitmap indexes.

Macro Benchmark. For the ingestion benchmark,we configured a batch size of 65,536 events at SOURCE,after observing that greater values entail slightly poorerperformance and higher variance (we tested up to 524,288events per batch).

Figure 7 shows the event rates for three data formats(Bro, PCAP, and a benchmark test) at SOURCE, ARCHIVE,and INDEX as a function of number of cores providedto CAF’s scheduler in the single system setup. The y-axis shows the throughput in events per second; notethe log scale. As mentioned above, by design SOURCEand ARCHIVE exhibit a fairly constant throughput rate.The highly concurrent architecture complicates measure-ments of aggregate throughput at INDEX, because there

Page 12: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 355

� �

� �

�� � �

� � � � � � � � � � � � � � � � � � � �

� �

��

��

� ��

��

� �

� �

� � �

��

��

�� �

�� �

� � � �

� � � � � � � � � � � �

bro pcap test

8,192

16,384

32,768

65,536

131,072

262,144

524,288

1,048,576

2,097,152

4,194,304

0 10 20 30 0 10 20 30 0 10 20 30Cores

Even

ts/s

ec

Aspect� Index

ArchiveSource

Figure 7: Throughput in events per second as a function of the number of cores for three data types: Bro connectionlogs, PCAP traces, and a test source generating synthetic events.

Label Results Query Description

A 374 resp h == 2001:7fe::53 Connections to a specific IPv6 addressB 942 (duration > 1000s || resp bytes > 40000) && service == "dns" Anomalous DNS / zone transfersC 13 orig h in 192.150.186.0/23 && orig bytes > 10000 && service == "http" Outgoing HTTP requests > 10 KB (exfiltration)D 3 duration > 1h && service == "ssh" Long-lived SSH sessionsE 969,092 conn state != "SF" TCP sessions lacking normal terminationF 4812 :addr in 192.150.186.0/23 && :port == 3389/? All RDP involving ICSI connections

G 1,077 :addr in 192.150.186.0/23 && :port == 3389/? Same as above, but applied to PCAP traceH 34 &time > 2015-02-04+10:00:00 && &time < 2015-02-04+11:00:00 && ((src

== 77.255.19.163 && dst == 192.150.187.43 && sport == 49613/? &&dport == 443/?) || (src == 192.150.187.43 && dst == 77.255.19.163&& sport == 443/? && dport == 49613/?))

Extract all packets from a single connectionspecified by its 4-tuple and restricted to aone-hour time window

I 187,015 &time > 2015-02-04+10:00:00 && &time < 2015-02-04+11:00:00 && :addr== 192.150.187.43

All traffic from a single machine within a one-hourwindow

Table 2: Test queries for single-machine throughput and latency evaluation. The top 6 queries run over Bro connectionlogs and the bottom 3 over a PCAP trace.

exist multiple running INDEXERs, but not all start andfinish at the same time. Therefore we compute through-put at INDEX as the number of events processed betweenstart and end of the measurement. Consequently, thethroughput can never exceed the input rate of SOURCE.We observe that the indexing rate approaches the inputrate for all sources at around 10 cores. Giving CAF’swork-stealing scheduler, more cores yield no further im-provement; in fact, performance decreases slightly. Wepresume this occurs to due thrashing since CAF does notpin the worker threads to a specific core, which increasescontext switches and cache evictions.

VAST parses Bro events at a rate of roughly 100 Kevents per second, with each event consisting of 20 dif-ferent values, yielding an aggregate throughput of 2 Mvalues per second. For PCAP, events consist of only the4-tuple plus the full packet payload; the latter do not needindexing. VAST can read at around 260 K packets persecond with libpcap. Since ARCHIVE does not skipthe payload, it cannot keep up with the input rate. Thissuggests that we need to parallelize this component in thefuture, which can involve spawning one COMPRESSORper event batch to parallelize the process. With our test

SOURCE, INDEX converges to the input rate at around 14cores, and we observe input rates close to 1 M events persecond. We conclude that VAST meets the performanceand scalability goals for data import on a single machine:the system scales up to the point of the input rate after10-14 cores.

Micro Benchmark. To better understand where VASTspends its time, we instrumented CAF’s scheduler to getfine-grained, per-actor resource statistics. This involvedbracketing the job execution with resource tracking calls(getrusage), i.e., we only measure actor execution andleave CAF runtime overhead, mostly out of the picture.

In Figure 8, we plot user versus system CPU time for allkey actors. Each point represents a single actor instance,with its size scaled to the utilization: user plus systemCPU time divided by wallclock time. Note the log scaleon both axes. In the top-right corner, we see ARCHIVE,which spends its time compressing events (user) and writ-ing chunks to disk (system). Likewise, INDEX appearsnearby, which manages primarily PARTITIONs and buildssmall “meta indexes” based on time to quickly identifywhich PARTITION to consider during a query. The bulkof the processing time spreads over numerous INDEXERs,

Page 13: VAST: A Unified Platform for Interactive Network Forensics

356 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

0us

10us

100us

1ms

10ms

100ms

1s

0us 10us 100us 1ms 10ms 100ms 1s 10sUser CPU time

Syst

em C

PU ti

me

IDaccountantarchiveevent−data−indexerevent−indexerevent−name−indexerevent−time−indexeridentifierimporterindexkey−value−storenodeOTHERpartitiontask

Utilization��������������

��������������

��������������

��������������

0.250.500.751.00

Figure 8: User versus system CPU time for key actors.Each point represents a single actor instance, with pointsize scaled to utilization: user plus system CPU timedivided by wallclock time.

which we can see accumulating on the right-hand side,because building bitmap indexes is a CPU-bound task.

5.2 LatencyQuery response time plays a crucial role in assessing thesystem’s viability. VAST spawns one EXPORTER perquery, which acts as a middleman receiving hits from IN-DEX and retrieving the corresponding compressed chunksof events from ARCHIVE. This architecture exhibits twointerleaving latency elements: the time (i) from the firstto the last set of hits received from INDEX, and (ii) fromthe first to the last result sent to a SINK after a successfulcandidate check.

To evaluate these latency components, we use the setof test queries given in Table 2, which a security operatorfor a large enterprise confirmed indeed reflect commonsearches during an investigation.

Query Pipeline. Figure 9 illustrates the latency ele-ments seen over the test queries. For all queries, we ranVAST with 12 cores and a batch size of 65,536. The firstred bar corresponds to the time it took until EXPORTERreceived the first set of hits from INDEX. The green barshows the time until EXPORTER has sent the first resultto its SINKs. We refer to this as “taste” time, since fromthe user perspective it represents the first system response.The blue bar shows the time until EXPORTER has sentthe full set of results to its SINK. The black transparent

I

H

G

F

E

D

C

B

A

0 5 10 15 20 25 30 35 40 45Latency (seconds)

Que

ry

Time untilIndexTasteQuery

Count100,000200,000300,000

Figure 9: Query pipeline reflecting various stages ofsingle-node execution. The first stage (Index) may appearabsent because it can take too little time to manifest in theplot.

0

2

4

6

8

10

12

14

16

4 8 12 16Cores

Late

ncy

(sec

onds

) QueryABCDEFGHI

Figure 10: Index latency (full computation of hits) as afunction of cores.

box corresponds to the time when INDEX finished thecomputation of hits. Finally, the crosses inside the barcorrespond to points in time when hits arrive, and thecircles to the times when EXPORTER finishes extractingresults from a batch of events.

We see that extracting results from ARCHIVE (blue bar)accounts for the largest share of execution time. Currently,this time is a linear function of the query selectivity, be-cause EXPORTER does not perform extraction in parallel.We plan to improve this in the future by letting EXPORTERspawn a dedicated helper actor per arriving batch fromARCHIVE, allowing for concurrent sweeps over the candi-dates. Alternatively, we could offload more computationinto ARCHIVE. Selective decompression algorithms [21]present an orthogonal avenue for further improvement.

Index. VAST processes index lookups in a continuousfashion, with first hits trickling in after a few 100 msecs.Figure 10 shows that nearly all index lookups fully com-plete within 3 seconds once we use more than 4 cores.For query G, we observe scaling gains up to 10 cores.This particular query processes large intermediate bit vec-tors during the evaluation, which require more time tocombine.

Overall, we find that VAST meets our single-machineperformance expections. In particular, we prioritized ab-

Page 14: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 357

0.5

1.0

1.5

2.0

5 10 15 20 25Nodes

1 / U

tiliz

atio

n

Figure 11: Per-node CPU utilization during ingestion.

straction to performance in our implementation, and havenot micro-optimized code bottlenecks (such as via in-specting profiler call graphs). Given that each layer ofabstraction—from low-level bit-wise operations to high-level concurrency constructs—comes at the cost of perfor-mance, we believe that future tuning efforts hold promisefor even further gains.

5.3 ScalingIn addition to single-machine benchmarks, we analyzehow VAST scales over multiple machines in a clustersetting, as this will constitute the only viable deploymentmodel for large sites exhibiting copious amounts of data.

Ingestion. Our first measurement concerns quantifyinghow CPU load during event import varies as a function ofcluster nodes. To this end, we ingest 1.24 B Bro connec-tion logs by load-balance them over the cluster NODEs inbatches of 65 K. That is, as in Figure 1(b), a SOURCE ona separate machine parses the logs and generates batcheswith a median rate of 125 K events per second. Due to thefixed input rate, we assess scaling by looking at the CPUload of each worker.

Figure 11 shows per-machine CPU inverse utilization1/U for U = ∑N

i (ui + si)/∑Ni ti with user CPU time ui,

system CPU time si, and wallclock time ti, for selectedvalues of i in [0,N]. The value U can exceed 1.0 becauseeach node runs several threads, and CPU time measure-ments yield the sum of all threads. As one would expectfor effective load-balancing, we observe linear scalinggains for each added node N.

Query. Our second measurement seeks to understandhow query latency changes when varying the number ofnodes. We show the index completion time of query Din Figure 12. For these measurements, we first primed thefile system cache in each case to compensate for a short-cut that our current implementation takes (it maintainsthe index in numerous small files that cause high seekpenalties for reads from disk; an effect we could avoid byoptimizing the disk layout through an intermediary stepso that the index can read its data sequentially).

We observe linear scaling from 12 nodes upward, but

��

���

0.5

1.0

1.5

2.0

2.5

5 10 15 20 25Nodes

Late

ncy

(sec

onds

)

Figure 12: Index completion latency as function of nodes.

experience problems for the lower half. Other queriesshow linear scaling for small numbers of nodes. We arein the process of investigating the discrepancy.

5.4 StorageUnlike systems which process data in situ, VAST relies onsecondary indexes that require additional storage space.In the case of the Bro connection logs, the index increasesthe total storage by 90%. VAST, however, also com-presses the raw data before storing in the archive, in thiscase cutting it down to 47% of its original size. Takentogether, VAST requires 1.37 times the volume of its rawinput. For PCAP traces VAST, archives entire packets,but skips all packet payload during index construction.Archive compression brings down the trace to 92% of itsoriginal size, whereas the index for connection 4-tupleplus timestamps amounts to 4%. In total, VAST stilloccupies less space than the original data.

String and container indexes require the most storage,due to their composite and variable-length nature. Theremaining indexes exhibit constant space design, and theirconcrete size is a direct function of encoding and layoutof the bit vectors.

6 Conclusion

When security analysts today attempt to reconstruct thesequence of events leading to a cyber incident, they strug-gle to bring together enormous volumes of heterogeneousdata. We present VAST [54], a novel platform for forensicanalysis that captures and retains a high-fidelity archiveof a network’s entire activity, leveraging domain-specificsemantics to manage high data volumes while support-ing rapid queries against historical data. VAST’s noveltycomes from synthesizing powerful indexing technologywith a distributed, entirely asynchronous system archi-tecture that can fully exploit today’s highly concurrentarchitectures. Our evaluation with real-world log andpacket data demonstrates the system’s potential to sup-port interactive investigation and exploration at a levelbeyond what current systems offer.

Page 15: VAST: A Unified Platform for Interactive Network Forensics

358 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

References

[1] AGARWAL, R., KHANDELWAL, A., AND STOICA,I. Succinct: Enabling Queries on Compressed Data.In Proceedings of the USENIX Symposium on Net-worked Systems Design and Implementation (NSDI)(2015).

[2] ALLMAN, M., KREIBICH, C., PAXSON, V., SOM-MER, R., AND WEAVER, N. Principles for Devel-oping Comprehensive Network Visibility. In Pro-ceedings of the USENIX Workshop on Hot Topics inSecurity (HotSec) (2008).

[3] ANTOSHENKOV, G. Byte-aligned Bitmap Compres-sion. In Proceedings of the Conference on DataCompression (DCC) (1995), p. 476.

[4] ARMSTRONG, J. Making Reliable Distributed Sys-tems in the Presence of Software Errors. Ph.D. The-sis, Department of Microelectronics and Informa-tion Technology, KTH, Sweden, 2003.

[5] BAO, C., HUITEMA, C., BAGNULO, M., BOU-CADAIR, M., AND LI, X. IPv6 Addressing ofIPv4/IPv6 Translators. RFC 6052, Internet Engi-neering Task Force (IETF), 2010.

[6] BAYER, R., AND MCCREIGHT, E. M. Organiza-tion and Maintenance of Large Ordered Indexes. InRecord of the ACM SIGFIDET Workshop on DataDescription and Access (1970), pp. 107–141.

[7] CHAMBI, S., LEMIRE, D., KASER, O., ANDGODIN, R. Better Bitmap Performance with Roar-ing Bitmaps. CoRR abs/1402.6407 (2014).

[8] CHAN, C.-Y., AND IOANNIDIS, Y. E. BitmapIndex Design and Evaluation. In Proceedings of theInternational Conference on Management of Data(SIGMOD) (1998), pp. 355–366.

[9] CHAN, C.-Y., AND IOANNIDIS, Y. E. An EfficientBitmap Encoding Scheme for Selection Queries.In Proceedings of the International Conference onManagement of Data (SIGMOD) (1999), pp. 215–226.

[10] CHAROUSSET, D., SCHMIDT, T. C., HIESGEN, R.,AND WAHLISCH, M. Native Actors – A ScalableSoftware Platform for Distributed, HeterogeneousEnvironments. In Proceedings of the InternationalWorkshop on Programming based on Actors, Agents,and Decentralized Control (AGERE!) (2013).

[11] COHEN, M. I., BILBY, D., AND CARONNI, G.Distributed Forensics and Incident Response in theEnterprise. Digital Investigations 8 (2011), S101–S110.

[12] COLANTONIO, A., AND DI PIETRO, R. CON-CISE: Compressed ’n’ Composable Integer Set. In-formation Processing Letters 110, 16 (2010), 644–650.

[13] CORRALES, F., CHIU, D., AND SAWIN, J. Vari-able Length Compression for Bitmap Indices. InProceedings of the International Conference onDatabase and Expert Systems Applications (DEXA)(2011), pp. 381–395.

[14] DEAN, J., AND GHEMAWAT, S. MapReduce: Sim-plified Data Processing on Large Clusters. In Pro-ceedings of the Conference on Symposium on Op-erating Systems Design & Implementation (OSDI)(2004), vol. 6, pp. 10–10.

[15] DELIEGE, F., AND PEDERSEN, T. B. PositionList Word Aligned Hybrid: Optimizing Space andPerformance for Compressed Bitmaps. In Proceed-ings of the International Conference on ExtendingDatabase Technology (EDBT) (2010), pp. 228–239.

[16] DOHERTY, W. J., AND THADANI, A. J. The Eco-nomic Value of Rapid Response Time. IBM (1982).

[17] ElasticSearch. https://www.elastic.co/

products/elasticsearch.

[18] etcd. https://github.com/coreos/etcd.

[19] FUSCO, F., DIMITROPOULOS, X., VLACHOS, M.,AND DERI, L. pcapIndex: An Index for NetworkPacket Traces with Legacy Compatibility. SIG-COMM Computer Communication Review 42, 1(2012), 47–53.

[20] FUSCO, F., STOECKLIN, M. P., AND VLACHOS,M. NET-FLi: On-the-fly Compression, Archivingand Indexing of Streaming Network Traffic. Pro-ceedings of the VLDB Endowment 3, 1-2 (2010),1382–1393.

[21] FUSCO, F., VLACHOS, M., AND DIMITROPOULOS,X. RasterZip: Compressing Network MonitoringData with Support for Partial Decompression. InProceedings of the Internet Measurement Confer-ence (IMC) (2012), pp. 51–64.

[22] GIURA, P., AND MEMON, N. NetStore: An Effi-cient Storage Infrastructure for Network Forensicsand Monitoring. In Proceedings of the InternationalSymposium on Recent Advances in Intrusion Detec-tion (RAID) (2010), pp. 277–296.

[23] GUZUN, G., AND CANAHUATE, G. PerformanceEvaluation of Word-Aligned Compression Methodsfor Bitmap Indices. Knowledge and InformationSystems (2015), 1–28.

Page 16: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 359

[24] GUZUN, G., CANAHUATE, G., CHIU, D., ANDSAWIN, J. A Tunable Compression Frameworkfor Bitmap Indices. In In Proceedings of the Inter-national Conference on Data Engineering (ICDE)(2014), pp. 484–495.

[25] HEWITT, C., BISHOP, P., AND STEIGER, R. AUniversal Modular ACTOR Formalism for ArtificialIntelligence. In Proceedings of the InternationalJoint Conferences on Artificial Intelligence (IJCAI)(1973), pp. 235–245.

[26] HOARE, C. A. R. Communicating Sequential Pro-cesses. Communications of the ACM 21, 8 (1978),666–677.

[27] INGO MLLER, CORNELIUS RATSCH, F. F. Adap-tive String Dictionary Compression in In-MemoryColumn-Store Database Systems. In Proceed-ings of the International Conference on ExtendingDatabase Technology (EDBT) (2014), pp. 283–294.

[28] ISO/IEC. Information technology – Microproces-sor Systems – Floating-Point arithmetic. Standard60559:2011, 2011.

[29] Apache Kafka. http://kafka.apache.org.

[30] KNUTH, D. E. The Art of Computer Programming:Sorting and Searching, vol. 3. 1998.

[31] KREIBICH, C., WEAVER, N., NECHAEV, B., ANDPAXSON, V. Netalyzr: Illuminating the Edge Net-work. In Proceedings of the Internet MeasurementConference (IMC) (2010), pp. 246–259.

[32] LEACH, P. J., MEALLING, M., AND SALZ, R. AUniversally Unique IDentifier (UUID) URN Names-pace. RFC 4122, Internet Engineering Task Force(IETF), 2005.

[33] LEE, J., LEE, S., LEE, J., YI, Y., AND PARK, K.FloSIS: A Highly Scalable Network Flow CaptureSystem for Fast Retrieval and Storage Efficiency.In Proceedings of the USENIX Annual TechnicalConference (ATC) (2015).

[34] LEMIRE, D., KASER, O., AND AOUICHE, K. Sort-ing Improves Word-aligned Bitmap Indexes. Data& Knowledge Engineering 69, 1 (2010), 3–28.

[35] libpcap. http://www.tcpdump.org.

[36] Lucene. https://lucene.apache.org.

[37] LZ4: Extremely Fast Compression algorithm.https://github.com/Cyan4973/lz4.

[38] MAIER, G., SOMMER, R., DREGER, H., FELD-MANN, A., PAXSON, V., AND SCHNEIDER, F. En-riching Network Security Analysis with Time Travel.In Proceedings of the ACM SIGCOMM conference(2008).

[39] MARTNEZ-PRIETO, M. A., BRISABOA, N., CNO-VAS, R., CLAUDE, F., AND NAVARRO, G. PracticalCompressed String Dictionaries. Information Sys-tems 56 (2016), 73–108.

[40] MELNIK, S., GUBAREV, A., LONG, J. J., ROMER,G., SHIVAKUMAR, S., TOLTON, M., AND VAS-SILAKIS, T. Dremel: Interactive Analysis of Web-Scale Datasets. Proceedings of the VLDB Endow-ment 3, 1-2 (2010), 330–339.

[41] O’NEIL, P. E. Model 204 Architecture and Perfor-mance. In Proceedings of the International Work-shop on High Performance Transaction Systems(1987), pp. 40–59.

[42] ONGARO, D., AND OUSTERHOUT, J. In Search ofan Understandable Consensus Algorithm. In Pro-ceedings of the USENIX Annual Technical Confer-ence (ATC) (2014), pp. 305–319.

[43] PAXSON, V. Bro: A System for Detecting NetworkIntruders in Real-Time. Computer Networks 31,23–24 (1999), 2435–2463.

[44] ROTEM, D., STOCKINGER, K., AND WU, K. Opti-mizing Candidate Check Costs for Bitmap Indices.In Proceedings of International Conference on In-formation and Knowledge Management (CIKM)(2005), pp. 648–655.

[45] SCHMIDT, A., AND BEINE, M. A Concept for aCompression Scheme of Medium-Sparse Bitmaps.In Proceedings of the International Conference onAdvances in Databases, Knowledge, and Data Ap-plications (DBKDA) (2011), pp. 192–195.

[46] SHVACHKO, K., KUANG, H., RADIA, S., ANDCHANSLER, R. The Hadoop Distributed File Sys-tem. In Proceedings of the Symposium on MassStorage Systems and Technologies (MSST) (2010),pp. 1–10.

[47] SINHA, R. R., AND WINSLETT, M. Multi-resolution Bitmap Indexes for Scientific Data. ACMTransactions on Database Systems (TODS) 32, 3(2007).

[48] Snappy: A fast compressor/decompressor. https://code.google.com/p/snappy/.

[49] Splunk. http://www.splunk.com.

Page 17: VAST: A Unified Platform for Interactive Network Forensics

360 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

[50] STOCKINGER, K., CIESLEWICZ, J., WU, K.,ROTEM, D., AND SHOSHANI, A. Using bitmapindex for joint queries on structured and text data.New Trends in Data Warehousing and Data Analysis(2009), 1–23.

[51] TAYLOR, T., COULL, S. E., MONROSE, F., ANDMCHUGH, J. Toward Efficient Querying of Com-pressed Network Payloads. In Proceedings ofthe USENIX Annual Technical Conference (ATC)(2012), USENIX ATC ’12.

[52] VALLENTIN, M. Scalable Network Forensics.Ph.D. Thesis, University of California, Berkeley,2016. (in preparation).

[53] VAN SCHAIK, S. J., AND DE MOOR, O. A MemoryEfficient Reachability Data Structure Through BitVector Compression. In Proceedings of the Inter-national Conference on Management of Data (SIG-MOD) (2011), pp. 913–924.

[54] VAST. http://vast.io.

[55] WONG, H. K. T., LIU, H.-F., OLKEN, F., ROTEM,D., AND WONG, L. Bit Transposed Files. In Pro-ceedings of the International Conference on VeryLarge Data Bases (VLDB) (1985), pp. 448–457.

[56] WU, K., OTOO, E., AND SHOSHANI, A. On thePerformance of Bitmap Indices for High CardinalityAttributes. In Proceedings of the International Con-ference on Very Large Data Bases (VLDB) (2004),pp. 24–35.

[57] WU, K., OTOO, E. J., AND SHOSHANI, A. A Per-formance Comparison of Bitmap Indexes. In Pro-ceedings of International Conference on Informa-tion and Knowledge Management (CIKM) (2001),pp. 559–561.

[58] ZAHARIA, M., CHOWDHURY, M., DAS, T., DAVE,A., MA, J., MCCAULEY, M., FRANKLIN, M. J.,SHENKER, S., AND STOICA, I. Resilient Dis-tributed Datasets: A Fault-tolerant Abstraction forIn-memory Cluster Computing. In Proceedings ofthe USENIX Symposium on Networked Systems De-sign and Implementation (NSDI) (2012), pp. 2–2.

Appendix

In §4.2.2 we introduce the structure of VAST’s high-levelindex types, but do note elaborating how to operate onthem with concrete algorithms. Table 3 and Table 4 pro-vide this information in more detail. For each type, weshow how to append a value (symbol �) to an index aswell as how to query it in terms of logical operations.

The basic index I in these tables represents a bitmap orinverted index with a fixed binning, encoding, and com-pression scheme. I operates on unsigned integer valuesand supports operators {<,≤,=, �=,≥,>}. For appendand lookup algorithms of the concrete encoding schemes,we refer the reader to the literature [8, 9], from which sum-marize established results about multi-component indexesin the following.

The basic index forms the foundation for thek-component index Kβ = 〈I1, . . . , Ik〉 with |β | = k(see §4.2.1). It represents the foundation of many higher-level indexes and its lookup algorithm varies according tothe relational operator of the predicate. Answering equal-ity queries involves computing a simple conjunction:

EQ(i,x) =i∧

j=1

(I j = x j) (1)

Thus, we can answer Kβ = x with EQ(k,x), and Kβ �= xwith EQ(k,x). For one-sided range queries, the algorithmless-than-or-equal (LE) implements range lookup of theform Kβ ≤ x as follows:

LE(i,x)=

(Ii ≤ xi −1) ∨ (θi ∧ LE(i−1,x)) i > 1,xi > 0θi ∧ LE(i−1,x) i > 1,xi = 0(Ii ≤ xi −1) ∨ LE(i−1,x) i > 1,xi = βi −1Ii ≤ xi i = 1

(2)The extra parameter θi depends on the coding scheme

and means either Ii = xi or Ii ≤ xi. Putting together algo-rithms EQ and LE, we can now answer Kβ ◦ x under allrelational operators with the algorithm �:

�(Kβ ,◦,x) =

EQ(k,x) ◦ ∈ {=}EQ(k,x) ◦ ∈ {�=}LE(k,x) ◦ ∈ {≤}LE(k,x) ◦ ∈ {>}LE(k,x−1) ◦ ∈ {<} ∧ x > 0LE(k,x−1) ◦ ∈ {≥} ∧ x > 00 ◦ ∈ {<} ∧ x = 01 ◦ ∈ {≥} ∧ x = 0

(3)

The two results 0 and 1 denote the empty and completeidentifier set. In the case of bitmap indexes, 0 representsa bit vector with all 0-bits, whereas 1 only consists of1-bits.

Page 18: VAST: A Unified Platform for Interactive Network Forensics

USENIX Association 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) 361

Type

Stru

ctur

eA

ppen

dL

ooku

p

basi

c∗I

I�

x(α)

I◦x

k-co

mpo

nent

†K

β=〈I

k,...,

I 1〉

Θk=

β i=

βj=

2∧|β|=

k

Φw=

∏k i=

1β i

≤2w

Kβ�

x(α)≡

I i�

x(α)

i∀1

≤i≤

kK

β◦x

≡�(

Kβ,◦,x)

bool

B=

SB�

x(α)≡

S�

αiff

x=true

B◦x

{ Sx=false

Sx=true

count

C=

Φ64

C�

x(α)≡

Φ64

�x(

α)

C◦x

≡Φ

64◦x

int

I=Φ

64I�

x(α)≡

Φ64

�(x

(α)

u +263

)I◦

x≡

Φ64◦(

xu +

263)

real

§

F=〈S,E

,M〉

S=B

E=

Θ11

M=

Θ52

F�

〈xs,

x e,x

m〉(α

)≡

S�

x(α)

s

E�

x(α)

e

M�

x(α)

m

F◦x

S=

x s∧

E◦x

e∧

M◦x

m◦∈{=

,�=}

S=

0∧

E◦x

e∧

M◦x

mx≥

0∧◦∈{>

,≥}

S=

1∨(E

◦xe∧

M◦x

m)

x≥

0∧◦∈{<

,≤}

S=

0∨(E

�x e

∧M

�x m

)x<

0∧◦∈{>

,≥}

S=

1∧

E�

x e∧

M�

x mx<

0∧◦∈{<

,≤}

duration

D=I

D�

x(α)≡

I�x(

α)

D◦x

≡I◦

x

time

T=D

T�

x(α)≡

D�

x(α)

T◦x

≡D◦x

string

S=〈φ

,κ1,...,

κ M〉

φ=

κ i=

Θ8

S�

〈x1,...,

x n〉(α

)≡

{ φ�

n(α)

κ i�

x(α)

i∀1

≤i≤

nS◦x

0|x|>

Mφ=

0|x|=

0

φ=|x|∧

|x| ∧ i=1

κ i=

x i◦∈{=

,�=}

φ≥|x|∧

M−|x|+

1∨ i=

1

|x| ∧ j=1

κ i+

j−1=

x j

◦∈{∈

,/∈}

∗T

heba

sic

inde

xha

sa

fixed

binn

ing,

codi

ng,a

ndco

mpr

essi

onsc

hem

ean

dop

erat

eson

valu

esx∈

X⊆N+ 0

and

has

card

inal

ityC≤|X|.

(see

§4.2

.1)

†T

hek-

com

pone

ntin

dex

oper

ates

with

aba

seβ=〈β

k,...,

β 1〉.

We

show

algo

rith

m�

in§?

?.T

hebi

t-sl

iced

inde

x[5

5]is

asp

ecia

lofK

βw

here

β i=

β i=

2fo

rall

i�=j.

The

mul

ti-co

mpo

nent

inde

w

can

atm

ostr

epre

sent

2wdi

stin

ctva

lues

We

deno

teby

�th

e“m

irro

red”

oper

ator

of◦,

e.g.

,<an

d>

.

Tabl

e3:

Sum

mar

yof

appe

ndan

dlo

okup

oper

atio

nson

high

-lev

elin

dexe

s.

Page 19: VAST: A Unified Platform for Interactive Network Forensics

362 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’16) USENIX Association

Type

Stru

ctur

eA

ppen

dL

ooku

p

addr

A=

Θ12

8A�

〈x1,...,

x 128〉(α

)≡

Θ12

8i

←x(α

)i

∀1≤

i≤12

8A◦x

k ∧ i=1

Θ12

8i

=x i

◦∈{∈

}

A�

x◦∈{/∈

}

subnet

U=〈A

,Φ8 〉

U�

〈xa,

x p〉(α

)≡

{ A�

x(α)

a

Φ8�

x(α)

pU◦x

Φ8≤

x p∧

(p ∧ i=1

Ai=

x ai)

◦∈{∈

}

U�

x◦∈{/∈

}

port

P=〈Φ

16,Φ

2 〉P�

〈xn,

x t〉(α

)≡

{ Φ16

�x(α

)n

Φ2�

x(α)

tP◦x

{ Φ16◦x

n∧

Φ2=

x tx t�=unknown

Φ16◦x

nx t=unknown

vector∗

XV=〈φ

,V1,...,V

M〉

φ=

XV�

〈x1,...,

x n〉(α

)≡

{ φ�

n(α)

Vi�

x(α)

i∀1

≤i≤

nX

V◦x

{ seeS◦x

τ(x)

=vector

seeX

S◦x

τ(x)

=set

set

XS=〈φ

,V1,...,V

M〉

φ=

XS�

〈x1,...,

x n〉(α

)≡

{ φ�

n(α)

Vi�

x(α)

i∀1

≤i≤

nX

S◦x

0|x|>

Mφ=

0|x|=

0|x| ∧ i=1

(M ∨ j=

1

Vj=

x i

)ot

herw

ise

table

XT=〈φ

,X1,...,

X M,Y

1,...,

Y M〉

φ=

X i=V

Y i=V

XT�

〈(k 1,v

1)...,(k

n,v n)〉

(α)≡

φ�

n(α)

X i�

k(α)

i∀1

≤i≤

n

Y i�

v(α)

i∀1

≤i≤

n

XT◦k

M ∨ i=1

X i=

k◦∈{∈

}

XT�

k◦∈{/∈

}

XT◦v

M ∨ i=1Y i

=v

◦∈{∈

}

XT�

v◦∈{/∈

}

XT◦(

k,v)

M ∨ i=1(X

i=

k∧

Y i=

v)◦∈{∈

}

XT�(k

→v)

◦∈{/∈

}∗

Dep

endi

ngon

the

type

τ(x)

ofva

lue

x,th

elo

okup

func

tion

can

eith

erpr

eser

veor

deri

ng(a

sin

subs

trin

gse

arch

)ori

gnor

eor

deri

ng(a

sin

subs

etse

arch

).†

Ata

ble

valu

eha

sth

efo

rmx=〈(

k 1,v

1),...,(

k n,v

n)〉.

We

show

look

ups

fora

sing

leke

y,va

lue,

orm

appi

ng.

Tabl

e4:

Sum

mar

yof

appe

ndan

dlo

okup

oper

atio

nson

high

-lev

elin

dexe

s.