Behind the Kenna Prioritization Algorithm · New York office that our customer particularly ... Look at the risk reduction from the application of a patch across ... and sensitivity

Behind theKennaPrioritization Algorithm

Contents

Predicting and Preventing Threats

3 4

11

9

Using Data Science atScale for Prioritization

10

About Kenna’s ChiefData Scientist

About Kenna’s Patents

Prioritization Checklist

Security teams are overwhelmed with data—whether from vulnerability scans, dynamic or static application scans, or a myriad of tools that tell them about misconfigurations in their systems or applications. They have no efficient way of prioritizing this massive volume of data, and may rely upon antiquated approaches such as the CVSS to do so—an extraordinarily poor way to stack rank vulnerabilities. Security teams may also play the “let’s close everything” game, submitting long PDF documents to their remediation groups—documents that are summarily tossed into the recycle bin.

That approach doesn’t work in the modern era where automated attacks at scale can easily overpower manual attempts to prioritize and close vulnerabilities. It’s like trying to run alongside a speeding train; technology can outpace and overpower even the most disciplined and focused security professionals.

Security teams need to employ the same tactics at scale as the attacks and threats confronting them. If you’ll accept a brief detour into buzzwords, these teams face a “big data” problem and they need data science in order to properly solve it. Only by automating the process of prioritizing and remediating vulnerabilities can they keep up with the volume of data being thrown at them each day.

Predicting and Preventing Threats:It’s a Data Problem

This is the problem that Kenna’s platform solves. We equip customers with the ability to keep pace with both targeted and non-targeted attacks by enabling them to solve the prioritization problem at scale. There may be billions of attacks every hour, but the Kenna algorithm helps hone and shape that mass of data into a manageable list of vulnerabilities that are critical for a specific organization to fix immediately, helping teams take exactly the right actions to protect themselves.

The platform enables these actions to be taken quickly. When there’s trillions of pieces of data that need to be computed against customer environments, existing vulnerabilities, and exploit intelligence, the ability to automate these computations at lightning speed is absolutely essential. The vastness of the data categorically rules out the use of legacy tools such as Excel sheets or hobbled-together SIEM platforms.

So how does all this work?

“There may be

billions of

attacks every hour,

but the Kenna

algorithm helps

hone and shape

that mass of

data into a

manageable list...”

Using Data Science at Scale for

Prioritization

The goal of the Kenna algorithm is to turn the mountain of vulnerabilities that organizations face into ones that are critical to identify and remediate in order to increase the efficacy of the security organization. To do this, Kenna brings together multiple layers of exploit intelligence with its customers’ vulnerability data and identifies what threats are most critical focus on in terms of remediation.

However, we don’t simply look at the number of attacks, but rather curate the data to evaluate not just the who but also the how. The goal is to arrive at veracity—or accuracy—as derived from volume and velocity of successful exploits. By taking volume of velocity of successful exploits in the wild, then overlaying that data on top of our customer’s environments, we are able to uncover substantial truths in terms of what threats are critical to remediate quickly.

In doing so, instead of “fixing all the things” our customers focus on fixing the ones that matter. For example, it may be extremely important to note that there were ten successful exploitations of a particular CVE in the past week. However, if there were 2,000 successful exploitations of a particular CVE in the past hour, then security teams need to take immediate action.

Similarly, we place a higher importance on the groups of assets that our users actually care about. If there’s a Java asset group in their New York office that our customer particularly wants to protect, then maybe ten successful exploitations there in the past day matters much more than 1,000 successful exploits of an irrelevant group of assets in the DMZ.

What we intend to do is quantify the probability that a specific vulnerability on a specific asset will be exploited, doing as much as possible to predict a successful exploitation before it happens. Kenna is, in a way, identifying not just the windows left open in a customer’s environment—but the ones most likely to enable an intruder to gain entry, based on the patterns we see across the Internet. With accurate probability estimates, our customers have the opportunity to close the windows before an attacker gets in.

All of this is computed every half an hour, using a scalable platform that analyzes billions of data points an hour in order to run the data through our framework and provide actionable, contextual next steps to the end user.

Data and metrics are only useful if they enable you to make better decisions. Our customers are able to see what they need to take action on fast, because we don’t compute vanity metrics - we use the aforementioned probability estimates to make decisions about which actions provide the best bang for the buck in terms of reducing the risk posed by critical vulnerabilities.

Put differently, we run each remediation, mitigation, and fix through a simulation that parses the affected assets and vulnerabilities, and calculates what your environment would look like if the fix was applied. Then, we rank order all of the possible remediations to give you an actionable list of fixes, already bundled with the justification for why that fix is necessary. In doing so, we’re not just doing the heavy lifting on remediation strategy; we’re reducing the noise and thereby providing everyone in the organization with an explanation of why a particular action ought to be taken first, in the terms that are relevant to them.

As a result, each role in the security hierarchy stands to gain substantial benefits. Are you a CISO? Look at the risk reduction from the application of a patch across your entire enterprise. Do you work in vuln management? Justify why this specific vulnerability ought to be remediated first with threat and exploit intelligence, as well as pre-computed probability estimates. Are you a sysadmin and have to apply the patch? Quickly see which assets are affected, and the why and how behind a particular update.

Actionable Data

0-Day VulnerabilitiesFor any potential zero day, Kenna matches similar assets in our customers’ networks based on metadata we’ve collected about the asset. While patches are generally not available for zero days, the information we provide in Kenna may include workarounds or compensating controls.

Top PriorityYour vulnerability management autopilot. The Top Priority facet takes multiple criteria into account, including elements of all the other facets, to help identify what to prioritize.

Active Internet BreachesAIBS are vulnerabilities found in a customer’s environment that match vulnerabilities used in successful breaches over the past 6 months.

Easily ExploitableWeaponized, easy-to-fire off exploits are the lifeblood of casual attackers. “Easily Exploitable” vulnerabilities are known to have an exploit available in one of the exploit kits being tracked by Kenna including ExploitDB, Metasploit, and others.

Popular TargetsThese are targets of opportunity for attackers—vulnerabilities that are trending across the Kenna customer base and affect widely proliferated technology, thus being particularly attractive to attackers.

Let’s see what all this looks like within the Kenna platform. We break down the high-level data this way:

If the goal is to predict how likely it is that specific vulnerability will be exploited, then it’s necessary to understand what factors contribute to increasing the probability of exploitation. To do that, we use our exploit intelligence to create a data set that describes “ground truth data.” From there, we build probability-based Bayesian Models for every vulnerability that exists, while at the same time measuring the effectiveness of every other dataset we import into the platform against the ground truth.

That’s something that in-house security teams simply can’t do on their own, because they can’t gauge the effectiveness of their in-house models. At Kenna, we test our models against the ground truth “breach” dataset every 30 minutes. The exploit intelligence Kenna uses tell us how good our models are, and directs us to the strategy that maximizes effectiveness is in every specific situation a customer may face.

Regression coefficients tell us how to weigh the impact of each new dataset we analyze in estimating the probability of that vulnerability being used in a successful attack. We’re able to derive the predictive value, specificity, and sensitivity of potential strategies by measuring actions those strategies yield against the ground truth dataset. Over time and as the aperture of the data we import grows, so does our ability to define an effective, responsive vulnerability management strategy tailored to a specific enterprise.

More About the Science

Our external partners’ SIEM systems near-real-time feed of exploitation attempts correlated to vulnerability scans, and the volume and velocity of these attempts and successes.

The existence of an exploit in ExploitDB, and to an even greater degree, the availability of a module in Metasploit.

The pervasiveness of a vulnerability in various clients’ disparate environments.

The type of vulnerability - remote code execution vulnerabilities are more likely to be exploited.

The position of an asset (internal vs external).

Let’s take a closer look at how we put these factors together.

1

2

3

4

5

Our analysis indicates that a few factors continuously yield high positive predictive value and sensitivity in capturing those vulnerabilities that are likely to be successfully exploited.

Here’s an example using only four of the many factors we use. Take active internet breach, metasploit, exploitdb, and trending. Given these factors, we want a function:

where Ei is a binary variable corresponding to the types of exploits available, and T is a binary variable corresponding to the Trending status of the vulnerability.

We postulate that in terms of impact on the probability of breach event, Ebreach is more important than Emsp and both are much more important than Eedb, which itself is much more important than T. This gives us 12 permutations for a vulnerability (3*2*2) as follows:

In reality, we don’t postulate anything. The platform calculates which factors are best and what the impact is on a regular interval in order to build the algorithm. But for the purposes of illustration, let’s continue.

We start with a baseline score of 2.5 for each vulnerability, and use the space of (2.5,10.0] as map for the rest of the ranking. The resulting function assigns a cumulative rank-ordered value to each of the possible outcomes. You can read more about this approach at:

http://exploringpossibilityspace.blogspot.com/2014/02/thomas-scoring-system.html.

Armed with this qualification of the probability of a vulnerability being exploited, we can begin to build our risk model.

f (Ebreach,Emsp, Eedb,T)

T=0 T=1Eedb=1

Emsp=1Ebreach=1

Rank Order of Probabilities

The Result

These rank-ordered probabilities equip security teams with the measured science they need to identify and fix the most critical vulnerabilities. It’s a decision-making platform that can be automated across the entire enterprise, helping teams take the most efficacious action of the security of the organization.

Our research indicates that CVSS scores are not an effective metric for prioritizing vulnerabilities. While CVSS 10 vulnerabilities are both impactful and more exploited in the wild than the average vulnerability, that relationship does not hold true for any other CVSS score range. Moreover, distributions of CVSS scores are not intuitive and often create gaps between categorization of vulnerabilities as “critical” or otherwise. However, disregarding the research that yields CVSS scores is equally disingenuous, since we cannot hope to replicate these efforts.

Our alternative is to use a CVSS “points above average” metric for prioritizing vulnerabilities. Namely, at the writing of this algorithm, the average CVSS score for all vulnerability definitions is 7.7. We use the percent deviation from average to assess a CVSS score’s usefulness. Essentially this is the same points above average metric often seen in fantasy sports.

Functionally, the effect of the smoothing is to polarize the CVSS score distribution (stretch it away from the mean). Whereas previously, a CVSS score was an unreliable indicator of a live breach occurring, in the smoothed version, most all such vulnerabilities cluster in the tail end, therefore making the cutoff decision binary and indicative of live breaches.

CVSS Smoothing

= 10 * CVSSvulnerability / CVSSaverageCVSSvulnerabilitypaa

AboutKenna’s Patents

Kenna has filed patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Identifying Vulnerabilities Of ComputingAssets Based On Breach DataPatent number: 9270695

Internet Breach CorrelationPatent Number: 8966639

Ordered Computer Vulnerability Remediation ReportingPatent Number: 8984643

These patents describe the techniques for ranking a set of vulnerabilities of a computing asset and set of remediations for a computing asset, and determining a risk score for one or more computing assets are provided. In one technique, vulnerabilities of computing assets in a customer network are received at a vulnerability intelligence platform. Breach data indicating set of breaches that occurred outside customer network is also received. A subset of the set of vulnerabilities that are most vulnerable to a breach is identified based on the breach data. In another technique, multiple vulnerabilities of a computing asset are determined. A risk score is generated for the computing asset based on the vulnerabilities. In another technique, multiple remediations associated with a risk score and multiple vulnerabilities are identified. The remediations are ordered based on the remediations that would reduce the risk score the most if those remediations were applied to remove the corresponding vulnerabilities.

About Kenna’sChief Data ScientistMichael Roytman is Kenna’s Chief Data Scientist. He is co-author of Kenna’s key patents, and the driving force behind the algorithm that’s used to prioritize critical vulnerabilities.

Some of Michael’s credentials:

Sole author of the “Vulnerabilities” chapter in the 2016 Verizon Data Breach Investigation Report

Co-author of the “Vulnerabilities” chapter in the 2015 Verizon Data Breach Investigation Report

Frequent speaker, including at RSA, SOURCE, BSides, Metricon, and SIRAcon

Referred to by Gartner Security Analyst Anton Chuvkain as security “literati”

(blogs.gartner.com/anton-chuvakin/2015/10/13/vulnerability-management-1-problem-after-all-these-years)

His research has been quoted in multiple publications including New York Times, Wall Street Journal,Dark Reading, and SC Magazine

M.S. in Operations Research from Georgia Tech

Michael is available to talk at any time with Kenna’s prospects and customers regarding the specifics of his work. Often, many questions that security teams have can be answered in a brief 30-minute session.

PRIORITIZATIONCHECKLIST

We hope this white paper helped explain the concepts underlying Kenna’s algorithm. While you don’t necessarily need to use Kenna in order to have a world-class way to prioritize your vulnerabilities, we do think that any such platform you may use needs to have certain components:

The best way to test the effectiveness of Kenna is to try it for yourself. Put 1,000 assets in our platform, and within an hour,

see whether the platform’s data science yields actionable results. Our trial is completely self-service, and most users are

able to prioritize their vulnerabilities very quickly.

www.kennasecurity.com/signup

The Final Proof Point – Your Own Data

The platform needs to account for successful exploits using contextual exploit intelligence

The platform needs to account for the volume & velocity of successful exploits

The platform’s intelligence sources need to be clearly delineated

The platform needs to be “vulnerability scanner agnostic,” able to intake vendor from any scanner

The platform’s data science needs to work on not just non-authenticated scans, by being able to infer OS to ensure full coverage

The platform needs to have a self-service “proof point” in the form of a free, self-service trial or freemium offering

The platform’s data science needs to be authored by credible sources—with the right patents, publications, and credentials

The platform should not only prioritize vulnerabilities, but also suggest the right Patch—providing not only the “what,” but the “what’s next” in the form of the right action to take

The platform should translate its prioritization capabilities into automated reporting—including the ability to track risk posture over time

Toll Free: (855) 474-7546Direct: (312) 577-6987Email: [email protected]

Kenna Security560 Mission Street Suite 230San Francisco, CA 94105

Try for Free (it only takes minutes): www.kennasecurity.com

© COPYRIGHT 2016 KENNA SECURITY, INC. ALL RIGHTS RESERVED.

Documents

Behind the Kenna Prioritization Algorithm · New York office that our customer particularly ... Look at the risk reduction from the application of a patch across ... and sensitivity