Upload
nadia-metoui
View
212
Download
0
Embed Size (px)
Citation preview
7/27/2019 s5.2_04-molloy_pap.pdf
http://slidepdf.com/reader/full/s5204-molloypappdf 1/8
Risk-Based Access Control Decisions Under Uncertainty
Ian Molloy, Pau-Chen Cheng, Jorge LoboIBM Research
{molloyim,pau,jlobo}@us.ibm.com
Luke Dickens, Alessandra RussoImperial College
{lwd03,ar3}@doc.ic.ac.uk
Charles MorissetRoyal Holloway, University of London
Abstract—This paper addresses making access control deci-sions under uncertainty, when the benefit of doing so outweighsthe need to guarantee the correctness of the decisions. Forinstance, when there are limited, costly, or failed communica-tion channels to a policy-decision-point (PDP). Previously, localcaching of decisions has been proposed, but when a correctdecision is not available, either a PDP must be contacted, ora default decision used. We improve upon this model by usinglearned classifiers of access control decisions. These classifiers,trained on known decisions, infer decisions when an exact matchhas not been cached, and uses intuitive notions of utility, damageand uncertainty to determine when an inferred decision is
preferred over contacting a remote PDP. There is uncertainty inthe inferred decisions, introducing risk. We propose a mechanismto quantify such uncertainty and risk. The learning componentcontinuously refines its models based on inputs from a centralPDP in cases where the risk is too high or there is too muchuncertainty. We validated our models by building a prototypesystem and evaluating it against real world access control data.Our experiments show that over a range of system parameters, itis feasible to use machine learning methods to infer access controldecisions. Thus our system yields several benefits, includingreduced calls to the PDP, communication cost and latency;increased net utility and system survivability.
I. INTRODUCTION
Modern access control systems rely on the PEP/PDP ar-
chitecture: a Policy Enforcement Point (PEP) forwards eachaccess request made within an application to a Policy Decision
Point (PDP), which analyses this request and returns an allow/-
deny authorization decision. In some solutions, such as [1]–[3],
a PDP is commonly implemented as a dedicated authorization
server and is located on a different node than the PEPs.
To enforce a consistent policy throughout the system, this
architecture relies on the PEP being able to contact the PDP
to query decisions, and therefore suffers from a single point
of failure. In particular, key factors that affect the performance
of the PEP are latency of the communication with the PDP,
reliability and survivability of this connection (and the PDP
itself), as well as the aggregated impact of communication
costs (which can include contacting a human to perform theauthorization). In several contexts such as mobile applications,
these costs may be prohibitive.
A number of approaches have been proposed to address
these issues, one common theme is to cache access control
decisions [4], [5] at the PEP, such that the PEP does not have
to forward the same access request more than once to the PDP:
either the PEP finds the request-decision pair in the history, or
it forwards the request. Others explore the tradeoff between
efficiency and accuracy in an access control context, e.g., Ni
et al. [6] use a machine-learning algorithm to build a classifier
of policy decisions from past request-decision pairs. Once the
classifier is accurate enough, it is used as a local decision
point. As with any such approximate classifier, there is a
degree of uncertainty in every decision; thus, every decision
made is inherently associated with a risk of being wrong, either
incorrectly allowing or denying some requests.
This paper generalises of the machine-learning approach
from [6], by taking account of the uncertainty in each locally
inferred decision. We propose a model where this uncertainty
can be used to determine risk of making this decision. Thisallows the policy to trade-off the utility of making the local
decision against the risk associated with getting it wrong: if
favourable it will enact the local decision, otherwise it defers
to the central PDP. Commonly used access control models [7]–
[11] define access control policies in terms of model-specific
relationship between subjects and objects. Our approach is
not tied to a particular model, but it uses attributes of access
requests, e.g., subjects and objects, and assumes the policy can
be defined using these attributes.
We describe a model for quantifying uncertainty and risk,
and show how these can inform access control decisions. Our
work focuses on estimating the uncertainty about the correct-
ness of the inferred access control decisions, and assumes thatestimates of (or distributions over) the gain, loss and damage
associated with correct and incorrect decisions are available
(or can be obtained). There are many situations in which the
model can be applied. For instance, the scenario described
in [6], where expensive calls to a call center to resolve access
control decisions are made. There is also a large body of
research in the estimation of the value of resources and the
cost of security breaches and misuse. [12] describes a proof
of concept method to estimate the value of resources in an
enterprise, such as jewel code, acquisition plans, trade secrets,
and personal information. The estimates of these values can
be continuously refined and verified by the current operations
of the company. For example, IBM was granted over 5,000patents in 2010 alone, and licenses many of these to other
organizations, providing an accurate distribution of their value
over time. Other examples include cases where fine grained
service level agreements are defined a priori such as the
credit card example briefly discussed in Section IV-D (see
also experimets in Section V-C).
We present distributed access control system to illustrate
and test this principle: local decision points determine whether
the tradeoff between utility and uncertainty associated with a
7/27/2019 s5.2_04-molloy_pap.pdf
http://slidepdf.com/reader/full/s5204-molloypappdf 2/8
local decision is favorable, and defer to the central PDP for a
binding decision if not. This general framework allows us to
explore the balance between avoiding potential errors in local
decisions, and the desire to limit reliance on the central PDP.
We present three methodologies for making decisions:
• Expected Utility A risk neutral approach that weights
expected gains and damage equally.
• Risk Adjusted Utility A pessimistic view of utility, whereuncertainty about damage reduces desirability.
• Independent Risk Constraints Augments expected utility
with independent risk thresholds, where high utility de-
cisions can be rejected if outweighed by the risk.
This system is validated with data from a large corporation,
used to grant entitlements and permissions to system adminis-
trators. Classifiers are initially untrained and begin by always
consulting the central PDP. Over time, the classifiers are
trained using the central PDP’s decisions, improving accuracy
and reducing the uncertainty, yielding fewer queries. Due to
space constraints, experiments focus on the risk adjusted utility
measure. Compared with the caching algorithm, our results
show that our learning based approach reduces the queries toPDP by as much as 75% and increases the system utility up to
eight fold when the cache hit ratio is low. When the hit ratio
is high, performance is comparable to the caching algorithm.
Our results from this dataset show that our machine-learning
based approach robust and accurate and can be viably used
for informing distributed access control decisions; and more
conservative measures of risk can be applied if needed. We
believe these techniques can generalize to other use cases
and domains, including information sharing in military and
coalition environments; financial applications like credit re-
port processing; and load balancing services like Netflix or
document repositories. Compared to pure machine-learning
solutions, our methods estimate the uncertainty of each clas-
sification, allowing one to arbitrarily minimize the pitfalls of
using approximate decisions, but can ameliorate the higher
communication costs of naive caching solutions.
This paper is organized as follows. Section II briefly dis-
cusses related work. Section III discusses the architecture of
access control systems based on our approach. We present
in Section IV different techniques to assess the risk and
uncertainty of local access control decisions. An experiment
and evaluation of our approach, applied on different scenarios,
is presented in Section V. Section VI concludes this paper.
I I . RELATED W OR K
The caching of access control decisions has been exploredin [1]–[3], and the policy evaluation framework in [13] uses
a caching mechanism, which can dramatically decrease the
time to evaluate access control requests. Clearly, caching
approaches are only valid when the cache is sure to return the
same decision as the central PDP, and techniques to ensure
strong cache consistency are proposed in [14]. In distributed
systems, each node can build its own cache in collaboration
with other nodes, improving the accuracy of each cache [15].
These caching methods return a local decision on a request
only if this exact request has been previously submitted and
answered. In the context of the Bell-LaPadula model [7],
Crampton et al. [4] present a method to calculate previously
unseen decisions, based on the history of submitted requests
and responses, and the formal rules governing Bell-LaPadula.
A similar approach for the RBAC model [8] is presented
in [5] and an extension to Bloom filters is used in [16]. These
inference techniques require knowledge of the underlying
access control model, and are most efficient for hierarchical
and structured access control models. Inference algorithms
have been proposed based on the relationship of the subjects
and objects in a database [17], [18]. However, the structure of
the subject/object space still needs to be known.
The use of machine-learning to make access control deci-
sions is proposed in [6], where the authors note the classifier
can, at times, return a decision that conflicts with the central
PDP. A notion of risk in access control is introduced in [19]–
[21], where they blur the distinction between allowing and
denying an access; instead, each access is associated with a
potential utility and damage. We reuse these notions here, in
order to estimate the potential impact of each access.III. UNCERTAINTY IN ACCESS C ONTROL
Figure 1 depicts the architecture of our system for access
control under uncertainty. An access request req is a triple
(s,o,a), where subject s accesses object o with access mode a.
We assume at least one oracle node with the full policy, called
the central PDP; and a correct local decision is one that agrees
with the central policy. Each node has a local PDP with three
parts. The decision-proposer (proposer) returns a guess of the
correct decision and a measure of the uncertainty. The risk-
assessor (assessor) determines whether the decision is taken
locally or deferred to the central PDP. The decision-resolver
(resolver) returns an enforceable decision. While any discreteset of enforceable decisions is supported, here we restrict the
set to be Datm = {allow, deny}. A PEP passes requests to
the proposer, and enforces decision from the resolver.
Decision-Proposer
Risk Assessor
Decision-Resolver
PEP
PDP
req
req
req
(dec, υ)∈ Dprp
req
dec ∈ Ddef
req,dec ∈ Datm
Datm : Atomic decisions, e.g. {allow, deny}D prp : Proposals, e.g. Datm × [0, 1]
Ddef : Deferrable decisions, e.g. Datm ∪ {defer}Fig. 1. Architecture of a local node
a) Decision-Proposer: A decision-proposer receives an
access request and returns a proposal (dec, υ), where dec ∈Datm is a guess of the correct decision for this request, and υ is
an uncertainty measure of the correctness of dec. Most simply
υ is a probability p, and a proposal (allow, p) is logically
equivalent to (deny, 1 − p) if Datm = {allow, deny}. When p is estimated, we may want to quantify the uncertainty in p.
In our learning based approach, the proposer is mainly a
7/27/2019 s5.2_04-molloy_pap.pdf
http://slidepdf.com/reader/full/s5204-molloypappdf 3/8
classifier produced by a machine-learning algorithm, which
treats a decision in Datm as a class and assigns a request to
a class based on the request’s attributes; and the uncertainty
υ is a pair (α, β ); where α, β ∈ [1, ∞] are parameters of a
beta distribution [22], Beta(µ|α, β ). This distribution captures
our uncertainty about p, and µ is a variable approximating
p. The α and β values are computed from the number of
correct and incorrect decisions made by the classifier when
it is evaluated against the training and testing sets of the
the learning process. Elements of these sets are past requests
tagged with the corresponding correct decisions. The beta
distribution fits well with our learning based approach and
gives finer control over risk estimation—see Section IV.
The learning approach makes the assumption that two
requests with similar attributes lead to the similar decisions.
This assumption is supported by underlying principles of many
access control models, a large body of work on learning access
control policies, such as role mining, and best practices exer-
cised by large organizations. Many models, such as attribute-
based access control, assign rights based on attributes; one
only needs to find an appropriate distance measure, suchas the Hamming distance between the subjects’ attributes to
capture similarity. Though it is possible to come up with
a configuration that violates it, this assumption has been
empirically shown to hold for concrete systems [6], and the
large body of research on role mining [23], [24]. In some
models such as the Chinese Wall [9], the decision to grant an
access can also depend on the context. Such models would
need to be restructured to encode context into the attributes
of the subjects and objects; the notion of distance between
requests will be influenced by this encoding.
b) Risk Assessor: The risk-assessor takes a request and
a proposal and determines whether a decision can be madelocally, or should be deferred to the central PDP. The risk
assessor can return any decision in the set of deferrable
decisions, Ddef = Datm ∪ {defer}.
A trivial assessor might have an uncertainty threshold above
which all decisions are deferred to the central PDP – all other
decisions being made locally. However, different requests have
different potential impacts on the system. Some requests have
high potential utility for the system, and wrongly denying
them would not realise this. Some requests have high potential
damage, and wrongly allowing them is a threat to the integrity
and survival of the system. Other factors to consider are the
cost of contacting the central PDP, and the system’s risk
appetite, which controls how risk-averse decisions should be.
A risk-averse decision trades off expected utility for greater
certainty in the outcome. More details on the risk assessor are
given in Section IV.
c) Decision Resolver: A decision-resolver takes a re-
quest, r, and a deferrable decision, decd ∈ Ddef and returns
an enforcable decision, deca ∈ Datm. Typically, deca = decdif decd∈ Datm; otherwise, a r is deferred to the central PDP.
IV. ASSESSING L OCAL D ECISIONS
This section considers how the assessor takes a proposal
from the proposer, and returns a firm decision. As discussed,
the proposal may conflict with the decision that would have
been made if passed to the central policy. If a request that
would be allowed by the central PDP is denoted as valid ,
then there are two kinds of errors that the local PDP could
make: false-allows (allowing an invalid request) and false-
denies (denying a valid request). These errors can have neg-
ative consequences: a false-allow can result in leakage of
information, corruption of information, or privilege escalation;
a false-deny can result in disruption of service, breach of
service level agreements (SLA), and or possibly reputational
damage. Making a correct decision leads to gains and benefits.
Ideally, an access control system should enable as many
appropriate accesses as possible, and so our primary concern is
to maximize the number and value of these accesses—subject
to limiting costs and damages. This trades off the potential
gains of each access control decision, against the costs and
damages that may result from poor decisions. Consider anassessor deciding whether to locally allow or deny a request,
(s,o,a), or to pass this decision on to a central PDP. The
assessor has a ternary choice to make: whether to allow the
request locally, allow; to deny locally, deny; or to defer the
decision to the central PDP, defer. We assume that the central
PDP is an oracle, and always gives the correct decision. If
the assessor decides allow, then either the request is valid
(a true-allow) and the system makes a gain, g, because an
appropriate access has been made; or the request is invalid and
the decision has caused some amount of damage, dA, due to
an inappropriate access being made. Likewise, if the assessor
decides deny, then either the request was invalid, deny is the
correct decision and nothing additional happens (i.e. a gain of 0), or the request was valid and the deny decision is incorrect,
and a damage, dD, is incurred for a false-deny. If the assessor
decides to defer and the request is valid, then the central PDP
will grant the request and the system will make a gain, g , but
incur a cost, c (the contact cost). If the decision is defer and
request is not valid then the central PDP will deny the request
and there will be no gain and again a contact cost, c. As the
central PDP is an oracle, it never makes an incorrect decision,
so there is no damage associated with this.
These gains, costs and damages are shown in Table I (a).
In general, the gains, costs and damages can be probability
distributions over potential values. For instance, in some cases,
wrongly allowing an access only implies the probability of anattack. For simplicity in our examples, we assume these to be
fixed, known values unless otherwise stated. Standard gains
and costs (negative gains) are kept separate from damage,
which is assumed to be associated with rare and costly
negative impacts; this distinction becomes important, when we
consider risk measures in Sections IV-B and IV-C and/or when
considering probability distributions over these values. The
impacts presented in Table I (a) are not the most general that
could be conceived, but are sufficiently rich to be of interest.
7/27/2019 s5.2_04-molloy_pap.pdf
http://slidepdf.com/reader/full/s5204-molloypappdf 4/8
(a) Outcome Gains and Damage
Request Valid Request Invalidgain damage gain damage
allow g 0 0 dAdeny 0 dD 0 0defer g − c 0 −c 0
(b) Expected utilities
U (·|allow, p) U (·|deny, p
)
allow E( pg − (1 − p)dA) E((1 − p)g − pdA)deny E(− pdD) E(−(1 − p)dD)defer E( pg − c) E((1 − p)g − c)
TABLE IWHEN GAINS, COSTS AND DAMAGES ARE INCURRED AND THEIR
INFLUENCE ON UTILITY.
For example, there may be a gain associated with denying
an invalid request, e.g., an increase in system reputation, or a
damage in allowing a valid request, e.g., increasing the number
of copies of a document.
All of the approaches described in this paper make a trade-
off between the uncertainties in classification and in estimation
of these gains and damages. To apply our methodology to areal example, we must be able to (approximately) compute
these gains and damages or find probability distributions over
them. We note here that there are many applications and
domains where these gains and damages can be estimated
either empirically or analytically. For example, in financial
service transactions like credit cards, these costs may be
estimated easily or provided a priori by some agreement.
Other examples include cases where fine grained service level
agreements are defined a priori. We discuss further details in
Section IV-D.
A. Expected Utility
To aid this decision, the proposer attempts to classifyrequest, (s,o,a), to determine whether it should be allowed
or denied, and include some measure of its uncertainty in this
classification. In the simplest case, the proposer can return one
of two kinds of answers, e.g. (allow, p) or (deny, p). The first
response, (allow, p) means that the proposer has classified the
request as an allow with probability p that this is correct, i.e.
that the request is valid. Likewise, (deny, p) means that the
proposer has classified the request as a deny with probability
p that the request is invalid.
The expected utility for a decision, dec ∈ Ddef , is the
expected gain of each outcome (request valid or invalid)
minus the expected damage, and these utilities are shown in
Table I(b). In this approach, we assume a risk neutral posture,meaning deferrable decisions are based purely on the expected
utility, U (·), for each decision. The assessor simply returns the
decision corresponding to the highest utility.
There are cases where the assessor may not defer but
return a decision contrary to the proposer’s. For example,
when the proposer returns (allow, p) given a request where
dA dD, dD < c and g ≈ c, then it may occur thatU (defer) < U (deny). Since our goal is to execute the central
policy as faithfully as possible, our system always defers to
the central PDP in these cases. Hence, given a request req and
a decision proposer δ returning a decision and a probability,
the expected utility assessor, ρeu, returns, given a request
req : allow if δ (req ) = (allow, p) and U (allow) ≥ U (defer);
deny if δ (req ) = (deny, p) and U (deny) ≥ U (defer); defer
otherwise.
Thus in this first approach, we directly use the certainty of
the decision to compute the expected utility and choose the
decision with the highest utility. However, this approach can
lead to rare but significant damages due to incorrect decisions.
Because while the probability of an incorrect decision may be
very small to produce a small expected damage, the value of
damage can be significant. In the following approaches we
will try to explicitly account for the risk of such significant
damage.
B. Risk Adjusted Utility
Now let’s assume that we are making the same decision,
but that we want to use a risk based measure. In other
words, we want to explicitly account for uncertainty in the
proposal. For this, the proposer returns a beta distribution,describing uncertain knowledge about the probability of a
correct classification. If the proposer classifies the request as
valid then it returns (allow, α , β ), where α, β ∈ [1, ∞] are
the parameters of the beta distribution [22]. Likewise, if it
classifies the request as invalid then it returns (deny, α, β ).
Again we focus just on the allow case, and assume that
the assessor can return either allow and defer. If the proposer
returns (allow, α , β ), we denote by p = αα+β
(the expectation
of the beta distribution), as the expected probability that this
decision is correct. The risk adjusted utility assumes that any
uncertainty in the utility is dominated by the damage term, this
can be when the damage is fixed, known and very large (the
case we examine here), or if there is a probability distributionover damage with high variance. Taking advantage of this
assumption, we use the expected values of gain and cost,
but are pessimistic when considering the impact of damage
and use the expected shortfall at some significance value of
n. For example, a 5% expected shortfall (n = 0.05) uses
the average damage in the worst 5% of universes. The risk
adjusted utility for various decisions and proposer outputs are
shown in Table II (a).
For our example of fixed, known value of damage dAand given proposal (allow, α , β ), we can simplify further by
using the pessimistic probability, ˜ p↓n, which is essentially the
expected shortfall of the probability, p, given α and β —e.g.
compute value x such that the incomplete beta function of xequals n, ˜ p↓n is expected value of p conditioned on p < x.
This corresponds to taking a pessimistic view of the evidence,
to imagine there is a higher probability of incurring dam-
age than would normally be expected. Therefore probability
˜ p↓n < p = αα+β
, takes a pessimistic view of how certain the
proposer is to have classified correctly when considering the
impact of damage, and U (allow) ≈ pg − (1 − ˜ p↓n)dA.
Given a request req and a proposer δ returning a pro-
posal δ (req ), the risk adjusted utility assessor, ρau, returns,
7/27/2019 s5.2_04-molloy_pap.pdf
http://slidepdf.com/reader/full/s5204-molloypappdf 5/8
(a) Risk Adjusted Utilities
U (·|allow,α,β ) U (·|deny, α, β )
allow E(pg)+ESn(−(1-p)dA) –deny – ESn(−(1-p)dD)
defer E(pg − c) E((1-p)g − c)
(b) Risks
R(·|allow,α,β ) R(·|deny, α, β )
allow −ESn(pg−(1-p)dA) –deny – −ESn(−(1-p)dD)
defer −ESn(pg−c) −ESn((1-p)g −c)
TABLE IIRISK A DJUSTED U TILITIES AND R ISKS FOR VARIOUS DECISIONS GIVEN
PROPOSER OUTPUTS.
given a request req : allow if δ (req ) = (allow, α , β ) andU (allow) ≥ U (defer); deny if δ (req ) = (deny, α, β ) and
U (deny) ≥ U (defer); defer otherwise. This approach differs
from the risk-neutral approach from the previous section, in
taking a risk averse approach, which mitigates against very
large damages.
C. Independent Risk Constraints
Our second risk method, called independent risk constraints,uses the standard expected utility to rank the decisions in
order of risk neutral preference, and then reject any which do
not satisfy our risk constraint(s). The proposer classifies the
requests and returns either (allow, α , β ) or (deny, α, β ), and
we calculate the expected utilities as they appear in Table I (b)
using p = αα+β
for the probability of allow, and p = α
α+β
for the probability of deny. The risk, R(·), for each deferrable
decision is calculated independently. The risk for a decision,
dec, takes a pessimistic view of the utility, and is the expected
shortfall, at significance n, of the inverse of the utility. The
risks for each decision and proposer output are shown in Table
II (b).
To simplify, as we did for the risk adjusted utility, we could
assume large fixed, known damages, and approximate the risk
using the pessimistic probability. For example, given proposal
(allow, α , β ), we calculate the risk as R(allow) ≈ (1−˜ p↓n)dA.
The decision making process involves either a fixed thresh-
old t > 0, or more generally for each deferrable decision,
dec ∈ Ddef , with utility, udec, a threshold that depends on the
utility, i.e. t(udec). We then take the following steps
1) Find expected utilities and risks as shown in Tables I (b)
and II (b).
2) Rank the decisions according to their utilities.
3) Check each decision, dec ∈ Ddef in turn (highest utility
first), and choose the first decision whose risk is belowt(udec).
For example, assume that probability p = αα+β
is suffi-
ciently close to 1 that allow is preferable to defer. Notice
also that defer carries zero risk and will always be risk
acceptable. Given request req and decision proposer δ re-
turning proposal δ (req ), the independent risk assessor, ρic,
returns, given a request req : allow if δ (req ) = (allow, α , β ),U (allow) ≥ U (defer) and R(allow) ≤ t(U (allow)); deny
if δ (req ) = (deny, α, β ), U (deny) ≥ U (defer) and
R(deny) ≤ t(U (deny)); defer otherwise.
More generally still we may not always have a decision
with zero risk, and hence we may need some way to resolve a
decision when all decisions exceed the risk threshold. Further,
one may wish to be able to override a decision whose risk
is greater than t(udec) by compensating for the additional
aggregate risk. For example, additional risk mitigating mea-
sures, risk escrow services, or bounding the aggregate risk
and limiting its distribution [21] have been proposed. This
is an orthogonal research problem and the most appropriate
technique is a subject of future research.
Both risk methods have their advantages and disadvantages.
For instance, risk adjusted utility behaves more like standard
expected utilities, and there is always a clearly defined decision
for the assessor. Conversely, the independent risk assessor
keeps the two measures utility and risk separate, is more flex-
ible and we could define multiple risk thresholds at different
significance levels. However, we do not necessarily know how
to deal with situations where all decisions violate the risk
constraint(s).
D. Estimating Gain and Damage with UncertaintyThe instantiation of the parameters in order to address a
real-world situation is a complex task, as it requires one to
evaluate the damage and utility associated with access control
entities. In some instances, all values are easily derivable and
in a common set of units. For example, when processing credit
card transactions, all values are in currencies, and rates are
agreed upon a priori through formal contracts. Consider the
Visa Debit Processing Service [25], [26]: the damage from a
false allow is the transaction amount, plus a Chargeback Fee,
e.g., a fixed amount of $15 or a percentage, e.g., 5%; the gain
is the profit margin for the items sold, minus any processing
fees, e.g., 20% the value of the transaction; the contact cost is
an Authorisation Fee, a fixed value to process the transactionor a percentage (or both), e.g., $1 or 3%; and the false deny can
be estimated as a fixed fraction of the gains from customers
that do not have alternative forms of payment.
In many scenarios, the actual value of gains, losses, and
damages may be given in different units, e.g., currency,
reputation, and physical assets, and may themselves be uncer-
tain. Our discussion above assumes these measures are point
estimates in a common set of units, for example currency,
where there is further uncertainty in such a conversion.
We suggest the same techniques used to calculate the risk
of making an incorrect decision described in Section IV-B can
be applied to selecting conservative, risk-adjusted conversion
rates and handling the uncertainty in value distributions. If oneobtains a probability density function (PDF) over values, say
human life to dollars, the expected value can be used for gains,
while a pessimistic value can be used for damages, resulting in
conservative estimates. For example, if we take US household
income as an estimator, then the median income, around $34–
55K, is the expected and the top 5%, $157K, might be used
for the risk.
Clearly, our method will work best when these values are
known, such as in credit card processing use case. When these
7/27/2019 s5.2_04-molloy_pap.pdf
http://slidepdf.com/reader/full/s5204-molloypappdf 6/8
values have distributions over some ranges, our approach is
sound as long as such distributions are conservative, over-
estimating damages and losses while under-estimating gains.
Further, note that we are being conservative here by taking the
pessimistic probability and the pessimistic damage (the worstn% of the worst n%.)
V. EXPERIMENTS
In this section, we describe our prototype system andexperimental results. To evaluate risk-based decision mak-
ing, we devise a simple decentralized access control system
consisting of a single local risk-based policy enforcement
and decision point, and a central policy decision point that
we call the oracle. Users submit requests, which are locally
evaluated, and an estimate of the uncertainty calculated as
described in Section IV-B. The system evaluates potential
risks and gains, and determines when decisions can be made
locally, and when the oracle must be queried at a cost. We
simulate a large number of requests, e.g., 100,000, logging the
request, decisions, and uncertainty. Afterwards, we compare
each decision with the oracle, and aggregate the net utility of
the system, including any gains, costs and damages incurredduring operation. We first describe our experimental system
in more detail. For clarity, we distinguish between standard
damage (of a false-allow) with loss (damage of a false-deny).
A. Experiments
The oracle is considered infallible, i.e., it always returns
correct results, and contains a real access control policy from
a system that provisions administrative access to a commercial
system obtained from a corporation. The policy consists of
881 permissions and 852 users, with 25 attributes describing
each user; there are 6691 allowed requests. As we do not
have access to actual permission usage logs, e.g., requests or
frequency of permission use, we simulate incoming requestsby sampling from the full policy, generating both valid (allow)
and invalid (deny) requests. Users are selected uniformly
randomly, and for permissions we first determine whether
the request is valid with input probability, pv, then if valid
(invalid) choose from the pool of valid (invalid) requests in
an unbiased way. This allows us to generate sample requests
with an arbitrary frequency of valid transactions, e.g. 30%,
50% or 80%, and hence to model different scenarios, such
as a system under attack (mostly invalid requests), or normal
usage (mostly valid requests).
B. Classifiers
Our experimental prototype is implemented in Python, and
uses PyML for the support vector machine1 [27] we use as
our classifier. A support vector machine (SVM) is a super-
vised learning method for classification and regression. Given
labeled points in a k-dimensional space, an SVM finds a
hyperplane (normal vector w and offset b from the origin) that
separates the two classes and maximizes the minimal distance
between any point and the hyperplane. The class of an unseen
1PyML provides a wrapper for libSVM
point x is determined by which side of the hyperplane it is on,
i.e., sign(w · x − b) [27]. See Figure 2. An SVM is a linear
classifier, however a kernel can be applied first to support non-
linear classification. Kernels may result in over fitting when
the number of input datapoints is too small, e.g., less than or
equal to the degree of a polynomial kernel. To prevent this, we
do not train the SVM until a sufficient number of data points
are available. We use both polynomial and Gaussian kernels.
Intuitively,
x
ji
Hyperplane
Support Vectors a l l o
w
d e n
y
α j = 1,
β j = 2
} α i
= 3, β i
= 1
Fig. 2. A Support Vector Machine classifier & (α,β)calculation for uncertainty.
there is less
uncertainty in
a prediction
the farther
away a point
is from the
hyperplane,
a distance
given by
|w · x − b |,and the
uncertaintyis highest
between the
support vectors (the points closest to the hyperplane). We
define an (αi, β i) value for each band (hyper-volume)
corresponding to the point i in our cached training set.
Any point correctly classified closer to the hyperplane
increments αi, while each point farther away incorrectly
classified increments β i. For example, in Figure 2, the
incorrectly predicted point k will increment β j , while
correctly classifying a new point x would increment αi (and
points farther away). The α and β values are only valid
within their respective bands, inclusively. We use the points
in the training set as our initial sample, and further updatethese quantities when we defer to the oracle if the uncertainty
is too high. When we retrain, all α, β values are reset. The
hyperplane itself has α = 1, β = 1.
In our setting, the data points represent users. We map a
user to a k-dimensional space by converting their attributes
into binary vectors. For example, consider a Department
attribute of a company that has three values: R&D, Sales,
and Marketing. These are represented by the binary vectors
0, 0, 1, 0, 1, 0, and 1, 0, 0, respectively. The final k-bit
vector for a user is the concatenation of all attributes. In the
remainder of this paper when we refer to a subject s ∈ S , we
are referring to a point in some k-dimensional space.
Finally, there are | O × A | SVM classifiers, one per permis-sion, trained on subjects s ∈ S . An alternative implementation
is to define a single classifier whose input is S × O × A, how-
ever, based on the results from [6], some object-right requests
have more uncertainty than others. To simplify calculating the
uncertainty of the classified requests, we choose to define a
classifier per permission.
Each PDP based on an SVM will store a small number
of data points (subjects and decisions) in a buffer for later
retraining. Before predicting the decision using an SVM, we
7/27/2019 s5.2_04-molloy_pap.pdf
http://slidepdf.com/reader/full/s5204-molloypappdf 7/8
first check to see if the point is in the cached buffer. When it is,
we return the known decision with zero uncertainty (assuming
a static policy). When a new input and decision is specified
by querying the oracle, it is provided to the SVM so that it
may be retrained. When the number of samples exceeds the
size of the buffer a new SVM is trained and the point correctly
classified and project farthest from the hyperplane is discarded.
Thus the buffer contains the points that most closely define the
support vectors. We use polynomial (degree four) and radial
bias function (RBF) kernels in our experiments.
We implement several baseline PDPs to compare against
the risk-based PDP: a local PDP that will always defer to
the central PDP (Default PDP), and a first-in-first-out cache
(FIFO Cache), such that if the request is not in the cache, the
oracle is queried. We use two FIFO cache sizes: (E) the size
of the FIFO cache is equal to the total memory of the SVMs;
(Inf) the FIFO cache is unbounded.
We also implement three learning-based PDPs: the first one
(SACMAT’09) is a PDP consistent with the approach taken
by Ni et al. , where we train an SVM for each permission and
accept any decision they return, i.e., we assume there is zerouncertainty. The second one (Seeded SVM) is a PDP where
an SVM is constructed for each permission and seeded with n
random users granted the permission, and m random users not
granted the permission. We train and test on the n + m users
and we use n = m = 10. Finally, the third one (Unseeded
SVM) uses an untrained SVM created for each permission.
Before the SVMs can be trained a minimum number of allow
and deny decisions must be specified by querying the oracle.
The buffer of an unseeded SVM is the same as a seeded SVM.
We note that our approach is not tied to any specific
classifier, and we introduce a “black box” classifier as an
example. This classifier randomly selects a level of uncertainty,
and then returns a correct or incorrect decision sampled fromthat accuracy. The performance of our experiments further
reinforces the validity of our assumption that similar requests
receive similar decisions.
C. Evaluation Results
In our experiments, we first compare our general approach
with a default policy that always contacts the central PDP, and
the two FIFO caches, the bounded size cache (the same as all
SVMs combined), and the unbounded (see previous section
for details). For these experiments we restrict ourselves to
the seeded SVM settings and evaluate the results against the
scenarios described in Figure 3 (cc = the cost of contacting
the central PDP; g = the gains for true-allow, d = thedamage of false-deny, l = the loss of false-allow. For space
limitations we only show tables where true-denies have 0 gain
and 0 loss). The figure shows that regardless of the parameters
our approach behave well, i.e. calls to the central PDP are
reduced (in some cases as much as 75% – see Figure 3(a))
and the utility is better than in any of the other standard
approaches. This confirms the independence of our approach
from the setting of the parameters making it applicable to
many situations.
U t i l i t y
Transaction Number
(a) Small cc , g = 2cc, d = l = 2g
U t i l i t y
Transaction Number
(b) Small cc , g = 4cc, d = 10g, l = 0
U t i l i t y
Transaction Number
(c) Small cc , g = 10cc, d = 2cc, l = 10g
Fig. 3. Evaluation under different system parameters.
0 2 0 0 0 4 0 0 0 6 0 0 0 8 0 0 0 1 0 0 0 0
0
2 0 0 0
4 0 0 0
6 0 0 0
8 0 0 0
1 0 0 0 0
C
e
n
t
r
a
l
Q
u
e
r
i
e
s
D e f a u l t P D P
S e e d ( 1 , 1 ) S V M
S e e d ( n , m ) S V M
S i m p l e F I F O ( E )
S i m p l e F I F O ( I n f )
0 2 0 0 0 4 0 0 0 6 0 0 0 8 0 0 0 1 0 0 0 0
T r a n s a c t i o n N u m b e r
1 0 0 0 0 0
5 0 0 0 0
0
5 0 0 0 0
1 0 0 0 0 0
1 5 0 0 0 0
2 0 0 0 0 0
U
t
i
l
i
t
y
(a) 30% Valid Requests
0 2 0 0 0 4 0 0 0 6 0 0 0 8 0 0 0 1 0 0 0 0
0
2 0 0 0
4 0 0 0
6 0 0 0
8 0 0 0
1 0 0 0 0
C
e
n
t
r
a
l
Q
u
e
r
i
e
s
D e f a u l t P D P
S e e d ( 1 , 1 ) S V M
S e e d ( n , m ) S V M
S i m p l e F I F O ( E )
S i m p l e F I F O ( I n f )
0 2 0 0 0 4 0 0 0 6 0 0 0 8 0 0 0 1 0 0 0 0
T r a n s a c t io n N u m b e r
5 0 0 0 0
0
5 0 0 0 0
1 0 0 0 0 0
1 5 0 0 0 0
2 0 0 0 0 0
2 5 0 0 0 0
3 0 0 0 0 0
3 5 0 0 0 0
U
t
i
l
i
t
y
(b) 50% Valid Requests
Fig. 4. Risk-based classifiers outperform traditional caching with high cache-miss rates.
Next, we consider the performance impact of the ratio of
invalid versus valid requests on the classifiers. We evaluate
this for two reasons. First, the ratio in a typical deployment
is unknown. Second, [6] shows that SVMs result in more
false-denies than false-allows, so varying the ratio may reveal
useful insights. We found that the sampling rate of valid versus
invalid requests impacts the rate of cache hits for the FIFO
implementations greatly affecting their performance. When the
number of cache hits was low, for example, by sampling more
invalid requests, we explore a larger amount of the possible
request space, resulting in more cache misses before a hit is
returned. Conversely, at a high rate of valid requests, a larger
fraction of the valid sampled requests reside in the cache (in
part due to the smaller sampling size because the fraction of
allowed requests is so low). The higher number of false-denies
also results in increased uncertainty, causing the SVMs to be
retrained more frequently, and thus deferring more requests.
The results are shown in Figure 4.Next, we consider the potential impact of trusting the SVM
classifiers without first evaluating their uncertainty, as is done
in [6]. Here, the false-allows and false-denies can result in
devastating losses. To illustrate this, we use the decision
resolver (SACMAT’09) described earlier which accepts any
predicted decision, regardless of the risk. Figure 5 illustrates
the potential decrease in utility from a small number of false-
positives in the classifier.
Finally, we evaluate how well the systems perform when
7/27/2019 s5.2_04-molloy_pap.pdf
http://slidepdf.com/reader/full/s5204-molloypappdf 8/8
Fig. 5. The impact of not evaluating the classifier risk.
the communication costs become prohibitive. Here, we set
the cost to contact the oracle to be five-times the gains
from allowing a valid request, and set the damage to befour-times the communication costs. This type of scenario
may correspond to a military setting where communication is
expensive, e.g., requiring a satellite link, or where radio usage
should be minimized to avoid triangulation by the enemy. The
results, shown in Figure 6, illustrate that while the risk-based
outperforms the FIFO caches and reduces queries by 65%, it
cannot keep the utility positive. However, it should be noted
that the FIFO caches lose over three-times more utility due to
the communication costs.
Fig. 6. Communication costs is 5 times the gains, and damages are 10 timesthe gains.
VI . CONCLUSION
This paper presents a new learning and risk-based archi-
tecture for distributed policy enforcement under uncertainty.
The trade-off between the uncertainty and utility of decisions
determines whether they can be taken locally or the central
PDP queried. We present three approaches, Expected Utility,
Risk Adjusted Utility, and Independent Risk Constraints, each
representing a different treatment of risk.
Our approach is validated using data from a large corpo-
ration, and consistently performs better than a naive caching
mechanism, in a number of different scenarios. Under idealconditions, the rate of central PDP queries is reduced up to
75% and an eight-fold increase in system utility is seen. Such
improvements occur when the cache hit rate is low.
We plan to compare our method with the inference tech-
niques cited in Section II. Dynamic access control policies
could also be considered, where the correct decision for each
request can change over time. In this context, the uncertainty
returned by the proposer must combine uncertainty due to
classifier error with uncertainty due to possible policy updates.
Finally, additional research is needed to better determine when
a classifier should be retrained and when a given sample
should be discarded, particularly challenging when dealing
with dynamic policy changes.
ACKNOWLEDGMENT
This research was sponsored by the U.S. Army Research Laboratory
and the U.K. Ministry of Defence and was accomplished under Agreement
Number W911NF-06-3-0001. The views and conclusions contained in this
document are those of the author(s) and should not be interpreted as repre-
senting the official policies, either expressed or implied, of the U.S. Army
Research Laboratory, the U.S. Government, the U.K. Ministry of Defence
or the U.K. Government. The U.S. and U.K. Governments are authorized to
reproduce and distribute reprints for Government purposes notwithstanding
any copyright notation hereon.
REFERENCES
[1] Entrust, “Getaccess design and administration guide,” September 1999.[2] G. Karjoth, “Access control with IBM Tivoli Access Manager,” ACM
Trans. Inf. Syst. Secur., 2003.[3] Netegrity, “Siteminder concepts guide,” 2000.[4] J. Crampton, W. Leung, and K. Beznosov, “The secondary and approxi-
mate authorization model and its application to Bell-LaPadula policies,”in Proceedings of 11th ACM Symposium in Access Control Models and
Technologies, 2006.[5] Q. Wei, J. Crampton, K. Beznosov, and M. Ripeanu, “Authorization
recycling in rbac systems,” in Proceedings of the 13th ACM symposiumon Access control models and technologies, 2008.
[6] Q. Ni, J. Lobo, S. B. Calo, P. Rohatgi, and E. Bertino, “Automatingrole-based provisioning by learning from examples,” in SACMAT , 2009.
[7] L. LaPadula and D. Bell, “Secure Computer Systems: A MathematicalModel,” Journal of Computer Security, 1996.
[8] D. F. Ferraiolo and D. R. Kuhn, “Role-based access control,” inProceedings of the 15th National Computer Security Conference , 1992.
[9] D. F. C. Brewer and M. J. Nash, “The Chinese Wall Security Policy,”in Proceedings of the IEEE Symposium on Security and Privacy, 1989.
[10] P. Bonatti and P. Samarati, “Regulating service access and informationrelease on the web,” in Proceedings of the 7th ACM conference onComputer and communications security, 2000.
[11] L. Wang, D. Wijesekera, and S. Jajodia, “A logic-based framework for attribute based access control,” in Proceedings of the 2004 ACM workshop on Formal methods in security engineering, 2004.
[12] Y. Park, S. C. Gates, W. Teiken, and S. N. Chari, “System for automaticestimation of data sensitivity with applications to access control andother applications,” in Demo at SACMAT’11, 2011.
[13] K. Borders, X. Zhao, and A. Prakash, “Cpol: high-performance policyevaluation,” in Proceedings of the 12th ACM conference on Computer and communications security, 2005.
[14] M. Wimmer and A. Kemper, “An authorization framework for sharingdata in web service federations,” in Secure Data Management , 2005.
[15] Q. Wei, M. Ripeanu, and K. Beznosov, “Cooperative secondary autho-rization recycling,” in Proceedings of the 16th international symposiumon High performance distributed computing, 2007.
[16] M. V. Tripunitara and B. Carbunar, “Efficient access enforcement indistributed role-based access control (RBAC) deployments,” in Proc. of the 14th ACM Symp. on Access control models and technologies, 2009.
[17] A. Rosenthal and E. Sciore, “Administering permissions for distributeddata: factoring and automated inference,” in Proceedings of the fifteenthannual working conference on Database and application security, 2002.
[18] S. Rizvi, A. Mendelzon, S. Sudarshan, and P. Roy, “Extending queryrewriting techniques for fine-grained access control,” in Proc. of the2004 ACM SIGMOD international Conf. on Management of data , 2004.
[19] L. Zhang, A. Brodsky, and S. Jajodia, “Toward Information Sharing:Benefit And Risk Access Control (BARAC),” 7th IEEE Intl. Workshop
on Policies for Distributed Systems and Networks (POLICY’06) , 2006.[20] P.-C. Cheng, P. Rohatgi, C. Keser, P. A. Karger, G. M. Wagner, and A. S.
Reninger, “Fuzzy multi-level security: An experiment on quantified risk-adaptive access control,” in IEEE Symp. on Security and Privacy, 2007.
[21] I. Molloy, P.-C. Cheng, and P. Rohatgi, “Trading in risk: Using marketsto improve access control,” in NSPW , 2008.
[22] C. M. Bishop, Pattern Recognition and Machine Learning (InformationScience and Statistics). Springer, 2007.
[23] I. Molloy, N. Li, J. Lobo, Y. A. Qi, and L. Dickens, “Mining roles withnoisy data,” in SACMAT’10, 2010.
[24] M. Frank, A. Streich, D. Basin, and J. Buhmann, “A probabilisticapproach to hybrid role mining,” in CCS ’09: Proceedings of the 16th
ACM conference on Computer and communications security, 2009.[25] “Visa Debit Processing Service, Transac-
tion Processing, Authorization Processing,”http://www.visadps.com/services/authorization processing.html.
[26] “Card-Present, Merchants, Visa USA,”http://usa.visa.com/merchants/risk management/card present.html.
[27] V. N. Vapnik, Statistical Learning Theory. Wiley-Interscience, Septem-ber 1998, ISBN 9780471030034.