10
Keyword(s): Abstract: Risk Assessment and Decision Support for Security Policies and Related Enterprise Operational Processes Marco Casassa Mont, Richard Brown HP Laboratories HPL-2011-12 Security Policies, Risk Assessment, Decision Support, Access Management, Security Analytics, Modelling, Simulation This paper presents and discusses our work to provide organizations with risk assessment and decision support capabilities when dealing with their strategic security policies. Traditional work in the policy management space primarily focuses on technical languages and frameworks to manage and enforce operational policies. These contributions are important but they do not address strategic decision makers' needs and questions such as: What business and security risks is my organization exposed to, due to the current security policies and related operational processes? How effectively are these policies enforced at the operational level? What is the impact of changing them? We aim at providing strategic decision support in this space by using a rigorous and scientific methodology (and tools) which leverages modeling and simulation techniques. This methodology helps organizations to assess their risk exposure. It factors in policy implementation at the operational level along with relevant threats, processes, interactions and people behaviors. It provides "what-if" analysis by illustrating the consequences of making policy changes and investments. We briefly introduce our methodology and tools and then ground the discussion by illustrating how this approach has been successfully used in a real case study with one of our major customers. This case study focused on the organization's access management processes and related policies: it helped to inform strategic security policies and support changes of current access management processes. Additional work is planned in this space to further validate our approach and build template solutions for different types of organizational policies and processes. External Posting Date: January 21, 2011 [Fulltext] Approved for External Publication Internal Posting Date: January 21, 2011 [Fulltext] Copyright 2011 Hewlett-Packard Development Company, L.P.

Risk Assessment and Decision Support for Security Policies ...Marco Casassa Mont, Richard Brown HP Laboratories HPL-2011-12 Security Policies, Risk Assessment, Decision Support, Access

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Keyword(s): Abstract:

    Risk Assessment and Decision Support for Security Policies and RelatedEnterprise Operational ProcessesMarco Casassa Mont, Richard Brown

    HP LaboratoriesHPL-2011-12

    Security Policies, Risk Assessment, Decision Support, Access Management, Security Analytics, Modelling,Simulation

    This paper presents and discusses our work to provide organizations with risk assessment and decisionsupport capabilities when dealing with their strategic security policies. Traditional work in the policymanagement space primarily focuses on technical languages and frameworks to manage and enforceoperational policies. These contributions are important but they do not address strategic decision makers'needs and questions such as: What business and security risks is my organization exposed to, due to thecurrent security policies and related operational processes? How effectively are these policies enforced atthe operational level? What is the impact of changing them? We aim at providing strategic decision supportin this space by using a rigorous and scientific methodology (and tools) which leverages modeling andsimulation techniques. This methodology helps organizations to assess their risk exposure. It factors inpolicy implementation at the operational level along with relevant threats, processes, interactions andpeople behaviors. It provides "what-if" analysis by illustrating the consequences of making policy changesand investments. We briefly introduce our methodology and tools and then ground the discussion byillustrating how this approach has been successfully used in a real case study with one of our majorcustomers. This case study focused on the organization's access management processes and related policies:it helped to inform strategic security policies and support changes of current access management processes.Additional work is planned in this space to further validate our approach and build template solutions fordifferent types of organizational policies and processes.

    External Posting Date: January 21, 2011 [Fulltext] Approved for External PublicationInternal Posting Date: January 21, 2011 [Fulltext]

    Copyright 2011 Hewlett-Packard Development Company, L.P.

  • Risk Assessment and Decision Support for Security Policies and Related Enterprise Operational Processes

    Marco Casassa Mont Cloud & Security Lab Hewlett-Packard Labs

    Bristol, UK [email protected]

    Richard Brown Cloud & Security Lab Hewlett-Packard Lab

    Bristol, UK [email protected]

    Abstract— This paper presents and discusses our work to provide organizations with risk assessment and decision support capabilities when dealing with their strategic security policies. Traditional work in the policy management space primarily focuses on technical languages and frameworks to manage and enforce operational policies. These contributions are important but they do not address strategic decision makers’ needs and questions such as: What business and security risks is my organization exposed to, due to the current security policies and related operational processes? How effectively are these policies enforced at the operational level? What is the impact of changing them? We aim at providing strategic decision support in this space by using a rigorous and scientific methodology (and tools) which leverages modeling and simulation techniques. This methodology helps organizations to assess their risk exposure. It factors in policy implementation at the operational level along with relevant threats, processes, interactions and people behaviors. It provides “what-if” analysis by illustrating the consequences of making policy changes and investments. We briefly introduce our methodology and tools and then ground the discussion by illustrating how this approach has been successfully used in a real case study with one of our major customers. This case study focused on the organization’s access management processes and related policies: it helped to inform strategic security policies and support changes of current access management processes. Additional work is planned in this space to further validate our approach and build template solutions for different types of organizational policies and processes.

    Security Policies, Risk Assessment, Decision Support, Access Management, Security Analytics, Modelling, Simulation

    I. INTRODUCTION It is very complex to define strategic policies within organizations. Different priorities, trade-offs and viewpoints need to be taken into account by decision makers, including Chief Information Officers (CIOs, CISOs), risk managers, business and financial managers, compliance managers, etc.

    This is particularly true when defining strategic security policies: they usually aim at mitigating security risks but they also impact productivity, business availability, compliance, etc. In this paper we specifically focus on these types of policies.

    Various kinds of strategic security policies are of relevance for organizations, including: authentication policies (e.g. access to critical resources requires two factor

    authentication); access management policies (e.g. employees must get access to the resources necessary to do their job within 2 days); vulnerability and threat management policies (e.g. 95% of IT systems need to be patched within 30 days); data protection policies (e.g. all sensitive data needs to be encrypted); web access policies; security monitoring policies; policies about access to physical sites.

    In general, once defined, security policies are interpreted by different stakeholders within the organization; refined into enforceable policies; deployed at the operational levels – i.e. within business and IT processes, along with underlying services, applications and systems - by means of security controls. They affect the involved processes. For example, policies dictating how to handle users’ access rights affect user accounts’ provisioning and deprovisioning processes; vulnerability and threat management (VTM) policies affect the processes to deal with patch testing, deployment and monitoring; etc. They also have an impact on employees’ productivity and business agility, e.g. long delays in provisioning access rights to users do not allow them to access the required resources to do their job. VTM policies may disrupt business applications depending on the frequency of patching activities.

    Security decision makers need to assess the risks their companies are exposed to (due to current and foreseeable threat environments) and how current security policies effectively address them; the priorities of various stakeholders and business objectives need to be taken into account; they need to understand the implications, at the operational level, of mandating or changing specific policies; they need to decide which investments (e.g. automation, education, better monitoring/compliance, etc.) are necessary and most suitable in order to support these policies.

    The decision making process in IT security is currently an “ad-hoc” activity: it depends on the expertise and skills of decision makers, their common sense and input received from trusted teams of experts/consultants. However, current trends highlight that the threat environment is becoming more and more dynamic (new vulnerabilities, malwares, attacks and misbehaviors discovered on a daily basis); the budget available for IT and security is shrinking, due to cost cuttings and the negative economic climate; decisions need to be fully justified and aligned with business objectives in order to get financial support.

    There is an increased demand for a more rigorous, scientific approach to the security decision making process, to provide evidence that justifies policy decisions and attract

  • investments. This includes: providing insights about the impact of policy decisions at the operational level – in terms of risks, productivity, compliance; explore in advance various options by means of rigorous “what-if” analysis.

    This paper describes our work in this space, specifically to provide risk assessment and decision support for security policies and related operational processes. Section II illustrates our scientific approach, based on modeling and simulation techniques and a related methodology. Section III presents how our approach and methodology has been successfully used in a case study with one of our major customers to investigate the risks associated with their access management policies and related processes. Sections IV and V discuss in more detail the involved modeling and simulation activities along with a description of the outcomes and how they have been used to provide decision support. Sections VI and VII further discuss our approach and related work. Finally, Section VIII draws a few conclusions.

    II. OUR APPROACH TO RISK ASSESSMENT AND DECISION SUPPORT

    Most organizations deal with risk assessment and policy management issues in the context of a wider lifecycle management of security, which includes: risk analysis and assessment; identification of suitable policies to mitigate risks; policy refinement and deployment at an operational level along with security controls; logging, monitoring and auditing (situational awareness); compliance and governance activities, which could trigger further risk assessment activities. This security lifecycle management usually has many gaps and disconnections: different organizational expertise and capabilities are involved at different stages of the lifecycle (e.g. at the business, legal, financial, technical and compliance levels); there are different priorities at different levels; often there are ambiguities and complexity in how to interpret and implement security policies.

    In terms of risk assessment, consultants and decision makers often use standard approaches such as ISO 2700x [6], CoBIT [7], etc. These approaches are indeed based on standard security criteria and common sense: they provide generic guidelines and coarse-grained analysis to decision makers whom still need to instantiate them to their specific needs and operational environments.

    Our work, also referred to as Security Analytics, aims at addressing these issues by providing a rigorous, scientific approach to risk assessment and decision support. It complements current standards (e.g. ISO 2700x) by potentially leveraging their outcomes and further grounding the analysis, by factoring in details at the process, system and human levels. It is based on probabilistic modeling and simulation techniques [2,8,9]; it can also factor in economics aspects to analyse trade-offs (e.g. security vs. productivity or security vs. costs) [5].

    Figure 1 illustrates the typical steps involved in the Security Analytics methodology: 1. Identification of a specific problem, usually at the

    strategic level e.g. explore risk exposure due to specific security policies and their operational implementation and/or understand the impact of policy changes;

    2. Identification of metrics and measures that are suitable to convey the answers;

    3. Building discrete-event probabilistic models [8] which factor in relevant policies, threat environments, business and IT processes, systems, human behaviors and relevant cause-effect relationships. These models are instantiated based on empirical data gathered from the ground. They can reflect the current situation and/or be based on assumptions for what-if analysis;

    4. Carrying out Monte Carlo simulations with the models, to obtain statistically significant data;

    5. Analysis of experimental results to provide answers to the originally stated problem, etc.;

    6. Validation of Models and outcomes against expected behaviors and initial empirical data.

    Problem DefinitionProblem Definition

    EmpiricalData Gathering

    EmpiricalData Gathering ModellingModelling

    SimulationSimulationOutcomeAnalysisOutcomeAnalysisValidationValidation

    Figure 1. Our Security Analytics Methodology

    At HP Labs we have developed a set of tools to support the entire Security Analytics methodology, specifically to deal with: the graphical authoring of probabilistic models; their mapping into a process-based, discrete event modeling language, GNOSIS [1,2]; the planning and execution of Monte Carlo simulation experiments to gather statistically significant results; the drawing of graphical statistical results.

    Let’s briefly consider an example, in the context of Vulnerability and Threat Management [3], to quickly illustrate the Security Analytics methodology. A security policy might mandate that: “All IT systems within the organization must be patched or risk mitigated within 30 days, against known vulnerabilities”.

    At the operational level, a few processes might have been put in place by the organization to track vulnerabilities, test the relevant patches and eventually deploy them within the IT systems. Additional mitigation controls might be available, in case no patches are available, such as locking down devices and/or deploying Host Intrusion Prevention Systems (HIPS).

    The Chief Security Officer (CISO/CSO) - i.e. the decision maker - might want to know how effectively this policy is enforced and what the actual “risk exposure” is.

    By applying the Security Analytics approach, we can use “the time required to patch 95% of systems” as a viable

  • metric to convey the level of risk exposure. We build a model of both the threat environment (i.e. frequency of vulnerabilities, malware and patches) and the current patching and vulnerability management processes in place within the organization. Monte Carlo simulations of the model can be carried out to generate a statistical distribution of the agreed metric. Results might highlight that, for example, on average it takes more than 50 days to patch 95% of the systems. This provides scientific evidence to the decision maker about how effectively the current security policy is implemented, the level of risk exposure and indications about potential causes (e.g. too long patch testing process and/or not enough available personnel). Our model can then be used to carry out “what-if” analysis. For example, verifying if the initial security policy can be enforced by increasing the number of personnel and/or by investing in additional security controls. Alternatively, the outcomes can be used to inform more realistic security policies i.e. policies that can be achieved with current organizational means and resources (and justify the acceptance of the additional security risks).

    We have successfully applied this methodology in various security areas, in collaboration with customers, including: VTM [3], Identity and Access Management (IAM) [4,5] and Data Protection.

    The remaining part of this paper focuses on a recent case study to further ground the Security Analytics concepts and illustrate, in more detail, the kind of risk assessment and decision support our methodology can provide. The case study has been carried out, jointly with one of our major customers, in the space of Access Management.

    Please notice that the scenario details, actual policies and processes, empirical data and the results/findings have been fully anonymised, for confidentiality reasons.

    III. CASE STUDY: ACCESS MANAGEMENT The focus of this case study is on organization’s access management processes and related policies. This area is perceived as being critical as it exposes organisations to various security risks: depending on how users’ access rights are allocated and removed, credentials could be misused, either accidentally or for criminal purposes.

    Jointly with the customer (also referred in the paper as the “decision maker”) we agreed to apply Security Analytics to the following problem: what is the risk exposure of the organization due to their current access management policies and their implementation at the operational level? What would be the consequences of changing these policies and further investing in IAM automation?

    Access management affects employees, business services, applications and systems within the customer’s organisation. Carrying out a comprehensive risk assessment across the overall organization was beyond the purposes of the case study. We decided to demonstrate the value of Security Analytics by focusing on a critical business service: the organisation’s Customer Relationship Management (CRM) service, its hosting systems and its database, along with the involved personnel that access it.

    Our goal was to produce a Security Analytics analysis that could be generalized with templates and re-used in other contexts and services.

    The CRM service is managed in an outsourcing context: its development and maintenance are carried out by an external company. It can be accessed both by organisation’s employees and by external technicians (outsourcing company). The security configuration of the involved IT systems is still performed by organization’s administrators. Our customer wanted to better understand the level of risk exposure due to their current access management policies and related processes. In case of major issues, they wanted to gather scientific evidence to support requests for policy changes and justify further IAM automation investments.

    A. Analysis of Access Management Policies and Processes Various strategic access policies are in place in the organisation, mandating key constraints and objectives on how to deal with the management of user access rights: • P1: All users’ accounts and access rights must be

    approved both by managers and security teams; • P2: User accounts should be configured according to

    best security practices; • P3: Managers should immediately notify the security

    team when their employees leave or change their role; • P4: Unnecessary user accounts and access rights should

    be removed as soon as possible. We will refer to these policies in various parts of this paper.

    We identified the two core access management processes directly implementing these policies and of relevance to explore the organisation’s risk exposure: the Provisioning and Deprovisioning processes. Both of them are instrumental at managing users’ access rights and accounts on IT systems. Failures in these processes can lead to the exploitation and misuse of users’ credentials, including: users accessing resources and information they are not entitled to; credential used for criminal purposes. Figure 2 provides a high level overview of the key phases involved in these processes.

    CRM Access Management Processes

    Provisioning of Access Rights to a User

    Metrics• Time to Provision• # failures• # success• …

    Deprovisioning of Access Rights from a User

    Metrics• Time to Deprovision• # failures• # success• …

    Failures: Miscommunication, Misconfigurations, …

    Failures: Miscommunication, Misconfigurations, …

    - User Joining- User Changing Role

    - User Leaving- User Changing Role

    ApprovalPhase

    ApprovalPhase

    Configuration/Deployment

    Phase

    Configuration/Deployment

    Phase

    DeprovisioningPhase

    DeprovisioningPhase

    Configuration/Deployment

    Phase

    Configuration/Deployment

    Phase

    Figure 2. Relevant Access Management Processes

  • Provisioning processes are in charge of assessing the suitability of users to get user accounts and access rights and granting these access rights. They usually involve an approval phase (by management) followed by a deployment phase, where IT systems are configured. Various failure points were identified, including: failures in implementing Separation of Duties (SoD); IT configuration failures; miscommunications between the involved parties, etc. In addition, the time required to provision user accounts, plays a key role in this process, not only in terms of productivity but also in encouraging security misbehaviours. For example, in case of too long provisioning times, managers might hold onto credentials that should otherwise be deprovisioned (e.g. their employees left) as they do not trust the process: just in case they need to use these credentials for emergency activities. This is an obvious violation of security policies and it increases security risks.

    Deprovisioning processes are in charge of removing access rights and user accounts, when no longer needed by employees (e.g. leavers). They usually include a notification phase carried out by employees’ managers, followed by the removal of access rights. This process might fail in various points, e.g. due to missing notifications or failures to remove user accounts. These failures are critical, as they generate hanging accounts i.e. accounts that are still in place, even though the employee does not need them anymore: they could be misused. The overall risk exposure further increases in case of long deprovisioning times (and the type of accounts, e.g. super user accounts) as these accounts would be available for a longer time for exploitation.

    The two phases of handling management approvals or raising notifications are carried out by the organization the employee belongs to (i.e. customer’s organization or external company). On the other hand, security configuration and deployment activities are still run by the customer’s company for all personnel: this might increase the chances of miscommunications between the various parties.

    Of particular concern, in terms of risk exposure, are failures related to privileged users, i.e. IT administrators with root access to sensitive systems and information and/or accounts shared by a team.

    Interestingly, we noticed that the current security policies do not mandate specific deadlines for the provisioning and deprovisioning activities; they do not give directions about the types of controls that need to be put in place. This was a key concern for the customer, also based on employees and managers’ feedback.

    B. Applying Security Analytics We applied the Security Analytics methodology to this specific domain. In this context, we identified a set of metrics, to be used to measure their risk exposure: • Time to carry out a provisioning and deprovisioning

    process: as anticipated, productivity issues and security risks can be caused by long provisioning and deprovisioning times;

    • Number of failures in carrying out specific process activities: we specifically focused on failures to

    configure accounts, notify the system about people leaving and removal of accounts;

    • Number of successes in carrying on various provisioning steps: this is an indicator of the reliability of the processes.

    The following Security Analytics activities were carried out: • We grounded the agreed metrics within the provisioning

    and deprovisioning processes; • We built models and ran simulations to analyse the

    impact on risks of current provisioning and deprovisioning processes and related policies;

    • We built models and ran simulations to analyse the impact on risk of potential new provisioning and deprovisioning processes (what-if analysis), based on new access policies and relevant IAM investments;

    • We provided recommendations to the customer. Section IV specifically describes the analysis of risks for current and future provisioning processes along with related policies whilst Section V describes the analysis of risks of current and potential future deprovisioning processes along with related policies.

    It is beyond the purpose of this paper to provide all the details of the case study. Again, all information has been anonymised. The main goal is to illustrate how Security Analytics can be effectively used to provide decision support for policies and risk assessment. More technical details will be provided in a HP Labs technical report.

    IV. MODELLING AND ANALYSIS OF PROVISIONING PROCESSES AND RELATED POLICIES

    This section illustrates the analysis carried on the Provisioning process.

    We built a probabilistic, discrete-event model [2,9] of the current provisioning process by representing the various steps involved in the organization and factoring in empirical data obtained from experts in the customer’s organization and historical information. Figure 3 illustrates the graphical version of the model built with our Visual Modelling tool (which uses a flow chart like approach): it is mapped in an executable model, in the GNOSIS modeling language [1].

  • Figure 3. Model of Current Provisioning Process

    This model captures the key events of relevance for provisioning of user accounts and rights, i.e. people joining the organization (average: 20 days) and changing roles, hence requiring new access (average: 3 days). It captures, the relevant provisioning steps, including: • Management Approval phase: this might take a long

    period of time, up to 15 days. In 5% of the cases it causes Separation of Duties (SoD) issues, when checks are not properly carried out;

    • Management of Requests for Implementation and subsequent Approval Phase by Security Team: these phases can take long time, respectively up to 5 and 15 days;

    • Implementation/Configuration of User Accounts and Access Rights: this phase can take up to 5 days.

    We modeled the involved times with uniform distributions – see Figure 3: this included the elapsed time to carry out each step and the actual required physical effort (i.e. without any delay of inefficiency).

    Our model simulates and measures the overall time it takes to carry out each complete provisioning activity.

    The Implementation Phase is critical from a risk exposure perspective. We discovered that in 3% of cases

    there is a failure to correctly implement user accounts; 95% of these problems are fixed within 5 days.

    More worryingly, we discovered that in 85% of cases, user accounts are not configured according to best security practices, including setting password expiration and lock-out controls. Specifically, the latter is relevant to disable hanging accounts. This is an obvious implementation failure of the organisation’s policy P2. This type of failure is currently not detected and fixed at the operational level. They are remediated only at the audit time.

    Our model keeps into account the nature of user accounts to be provisioned: 15% of these accounts relate to super user accounts whilst 10% relates to shared accounts (i.e. with login and password details shared by a team).

    The model specifically measures the number of failures happening when handling these critical types of accounts, including SoD failures, implementation failures and failures to set lock-out controls.

    The model runs on a simulated period of time of 1 year; it simulates the fact that new employees join the organization or change their role along with the related access provisioning activities.

    In order to get statistically significant results, Monte Carlo simulations of the model are carried out 1000 times. The analysis of the experimental results provided interesting insights. On average there are 133 provisioning activities per year, 19 of which related to super users and 14 to shared accounts. On a yearly basis, we registered an average of 6 SoD violations, and 113 failures to set security controls (lock down controls): 15 of these security control failures were related to super users whilst 11 related to shared accounts.

    The overall elapsed time to fully provision a user account can range from 14 to 45 days; in 67% of cases this takes between 23 and 33 days, Figure 4. This contrasted against the actual required Physical Effort time, always below 1 day. These results indicate that there are performance issues related to the provisioning process. There are too many critical business process failures which result in increased security risks; the overall process takes too long a time to be carried out. The latter problem is primarily due to the potentially long time required to deal with the Management Approval Step.

    Elapsed Time - Provisioning Process

    0

    0.01

    0.02

    0.03

    0.04

    0.05

    0.06

    0.07

    0.08

    1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49

    Days

    Prop

    ortio

    n

    Elapsed Time

    Figure 4. Provisoning Process: Distribution of Elapsed Time

  • The results also gave an indication of the potential risk exposure of the organization and policy failures. Long provisioning times can induce people to bypass the process or induce managers to hold onto credentials, as previously explained. Failures in implementing security controls (i.e. lock-out control) are in violation of Policy P2.

    The customer agreed that our outcomes correctly reflected the current situation. Based on this, the customer asked us to perform “what-if” analysis to assess the risk exposure if changes were to be made. These included the:

    • Introduction of role-based access control, with pre-

    defined and pre-approved set of roles; • Retention of an explicit approval phase, but mainly

    driven by passive approval (at least for pre-approved roles);

    • Introduction of Identity and Access Management tools (IAM) to automate the approval and configuration phases. These phases are simplified and some of the current approval duplications removed. Automation is introduced to deploy accounts and dealing with SoD checks.

    Despite automation, failures could still happen: however, logging and auditing capabilities provided by the IAM tools can remediate them within 1 working day.

    Figure 5 illustrates the model reflecting these assumptions and simplifications. We gathered empirical data both from an IAM solution provider and our experience in this field. In particular we assumed that: there is a likelihood of 0.15% and 0.02% respectively for SoD and configuration failures; the automated configuration of user accounts would now take between 0.1 and 1 day.

    We carried out Monte Carlo simulations, as previously described, which confirmed the improvement in terms of accuracy of the processes: basically there will be negligible failures on yearly basis, which is consistent with what policy P2 mandates.

    In terms of overall provisioning time (elapsed time), the new model is sensitive to the time allocated for “passive approval”. We explored the implication of having different policies in this area – i.e. where this time could be 1 day (case#1), 2 days (case#2) or 5 days (case#3). Figure 6 illustrates that the overall elapsed time can vary from a [0.5,2] days range (case#1) to [0.5, 5.5] days (case#3).

    These outcomes illustrate that the introduction of IAM automation (and the described changes) indeed improves the situation and where this happens. They helped to inform the decision maker about the impacts of investing in IAM automation and how this improves compliance to policy P1 (about performance during approval phase) and P2 (setting of security controls).

    Figure 5. Model of Provisioning Process with IAM Automation

    Elapsed Time - Sensitivity Analysis - Provisioning Process with IAM Automation

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10

    Days

    Prop

    ortio

    n Elapsed Time - Case #1Elapsed Time - Case #2Elapsed Time - Case #3

    Figure 6. Elapsed Time – Sensistivity Analysis

    V. MODELLING AND ANALYSIS OF DEPROVISIONING PROCESSES AND RELATED POLICIES

    This section describes our analysis of the Deprovisioning process. The approach is very similar to the one described in Section IV.

    The deprovisioning process is critical in terms of exposing the organization to risks. It is triggered when people leave the organization (on average every 25 days) or change their role and no longer need their access rights (on average every 3 days). Our analysis identified a few key aspects that have been factored into our model, see Figure 7.

  • Figure 7. Model of Current Deprovisioning Process

    These aspects included: • There is a heavy reliance on managers’ notifications to

    trigger the deprovisioning process. This step is successfully carried out only in 60% of cases. It usually takes up to 5 days to notify the security team. This data gave the decision maker indications about the degree of failure in implementing Policy P3;

    • In case of notification failure, only 15% of accounts have a lock-out control properly set (reflecting the performance of the current provisioning process). Accounts with lock-out control are disabled after 45 days;

    • In two situations lock-out controls are ineffective, in case of deprovisioning failures. This included: (1) the presence of shared accounts. The password is currently not changed at the deprovisioning time, so the account is kept alive by the remaining team accessing it; (2) users that change roles but stay in the organizations. These people (20% of cases) keep accessing the accounts, for various reasons.

    • In case of failure, if an account is not disabled by lock-out controls it becomes a hanging account. It can potentially be misused by the personnel or attackers, for criminal purposes. This has a major impact on the

    organization’s risk exposure. The model tracks the occurrence of these failures;

    • In case of successful manager’s notifications, the Security Team has still to approve the request and then remove the account. These phases can respectively take up to 20 days ([7,20] days range) and 2 days. The removal process is likely to fail in 3% of cases, hence generating hanging accounts if not mitigated by the lock-out control.

    As previously mentioned, it is likely that 15% of the accounts to be deprovisioned are super user accounts whilst 10% are shared accounts.

    We carried out Monte Carlo simulations for this model, as mentioned in the previous section. The experimental results highlighted that on average the organization has to deal with 129 requests for deprovisioning on yearly basis, of which 49 end up with critical failures i.e. generate hanging accounts (7 super user accounts and 5 shared accounts). On average, only 6 accounts were locked-out, despite these failures.

    Our experimental results also highlighted that the overall deprovisioning time can take between 10 and 30 days (compared to less than 1 day to carry out the physical work). In 50% of cases it takes between 15 and 26 days. In case the accounts is locked out (without removal), it takes 45 days to disable it. In almost 40% of cases accounts are neither deprovisioning nor deprovisioned at all. See Figure 8.

    Elapsed Time - Current Deprovisioning Process

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    0.4

    0.45

    0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60Mo

    reDays

    Prop

    ortio

    n

    Elapsed Time

    Figure 8. Elapsed Time – Current Deprovisioning Process

    These results highlighted the current failure in implementing policies P3 and P4 and the consequent high level of risk exposure for the organization. It is important to notice that failures in correctly implementing policy P2, at the provisioning level, also have a negative impact at the deprovisioning level.

    A major issue is that these policies are too abstract: they do not set precise goals and constraints. This is reflected in the relaxed implementation of the various processes.

    As part of our decision support activity, we carried out a “what-if” experiment: what are the implications of improving the implementation of policy P2, i.e. what if the percentage of accounts with the lock-down control properly set were 50% or 100% - instead of just 15%? This implies making changes, but only for the provisioning process.

    We carried out simulations by changing the “lock-out control” parameter in the model shown in Figure 7. The new

  • results are encouraging, see Figure 9: they show that the number of hanging account can drop from 49 to 34 (case of 50% lock-out control) and 14 (case of 100% lock-out).

    What-if Analysis - Impact of Lock-out Control

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    0.4

    0.45

    0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60

    Days

    Prop

    ortio

    n Elapsed Time - 15% lock-outElapsed Time - 50% lock-outElapsed Time - 100% lock-out

    Figure 9. Elapsed Time - What-if analysis with Current Deprovisioning

    Process

    However, these results confirm that the “lock-out control” does not provide the ultimate solution to mitigating risks: failures generating hanging accounts still happen; there are still very long deprovisioning times.

    The customer asked us to explore the implications of further investing in IAM automation and making the following changes to the deprovisioning process:

    • Use the HR system to provide automatic notification

    about personnel changes. We considered the case where the notification time could be configured to 1, 2 or 5 days;

    • Automate the removal of user accounts with an IAM solution: based on solution provider’s information, it can take between 0.01 and 0.1 days to remove them. The failure rate is negligible (0.02%) and mitigated by automated remediation, taking place within 1 day.

    All these assumptions were factored in a new version of the deprovisioning process model, shown in Figure 10.

    Figure 10. Model of Deprovisioning Process with IAM Automation

    The corresponding simulation results were very encouraging. The number of failures and hanging accounts dropped basically to 0, on yearly basis. In addition, IAM technological investments drastically reduce the overall deprovisioning time, depending on the frequency of HR updates, as shown in Figure 11. It can take up to 5.5 days, in case the HR system is provides notification every 5 days.

    Elapsed Time - Sensitivity Analysis - Deprovisioning Process with IAM Automation

    0

    5000

    10000

    15000

    20000

    25000

    30000

    35000

    40000

    0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10

    Days

    Prop

    ortio

    n

    Elapsed Time - Case #1Elapsed Time - Case #2Elapsed Time - Case #3

    Figure 11. Elapsed Time- Deprovisioning Process with IAM automation

    These changes would enable to customer to be compliant with policies P3 and P4, of course at the cost of buying, deploying and periodically managing the IAM solution.

  • VI. DISCUSSIONS AND FUTURE WORK Our case study was successful. It was completed in 3

    months and produced a full Security Analytics Report, followed by a presentation of our findings to the customer. Based on input received from the customer, Security Analytics indeed helped them to ground the analysis of their risks and explore the implications of making investments or modify their policies.

    This case study provided the decision makers with scientific evidence to support their decision making process in order to address current risks and improve their current access management processes. Additional actions might be taken by the customer to refine their current access policies to mandate more specific constraints and goals.

    A few steps have been critical to the success: gathering empirical data and process knowledge by interacting with various domain experts; refinement and validation of our models and assumptions. This has been achieved thanks to the collaboration of the customer.

    This work helped us to create a template Security Analytics solution in the IAM space, to be used in other customer engagements. It is now available as part of the Security Analytics services provided by HP Information Security [11].

    Additional Security Analytics work is planned by HP Labs in the IAM space. This includes work in the areas of privileged users, vetting, compliance management and situational awareness.

    VII. RELATED WORK Modeling and simulation techniques have been

    successfully used in many fields to assess risks and provide decision support, including weather forecast, aeronautics, civil engineering, etc. However their usage in the realm of IT security, security policies and related IT processes is still limited: this area is open to R&D activities and exploitation opportunities. A more detailed assessment of relevant work is available [4].

    We are not aware of research or commercial solutions that aim at modelling and simulating the overall complexity of security and related policy to support the decision making process. Standards such as ISO 27001 [6], CoBit, ITIL describe best practices and methodologies respectively in terms of information security management, IT governance and Service Management: they can be coupled with Attack trees analysis [10]. However decision makers still need to

    understand, interpret and instantiate results to their specific operational environments.

    VIII. CONCLUSIONS This paper presented our work on Security Analytics to

    provide risk assessment and decision support to decision makers when dealing with their strategic security policies.

    Probabilistic modeling and simulation tools have been used to explore the risk exposure of an organization at the operational level and the implications of specific security policies. What-if analysis was carried out to explore decision options. We described how this methodology has been successfully used in a case study, in the space of user access management, jointly carried out with a customer and how this informed potential investments and policy changes.

    This work created a template Security Analytics solution in the IAM space and a business service offering by HP Information Security. More R&D work will be carried out by HP Labs.

    REFERENCES [1] GNOSIS, http://www.hpl.hp.com/research/systems_security/

    gnosis.html [2] M. Collinson, B. Monahan, D. Pym, A Discipline of Mathematical

    Systems Modelling. Forthcoming monograph, College Publications, London, 2009.

    [3] Y. Beres, J. Griffin, S. Shiu, M. Heitman, D. Markle, P. Ventura, Analysing the Performance of Security Solutions to Reduce Vulnerability Exposure Windows, ACSAC, 33–42, CA, IEEE, 2008.

    [4] A. Baldwin, M. Casassa Mont, S. Shiu, Using Modelling and Simulation for Policy Decision Support in Identity Management, IEEE Policy 2009, 2009

    [5] M. Casassa Mont, Y. Beres., D. Pym, S. Shiu, Economics of Identity and Access Management: Providing Decision Support for Investments, BDIM 2010, 2010

    [6] ISO, ISO 27001, Information Security Risk Assessment, http://www.iso.org/, 2005

    [7] ISACA, Cobit, Control Objectives for Information and realted Technologies, http://www.isaca.org/, 2010

    [8] G.S. Fishman, Discrete-Event Simulation: Modelling, Programming and Analysis, Springer-Verlag, 2001

    [9] M. Collinson, B. Monahan, D. Pym, Semantics for Structured Systems Modelling and Simulation. To appear, Proc. Simutools 2010, ACM, 2010

    [10] B. Schneier, Attack Trees, http://www.schneier.com/paper-attacktrees-ddj-ft.html, 1999

    [11] HP Information Security, http://h10131.www1.hp.com/uk/en/information-security/security-innovation/, 2010