28
Draft 0.1; 30 Sept 2004 Open Science Grid Security Incident Handling and Response Guide page 1

OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

Open Science Grid

Security Incident Handling and Response Guide

page 1

Page 2: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

Document Log

Issue Date Author Comment

0.1 30 Sept 2004 Doug Pearson Draft release to the Activtiy Group

0.2 7 Sept 2004 Doug Pearson Draft release to the OSG Workshop

page 2

Page 3: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

i. Document Development Milestones:

September 6, 2004: An abbreviated version of the full Guide will contain completed recommendations for the establishment and maintenance of contact lists and communications methods; preliminary recommendations for containment and notification methods; and and an outline of additional content to be developed. Recommendations regarding the additional content development process and schedule will be made. The abbreviated version will be presented to OSG and iVDGL committees on September 6, for review and discussion during the Sept 9-10 OSG Workshop.

November, 2004: Progress in content development, especially aiming for harmonization with EGEE efforts, in preparation for the Second EGEE Conference, November 22-26, 2004.

February 2005: Guidelines developed and processes and services implemented as necessary for OSG-0.

ii. Credits:

This document was developed through the work of the OSG Security Incident Handling Activity Group1, including members Bob Cowles (SLAC), Mark Green (U Buffalo), Michael Helm (ESnet/LBNL), Doug Olson (LBNL), Doug Pearson (IU/REN-ISAC), Dane Skow (Fermilab), Tom Throwe (BNL), and Von Welch (NCSA); and with background developed through the prior works of Yuri Demchenko (University of Amsterdam).

iii. Contact:

Comments, questions, etc. may be referred through the OSG Security Incident Handling Activity Group chair, Doug Pearson <[email protected]>.

page 3

Page 4: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

iv. Document Work in Progress:

In addition to the specific call-outs in the document for work in progress, the following enhancements, additions, and changes are in progress:

o Consideration needs to be given to the nature of OSG as a framework for the cooperation of grids, and as a grid itself, i.e. OSG-0. Currently the document is heavily slanted to OSG as a framework. One approach to the duality may be to create two documents. One as a concept of operations for grids that participate in the framework and the central services and processes required to facilitate the cooperation; and another document to serve as the specific guide to OSG-0.

o Need to define the minimum set of requirements for OSG-0 and identify the implementation.

o Slant the processes and more towards coordination rather centralized command and control.

page 4

Page 5: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

1. INTRODUCTION.................................................................................................................................62. THE PURPOSE OF THIS DOCUMENT...........................................................................................63. DEFINITIONS......................................................................................................................................64. INCIDENT TAXONOMY AND LEVELS OF SUPPORT...............................................................75. POLICIES.............................................................................................................................................7

5.1. Reporting and responding to Grid incidents:...........................................................................75.2. Guidance to the sharing and disclosure of sensitive data........................................................7

6. ORGANIZATIONAL STRUCTURE.................................................................................................87. SUPPORTING RESOURCES.............................................................................................................9

7.1. Mailing lists..............................................................................................................................98. PROCESS..............................................................................................................................................9

8.1. Discovery and reporting...........................................................................................................98.2. Triage.....................................................................................................................................108.3. Containment...........................................................................................................................118.4. Initial notification...................................................................................................................128.5. Analysis and response............................................................................................................138.6. Tracking and progress notification........................................................................................148.7. Escalation...............................................................................................................................148.8. Reporting................................................................................................................................148.9. Public relations......................................................................................................................148.10. Post-incident analysis.............................................................................................................15

9. COMMUNICATIONS SUPPORT....................................................................................................159.1. Contact lists............................................................................................................................159.2. Normal communication channels...........................................................................................169.3. Secure communications..........................................................................................................169.4. Phone bridge..........................................................................................................................16

10. WEB SITE...........................................................................................................................................1611. PERIODIC REPORTING.................................................................................................................1612. RELATIONSHIPS TO OTHER ENTITIES....................................................................................1613. OUTREACH.......................................................................................................................................1614. SECURITY OPERATIONS CENTER.............................................................................................1615. EFFECTIVENESS EVALUATION PROCESS..............................................................................1716. INFORMATION DISCLOSURE GUIDELINES............................................................................1717. INCIDENT REPORTING FORMATS............................................................................................1718. LOCAL PROCESSES AND TOOLS TO SUPPORT INCIDENT RESPONSE..........................1719. GUIDANCE TO MIDDLEWARE AND GRID SERVICE DEVELOPERS................................1920. RELATIONSHIPS..............................................................................................................................1921. RELEVANT AND RELATED STANDARDS AND PRACTICES...............................................1922. Useful References and Other Works.....................................................................................................20

page 5

Page 6: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

1. Introduction

The cyberspace defined by Grids transcends organizational boundaries. An operative space requires that participants develop new forms of cooperation with respect to policies, resources, operations, and security. Although the Grid doesn't create fundamentally new cyber security risks it does serve to amplify risks and creates a broader scope of impact for incidents. User identity and authorizations are extended throughout the multi-organizational space. Large numbers of homogenous systems scattered across organizations are presented to the authenticated user. These and other aspects provide fertile ground for the rapid spread of security incidents and expose an institution to risks commensurate to the security practices of its collaborators. Additionally, the high profile and vast resources of Grids are attractive hacking targets, providing notoriety, or as a platform for other attacks.

The character of the security vulnerabilities and risks presented by Grid cyberspace provides a rationale for strong coordination among the Grid participants for cyber security incident response.

2. The purpose of this document

The Open Science Grid Consortium2 is an umbrella for guidance and support to various independent Grid efforts, seeking to expand and enable the use of common grid infrastructure and shared resources for the benefit of scientific applications.

This document is targeted for Grids partcipating in the OSG structure, and is relevent to the haromonization with other national and international efforts. The document was developed with an eye to the US PPDG3, iVDGL4 and TeraGrid5 communities, and European LCG6, and EGEE7 efforts.

The purpose of this document is to guide the development and maintenance of a common capability for handling and response to cyber security incidents on Grids. The capability will be established through (1) common policies and processes, (2) common organizational structures, (3) cross-organizational relationships, (4) common communications methods, and (5) a modicum of centrally-provided services and processes.

The vision articulated in this document is not to establish a centralized OSG incident handling and response organization, but to establish a concept of operations for individual Grid security efforts, permitting a harmonization and collaboration in the collective Grid space.

Ultimately the purpose behind the development of this document is to reduce the incidence, severity, and exposure of Grids to cyber security incidents and to reduce the exposures to institutions and their systems posed by the Grid.

3. Definitions

A cyber security incident is any real or suspected event that poses a breach to explicit or commonly-held security policies and practices, and that poses a real or potential threat the integrity of services, resources, infrastructure, or identities. Some typical classes of incidents are computer intrusion, denial-of-service attack and worm/virus infections.

page 6

Page 7: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

Although the Grid doesn't create fundamentally new cyber security risks or classes of incidents, it does serve to amplify risks and creates a broader scope of the impact of incidents.

4. Incident taxonomy and levels of support

Need work here

5. Policies

5.1. Reporting and responding to Grid incidents:

Grid Site Charters, Agreements, and other policy documents that guide overall site participation are the explicit sources of site requirements for security incident handling and response. Ideally, those charters and agreements will refer directly to this document for incident handling and response requirements and procedures. The charters and agreements should state that:

o Grid participants MUST follow the guidelines and practices established in the OSG Security Incident Handling and Response Guide.

o Grid participants MUST report incidents that have impact or relationship to Grid resources, services, or identities. Reports MUST be made for incidents with potential impact* as well incidents with known impact.

o Grid participants MUST respond to incidents where local systems or resources are presenting a threat to Grid security. Response is guided by OSG Security Incident Handling and Response Guide, section 8.3 Containment, and more fully through the methods of section 8 Process.

5.2. Guidance to the sharing and disclosure of sensitive data

5.2.1. Privacy and security concerns

Information exchanged during the investigation of an incident may include data about individuals or groups and behavours that is not meant to be public information (e. g. contact telephone numbers). Additionally, some application areas have stringent requirements about their data is to be handled due to legal and ethical considerations (e. g. biomed data). A site handling incident information MUST treat it with security appropriate to the sensitivity of the data involved.

5.2.2. Sharing of processed incident information

Incident information transferred by one site to another MUST only occur through secure means sufficient to prevent casual eavesdropping by the attackers. Additional protections may be required if the information being exchanged is of the nature described above.

* An incident with potential impact could for instance be the compromise of a host holding user identities, but it is unknown whether the identities themselves were compromised.

page 7

Page 8: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

All media contacts relating to an OSG security incident MUST be handled by the OSG Public Relations contacts. All interviews or information passed to the media MUST be through OSG Public Relations except through prior arrangement.

5.2.3. Handling of shared log, netflow, and other supporting data

Need work here [What should we say? It’s not probably suitable for actually inclusion in the ticketing system, so that system would at best contain a pointer to it. Can the GOC be responsible for maintaining that repository of supporting information associated with an incident? How else do we provide for the protection of the data in cses where there might be prosecution? What about physical evidence – how do we handle that? What data handling procedures do we need to be sure that logs, etc haven’t been altered? ]

6. Organizational structure

The operational organizational structure is comprised of the components:

Security Contacts. Every Grid participant (user, service, or resource) must have assigned Security Contact(s). Institutional approaches may vary significantly. Security Contacts may report from a variety of organizational units, such as system administrators, security engineers, or network engineers. The purview of Security Contacts may be an entire institution or a single department. As possible, the Security Contacts should include 24x7 support desks, in addition to but not replacing, the appointment of specific individuals. A list of Security Contacts is maintained by the Security Operations Center (SOC).

A body of incident handling and response technical experts. The body is a self-organized collection of volunteer technical experts who are available to provide response and remediation advice in the event of large or technically complex incidents. The expert body is organized and available as a mailing list maintained by the SOC.

Ad hoc Incident Response Teams. Depending upon the severity, complexity, and scope of an incident, response may require the ad hoc formation of an Incident Response Team. The team will be formed under the auspices of the SOC and will composed of Security Contacts, site systems administrators, Grid software specialists, operations centers, etc. as appropriate to the incident. The team leader is responsible to coordinate with the SOC for supporting services, and to maintain a flow of information regarding incident status to the SOC.

The Security Operations Centers (SOC). Individual Grids may choose to establish individual centers or partner with other Grid efforts in joint efforts. The SOC function will often be provided by the Grid Operations Center (GOC). The SOC is responsible to organize and coordinate large-scale, multi-institutional response efforts, to track and report on security incidents, and to monitor information flows, e.g. closed security lists and REN-ISAC alerts, for threat to the served community, and to provide supporting services (see 14 Security Operations Center).

A Security Operations Advisory Group is established with representative participation of the the served community, to advise the development and practice of the Security Operations Center.

page 8

Page 9: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

Cross SOC Coordination (XSOC). The Security Operations Centers of Grids operating under OSG participate in a Cross-SOC Coordination body. The body acts to share information across Grid stovepipes regarding incidents, vulnerabilities, exploits, attacks, practices, tools, etc.

7. Supporting resources

7.1. Mailing lists

Three mailing lists support incident reporting, analysis, and response. In the following, xxx.yyy is to be replaced with the respective Grid, e.g. ivdgl.org or ppdg.net.

The standard grid support processes and Grid Operations Center procedures are employed by end-users and individuals other than Security Contacts to report incidents. The GOC monitors and supports these reporting methods, and relays incident reports to INCIDENT-SEC-L.

[email protected] is a closed list composed of the Security Contacts at all sites – including Grid Operations Centers and Security Operations Centers. Only list members may post to the list. The list is intended solely for initial incident reporting, not for incident discussion.

[email protected] is a closed list composed of the same members as INCIDENT-SEC-L. Only members may post to the list. The list is intended for discussion of reported incidents.

The reason for the differentiating INCIDENT-SEC-L and INCIDENT-DISCUSS-L is to allow alerting mechanisms to be driven by the presence of a new message in INCIDENT-SEC-L. Communications on both lists MUST be encrypted.

8. Process

The processes for incident handling and responses are:

1. Discovery and reporting2. Triage3. Containment4. Initial notification5. Analyis and response6. Tracking and progress notification7. Escalation8. Reporting9. Public relations10. Post-incident analysis

8.1. Discovery and reporting

Incidents will be discovered through a variety of means including users, system administrators and engineers, operations center monitoring of infrastructure, services, and resources, through

page 9

Page 10: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

monitoring of intelligence channels, such as FIRST and REN-ISAC, and through reports from peers of the aforementioned entities.

If an incident is discovered locally, such as by a user or systems administrator, the local incident reporting process should be used, AND the discovering party should make certain to inform the local incident response team that (1) the incident involves the Grid, and (2) the incident must be reported to INCIDENT-SEC-L according to guidelines at [webpage]. If the initial assessment by the local incident response team reveals that real or potential impact or relationship to Grid resources, services, or identities has occured, the Security Contact must immediately report the incident to INCIDENT-SEC-L.

If an incident is discovered locally, such as by a user or systems administrator, AND a local incident reporting process cannot be engaged or the Security Contact informed, for instance due to occurance outside of normal business hours, the discovering party should report the incident directly to the Grid security incident handling community via [webpage], and follow-up with local incident reporting procedures when possible.

If an incident is discovered by Security Contacts, grid operations centers, or security operations centers, the discovering party should immediately report the incident to INCIDENT-SEC-L.

In all cases the SOC will monitor incident reports and will assume the responsibility to insure that involved sites are aware of the incident.

Need template for reporting.

8.2. Triage

8.2.1. Verify the incoming incident reports

Need work here

8.2.2. Assign a severity classification

Through an initial assessment of the incident, assign a severity classification according to:

High:

The incident could lead to exploitation of the trust fabric, i.e user and host identities, orthe incident could lead to instability of the overall Grid, ora denial-of-service is in progress against all replicas of a given Grid service.

Medium:

The incident affects an instance of a Grid service, but Grid stability is not at risk, ora denial-of-service affects one replica of a given Grid service, ora local attack compromised a priviledged user account.

Low:

page 10

Page 11: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

A local attack comprised individual user, non-privileged credentials, or a denial-of-service attack or compromise affects only local grid resources.

8.2.3. Engage response activities

Need work here

8.3. Containment

There are three areas of concern for containment of an attack: (1) preventing further spread of the attack through local services/resources; (2) preventing further attacks from external grid services/resources; and (3) protecting the grid from attacks sourced at a different site. For this discussion, we will assume the local site already has procedures in place to handle (1); however, it should be validated as part of the site registration process.

8.3.1. Protection from attacks through the grid

Attacks originating from the grid might be coming from (1) a grid service hosted at the local site; (2) a grid service hosted at a remote site; (3) a shared authentication (group account where some other process possibly at some other site has handled the authentication and authorization of the user to request this resource/service); and (4) a single grid user. As a general matter, the level of response must take into account a number of factors:

the resource/service has been compromised or is it just under attack?

the kind of attack - DOS or user or privileged user compromise?

the importance of the resource/service locally?

the importance of the resource/service to the operation of the grid?

the importance of the resource/service to various Virtual Organizations?

As an operational principle for the site, the normal response should be to err on the side of caution and block access from the grid during the initial stages of dealing with an intrusion – only opening access as is prudent and justified, without extraordinary risk. Having this policy results in two beneficial effects: (1) it gives sites more freedom of action and more confidence they can act to protect themselves without bringing down the wrath of the grid community; and (2) it will hopefully result in more redundancy of services, better failover, and applications that are more robust to outages in various parts of the grid (by putting the responsibility on the middleware and application developers to design a more failsafe environment).

Sites MUST inform the grid operations center of actions they take affecting grid resources/services.

For a grid service/resource hosted at the local site, it MUST have an interface allowing it to be disabled and SHOULD be able to inform central scheduling and monitors that it is entering a

page 11

Page 12: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

disabled state. It is assumed that local site policies will handle containment issues from locally hosted resources/services to the rest of their infrastructure.

For a grid service/resource hosted at a remote site, and interface MUST be provided to local services and resources to block requests or access from the remote service. If a compromise-style of attack then blocking authorizations at the appropriate level is probably sufficient. Queries from remote monitors and schedulers SHOULD be told that access is blocked at the appropriate level. For DOS-style attacks, lower-level protocol blocking is likely to be necessary but there SHOULD still be a way to inform schedulers and monitors that access is being blocked.

For group and single user accounts, the initial response is probably the same - to temporarily deny access to the resource/service through the appropriate local control on authorization that MUST be provided. In the case of the group account, the follow-up action is different since the service that provides for the "grouping" must be contacted to they can perform corrective actions before the group is re-enabled. In the single user case the follow-up action goes directly back to the VO for resolution.

8.3.2. Protection of the grid from attacks through a site

Many of the considerations from the above discussion also apply here. One would like to believe that the Grid Operations Center would generally have the ability to block a site or service that was misbehaving, and while that might be true in cases for specific centrally controlled middleware services, it will not be true for the vast majority of services on the grid nor will it be true for federated grids that have their own operations centers.

A problematic site or service might be reported by to the grid operations center by a site on the grid, a grid operations center, site or ISP on another grid or independent of grids, or might be discovered by the monitoring capabilities of the grid operations center itself.

Depending on the severity of the attack and based on the sites potentially affected, the grid operations center will attempt to notify site, resource and service providers so they can take appropriate action to protect themselves.

The second phase of containment is the process of narrowing down the things that are blocked to the specific sites, resources, services and users which were compromised. Incident response teams at the sites, in communication with their peers through the established, secure email list, are expected to restore normal operation as quickly as the problem areas can be identified and isolated.

8.4. Initial notification

8.4.1. Incident report template (see "incident reporting formats")

Need work here. Potentially IODEF; what do we want out of IODEF - define our subset of IODEF information; lightweight is good; linkage to EGEE - keep coordinated, track with LCG GOC and Yuri

page 12

Page 13: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

8.4.2. Communications network

Need work here

8.4.3. Acknowledgements to notifications

Need work here

8.4.4. End-user notifications

when to contact end-usershow to know what users to contacthow to contact usersNeed work here

8.4.5. Management

Need work here

8.5. Analysis and response

8.5.1. Considerations:

8.5.1.1. Tracking

All work performed in incident analysis and response should be tracked and reported to the OSG SOC, including:

o responder(s)o containment actions takeno what was determinedo what steps taken to respond/recovero what was the extent of damageo man-hours required in response

8.5.1.2. Evidence collection

Need work here

8.5.2. Extent of user and host identity compromise

Determine the extent of known and potential compromise of user and host credentials and passwords. Did the initial containment step treat the entire scope of the compromise? Work with contacts at all sites to revoke/suspend credentials and passwords. Don't permit users to reactivate

page 13

Page 14: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

accounts and change passwords until it's known that the compromise doesn't involve keystroke logging trojans. Need additional work here.

8.5.3. Analysis, removal and recovery

o Maintain regular communications of status and observations to the incident handling mailing list. Need additional work here.

o Need specific procedures for recovery from identity compromises, e.g. a large and organizationally diverse body of users may be affected. How to contact and support all users?

8.5.4. Experts group to aid sites in analysis and response (e-mail list)

Need additional work here.

8.5.5. OSG SOC provides a conference phone bridge for response communications.

Need additional work here.

8.6. Tracking and progress notification

Need additional work here.

ticketing system?; and/or other repositories?

figure out the relationship IODEF to ticketing, plugins to RT?

is incident reporting format (see below), e.g. IODEF sufficient, or is tracking accomplished utilizing a ticketing system in conjunction with reporting format?

8.7. Escalation

o According to the incident severity, services affected, etc. are the proper levels of management informed and engaged?

o According to the incident severity, services affected, etc. is response activity proceeding appropriately?

Need to define and document the management-level interfaces for escalation

8.8. Reporting

o controlling information releaseo refer to information disclosure guidelines (below)o to the immediate community

page 14

Page 15: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

o to the larger grid communityo to ISACs o to US-CERTo to law enforcemento to the public

Need additional work here.

8.9. Public relations

Incident information is confidential. The responsibility and authority for public communications and relations lies first with each site - with respect to information involving that site, and then with the Grid, VO, or appropriate Management entity. In all cases, the users, system administrators, engineers, technicians, and other support personnel involved in handling of an incident must refrain from making public statements and should provide timely information to site, Grid, and VO management regarding the incident.

Refer to information disclosure guidelines (below).

8.10. Post-incident analysis

Need additional work here.

9. Communications support

9.1. Contact lists

Each Grid security operations center must maintain incident handling and response contact lists. Information should be maintained for:

Site contact / Security Contacts,Grid and VO management contacts, and Security Operations Centers for other Grids

Sensitive or confidential information will at times be shared among the contacts; therefore all contacts must be properly designated by their home insitution. Currency of the information must be aggressively maintained.

The SOC can maintain this information, or can rely on the REN-ISAC Cyber Security Registry as a mechanism to collect, maintain, and provide access to the information. The Registry contains 24x7 operational and management contact information for cyber security at participating institutions. Registry information is vetted and aggressively maintained. In addition to the broad trusted circle established by Registry, flexible subcommunities can be defined to circumscribe entities such as Grids, regional networks, GigaPoPs, etc. The Registry contains personal and functional account contact information, and institutional information including owned network address blocks. Information in the Registry is accessible by individuals participating in the Registry.

page 15

Page 16: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

A schema of for Site contact / Security Contact should minimally contain:

site namesite locationCSIRT emailCSIRT phoneprimary contact nameprimary contact phoneprimary contact 24x7 phone/pagerprimary contact email secondard contact […]

methods

trust establishment and maintenance

what if a Physics department by itself? without security-team, etc.

9.2. Normal communication channels

9.3. Secure communications

Communications for the Incident Handling and Response Teams must be supported by secure mailing lists supporting digital signature and encryption.

9.4. Phone bridge

Communications for the Incident Hanlding and Response Teams must be supported by the 24x7 ad hoc availability of a secure method to hold phone conferences.

10. Web site

11. Periodic reporting

o number of handled incidentso response timeo time-to-live of incidentso successful practiceso practices requiring improvement

12. Relationships to other entities

TeraGridEGEEREN-ISAC

page 16

Page 17: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

network service providersCERT/CSIRT

13. Outreach

training and workshopsconferences

14. Security Operations Center

As described in 2. The Purpose of this Document, "The vision articulated in this document is not to establish a centralized OSG incident handling and response organization, but to establish a concept of operations for individual Grid security efforts, permitting a harmonization and collaboration in the collective Grid space." Among the requirements of a Grid security operations center are:

o Aggressive development and maintenance of contacts

o Develop and maintain the list of locally-assigned Security Contacts

o Coordination of and mailing list support for the experts group which supports sites in analysis and response

o Mailing list server which supports encrypted communications

o Team for building awareness and outreach

o 24x7 incident response hotline

o 24x7 conference phone bridge

o Incident response IRC server, SSL-secured channel

o Monitoring of intelligence flows, e.g. information from organizations such as REN-ISAC, FIRST, etc.

o Ticketing system for tracking and reporting incidents

o more stuff here

15. Effectiveness evaluation process

Need additional work here.

page 17

Page 18: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

16. Information disclosure guidelines

classification of information to be released

classification of potential information recipients

information release characteristics

17. Incident reporting formats

standardized format necessary for security incident exchange

IODEF? extension for Grid(?); collaboration with EGEE JRA3(?)

tools

Other?

18. Local processes and tools to support incident response

Site and VO administrators should be conversant with the secure communication mechanisms used in support of the security infrastructure.

Centralized logging should be utilized to permit the maintenance of log integrity in the face of system compromises. The logs should be published to an Incident Response team on request through a secure channel.

Sites should track connections through border routers and firewalls. Logs of network and user access information should be kept for a minimum of 2 months. The logs should be published to an Incident Response team on request through a secure channel.

process accounting

evidence collection

Forensics boot disks should be developed in advance of system compromises to facilite quick and effective response.

secure communications

Each site and VO should publish their a local incident response plan which must include a well defined point of contact with the grid operations center and security operations center.

Each site should maintain a list of system administrators for each resource accessible to the grid.

Each VO should maintain a list of users who have administrative privilege.

page 18

Page 19: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

Site and Virtual Organization Administrators should maintain an online log of the response to, information available and actions taken in the event of an incident. This information should be maintained for a minimum of 2 months.

Sites and VO administrators should participate in testing of vulnerabilities.

formal interface to middleware development activities

OSCP

logging

All grid services should log access information on invocation.

Action logs on grid services must be sufficient to determine which identity was associated with all processes and which AA tokens might be exposed in an incident.

auditing

process to look at the decisions that were made a various points in time; ability to trace an incident

Services which accept delegated credentials must be auditable to resolve claims of challenged authentication and exposed risk.

19. Guidance to middleware and grid service developers

Middleware and grid services should be secure and facilitate security incident analysis and response. At the initial draft of this document, methods and policies are needed for suspension of identities, and richer logging is required throughout middleware and services.

Define here a structure for facilitating the relationship of the security practioners and middleware/service developers, and for maintaining and communicating the list of desired enhancments.

20. Relationships

The following represent activities which the OSG and iVDGL Security Incident Handling Activity should establish relationship to:

DOE Grids PKI ServiceDOE Grids Certificate Policy and Certification Practice StatementGuidelines for Security Incident Response and Resolutionhttp://www.doegrids.org/Docs/CP-CPS.pdf

EGEE JRA3: Securityhttp://egee-jra3.web.cern.ch/egee-jra3/index.html

page 19

Page 20: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

EGEE Global Security Architecture (EU Deliverable DJRA3.1)section: Security Consideratons: Incident Responsehttps://edms.cern.ch/document/487004/

LCG Joint Security Grouphttp://proj-lcg-security.web.cern.ch/proj-lcg-security/

JSG Incident Response Activityhttp://proj-lcg-security.web.cern.ch/proj-lcg-security/incident_response.html

Agreement on Incident Response For LCG-1https://edms.cern.ch/file/428035/LAST_RELEASED/LCG_Incident_Response.pdf

21. Relevant and related standards and practices

RFC 2350 - Expectations for Computer Security Incident Response

RFC 2196 - Site Security Handbook

RFC 3013 - Recommended ISP Security

IETF Extended Incident Handling (INCH)http://www.ietf.org/html.charters/inch-charter.html

IETF Incident Object Description Exchange Format (IODEF)http://www.ietf.org/internet-drafts/draft-ietf-inch-implement-00.txt

LCG Security Group, Agreement on Incident Responsehttps://edms.cern.ch/file/428035/LAST_RELEASED/LCG_Incident_Response.pdf

CERT/CC - Handbook for Computer Security Incident Response Teams http://www.cert.org/archive/pdf/csirt-handbook.pdf

CERT/CC - Incident Reporting Guidelineshttp://www.cert.org/tech_tips/incident_reporting.html

CERT/CC - Creating a Computer Security Incident Response Team: A Process for Getting Startedhttp://www.cert.org/csirts/Creating-A-CSIRT.html

CERT/CC - State of the Practice of Computer Security Incident Response Teams (CSIRTs)http://www.cert.org/archive/pdf/03tr001.pdf

22. Useful References and Other Works

"White collar" Attacks on Web Services and GridsGrid Security threats analysis and Grid Security Incident data model definitionDraft Version 0.2, August 12, 2004

page 20

Page 21: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

Yuri Demchenko <[email protected]>http://www.uazone.org/demch/analytic/draft-grid-security-incident-02.pdf

1 OSG Security Incident Handling Activity Grouphttp://www.opensciencegrid.org/activities/incident-response/index.html

2 Open Science Grid Consortiumhttp://www.opensciencegrid.org

3 PPDGhttp://www.ppdg.net/

4 iVDGL: Interational Virtual Data Grid Laboratoryhttp://www.ivdgl.org/

5 TeraGridhttp://www.teragrid.org/

6 LCG: LHC Computing Grid Projecthttp://lcg.web.cern.ch/LCG/

7 EGEE: Enabling Grids for E-science in Europe http://egee-intranet.web.cern.ch/egee-intranet/gateway.html

page 21

Page 22: OSG/iVDGL Security Incident Handling and …osg-docdb.opensciencegrid.org/0000/000019/001/OSG... · Web viewSecurity Incident Handling and Response Guide Document Log Issue Date Author

Draft 0.1; 30 Sept 2004

page 22