Data Center Best Practice and Architecture

  • Published on

  • View

  • Download




  • 1. Data Center
    Best Practices and Architecture
    for the California State University
    Author(s):DCBPA Task Force
    Date:OctAug 127, 2009
    The content of this document is the result of the collaborative work of the Data Center Best Practice and Architecture (DCBPA) Task Force established under the Systems Technology Alliance committee within the California State University.
    Team members who directly contributed to the content of this document are listed below.
    Samuel G. Scalise, Sonoma, Chair of the STA and the DCBPA Task Force
    Don Lopez, Sonoma
    Jim Michael, Fresno
    Wayne Veres, San Marcos
    Mike Marcinkevicz, Fullerton
    Richard Walls, San Luis Obispo
    David Drivdahl, Pomona
    Ramiro Diaz-Granados, San Bernardino
    Don Baker, San Jose
    Victor Vanleer, San Jose
    Dustin Mollo, Sonoma
    David Stein, PlanNet Consulting
    Mark Berg, PlanNet Consulting
    Michel Davidoff, Chancellors Office
    Table of Contents
    TOC o " 1-3" h z u 1.Introduction PAGEREF _Toc230564580 h 4
    1.1.Purpose PAGEREF _Toc230564581 h 4
    1.2.Context PAGEREF _Toc230564582 h 4
    1.3.Audience PAGEREF _Toc230564583 h 5
    1.4.Development Process PAGEREF _Toc230564584 h 5
    1.5.Principles and Properties PAGEREF _Toc230564585 h 5
    2.Framework/Reference Model PAGEREF _Toc230564586 h 7
    3.Best Practice Components PAGEREF _Toc230564587 h 15
    3.1.Standards PAGEREF _Toc230564588 h 15
    3.2.Hardware Platforms PAGEREF _Toc230564589 h 15
    3.3.Software PAGEREF _Toc230564590 h 17
    3.4.Delivery Systems PAGEREF _Toc230564591 h 18
    3.5.Disaster Recovery PAGEREF _Toc230564592 h 23
    3.6.Total Enterprise Virtualization PAGEREF _Toc230564593 h 29
    3.7.Management Disciplines PAGEREF _Toc230564594 h 32
    As society and institutions of higher education increasingly benefit from technology and collaboration, the importance of identifying mutually best practices and architecture makes this document vital to the behind-the-scenes infrastructure of the university. Key drivers behind the gathering and assimilation of this collection are:
    Many campuses want to know what the others are doing so they can draw from a knowledge base of successful initiatives and lessons learned. Having a head start in thinking through operational practices and effective architectures--as well as narrowing vendor selection for hardware, software and services--creates efficiencies in time and cost.
    Campuses are impacted financially and data center capital and operating expenses need to be curbed. For many, current growth trends are unsustainable with limited square footage to address the demand for more servers and storage without implementing new technologies to virtualize and consolidate.
    Efficiencies in power and cooling need to be achieved in order to address green initiatives and reduction in carbon footprint. They are also expected to translate into real cost savings in an energy-conscious economy. Environmentally sound practices are increasingly the mandate and could result in measurable controls on higher energy consumers.
    Creating uniformity across the federation of campuses allows for consolidation of certain systems, reciprocal agreements between campuses to serve as tertiary backup locations, and opt-in subscription to services hosted at campuses with capacity to support other campuses, such as the C-cubed initiative.
    This document is a collection of Best Practices and Architecture for California State University Data Centers. It identifies practices and architecture associated with the provision and operation of mission-critical production-quality servers in a multi-campus university environment. The scope focuses on the physical hardware of servers, their operating systems, essential related applications (such as virtualization, backup systems and log monitoring tools), the physical environment required to maintain these systems, and the operational practices required to meet the needs of the faculty, students, and staff. Data centers that adopt these practices and architecture should be able to house any end-user service from Learning Management Systems, to calendaring tools, to file-sharing.
    This work represents the collective experience and knowledge of data center experts from the 23 campuses and the chancellors office of the California State University system. It is coordinated by the Systems Technology Alliance, whose charge is to advise the Information Technology Advisory Committee (made up of campus Chief Information Officers and key Chancellors Office personnel) on matters relating to servers (i.e., computers which provide a service for other computers connected via a network) and server applications.
    This is a dynamic, living document that can be used to guide planning to enable collaborative systems, funding, procurement, and interoperability among the campuses and with vendors.
    This document does not prescribe services used by end-users, such as Learning Management Systems nor Document Management Systems. As those services and applications are identified by end-users such as faculty and administrators, this document will describe the data center best practices and architecture needed to support such applications.
    Campuses are not required to adopt the practices and architecture elucidated in this document. There may be extenuating circumstances that require alternative architectures and practices. However, it is hoped that these alternatives are documented in this process.
    It is not the goal to describe a single solution, but rather the range of best solutions that meet the diverse needs of diverse campuses.
    This information is intended to be reviewed by key stakeholders who have material knowledge of data center facilities and service offerings from business, technical, operational, and financial perspectives.
    Development Process
    The process for creating and updating these best Practices and Architecture (P&A) is to identify the most relevant P&A, inventory existing CSU P&A for key aspects of data center operations, identify current industry trends, and document those P&A which best meet the needs of the CSU. This will include information about related training and costs, so that campuses can adopt these P&A with a full understanding of the costs and required expertise.
    The work of creating this document will be conducted by members of the Systems Technology Alliance appointed by the campus Chief Information Officers, by members of the Chancellors Office Technology Infrastructure Services group, and by contracted vendors.
    Principles and Properties
    In deciding which Practices and Architecture should be adopted, it is important to have a set of criteria that reflect the unique needs, values, and goals of the organization. These Principles and Properties include:
    Long-term viability
    Flexibility to support a range of services
    Security of the systems and data
    Reliable and dependable uptime
    Environmental compatibility
    High availability
    Additionally, the architecture should emphasize criteria that are standards-based. The CSU will implement standards-based solutions in preference to proprietary solutions where this does not compromise the functional implementation.
    The CSU seeks to adhere to standard ITIL practices and workflows where practical. Systems and solutions described herein should relate to corresponding ITIL and service management principles.
    Framework/Reference Model
    The framework is used to describe the components and management processes that lead to a holistic data center design. Data centers are as much about the services offered as they are the equipment and space contained in them. Taken together, these elements should constitute a reference model for a specific CSU campus implementation.
    The Information Technology Infrastructure Library is a set of concepts around managing services and operations. The model was developed by the UK Office of Government Commerce and has been refined and adopted internationally. The ITIL version 2 framework for Service Support breaks out several management disciplines that are incorporated in this CSU reference architecture (see Section 2.7).
    ITIL version 3 has reworked the framework into a collection of five volumes that describe
    Service Strategy
    Service Design
    Service Transition
    Service Operation
    Continual Service Improvement
    The American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) releases updated standards and guidelines for industry consideration in building design. They include recommended and allowable environment envelopes, such as temperature, relative humidity, and altitude for spaces housing datacomm equipment. The purpose of the recommended envelope is to give guidance to data center operators on maintaining high reliability and also operating their data centers in the most energy efficient manner.
    Uptime Institute
    The Uptime Institute addresses architectural, security, electrical, mechanical, and telecommunications design considerations. See Section for specific information on tiering standards as applied to data centers.
    ISO/IEC 20000
    An effective resource to draw upon as part of one of the ISO IT management standards are the ISO 20000-1 and ISO 20000-2 processes. ISO 20000-1 promotes the adoption of an integrated process approach to effectively deliver managed services to meet the business and customer requirements. It comprises ten sections: Scope; Terms & Definitions; Planning and Implementing Service Management; Requirements for a Management System; Planning & Implementing New or Changed Services; Service Delivery Process; Relationship Processes; Control Processes; Resolution Processes; and Release Process. ISO 20000-2 is a 'code of practice', and describes the best practices for service management within the scope of ISO20000-1. It comprises nine sections: Scope; Terms & Definitions; The Management System; Planning & Implementing Service Management; Service Delivery Processes; Relationship Processes; Resolution Processes; Control Processes; Release Management Processes.
    Together, this set of ISO standards is the first global standard for IT service management, and is fully compatible and supportive of the ITIL framework.
    Hardware Platforms
    Rack-mounted Servers provide the foundation for any data centers compute infrastructure. The most common are 1U and 2U: these form factors compose what is known as the volume market. The high-end market, geared towards high-performance computing (HPC) or applications that need more input/output (I/O) and /or storage is composed of 4U to 6U rack-mounted servers. The primary distinction between volume market and high-end servers is the I/O and storage capabilities.
    Blade Servers are defined by the removal of many components PSUs, network interface cards (NICS) and storage adapters from the server itself. These components are grouped together as part of the blade chassis and shared by all the blades. The chassis is the piece of equipment that all of the blade servers plug into. The blade servers themselves contain processors, memory and a hard drive or two. One of the primary caveats to selecting the blade server option is the potential for future blade/chassis compatibility. Most IHVs do not guarantee blade/chassis beyond two generations or five years. Another potential caveat is the high initial investment in blade technology because of additional costs associated with the chassis.
    Towers There are two primary reasons for using tower serversprice and remote locations. Towers offer the least expensive entrance into the server platform market. Towers have the ability to be placed outside the confines of a data center. This feature can be useful for locating an additional Domain Name Server (DSN) or backup server in a remote office for redundancy purposes.
    Application requirements Applications such as databases, backup servers and other high I/O requirements are better suited HPC rack-mounted servers. Applications such as web servers and MTAs work well in a volume-market rack-mounted environment or even in a virtual server environment. These applications allow servers to be easily added and removed to meet spikes in capacity demand. The need to have servers that are physically located at different sites for redundancy or ease of administration can be met by tower servers, especially if they are low demand applications. Applications with high I/O requirements perform better with 1U or 2U rack-mounted servers rather than blade servers because stand alone servers have a dedicated I/O interface rather than a common one found on the chassis of a blade server.
    Software support can determine the platform an application lives on. Some vendors refuse to support virtual servers making VMs unsuitable if support is a key requirement. Multiple instances of an application is not supported by some software, requiring the application to run on a large single server rather than multiple smaller servers.
    Storage requirements can vary from a few gigabytes to accommodate the operating system, application and state data for application servers to terabytes to support large database servers. Applications requiring large amounts of storage should be SAN attached using fiber channel or iSCSI. Fiber offers greater reliability and performance but a higher skill lever from SAN Admins. Support for faster speeds in iSCSI is and improved reliability is making it more attractive. Direct Attached Storage (DAS) is still prevalent because it is less costly and easier to manage than SAN storage. Rack-mounted 4U to 6U servers have the space to house a large number of disk drives and make suitable DAS servers.
    Consolidation projects can result in several applications being combined onto a single server or virtualization. Care must be taken when combining applications to ensure they are compatible with each other and vendor support can be maintained. Virtualization accomplishes consolidation by allowing each application think its running on its own server. The benefits of consolidation include reduced power and space requirements and fewer servers to manage.
    Energy efficiency starts with proper cooling design, server utilization management and power management. Replacing old servers with newer energy efficient ones reduces energy use and cooling requirements and may be eligible for rebates which allow them to pay for themselves.
    Improved management Many data centers contain best of breed technology. They contain server platforms and other devices from many different vendors. Servers may be from vendor A, storage from vendor B and network from vendor C. This complicates troubleshooting and leads to finger pointing. Reducing the number of vendors produces standardization and is more likely to allow a single management interface for all platforms....


View more >