Storage and Server

Embed Size (px)

Citation preview

  • 7/31/2019 Storage and Server

    1/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 1

    SERVER-CENTRIC IT ARCHITECTURE AND ITS LIMITATIONS:

    In conventional IT architectures, storage devices are normally only connected to a single server (Figure

    1.1). To increase fault tolerance, storage devices are sometimes connected to two servers, with only one

    server actually able to use the storage device at any one time. In both cases, the storage device exists

    only in relation to the server to which it is connected. Other servers cannot directly access the data; they

    always have to go through the server that is connected to the storage device. This conventional IT

    architecture is therefore called server-centric IT architecture. In this approach, servers and storage

    devices are generally connected together by SCSI cables. As mentioned above, in conventional server-

    centric IT architecture storage devices exist only in relation to the one or two servers to which they are

    connected. The failure of both of these computers would make it impossible to access this data. Most

    companies find this unacceptable: at least some of the company data (for example, patient files,

    websites) must be available around the clock. Although the storage density of hard disks and tapes is

    increasing all the time due to ongoing technical development, the need for installed storage is increasing

    even faster.

    Consequently, it is necessary to connect ever more storage devices to a computer. This throws up the

    problem that each computer can accommodate only a limited number of I/O cards (for example, SCSI

    cards). Furthermore, the length of SCSI cables is limited to a maximum of 25 m. This means that the

    storage capacity that can be connected to a computer using conventional technologies is limited.

    Conventional technologies are therefore no longer sufficient to satisfy the growing demand for storage

  • 7/31/2019 Storage and Server

    2/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 2

    capacity. In server-centric IT environments the storage device is statically assigned to the computer to

    which it is connected. In general, a computer cannot access storage devices that are connected to a

    different computer. This means that if a computer requires more storage space than is connected to it,

    it is no help whatsoever that another computer still has attached storage space, which is not currently

    used (Figure 1.2). Last, but not least, storage devices are often scattered throughout an entire building

    or branch. Sometimes this is because new computers are set up all over the campus without any great

    consideration and then upgraded repeatedly. Alternatively, computers may be consciously set up where

    the user accesses the data in order to reduce LAN data traffic. The result is that the storage devices are

    distributed throughout many rooms, which are neither protected against unauthorized access nor

    sufficiently air-conditioned. This may sound over the top, but many system administrators could write a

    book about replacing defective hard disks that are scattered all over the country.

    STORAGE-CENTRIC IT ARCHITECTURE AND ITS ADVANTAGES:

    Storage networks can solve the problems of server-centric IT architecture that we have just discussed.

    Furthermore, storage networks open up new possibilities for data management. The idea behind

    storage networks is that the SCSI cable is replaced by a network that is installed in addition to the

    existing LAN and is primarily used for data exchange between computers and storage devices (Figure1.3). In contrast to server-centric IT architecture, in storage networks storage devices exist completely

    independently of any computer. Several servers can access the same storage device directly over the

    storage network without another server having to be involved. Storage devices are thus placed at the

    centre of the IT architecture; servers, on the other hand, become an appendage of the storage devices

    that just process data. IT architectures with storage networks are therefore known as storage-centric IT

    architectures. When a storage network is introduced, the storage devices are usually also consolidated.

    This involves replacing the many small hard disks attached to the computers with a large disk

    subsystem. Disk subsystems currently (in the year 2009) have a maximum storage capacity of up to a

    petabyte. The storage network permits all computers to access the disk subsystem and share it. Free

    storage capacity can thus be flexibly assigned to the computer that needs it at the time. In the same

    manner, many small tape libraries can be replaced by one big one.

    Intelligent Disk Subsystems:

    SUMMARY OF THE INVENTION:

    It is therefore an object of the present invention to provide an intelligent hard disk drivesubsystem with the functions of a disk drive controller incorporated into the disk drive package.

    It is a further object to provide a standard interface between the intelligent disk drive subsystemand the host system.

    It is a further object to provide a more compact disk drive and controller subsystem than diskdrives which require a separate disk drive controller.

    It is a further object to provide a more economical disk drive and controller subsystem than diskdrives which require a separate disk drive controller.

    It is a further object to provide a method for formatting a disk which eliminates the need for atrack 0 sensing switch in the disk drive.

  • 7/31/2019 Storage and Server

    3/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 3

    It is a further object to eliminate the ST506 interface between the controller and the disk drive,and to thereby eliminate the heat, cost and detrimental effect on reliability associated

    therewith.

    It is a further object to provide a disk drive subsystem for which the internal recording techniqueand process are independent of the interface to the host computer system, whereby data

    transfer rate and media organization and allocation are transparent to the host computer

    system.

    It is a still further object to provide an intelligent disk drive subsystem which inherently providesthe benefits associated with tested pair matching of controllers and disk drives, and which

    inherently provides single source responsibility for compatibility between the controller and the

    disk drive.

    Disk Arrays: Modular and Integrated Arrays: Network attached storage (NAS) arrays Modular storage area network (SAN) arrays Monolithic (enterprise) arrays Storage virtualization Utility Storage arrays1. Network attached storage (NAS) arrays: Network attached storage is a hard disk storage system

    on a network with its own LAN IP address. NAS arrays provide file-level access to storage

    through such protocols as CIFS and NFS. Examples:

    3PAR and ONStor UtiliCat Unified Storage EMC Celerra family HP StorageWorks All-In-One Storage Systems HP ProLiant' Storage Server NetApp Filer Sun StorageTek 5000 family2. Modular storage area network (SAN) arrays: A SAN is a dedicated network, separate from LANsand WANs, that is generally used to connect numerous storage resources to one or many

    servers. SAN arrays provide block-level access to storage through SCSI-based protocols such as

    Fibre Channel and iSCSI. Modular storage system typically consist of separate modules, which

    afford some level of scalability, and can be mounted in a standard rack cabinet. Modular storage

    systems are also sometimes referred as departmental. Examples:

    Fujitsu ETERNUS 4000/3000 series storage arrays HP Storageworks EVA family products Hitachi Thunder family products IBM DS4000 /FAStT family of storage servers

    IBM DS6000 series storage servers Arena Maxtronic Janus Fibre Channel and iSCSI RAID systems Infortrend EonStor/EonRAID family NetApp FAS series Unified storage servers ONStor Pantera

    3. Monolithic (enterprise) arrays: Although this is not a strict definition, the array is consideredmonolithic when even basic configuration is physically too large to fit into a standard rack

  • 7/31/2019 Storage and Server

    4/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 4

    cabinet. These arrays are suited for large-scale environments. Often Enterprise storage systems

    provide ESCON and FICON protocols for mainframes in addition to Fibre Channel and iSCSI for

    open systems SANs. Examples:

    HP XP IBM Enterprise Storage Server (ESS)

    IBM DS8000 series of storage servers Infortrend EonStor / EonRAID family

    4. Storage virtualization: Intelligent SAN or Storage Servers (Software that adds disk controllerfunctionality to standard server hardware platforms). Hardware independent software that

    typically runs as a control program on top of a standard OS platform (Windows, Linux, etc.):

    Falconstor IPStor Software IBM SAN Volume Controller NetApp V-Series storage virtualization solutions RELDATA Unified Storage Gateway Appliance EMC inVista

    5. Utility Storage arrays: 3PAR InServ Storage Servers NetApp FAS GX Series Pillar Data Systems Axiom

    Disk Physical Structure Components:

    Major Parts of a Disk Drive: Disk drives are constructed from several highly specialized parts and

    subassemblies designed to optimally perform a very narrowly defined function within the disk drive.

    These components are:

    1. Disk platters Read and write heads2. Read/write channel3. Arms and actuators4. Drive spindle motor and servo control electronics5. Buffer Memory6. Disk Controller

    Disk Platters: The physical media where data is stored in a disk drive is called a platter. Disk platters are

    rigid, thin circles that spin under the power of the drive spindle motor. Platters are built out of three

    basic layers:

    1. The substrate, which gives the platter its rigid form2. The magnetic layer, where data is stored3. A protective overcoat layer that helps minimize damage to the disk drive from microscopically

    sized dust particles

    The three different layers within a disk platter are illustrated in Figure 4-1, which shows both the top and bottom

    sides of a platter

  • 7/31/2019 Storage and Server

    5/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 5

    Read and Write Heads: The recording heads used for transmitting data to and from the platter are

    called read and write heads'. Read/write heads are responsible for recording and playing back data

    stored on the magnetic layer of disk platters. When writing, they induce magnetic signals to be

    imprinted on the magnetic molecules in the media, and when reading, they detect the presence of those

    signals.

    The performance and capacity characteristics of disk drives depend heavily on the technology used in

    the heads. Disk heads in most drives today implement giant magneto resistive (GMR) technology, which

    uses the detection of resistance variances within the magnetic layer to read data. GMR recording is

    based on writing very low strength signals to accommodate high areal density. This also impacts the

    height at which the heads "fly" over the platter.

    The distance between the platter and the beads is called the flying height or head gap, and is measured

    at approximately 15 nanometers in most drives today. This is much smaller than the diameter of most

    microscopic dust particles. Considering that head gap tolerances are so incredibly close, it is obviously a

    good idea to provide a clean and stable environment for the tens, hundreds, or thousands of disk drives

    that are running in a server room or data center. Disk drives can run in a wide variety of environments,

    but the reliability numbers improve with the air quality: in other words, relatively cool and free fromhumidity and airborne contaminants.

    Read/Write Channel: The read/write channel is implemented in small high-speed integrated circuits

    that utilize sophisticated signal processing techniques and signal amplifiers. The magneto resistive

    phenomenon that is detected by the read heads is very faint and requires significant amplification.

    Readers might find it interesting to ponder how data read from disk is not actually based on detecting

    the magnetic signal that was written to media. Instead, it is done by detecting minute differences in the

    electrical resistance of the media, caused by the presence of different magnetic signals. Amazingly, the

    resistance is somehow detected by a microscopically thin head that does not make contact with the

    media but floats over it at very high speeds.

    Arms and Actuators: The read and write heads have to be precisely positioned over specific tracks. As

    heads arc very small, they are connected to disk arms that are thin, rigid, triangular pieces of lightweight

    alloys. Like everything else inside a disk drive, the disk arms are made with microscopic precision so that

    the read/write heads can be precisely positioned next to the platters quickly and accurately.

  • 7/31/2019 Storage and Server

    6/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 6

    The disk arms are connected at tile base to tile drive actuator, which is responsible for positioning the

    arms. The actuator's movements are controlled by voice-coil drivers; the name is derived from voice coil

    technology used to make audio speakers. Considering that some speakers have to vibrate at very high

    frequencies to reproduce sounds, it's easy to see bow disk actuators can be designed with voice coils to

    move very quickly. The clicking sounds you sometimes hear in a disk drive are the sounds of the actuator

    being moved back and forth.

    Buffer Memory: The mechanical nature of reading and writing data on rotating platters limits the

    performance of disk drives to approximately three orders of magnitude (1000 times) less than the

    performance of data transfers to memory chips. For that mason, disk drives have internal buffer

    memory to accelerate data transmissions between the drive and the storage controller using it.

    Disk Portioning (Logical Partitioning): Disk partitioning is the creation of divisions of a hard disk.

    Once a disk is divided into several partitions, directories and files can be grouped by categories such as

    data type and type usage. More separate data categories provide more control but too many become

    cumbersome. Space management, access permissions and directory searching are based on the filesystem installed on a partition. Careful consideration of the size of the partition is necessary as the

    ability to change the size depends on the file system installed on the partition.

    Purposes for Partitioning:

    Separation of the operating system files from user files Having a partition for swapping/paging Keeping frequently used programs and data near each other. Having cache and log files separate from other files.

    Primary (or Logical): A primary (or logical) partition contains one file system. In MS-DOS and earlierversions of Microsoft Windows systems, the first partition (C:) must be a "primary partition". Other

    operating systems may not share this limitation; however, this can depend on other factors, such as a

    PC's BIOS.

    Extended: An extended partition is secondary to the primary partition(s). A hard disk may contain only

    one extended partition; which can then be sub-divided into logical drives, each of which is (under DOS

    and Windows) assigned additional drive letters.

    For example, under either DOS or Windows, a hard disk with one primary partition and one extended

    partition, the latter containing two logical drives, would typically be assigned the three drive letters: C:,

    D: and E: (in that order).

    RAID and Parity Algorithms: RAID which stands for Redundant Arrays of Inexpensive Disks (as named

    by the inventor) or Redundant Arrays of Independent Disks (a name which later developed within the

    computing industry) is a technology that employs the simultaneous use of two or more hard disk

    drives to achieve greater levels of performance, reliability, and/or larger data volume sizes.

  • 7/31/2019 Storage and Server

    7/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 7

    RAID Principles: - RAID combines two or more physical hard disks into a single logical unit by using either

    special hardware or software. Hardware solutions often are designed to present themselves to the

    attached system as a single hard drive, and the operating system is unaware of the technical workings.

    Software solutions are typically implemented in the operating system, and again would present the RAID

    drive as a single drive to applications.

    There are three key concepts in RAID: mirroring, the copying of data to more than one disk; striping, the

    splitting of data across more than one disk; and error correction, where redundant data is stored to

    allow problems to be detected and possibly fixed (known as fault tolerance).

  • 7/31/2019 Storage and Server

    8/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 8

  • 7/31/2019 Storage and Server

    9/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 9

  • 7/31/2019 Storage and Server

    10/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 10

  • 7/31/2019 Storage and Server

    11/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 11

  • 7/31/2019 Storage and Server

    12/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 12

  • 7/31/2019 Storage and Server

    13/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 13

  • 7/31/2019 Storage and Server

    14/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 14

  • 7/31/2019 Storage and Server

    15/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 15

  • 7/31/2019 Storage and Server

    16/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 16

  • 7/31/2019 Storage and Server

    17/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 17

    Hot Sparing: If a drive fails in a RAID array that includes redundancy, meaning all of them except RAID 0,

    it is desirable to get the drive replaced immediately so the array can be returned to normal operation.

    There are two reasons for this: fault tolerance and performance. If the drive is running in a degraded

    mode due to a drive failure, until the drive is replaced, most RAID levels will be running with no fault

    protection at all: a RAID 1 array is reduced to a single drive, and a RAID 3 or RAID 5 array becomes

    equivalent to a RAID 0 array in terms of fault tolerance. At the same time, the performance of the array

    will be reduced, sometimes substantially.

    Disk Organization:

    1. Disk Storage OrganizationTracks, sectors and clustersSides and heads

    Cylinders

    Disk controllers

    2. File SystemsBoot Record

    FAT (File Allocation Table)

    Directory and Directory Entry

    Files

    Front end Connectivity: Front end connections are basically connections from interfaces to the host

    adaptors. There are various techniques for front end connectivity.

    ATA Fiber Channel PATA SATA SAS SCSI

  • 7/31/2019 Storage and Server

    18/18

    Ankit Gupta (Faculty- CSE Dept. COER) Page 18

    Advanced Technology Attachment (ATA) is a standard interface for connecting storage devices such as

    hard disks, solid state disks and CD-ROM drives inside personal computers.

    The standard is maintained by X3/INCITS committee T13. Many synonyms and near-synonyms for ATA

    exist, including abbreviations such as IDE (Integrated Drive Electronics) and ATAPI (Advanced

    Technology Attachment Packet Interface). Also, with the market introduction of Serial ATA in 2003, the

    original ATA was retroactively renamed Parallel ATA (PATA).

    Fibre Channel, or FC, is a gigabit-speed network technology primarily used for storage networking. Fibre

    Channel is standardized in the T11 Technical Committee of the InterNational Committee for Information

    Technology Standards (INCITS), an American National Standards Institute (ANSI)accredited standards

    committee. It started use primarily in the supercomputer field, but has become the standard connection

    type for storage area networks (SAN) in enterprise storage. Despite common connotations of its name,

    Fibre Channel signaling can run on both twisted pair copper wire and fiber-optic cables; said another

    way, fiber (ending in "er") always denotes an optical connection, whereas fibre (ending in "re") is always

    the spelling used in "fibre channel" and denotes a physical connection which may or may not be optical.

    Fibre Channel Protocol (FCP) is a transport protocol (similar to TCP used in IP networks) which

    predominantly transports SCSI commands over Fibre Channel networks.Serial Attached SCSI (SAS) is a data transfer technology designed to move data to and from computer

    storage devices such as hard drives and tape drives. It is a point-to-point serial protocol that replaces the

    parallel SCSI bus technology that first appeared in the mid 1980's in corporate data centers, and uses the

    standard SCSI command set.

    Serial Advanced Technology Attachment (SATA) is a computer bus primarily designed for transfer of

    data between a computer and mass storage devices such as hard disk drives and optical drives. The main

    advantages over the older parallel ATA interface are faster data transfer, ability to remove or add

    devices while operating (hot swapping), thinner cables that let air cooling work more efficiently, and

    more reliable operation with tighter data integrity checks.

    Small Computer System Interface or SCSI is a set of standards for physically connecting and transferring

    data between computers and peripheral devices. The SCSI standards define commands, protocols, and

    electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it can

    connect a wide range of other devices, including scanners and CD drives. The SCSI standard defines

    command sets for specific peripheral device types; the presence of "unknown" as one of these types

    means that in theory it can be used as an interface to almost any device, but the standard is highly

    pragmatic and addressed toward commercial requirements. SCSI is an intelligent interface: it hides the

    complexity of physical format. Every device attaches to the SCSI bus in a similar manner. SCSI is a

    peripheral interface: up to 8 or 16 devices can be attached to a single bus. There can be any number of

    hosts and peripheral devices but there should be at least one host.