34
Sun SPARC Enterprise M3000 Server Architecture

Server sun Prof. Hemant Samant

Embed Size (px)

Citation preview

Page 1: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

Page 2: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

Introduction ......................................................................................... 1

Sun SPARC Enterprise M3000 Server Overview............................ 3

Meeting the Needs of Commercial and Scientific Computing ......... 3

System Architecture ............................................................................ 4

System Component Overview......................................................... 4

System Outline ................................................................................ 8

System Bus Architecture—Jupiter Interconnect.................................. 9

Jupiter Interconnect Architecture .................................................... 9

Performance of Sun SPARC Enterprise M3000 Server ................ 10

SPARC64 VII Processor ................................................................... 11

SPARC64 VII Overview ................................................................ 11

SPARC64 VII Microarchitecture .................................................... 12

Details of the Microarchitecture..................................................... 13

Cache System ............................................................................... 16

Reliability, Availability, and Serviceability Functions ..................... 17 I/O

Subsystem................................................................................... 20 I/O

Subsystem Architecture .......................................................... 20

Reliability, Availability, and Serviceability.......................................... 21

Redundant and Hot-Swap Components ....................................... 21

Advanced Reliability Features....................................................... 22

Error Detection, Diagnosis, and Recovery .................................... 22

System Management ........................................................................ 23

eXtended System Control Facility ................................................. 23

Page 3: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

Oracle Enterprise Manager Ops Center Software......................... 25

Oracle Solaris 10............................................................................... 26

Observability and Performance ..................................................... 26

Availability ..................................................................................... 27

Security ......................................................................................... 28

Virtualization and Resource Management .................................... 28

Conclusion ........................................................................................ 30

Page 4: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

Introduction Organizations now rely on technology more than ever before. Today, computer systems play a

critical role in every function from product design to customer order fulfillment. In many cases,

business success is dependent on the continuous availability of IT services. Once only required in

pockets of data centers, mainframe-class reliability and serviceability are now essential for

systems throughout an enterprise. In addition, powering data center servers and keeping services

running through a power outage are significant concerns. On the other hand, the environment is

also playing a key role in

such considerations, in areas that include, for example, power conservation and

miniaturization, amid demand to reduce the load on the environment. New computer systems

that consume less power and that emit less greenhouse gases can play an essential role in

protecting the environment.

Although availability is a top priority, costs must also remain within budget and operational

familiarity maintained. To deliver networked services as efficiently and economically as

possible, organizations look to maximize use of every IT asset through consolidation and

virtualization strategies. As a result, modern IT system requirements reach far beyond simple

measures of compute capacity. Organizations need highly flexible servers with built-in

virtualization capabilities and associated tools, technologies, and processes that work to optimize

server use. With budgets still in mind, new computing infrastructures must also help protect

current investments in technology

and training.

.

1

Page 5: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

2

A High-Performance, High-Reliability, Ecologically Sustainable

Server: Introducing the Sun SPARC Enterprise M3000 Server

Oracle’’s Sun SPARC Enterprise servers are highly reliable, easy-to-manage, vertically

scalable systems with all the benefits of traditional mainframes without the associated cost,

complexity, or vendor lock-in (Figure 1). In fact, Sun SPARC Enterprise servers deliver

mainframe-class system architecture at open system prices.

The Sun SPARC Enterprise M3000 server is the entry-class model that has many

characteristics of Sun SPARC Enterprise servers, and shares benefits such as operability and

manageability common to the servers. With symmetric multiprocessing (SMP), 32 GB

memory subsystem, and high-throughput I/O architecture, the server can ensure core

business operations. Further, the server runs the powerful Oracle Solaris 10 operating system

(OS) and includes leading virtualization technologies. Through the innovative Oracle Solaris

Containers virtualization technology, the server brings sophisticated resource control to an

open systems platform.

The server combines high performance, high quality, and ecological sustainability with a

resilient system architecture, the advanced functions of Oracle Solaris 10, a compact form

factor (two

rack units [2U] in a rack cabinet), and the top CPU power in the entry class of servers. Moreover,

Sun SPARC Enterprise servers offer improved performance over the previous generations of

Sun servers, with a clear upgrade path that protects existing investments in software, training,

and data center practices. By taking advantage of the Sun SPARC Enterprise M3000 server,

IT organizations can create a more-powerful infrastructure, optimize hardware use, and

increase application availability——resulting in lower operational costs and risks.

Figure 1. Sun SPARC Enterprise M3000, M4000, M5000, M8000, and M9000 servers include many features to help improve uptime, application

performance, and data center efficiency.

Page 6: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

3

Sun SPARC Enterprise M3000 Server Overview

The Sun SPARC Enterprise M3000 server offers numerous power, reliability, and energy-

saving characteristics useful to enterprises. The Sun SPARC Enterprise M3000 server

features an SMP design that uses the latest generation of SPARC64 processors connected to

memory and I/O by a new high-speed, low-latency system interconnect, which delivers

exceptional throughput to software applications. Characteristics of the Sun SPARC Enterprise

M3000 are found in Table 1. Also architected to reduce unplanned downtime, this server

includes stellar reliability, availability, and serviceability (RAS) capabilities to avoid outages

and reduce recovery times. Design features,

such as high-performance CPU and data path integrity, Memory Extended ECC, end-to-end data

protection, hot-swappable components, fault-resilient power options, and hardware

redundancy, boost the reliability of this server. The environment-conscious design of the Sun

SPARC Enterprise M3000 server offers numerous benefits with the aim of energy

consumption reduction that enterprises and data centers require. With the adoption of the

SPARC64 VII

processor, which achieves low power consumption while demonstrating high performance, and a

structural design of improved cooling efficiency and cooling control, the server realizes

power saving, space saving, and a quiet operation that reduces the environmental load.

TABLE 1. CHARACTERISTICS OF SUN SPARC ENTERPRISE M3000 SERVER

ENCLOSURE TWO RACK UNITS

SPARC64 VII processors x 2.52 GHz

x 5 MB Level 2 cache

x Four cores

Memory x Up to 32 GB

x Eight DIMM slots

Internal I/O slots Four PCIe

External I/O chassis None

Internal storage x Serial Attached SCSI

x Up to four drives

Dynamic system domains Maximum of one

External I/O connections One SAS port

Meeting the Needs of Commercial and Scientific Computing

Suiting a wide range of computing environments, the Sun SPARC Enterprise M3000 server

provides the availability features needed to support commercial computing workloads along

with the raw performance demanded by the high-performance community (Table 2).

Page 7: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

4

TABLE 2. SAMPLE WORKLOADS FOR THE SUN SPARC ENTERPRISE M3000 SERVER

Adaptive services Business processing (ERP, CRM, OLTP, batch)

Database Decision

support Data mart

Web services

System and network management

Application development Application

services

Scientific engineering

System Architecture

Continually challenged to do more with less, IT organizations realize that meeting processing

requirements with fewer, more-powerful systems holds economic advantages. In the Sun

SPARC Enterprise M3000 server, the system interconnect processors, memory subsystem,

and I/O subsystem work together to create a reasonably priced, high-performance platform.

System Component Overview

The design of the Sun SPARC Enterprise M3000 server specifically focuses on delivering

high reliability, outstanding performance, and true SMP throughput. The characteristics and

capabilities of every subsystem work toward this goal. The high-bandwidth system bus,

powerful

SPARC64 VII processor chips, high-density memory option, and high-speed PCI Express (PCIe)

provide not only reliable performance for enterprise applications, but also high-level

operational time and throughput.

System Interconnect

Based on mainframe technology, the Jupiter system interconnect enables high performance

and reliability for the Sun SPARC Enterprise M3000 server. The system controller provides

point-to- point connections among the CPU, memory, and I/O subsystems. The system

interconnect delivers as much as 17 GB/sec of peak bandwidth, offering true SMP

throughput. Additional technical details about the system interconnect are found in the

section titled ““System Bus Architecture——Jupiter Interconnect.””

SPARC64 VII Processor

The Sun SPARC Enterprise M3000 server uses the SPARC64 VII processor developed by

Fujitsu. The SPARC64 VII processor, which has a multicore and multithreading architecture,

has been designed based on experience in the mainframe computer field accumulated over

several

Page 8: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

5

decades in the pursuit of excellence in reliability and speed. It adopts advanced technology

(65 nm for the SPARC64 VII), which realizes a maximum consumption of 135 watts.

Additional technical details about the SPARC64 VII processor are found in the section

titled ““SPARC64

VII Processor.””

Memory

The memory subsystem of the Sun SPARC Enterprise M3000 server accommodates up to 32

GB of memory. The server uses DDR2 DIMMs with two-way memory interleave to

enhance system performance. Available DIMM sizes include 1 GB, 2 GB, and 4 GB.

Further details about the memory subsystem of the Sun SPARC Enterprise M3000 server

are listed in Table 3.

TABLE 3. SUN SPARC ENTERPRISE M3000 SERVER MEMORY SUBSYSTEM SPECIFICATIONS

Maximum memory capacity 32 GB

DIMM slots 8

Bank size 4 DIMMs

Number of banks 2

Beyond performance, the memory subsystem of the Sun SPARC Enterprise M3000 server is

built with reliability in mind. ECC protection is implemented for all data stored in main

memory, and the following advanced features foster early diagnosis and fault isolation that

preserve system integrity and raise application availability:

x Memory patrol. Memory patrol periodically scans memory for errors. This proactive

function prevents the use of faulty areas of memory before they can cause system or

application errors, improving system reliability.

x Memory Extended ECC. The Memory Extended ECC function provides single-bit error

correction, supporting continuous processing despite events such as burst read errors,

which are sometimes caused by memory device failures.

PCI Express Technology

The Sun SPARC Enterprise M3000 server uses a PCI bus to provide high-speed data

transfer within the I/O subsystem. To support PCIe expansion cards, the server uses a PCIe

physical layer (PCIe PHY) ASIC to manage the implementation of the PCIe protocol. PCIe

technology doubles the peak data transfer rates of the original PCI technology and reaches

the maximum throughput of 20 Gb/sec. In fact, PCIe was developed to accommodate high-

speed interconnects such as Fibre Channel, Infiniband, and Gigabit Ethernet. Additional

technical details about the I/O subsystem are found in the section titled ““I/O Subsystem.””

Page 9: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

6

Service Processor eXtended System Control Facility

Simplifying management of computer systems leads to higher availability levels for hosted

applications. With this in mind, the Sun SPARC Enterprise M3000 server includes the

eXtended System Control Facility (XSCF). The XSCF consists of a dedicated processor that

is independent of the server and runs the XSCF Control Package (XCP) to provide remote

monitoring and management capabilities. This service processor regularly monitors

environmental sensors, provides advanced warning of potential error conditions, and executes

proactive system maintenance procedures as necessary. Although power is supplied to the

server, the XSCF constantly monitors the platform even when the system is inactive.

The XCP enables audit administration, hardware control capabilities, hardware status

monitoring, reporting, and handling, automatic diagnosis, and domain recovery. Additional

technical details about the XSCF and XCP are found in the section titled ““System

Management.””

Power and Cooling

The Sun SPARC Enterprise M3000 server uses separate modules for power and cooling.

Sensors placed throughout the system measure the temperatures at processors and key ASICS

as well as the exhaust temperature. Hardware redundancy in the power and cooling

subsystems, combined with environmental monitoring, keeps the server operating even under

power or fan fault conditions.

Fan Unit

The Sun SPARC Enterprise M3000 server uses fully redundant, hot-swap fans as the

primary cooling system (Table 4). If a single fan fails, the XSCF detects the failure and

switches the remaining fans to high-speed operation to compensate for the reduced airflow.

The server can operate normally under these conditions, allowing ample time to service the

failed unit. Replacement of fan units can occur without interrupting application processing.

TABLE 4. POWER AND COOLING SPECIFICATIONS OF SUN SPARC ENTERPRISE M3000 SERVER

x Two fan units

Fan units

x Two 80 mm fans

x 1+1 redundant

Power supplies x 470 watts of rated power

x Two units

x 1+1 redundant

x Single-phase

Power cables x Two power cables

x 1+1 redundant

Page 10: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

7

Power Supply

The use of redundant power supplies and power cables adds to the fault resilience of the Sun

SPARC Enterprise M3000 server (Table 4). Power is supplied to the server by redundant

hot- swap power supplies, enabling continuous server operation even if a power supply fails.

Because the power units are hot-swappable, they can be replaced during system operation.

Optional Dual Power Feed

Although organizations can control most factors within the data center, utility outages are

often unexpected. The consequences of loss of electrical power can be devastating to IT

operations. To enable organizations to reduce the impact of such incidents, the Sun SPARC

Enterprise M3000 server is dual power feed capable. The AC power subsystem in this server

is completely duplicated, providing optional reception of power from two external,

independent AC power sources. The use of a dual power feed ensures that server operations

are not affected, even after

a single power grid failure. Therefore, the server can continue to be used. Though the dual power

feed system and redundant power supply system are not compatible, the redundancy feature

of either system increases system availability.

Operator Panel

The Sun SPARC Enterprise M3000 server features an operator panel, which has the

following functions:

x Displaying server status

x Storing server identification and user setting information

x Changing between operational and maintenance modes

x Turning on power supplies for a domain

During server startup, the front panel LED status indicators monitor the XSCF and

server operation (Figure 2).

Page 11: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

8

Figure 2. The server operator panel of the Sun SPARC Enterprise M3000.

System Outline

The Sun SPARC Enterprise M3000 server is an economical, high-power compute platform

with enterprise-class features. This server is designed to reliably carry data center workloads

that undertake core business operations. The Sun SPARC Enterprise M3000 server enclosure

measures 2U and supports one processor chip and 32 GB of memory. The SPARC64 VII (four

cores) processor chip is mounted. In addition, the server features four short internal PCIe

slots, four internal disk drives, one internal DVD drive, and an external SAS port for attaching

addition storage or tape device. Two power supplies and two fan units power and cool the

server. Front and rear views and a component diagram of the Sun SPARC Enterprise M3000

server are found in Figure 3 and Figure 4.

Figure 3. Sun SPARC Enterprise M3000 server components.

Page 12: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

9

Figure 4. Front and rear views of the Sun SPARC Enterprise M3000 server.

System Bus Architecture—Jupiter Interconnect

The ability to deliver fast, predictable performance for a broad set of CPU applications rests

largely on the capabilities of the system bus. The Sun SPARC Enterprise M3000 server

uses a system interconnect designed to deliver consistent low latency. The Jupiter system

bus benefits IT operations by delivering balanced and predictable performance for

application workloads.

Jupiter Interconnect Architecture

The Jupiter interconnect design maximizes the overall performance of the Sun SPARC

Enterprise M3000 server. Implemented as point-to-point connections that use packet-

switched technology, this system bus provides fast response times by transmitting multiple

data streams. Packet switching allows the interconnect to operate at a much-higher

systemwide throughput by eliminating ““dead”” cycles on the bus. All routes are

unidirectional, contention-free paths with multiplexed addresses, data, and control plus ECC

in each direction.

System controllers within the Jupiter interconnect architecture direct traffic among

CPUs, memory, and I/O subsystems.

Page 13: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

10

System Interconnect Reliability Features

The built-in redundancy and reliability features of the Sun SPARC Enterprise M3000 server

system interconnect enhance the stability of this server. The Jupiter interconnect protects

against loss or corruption of data with full ECC protection on all system buses and in

memory. When a single-bit data error is detected in a CPU, memory, or an I/O controller,

hardware corrects the data and performs the transfer.

Sun SPARC Enterprise M3000 System Interconnect Architecture

The Sun SPARC Enterprise M3000 system is implemented within a single motherboard. This

server design features one logical system board with one system controller. The system

controller is connected to CPUs, memory, and the I/O controller (PCIe bridge), as shown in

Figure 5.

Figure 5. Sun SPARC Enterprise M3000 server system interconnect.

Performance of Sun SPARC Enterprise M3000 Server

The high bandwidth and overall design of the Jupiter system interconnect maximize the

performance of the Sun SPARC Enterprise M3000 server. Theoretical peak system

throughput,

Page 14: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

11

I/O bandwidth numbers, and stream benchmark results for the Sun SPARC Enterprise

M3000 server are found in Table 5.

TABLE 5. PERFORMANCE OF SUN SPARC ENTERPRISE M3000 SERVER

Theoretical system bandwidth at peak timea

(GB/sec) 17

Theoretical I/O bandwidth at peak timeb

(GB/sec) 4

Triad results of stream benchmark (GB/sec) 4.5

Copy results of stream benchmark (GB/sec) 5.6 aThe theoretical system bandwidth at peak time is calculated by multiplying the bus width by the bus frequency between the system

controller and memory.

bThe theoretical I/O bandwidth at peak time is calculated by multiplying the bus width by the bus frequency between the system controller

and PCI bridge.

SPARC64 VII Processor

The SPARC64 Series consists of SPARC processors developed by Fujitsu for UNIX servers.

Customers have realized high-reliability technology——consistent with the mainframe

class——and a frequency exceeding 1 GHz with the SPARC64 V. The SPARC64 VI has

realized high throughput by using the SPARC64 V as a base and incorporating a two-core by

two-thread architecture. The throughput of the latest SPARC64 VII has been improved

further by incorporating a four-core architecture and by modifying the multithreading

mechanism. The Sun SPARC Enterprise M3000 server uses this SPARC64 VII processor.

SPARC64 VII Overview

The SPARC64 VII is the latest processor developed by Fujitsu for the SPARC64 Series. It uses

65 nm technology and has an operating frequency of 2.5 GHz. The chip measures 21.3 mm by

20.9 mm and has four built-in cores with a shared 5 MB Level 2 (L2) cache configuration.

The operating power consumption is 135 watts.

Fujitsu designed the SPARC64 VII for increased throughput while maintaining the high

performance and high reliability that have been realized with the existing SPARC64 VI. For

increased throughput, the number of built-in cores has been increased from two to four, and

the multithreading mechanism to be used has been changed from vertical multithreading

(VMT) to simultaneous multithreading (SMT). The L2 cache is configured to be shared by

the four cores, and the throughput has been doubled so that data can be supplied to the four

cores. Also, especially with the field of high-performance computing in mind, an intercore

high-speed synchronization mechanism called hardware barrier has been implemented.

Page 15: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

12

SPARC64 VII Microarchitecture

This section provides an overview of the microarchitecture of the SPARC64 VII. Although

the basic structure of the core pipeline of the SPARC64 VII is the same as that of the

SPARC64 VI, it uses SMT technology instead of VMT technology to implement

multithreading. As shown in

Figure 6, the SPARC64 VI processor takes advantage of VMT technology to execute two threads

in parallel——only one thread is active at any given time. Within the VMT model, a latency

event or specific trigger must occur for processing to switch over to the alternate thread. By

implementing SMT technology, both threads within each core on the SPARC VII processor

can execute simultaneously. As a result, the SPARC VII offers the potential to achieve

greater throughput and performance. As shown in Figure 7, two threads can be executed

simultaneously on each of the four cores.

Figure 6. SPARC64 VI VMT processing mode.

In the SMT design, Fujitsu focused on eliminating interference between threads as much as

possible. The chip is configured so that, as a rule, the hardware resources for one thread are

isolated from those of the other when both threads are running. In contrast, when either

thread is in the idle state, the other thread can use the resources of both threads except for

some resources. Thus, the chip has been designed to provide higher performance than in a

single- thread operation. In the structure, both threads share the pipeline core. However, it is

controlled so that, even if a pipeline is stalled in one thread, the processing in the other thread

is not

clogged up. In the instruction fetch stage, instruction decoding stage, or commit stage, either

thread is selected in each cycle.

Page 16: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

13

Figure 7. SPARC64 VII SMT processing mode.

Details of the Microarchitecture

As shown in Figure 8, a core of the SPARC64 VII is divided into the instruction fetch block

and instruction execution block. The instruction fetch block contains the primary cache

dedicated for instructions (L1I cache), and the instruction execution block contains the

primary cache for operands (L1D cache).

Page 17: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

14

Figure 8. Functional diagram of the SPARC64 VII core.

Instruction Fetch Block

The instruction fetch block, which operates independently of the instruction execution block,

takes a series of instructions into the instruction buffer (IBUF), which are expected to be

executed according to branch prediction. The IBUF has a capacity of 256 bytes and can store

up to 64 instructions. When both threads are running, the IBUF is divided evenly for each

thread. If

an instruction execution is stalled, the instruction fetch continues until the IBUF becomes full. In

contrast, if the instruction fetch pauses for some reason such as a cache error, instructions can

be taken from the IBUF and the execution can continue as long as the IBUF contains

instructions. The instruction fetch can be started in every cycle, and 32 bytes, which comprise

eight instructions, are fetched at one time. The throughput of instruction execution is up to

four instructions per cycle, and twice the throughput of instruction execution is ensured for

the instruction fetch. The IBUF conceals the latency of the large-capacity primary instruction

cache by separating the instruction fetch and instruction execution from each other

(decoupling).

Instruction Execution Block

A core of the SPARC64 VII is divided into the instruction fetch block and instruction

execution block. The instruction execution block operates independently of the instruction

fetch block.

Page 18: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

15

Instruction Decode and Issue

In the instruction decode and instruction issue stages, the four instructions in the Instruction

Word Register (IWR) are decoded simultaneously, and resources required for execution

(various reservation stations, fetch port and store port, and register update buffer) are

determined. Then, whether there are free resources for them is checked. If there are free

resources, they are allocated and given instruction identifications (IID) ranging from 0 to 63.

Then, the instructions

are issued. In other words, the maximum number of in-flight instructions is 64. Meanwhile, when

both threads are running, the maximum number of instructions for each thread is 32. In

each cycle, an instruction of either thread is decoded and threads are alternately switched.

When an instruction is issued, the IWR is released. For the instruction in any slot of the

IWR, there are no restrictions on the allocation of resources such as reservation stations.

Also, there are no restrictions on instruction-type combinations. Therefore, as long as there

are free resources, instructions can be issued. Even if there is insufficient space for four

instructions, as many instructions as possible are issued in program order. As described

above, by eliminating stall conditions of instruction issue as much as possible, a high

multiplicity level is ensured for any binary code.

Instruction Execution

A decoded instruction is registered in a reservation station. The SPARC64 VII has

reservation stations for integer operation (reservation station for execution [RSE]) and

reservation stations for floating point (RSF) operation. The RSEs and RSFs are divided into

two queues for the execution unit. In other words, four reservation stations are provided for

operation. They are RSEA, RSEB, RSFA, and RSFB. Each instruction stored in a reservation

station is dispatched to the execution unit that corresponds to the reservation station in the

order in which source operands are prepared for the instructions. Therefore, four operations

can be dispatched simultaneously. Basically, the oldest instruction that can be dispatched

(oldest ready) is selected from the instructions in a reservation station. However, in cases

where a register to be updated by a load instruction is used as a source operand for an

operation, the instruction is speculatively dispatched before the result of the load instruction

is obtained. Then, in the execution stage, whether the speculatively dispatched instruction has

been successful is determined; this is called speculative dispatch. Use of speculative dispatch

conceals the latency of the pipeline for cache access, increasing the use efficiency of the

execution unit. In addition to the above-described RSEs and RSFs, the other reservation

stations are reservation stations for branch instructions (RSBR) and reservation stations for

calculating addresses for load/store instructions (reservation station for address generation

[RSA]).

Instruction Commit

All results of instructions that are executed out of order are stored once in the GPR Update

Buffer (GUB) and FPR Update Buffer (FUB) work registers, which are not visible to software.

Page 19: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

16

To ensure the instruction order in a program, registers such as the general-purpose registers

(GPR) and floating-point registers (FPR) and memory are updated in program order in the

commit stage. In addition, control registers such as the PC are updated at the same time in

the commit stage. Precise interrupts are guaranteed, and processing in execution can always

be canceled. The above method is called a synchronous update method, which not only

makes it easier to re-execute instructions after a branch prediction error, but also contributes

to increased RAS, as explained later in this document. The maximum number of instructions

that can be committed at one time is four. The instruction commit stage is shared by the two

threads, and either thread is selected in each cycle to execute commit processing.

Cache System

As shown in Figure 9, the cache memory of the SPARC64 VII has a two-layer structure,

consisting of a medium-capacity primary cache (L1 cache) and a high-capacity secondary

cache (L2 cache).

Figure 9. SPARC64 VII processor core and cache design.

The L1 cache consists of a cache dedicated for instructions (L1I cache) and a cache dedicated

for operands (L1D cache). Each of these caches has a capacity of 64 KB, uses the two-way

set associative method, and has a block size of 64 B. The L1D cache is divided into eight

banks on the four-byte address boundaries, and two operands can be accessed at one time.

The L1 cache uses virtual addresses for cache indexes and physical addresses for cache tags.

In the virtually indexed, physically tagged (VIPT) method, consistency can be lost if the same

area of memory is accessed using different virtual addresses, because different indexes are

used for registration

Page 20: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

17

(synonym problem). Through coordination with the L2 cache, the SPARC64 VII resolves

the synonym problem with hardware.

The L2 cache has a maximum capacity of 5 MB, uses a 10-way set associative method, has

a block size of 256 B, and is shared by the four cores. It adopts a two-bank interleaved

structure,

so 64 B of data can be read in each cycle. The bus for sending data that is read from the L2

cache to the L1 cache has a width of 32 B per two cores, and the bus for sending data from the

L1 cache to L2 cache has a width of 16 B per core.

The cache update policies of the L1 cache and L2 cache are both write-back. That is, stored

data is written into only one cache hierarchy. In the write-back method, cache-missed lines

are always loaded onto the cache memory, so that the store operations can be completed by

updating one cache hierarchy. In the write-back method, it is necessary to bring old data in

memory onto the cache even if the data is stored, when a cache error occurs; however, the

store operation is completed only on the cache when a cache hit occurs. In general, because

the frequency of the store operation is quite high, the write-back method has an advantage

because it can reduce intercache traffic and memory access traffic.

Meanwhile, because the write-back method keeps the latest data in the cache, if an error occurs

in the relevant processor, there is a risk that the error could affect not only the internal

operation of the processor but also the entire system. The SPARC64 VII has powerful RAS

functions to cope with this problem.

Also, a new hardware barrier mechanism has been implemented in the SPARC64 VII. The

hardware barrier mechanism synchronizes the cores in the CPU chip with each other, and

faster synchronization processing can be implemented compared with a conventional

synchronization process realized by software. This mechanism is especially useful in the

high-performance computing area.

Reliability, Availability, and Serviceability Functions

In the SPARC64 VII, RAS functions comparable to those of mainframe computers have

been implemented. With these RAS functions, errors are reliably detected, their effect is

kept within a limited range, recovery processing is tried, error logs are recorded, and

software is notified.

In other words, the basics of RAS functions are thoroughly implemented. Through the imple-

mentation of the RAS functions, the SPARC64 VII provides high reliability, high

availability, high serviceability, and high data integrity as a processor for mission-critical

UNIX servers.

Reliability, Availability, and Serviceability of Internal RAMs

Among the parts of a processor, RAM has the highest error occurrence frequency. Error

detection and correction methods for the SPARC64 VII processor are highlighted in Table 6.

In the SPARC64 VII, because any one-bit error in RAM can automatically be corrected by

hardware without intervention by software, it does not affect software.

Page 21: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

18

TABLE 6. ERROR DETECTION AND CORRECTION METHOD FOR INTERNAL RAMS

TYPE ERROR DETECTION AND

PROTECTION METHOD

ERROR CORRECTION METHOD

L1 instruction cache Data Parity Invalidation and reread

Tag Parity + duplication Rewrite of duplicated data

L1 data cache Data SECDED ECC One-bit error correction using ECC

Tag Parity + duplication Rewrite of duplicated data

L2 cache Data SECDEDa

ECC One-bit error correction using ECC

Tag SECDED ECC One-bit error correction using ECC

Instruction TLB Parity Invalidation

Data TLB Parity Invalidation

Branch history Parity Recovery from branch prediction failure

aSECDED: Single error correction and double error correction.

For the L1 cache, L2 cache, and TLB, degradation can be performed separately in way units.

Error occurrence counts are made for each function unit. If the error occurrence count per

unit time exceeds the upper limit, degradation is performed and the relevant way is not

subsequently used. Hardware performs degradation automatically; at the same time, it also

performs the required operation to ensure the continuity of coherency automatically. More

specifically, hardware automatically performs the following: (1) operation that writes back to

the L2 cache all the dirty lines in the way of the L1D cache to be degraded, and (2) operation

that writes back to memory the dirty lines in the way of the L2 cache to be degraded. The

degradation of a way is performed without adversely affecting software, and software

operation is free from any effect except for a slowdown of the processing speed.

Reliability, Availability, and Serviceability of Internal Registers and Execution Units

The SPARC64 VII also provides error protection for registers and execution units, making

doubly sure that data integrity is guaranteed (Table 7). For integer architecture registers,

ECC is used from the SPARC64 VII to increase reliability. If an error occurs, the ECC

circuit corrects the error. Parity bits have been added to the floating-point architecture

registers and other registers. Also, the parity prediction circuit, residue check circuit, and

other circuits have been added to the execution unit to propagate parity information to

output results. In the unlikely event that a parity error occurs, it is detected, and hardware

automatically re-executes the instruction to attempt recovery as described below. This

function is called instruction retry.

TABLE 7. ERROR DETECTION AND PROTECTION METHOD FOR INTERNAL REGISTERS AND EXECUTION UNITS

TYPE ERROR DETECTION

METHOD

PROTECTION METHOD

Register

Integer register SECDED ECC

Floating-point register Parity

Page 22: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

19

PC, PSTATE, etc. Parity

Computation input-output register Parity

Execution unit

Addition and subtraction, division, shift, and graphic

operations

Parity prediction

Multiplication Parity prediction + residue check

Synchronous Update Method and Instruction Retry

As shown in the explanation of the instruction execution block, the SPARC64 VII uses

the synchronous update method. When an error is detected, all the instructions being

executed at this time are canceled. Interim results before commitment can be discarded,

and only results updated by instructions that have been completed without encountering

any errors remain in programmable resources. Therefore, not only can errors be prevented

from destroying programmable resources, but hardware can also perform an instruction

retry after error

detection. Even in the case of a hang, because stalled instructions can be discarded once and then

retried from the beginning, there is a possibility of recovery.

Instruction retry is triggered by an error and is automatically started. A retry is performed

instruction by instruction to increase the chance of normal execution. When the execution is

completed normally, the state automatically returns to the normal execution state. During

this period, no software intervention is required, and if the instruction retry succeeds, the

error does not affect software. An instruction retry is repeated until the number of retry

times reaches the threshold, and when the threshold is exceeded, the occurrence of the error

is reported to the

software by an interrupt. Operational flow is shown in Figure 10.

Figure 10. Instruction retry by hardware after error detection.

Increased Serviceability

The SPARC64 VII has error-checking mechanisms in a variety of locations. If an error

occurs, the system is notified of the error through a dedicated interface. On receipt of this

notification,

Page 23: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

20

the XSCF firmware collects error logs through the dedicated interface and analyzes them.

This series of operations does not affect software and is performed in the background.

With the mechanism described above, a system in which the SPARC64 VII is mounted

can identify the location and type of a failure quickly and accurately while continuing

operation. Thus, the system can obtain information useful for preventive maintenance to

increase serviceability.

I/O Subsystem

A growing reliance on computer systems for every aspect of business operations brings along

a need to store and process ever-increasing amounts of information. Powerful I/O subsystems

are crucial to effectively moving and manipulating these large data sets. The Sun SPARC

Enterprise M3000 server delivers exceptional I/O expansion and performance, enabling

organizations to scale systems and accommodate evolving data storage needs.

I/O Subsystem Architecture

The use of PCI technology is crucial to the performance of the I/O subsystem within the Sun

SPARC Enterprise M3000 server. A PCIe bridge supplies the connection between the main

system and all I/O components, such as PCIe slots and internal drives (Figure 11). The PCIe

bus also enables the connection of external I/O devices by using internal PCI slots.

Figure 11. Sun SPARC Enterprise M3000 server I/O subsystem architecture.

Page 24: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

21

Sun SPARC Enterprise M3000 Server I/O Subsystem

In the Sun SPARC Enterprise M3000 server, a single PCIe bridge mounted on the

motherboard connects all I/O components to the system controllers. The Sun SPARC

Enterprise M3000 server has four PCIe slots.

I/O Devices

Along with a disk device directly integrated into it, the Sun SPARC Enterprise M3000

server supports one internal DVD drive and four internal SAS 2.5-inch hard disk drives.

The Sun SPARC Enterprise M3000 server also supports one external SAS port, which can

be connected to any SAS storage or tape device.

Reliability, Availability, and Serviceability

Reducing downtime——both planned and unplanned——is critical for IT services. System

designs must include mechanisms that foster fault resilience, quick repair, and even rapid

expansion, without impacting the availability of key services. Specifically designed to support

complex, network computing solutions and stringent high-availability requirements, the

system in the Sun SPARC Enterprise M3000 server includes redundant, hot-swap system

components; diagnostic and error recovery features throughout the design; and built-in remote

management features. The advanced architecture of this reliable server enables high levels of

application availability and

rapid recovery from many types of hardware faults, simplifying system operation and lowering

costs for enterprises.

Redundant and Hot-Swap Components

Today’’s IT organizations are challenged by the pace of nonstop business operations. In a

networked global economy, revenue opportunities remain available around the clock, forcing

planned downtime windows to shrink and, in some cases, disappear entirely. To meet these

demands, the Sun SPARC Enterprise M3000 server employs built-in redundant and hot-swap

hardware to help mitigate the disruptions caused by individual component failures or

changes to system configurations. In fact, these systems are able to recover from hardware

failures——often with no impact to users or system functionality.

The Sun SPARC Enterprise M3000 server features redundant, hot-swap power supplies and

fan units. Also, redundant internal storage can be created by combining hot-swap disk drives

with disk mirroring software. If a fault occurs, these duplicated components can enable

continued operation. Depending upon the component and type of error, the system could

continue to operate in a degraded mode or could reboot——with the failure automatically

diagnosed and the relevant component automatically configured out of the system. In

addition, hot-swap hardware

within the Sun SPARC Enterprise M3000 server speeds service and allows for the

replacement or addition of components, without stopping the system.

Page 25: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

22

Advanced Reliability Features

Advanced reliability features included within the components of the Sun SPARC

Enterprise M3000 server increase the overall stability of this platform. In addition,

advanced CPU integration and guaranteed data path integrity provide for autonomous

error recovery by the SPARC 64 VII processor, reducing the time to initiate corrective

action and subsequently increasing uptime.

XSCF and the predictive self-healing feature in Oracle Solaris further enhance the reliability

of Sun SPARC Enterprise servers. The implementation of XSCF and predictive self-healing

for Sun SPARC Enterprise servers enables the constant monitoring of all CPUs and memory.

Depending upon the nature of the error, persistent CPU soft errors can be resolved by

automatically

offlining a thread, core, or entire CPU. In addition, a memory page retirement capability enables

memory pages to be taken offline proactively, in response to multiple corrections to data

access for a specific memory DIMM.

Error Detection, Diagnosis, and Recovery

The Sun SPARC Enterprise M3000 server features important technologies that correct

failures early and keep marginal components from causing repeated downtime. Architectural

advances that inherently increase reliability are augmented by the error detection and

recovery capabilities within the server hardware subsystems. Ultimately, the following

features work together to raise application availability:

x End-to-end data protection detects and corrects errors throughout the system,

ensuring complete data integrity.

x State-of-the-art fault isolation enables the server to isolate errors within component

boundaries and offline only the relevant resources instead of whole components. This

feature applies to CPUs (cores), memory, and I/O devices.

x Constant environment monitoring provides a historical log of all pertinent environmental

and error conditions.

x The host watchdog feature periodically checks the operation of software, including the

domain operating system. This feature also uses the XSCF firmware to trigger error

notification and recovery functions.

x Periodic component status checks are performed to determine the status of many system

devices to detect signs of an impending fault. Recovery mechanisms are triggered to

prevent system and application failures.

x Error logging, multistage alerts, electronic field-replaceable unit identification information,

and system fault LED indicators all contribute to rapid problem resolution.

Page 26: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

23

System Management

Providing hands-on, local system administration for server systems is no longer realistic for

many organizations. Around-the-clock system operation, disaster recovery hot sites, and

geographically dispersed organizations lead to requirements for remote management of

systems. One of the many benefits of Oracle servers is the support for lights-out data centers,

enabling expensive support staff to work at any location with network access. The Sun

SPARC Enterprise M3000 system design, combined with the powerful XSCF, XSCF Control

Package (XCP), and system management software, enables administrators to remotely

execute and control nearly any task. These management tools and remote functions lower

administrative loads, saving organizations time and reducing operational expenses.

eXtended System Control Facility

The XSCF is the core technology of remote monitoring and management capabilities in the

Sun SPARC Enterprise M3000 server. The XSCF consists of a dedicated processor that is

independent of the server system and runs the XCP. The Domain to Service Processor

Communication Protocol (DSCP) is used for communication between the XSCF and the

server. The DSCP runs on a private TCP/IP-based or PPP-based communication link

between the service processor and each domain. Although input power is supplied to the

server, the XSCF constantly monitors the system even when the domain is inactive.

The XSCF regularly monitors the environmental sensors, provides advance warnings of

potential error conditions, and executes proactive system maintenance procedures as

necessary. For example, the XSCF can initiate a server shutdown in response to temperature

conditions that might lead to physical system damage. The XCP running on the service

processor enables administrators to remotely control and monitor a domain as well as the

platform itself. Using a network or serial connection to the XSCF, operators can effectively

administer the server from anywhere on the network. Remote connections to the service

processor run separately from the operating system and provide the full control and authority

of a system console.

DSCP Network

The DSCP service provides a secure TCP/IP and PPP-based communications link between

the service processor and each domain. Without this link, the XSCF cannot communicate

with the domain. The service processor requires one IP address dedicated to the DSCP

service on the XSCF side of the link and one IP address on the domain side.

eXtended System Control Facility Control Package

The XCP enables users to control and monitor the server system quickly and effectively.

The XCP provides a command-line interface (CLI) and Web browser user interface that

gives administrators and operators access to all system controller functions. Password-

protected

Page 27: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

24

accounts with specific administration capabilities also provide system security for domain

consoles. Communication between the XSCF and individual domains uses an encrypted

connection based on secure shell (SSH) and Secure Sockets Layer (SSL), enabling secure,

remote execution of commands provided by the XCP.

The XCP provides the interface for the following key server functions:

x Audit administration including the logging of interactions between the XSCF and the domains

x Monitoring and control of power to the components inside the Sun SPARC Enterprise

M3000 server

x Interpretation of hardware information presented, and notification of impending

problems such as high temperatures or power supply problems, as well as access to the

system administration interface

x Integration with the fault management architecture of Oracle Solaris 10 to improve

availability through accurate fault diagnosis and predictive fault analysis

x Execution and monitoring of diagnostic programs, such as the OpenBoot PROM (OBP)

and power-on self-test (POST)

Role-Based System Management

The XCP supports role-based system access control through the organization of users into

groups. Different privileges are assigned to each group. Privileges allow a user to perform a

specific set of actions on a specific set of hardware, including physical components,

domains, or physical components within a domain. In addition, a user can possess multiple,

different privileges on any number of domains.

Platform Management

Oracle Enterprise Manager Ops Center software, as well as other third-party tools, offer

advanced management functions that complement the capabilities of the XCP. To simplify

integration, the XSCF can communicate to system management tools by enabling an SNMP

agent on the service processor. The network interface on the service processor facilitates data

transfer to SNMP managers within third-party management applications. SNMP V1, V2, and

V3 and concurrent access from multiple SNMP managers are supported.

The service processor SNMP agent can export the following types of information to an SNMP

manager:

x System information such as chassis ID, platform type, total number of CPUs, and

total memory

x Hardware configuration

Page 28: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

25

x Domain status

x Power status

x Environmental

status

The service processor SNMP agent can supply system and fault event information using

public management information bases (MIBs). The XSCF supports the configuration of the

following two MIBs (configuration commands can be found in Table 8):

x XSCF extension MIB (SP-MIB). Provides information on the status and configuration

of the platform. For fault events, the SP-MIB sends a trap with basic fault information.

x Fault Management MIB (FM-MIB). Records fault event data. The FM-MIB provides

the same detailed information as the FMA MIB in an Oracle Solaris domain. This data

can help service technicians diagnose failures.

TABLE 8. SERVICE PROCESSOR SNMP AGENT CONFIGURABLE FOR ONE OR BOTH MIBS

MIB CONFIGURATION COMMAND

SP traps only setsnmp enable SP_MIB

FMA traps only setsnmp enable FM_MIB

SP and FMA traps setsnmp enable

Oracle Enterprise Manager Ops Center Software

Controlling a rapidly changing IT infrastructure requires intelligent management tools and an

ability to provision servers efficiently. Oracle Enterprise Manager Ops Center is a highly

scalable data center management platform that provides organizations with systems lifecycle

management and automation processes to help manage data center requirements such as

server consolidation, compliance reporting, and rapid provisioning. This management

platform helps enterprises to provision and administer both physical and virtual data center

assets. Oracle Enterprise Manager Ops Center provides a single console to help discover,

provision, update, and manage globally dispersed heterogeneous IT environments, which may

include Oracle and non-Oracle hardware running Windows, Linux, and Oracle Solaris

operating systems. When used in conjunction with the Sun SPARC Enterprise M3000 server,

this enterprise platform can automate the knowledge necessary for patch lifecycle

management and maintenance. Oracle Enterprise Manager Ops Center can help system

administrators automate software installations, simulation, rollback, compliance checking,

reporting, and many other related activities. In addition, Oracle Enterprise Manager Ops

Center can be used to discover the embedded service tag technology in the service processor

and the domain running on the Sun SPARC Enterprise M3000 server.

Oracle Solaris JumpStart can be implemented in this management solution to provision

Oracle Solaris onto the Sun SPARC Enterprise M3000 server. Oracle Enterprise Manager

Ops Center helps facilitate and control administrative actions from a central location to

ensure accountability and auditing. These automation capabilities can be used for knowledge-

based change

Page 29: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

26

management in conjunction with existing configuration management investments. Taking

advantage of Oracle Enterprise Manager Ops Center can help organizations create a more-

reliable environment that offers considerable cost savings through maintenance reduction

and more-rapid recovery down times.

Oracle Solaris 10

With mission-critical business objectives on the line, enterprises need a robust operating

environment with the ability to optimize the performance, availability, security, and use of

hardware assets. In a class by itself, Oracle Solaris 10 offers many innovative technologies to

help IT organizations improve operations and realize the full potential of Sun SPARC

Enterprise servers.

Observability and Performance

IT organizations need to make effective use of the power of hardware platforms. Oracle Solaris

10 supports near-linear scalability proportional to the number of CPUs (cores) and memory

addressability that reaches well beyond the physical memory limits of even Oracle’’s largest

server. The following advanced features of Oracle Solaris 10 provide IT organizations with

the ability to identify potential software tuning opportunities and maximize raw system

throughput:

x Oracle Solaris DTrace is a powerful tool that provides a true system-level view of

application and kernel activities, even those running in a Java Virtual Machine. DTrace

software safely instruments the running operating system kernel and active applications

without rebooting the kernel or recompiling——or even restarting——software. By using

this feature, administrators can view accurate and concise information in real time and

highlight patterns and trends in application execution. The dynamic instrumentation that

DTrace provides enables organizations to reduce the time to diagnose problems from days

and weeks to minutes and hours, resulting in faster data-driven fixes.

x The highly scalable, optimized TCP/IP stack in Oracle Solaris 10 lowers overhead by

reducing the number of instructions required to process packets. This technology also

provides support for large numbers of connections and enables server network throughput

to grow linearly with the number of CPUs and network interface cards. By taking

advantage of Oracle Solaris 10 network stack, organizations can significantly improve

application efficiency and performance.

x The memory handling system of Oracle Solaris 10 provides multiple page size support

to enable applications to access virtual memory more efficiently, improving

performance for applications that use large memory intensively.

x Oracle Solaris 10 multithreaded execution model plays an important role in enabling Sun

SPARC Enterprise servers to deliver scalable performance. Improvements to the

threading

Page 30: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

27

capabilities in Oracle Solaris 10 occur with every release, resulting in performance and

stability improvements for existing applications without recompilation

Availability

The ability to rapidly diagnose, isolate, and recover from hardware and application faults is

paramount for meeting the needs of nonstop business operations. Longstanding features of

the Oracle Solaris provide for system self-healing. For example, the kernel memory

scrubber constantly scans physical memory, correcting any single-bit errors to reduce the

likelihood of those problems turning into uncorrectable double-bit errors. Oracle Solaris 10

takes a big leap forward in self-healing with the introduction of the fault manager and

service manager features. With these features, business-critical applications and essential

system services can continue uninterrupted in the event of software failures, major hardware

component breakdowns, and software misconfiguration problems.

x Fault manager reduces complexity by automatically diagnosing faults in the system and

initiating self-healing actions to help prevent service interruptions. Fault manager diagnosis

engine produces a fault diagnosis once discernible patterns are observed from a stream of

incoming errors. Following error identification, fault manager provides information to

agents that know how to respond to specific faults. Problem components can be configured

out of a system before a failure occurs——and in the event of a failure, this feature

performs automatic recovery and application restart. For example, an agent designed to

respond to a memory error might determine the memory addresses affected by a specific

failure and remove the affected locations from the available memory pool.

x Service manager software converts the core set of services packaged with the operating

system into first-class objects that administrators can manipulate with a consistent set of

administration commands. Using service manager, administrators can take actions on

services including start, stop, restart, enable, disable, view status, and snapshot. Service

snapshots save the complete configuration of a service, giving administrators a way to roll

back any erroneous changes. Snapshots are taken automatically whenever a service starts to

help reduce risk by guarding against erroneous errors. Because service manager is

integrated with fault manager, when a low-level fault is found to impact a higher-level

component of a running service, fault manager can direct service manager to take

appropriate action.

In addition to handling error conditions, efficiently managing planned downtime greatly

enhances availability levels. Tools included with Oracle Solaris 10, such as Oracle Solaris

Flash and Oracle Solaris Live Upgrade, can help enterprises achieve more-rapid and

consistent installation of software, upgrades, and patches, leading to improved uptime.

x Oracle Solaris Flash enables IT organizations to quickly install and update systems with an

Oracle Solaris 10 configuration tailored to enterprise needs. This technology provides tools

for

Page 31: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

28

system administrators to build custom rapid-install images——including applications,

patches, and parameters——which can be installed at a data rate close to the full speed of

the hardware.

x Oracle Solaris Live Upgrade provides mechanisms to upgrade and manage multiple on-

disk instances of Oracle Solaris 10. This technology enables system administrators to

install a new operating system on a running production system without taking it offline,

with the only downtime for the application being the time necessary to reboot the new

configuration.

Security

Today’’s increasingly connected systems create benefits and challenges. While the global

network offers opportunities to increase revenue, enterprises must pay close attention to

security concerns. The most secure operating system on the planet, Oracle Solaris 10

provides features previously found only in the trust military-grade Oracle Solaris. These

capabilities enable the

strong controls required by governments and financial institutions, but also benefit all enterprises

focused on security concerns and requirements for auditing capabilities.

x The user rights management and process rights management capabilities in Oracle Solaris

work in conjunction with Oracle Solaris Containers to enable multiple applications to

securely share the same domain. Security risks are reduced by granting users and

applications only the minimum capabilities needed to perform assigned duties. Best yet,

unlike other solutions on

the market, no application changes are required to take advantage of these security

enhancements.

x The security policy in Oracle Solaris 10 can be extended with labeling features

previously available only in highly specialized operating systems or appliances. These

extensions deliver true multilevel security within a commercial grade operating system,

beneficial to civilian organizations with specific regulatory or information protection

requirements.

x Oracle Solaris 10 provides features that fortify platforms against compromise. Firewall

protection technology included within Oracle Solaris 10 distribution protects individual

systems against attack. In addition, file integrity checking and digitally signed binaries

within Oracle Solaris 10 enable administrators to verify that platforms remain untouched

by hackers. Secure remote access capabilities also increase security by centralizing the

administration of system access across multiple operating systems.

Virtualization and Resource Management

The economic need to maximize the use of every IT asset often necessitates consolidating

multiple applications onto single server platforms. Virtualization techniques enhance

consolidation strategies one step further by helping organizations create administrative and

resource boundaries between applications within each domain on a server. By taking

advantage of Oracle Solaris Containers and Oracle Solaris Resource Manager software,

Page 32: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

29

organizations can improve resource use and reduce downtime——without additional

software licensing expenses.

Page 33: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

30

Oracle Solaris Containers

Containers provide a breakthrough approach to virtualization and software partitioning,

supporting the creation of many private execution environments within a single instance

of

Oracle Solaris (Figure 13). Within the Container model, each environment holds a unique identity

and maintains resource and namespace isolation. Administrators can configure separate LAN

or virtual LAN connections with exclusive IP stacks for individual Containers, creating

secure separation of network traffic. By supporting fine-grained control over the assignment

of system rights and resources, Containers can ease consolidation efforts.

Applications within containers are isolated, preventing processing in one container from

monitoring or affecting processes running in another container. Even a superuser process

cannot view or affect activity in other containers. Software fault and security isolation

features in Oracle Solaris Containers prohibit poorly behaved applications from impacting

other containers. This isolation supports better administrative control, helping organizations

eliminate error

propagation, unauthorized access, and unintentional intrusions.

Figure 13. Containers isolate applications using flexible software mechanisms.

Hosting multiple applications on one system helps organization realize the use of expensive

resources to greater effect. Using Containers can lead to lower costs by helping IT

organizations harness and provision otherwise idle compute power into a secure, isolated

runtime environment for new deployments. For example, a database, Web server, and batch

application•each running on its own system•can be consolidated onto a single server

configured to give each access to one-third of the available system resources. That same

server can be automatically reconfigured

so that the Web server receives 75 percent of the network bandwidth during peak load

conditions. When applied to test and development environments, Containers can minimize

the need for dedicated test systems and facilitate the implementation of multiple

deployment scenarios with ease. At the end of a testing cycle, administrators can also

rapidly duplicate validated configurations for production deployment. With the ability to

dynamically allocate resources, Containers help improve resource use without increasing

the number of operating system instances to manage.

Page 34: Server sun Prof. Hemant Samant

Sun SPARC Enterprise M3000 Server Architecture

31

Oracle Solaris Resource Manager

Resource management tools address the needs of consolidation efforts, which require soft

resource boundaries between applications. With no privileges to access underlying

hardware, resource management software leverages operating system controls to govern the

use of CPU, memory, and I/O. Oracle Solaris Resource Manager software enables system

administrators to set and enforce policies that guarantee a share of CPU cycles and virtual

memory space to individual applications. Administrators can also set upper limits on process

count, number of logins, and connect time for each system user ID. In addition, Oracle

Solaris Resource Manager

can be used along with other virtualization technologies to further define resource rights for

each virtualized boundary. In fact, Oracle Solaris Resource Manager enables the dynamic

allocation of processors and individual processor cores to a Container. The power to define

and readily adjust compute resource levels within virtualized environments helps enterprises

improve hardware use and better guarantee the quality of service for individual applications.

Conclusion

To support the high demand for reliability, manageability, and reduced environmental loads

in data centers, infrastructures need to provide ever-increasing performance and capacity

along with power conservation and miniaturization. Outfitted with the SPARC64 VII

processor developed to provide high performance and low power consumption, a large

memory capacity, a reliable architecture, and a system monitoring feature, Oracle’’s Sun

SPARC Enterprise M3000 server delivers new levels of power, availability, and ease-of-use

to enterprises. Organizations using this server can open the door to a new environment,

fostering greater business opportunities and gaining a strategic asset in the quest to get ahead

and stay ahead of the competition.