Upload
vobao
View
249
Download
3
Embed Size (px)
Citation preview
Huawei OceanStor UDS Massive Storage System Technical White Paper
Issue 1.1
Date 2014-06
HUAWEI TECHNOLOGIES CO., LTD.
Copyright © Huawei Technologies Co., Ltd. 2013. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without
prior written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not
be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all
statements, information, and recommendations in this document are provided "AS IS" without warranties,
guarantees or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.
Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China
Website: http://enterprise.huawei.com
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater Contents
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
3
Contents
1 Executive Summary ...................................................................................................................... 5
2 Introduction.................................................................................................................................... 6
3 Solutions ......................................................................................................................................... 7
3.1 Product Composition .................................................................................................................................................... 7
3.2 Product Features ......................................................................................................................................................... 10
3.2.1 Exascale Scalability ................................................................................................................................................. 10
3.2.2 SoD Architecture ..................................................................................................................................................... 15
3.2.3 High Security and Reliability .................................................................................................................................. 21
3.2.4 Low TCO ................................................................................................................................................................. 27
4 Experience ..................................................................................................................................... 29
4.1 Solution 1: Massive Resource Pool ............................................................................................................................ 29
4.1.1 Typical Needs and Problems Facing Customers ...................................................................................................... 29
4.1.2 Solution .................................................................................................................................................................... 30
4.1.3 Software and Hardware Configurations ................................................................................................................... 31
4.1.4 Benefits .................................................................................................................................................................... 32
4.2 Solution 2: Centralized Backup .................................................................................................................................. 32
4.2.1 Typical Needs and Problems Facing Customers ...................................................................................................... 32
4.2.2 Solution .................................................................................................................................................................... 33
4.2.3 Software and Hardware Configurations ................................................................................................................... 34
4.2.4 Benefits .................................................................................................................................................................... 35
4.3 Solution 3: Web Disk .................................................................................................................................................. 35
4.3.1 Typical Needs and Problems Facing Customers ...................................................................................................... 35
4.3.2 Solution .................................................................................................................................................................... 36
4.3.3 Software and Hardware Configurations ................................................................................................................... 37
4.3.4 Solution Network ..................................................................................................................................................... 39
4.3.5 Benefits .................................................................................................................................................................... 39
4.4 Solution 4: Centralized Active Archiving ................................................................................................................... 40
4.4.1 Typical Needs and Problems Facing Customers ...................................................................................................... 40
4.4.2 Solution .................................................................................................................................................................... 40
4.4.3 Software and Hardware Configurations ................................................................................................................... 41
4.4.4 Solution Network ..................................................................................................................................................... 42
HUAWEI OceanStor UDS Massive Storage System Technical
White Pater Contents
4.4.5 Benefits .................................................................................................................................................................... 43
5 Conclusion .................................................................................................................................... 44
6 Acronyms and Abbreviations ................................................................................................... 45
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 1 Executive Summary
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
5
1 Executive Summary
As the IT industry develops, data amount soars at an unprecedented speed. A brand-new
storage system is required for the reliable storage of massive data. Massive storage systems
come into being. This document describes HUAWEI UDS massive storage system (UDS for
short) in terms of product composition, application scenario, and advantage. With large
capacity, high reliability, and outstanding scalability, the UDS brings unique values to
customers.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 2 Introduction
2 Introduction
As the IT industry evolves, people's life becomes closely related to the IT. Data, as the
cornerstone and most important assets in the IT industry, is growing at an unprecedented
speed. Big data now becomes a trend. All industries call for secure and reliable storage of
massive data.
HUAWEI UDS massive storage system is developed to address problems and challenges
facing customers. The UDS features:
Industry-leading scale-out distributed storage architecture and the distributed hash table
(DHT) algorithm
Diversified external interfaces compatible with Amazon Simple Storage Service (S3)
interfaces
Multi-level data protection technologies such as Multiple Copies (MC) and Erasure Code
(EC)
With large capacity, high reliability, easy maintenance, and flexible scalability, the UDS
applies to scenarios of massive data storage and centralized backup, and supports an exascale
capacity and a secure, reliable, efficient, and converged architecture.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
7
3 Solutions
Based on industry trends and a thorough understanding of customer needs, Huawei releases
the UDS, a massive storage system designed specifically for the big data market. The UDS
employs the DHT-based scale-out storage architecture and multiple data protection
technologies such as EC and MC to ensure data security, and provides unified external
interfaces for the access of multiple types of services, meeting massive data storage
requirements.
3.1 Product Composition The UDS consists of access nodes (A-Node for short) and universal distributed storage nodes
(UDSNs). A-Nodes are used for data scheduling, that is, distributing data requests from
upper-layer services to UDSNs. UDSNs are used for data storage. To meet the requirement of
massive data storage, A-Nodes and UDSNs are deployed in high availability (HA) clusters,
namely, the access cluster and the storage cluster. 0 shows the components deployed in a UDS
cabinet.
Figure 3-1 Components deployed in a UDS cabinet
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
A-Nodes: two T3200 servers
UDSNs: ≤ seven (flexibly deployed based on the load bearing capability of equipment
rooms and power consumption requirements)
Access switches: two S6724 (S6724 for the enterprise market/S6324 for the carrier
market), full 10GE
An A-Node is used to process and control access requests initiated by clients, establish object
transmission channels, and manage metadata. A-Nodes can be clustered. When the amount of
concurrent access requests is large, new A-Nodes can be added to improve request processing
capabilities, thereby eliminating data processing bottlenecks.
Figure 3-2 shows the appearance of an A-Node.
Figure 3-2 Appearance of an A-Node
Specifications Value
Disk type SATA, SAS, NL SAS, and SSD
Max. number of disks 12
Max. capacity per disk 4 TB
AC power supplies 100 V to 127 V or 200 V to 240 V, 1+1 power
supply redundancy
Power consumption Without service disks: 350 W
Max. power consumption: 650 W
Dimensions 86.1 mm x 446 mm x 585 mm (2 U)
Weight 18.5 kg (unloaded)
A UDSN is used to store, replicate, and ensure consistency of data and metadata. A UDSN
contains innovative smart disks. Unlike traditional disks, smart disks combine disk drives and
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
9
CPUs to provide improved data processing capabilities. The storage capacity of the UDS can
be expanded by adding UDSNs.
Figure 3-3 shows the appearance of a UDSN.
Figure 3-3 Appearance of a UDSN
Specifications Value
Disk type SATA
Max. number of disks 75
Max. capacity per disk 4 TB
AC power supplies 100 V to 127 V or 200 V to 240 V, 2+2 or 1+1
power supply redundancy
Max. power
consumption
1350 W
Dimensions 176.5 mm x 446 mm x 790 mm (4 U)
Weight 45.2 kg (unloaded)
97.7 kg (fully-loaded)
The UDS provides massive data storage capabilities by A-Node and UDSN clusters and
cross-cabinet capacity expansion. UDS cabinets are connected by service switches S6724
(S6724 for enterprise markets and S6324 for carrier markets) and core switches over a full
10GE network, as shown in Figure 3-4.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Figure 3-4 UDS system network diagram
3.2 Product Features As a massive storage system, the UDS applies to big data scenarios where upper-layer
services vary and data amount soars. To meet the increasing requirement of data security, the
UDS provides a secure, reliable, massive, efficient, and converged storage architecture to
cope with challenges in the big data era.
3.2.1 Exascale Scalability
For customers who do not want to purchase a large storage capacity during initial deployment,
but will expand the system capacity with the service growth, the UDS reduces their initial
investment and provides flexible capacity scalability. For customers whose service systems
carry heavy workloads, the UDS provides a massive storage resource pool to eliminate
storage capacity bottlenecks.
Based on the elastic DHT, DHT-based one-off addressing, and key technologies such as
decentralized architecture, stateless access cluster, and metadata hashing, the UDS provides
massive storage capacities scalable to exabyte level.
3.2.1.1 DHT
The UDS uses the DHT-based hash algorithm to divide and address all storage units' address
space and then maps the divided address space to the DHT ring. Each storage unit stores data
as objects and can be located by its address space. Upon data object read/write, each data
object stored on a storage unit is located using the one-off hash algorithm addressing.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
11
A DHT ring has an infinite address space and is elastic in size (as shown in Figure 3-5) by
changing the partition size. Theoretically, a DHT ring supports an extremely large number of
storage units, laying a solid foundation for exascale capacity expansion.
Figure 3-5 Elastic DHT ring that supports infinite node expansion
The UDS provides the following advantages based on the DHT:
1 Metadata is evenly in virtual space comprising all physical nodes, enabling infinite
storage expansion.
2 Data is equivalently accessed on each two nodes in a point-to-point approach, shortening
latency of central node index query and eliminating performance bottlenecks.
3 Storage capacity can be gradually expanded on demand.
The DHT technical principles are as follows:
Each storage unit (smart disk) corresponds to a physical node and has a unique ID.
In the UDS, data has a key and is stored by the key's hash value. The hash value of a key
corresponds to a storage unit.
Hash values of all keys reside in integer range [0, 232
-1]. When the UDS is being
initialized, this integer range is divided into multiple same size partitions, each of which
contains the same number of hash value integers. Each partition represents the same hash
space.
The capacity of each physical node is usually divided into 20 to 40 partitions.
Each partition corresponds to a virtual node. Data in a partition is stored onto the
corresponding virtual node.
The UDS maintains and updates a mapping table between partitions (or virtual nodes)
and physical nodes.
The DHT ring is an integer range of 0 to 2128
. Each virtual mode is mapped to the DHT
ring and each key of data is mapped to a virtual node. The hash value (data) of a key is
stored to the corresponding virtual node.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
After the UDS is expanded, for example, new physical nodes are added, the number of
hash space partitions keeps unchanged but the mappings between virtual nodes and
physical nodes are updated automatically. The DHT ring has an infinite address space
therefore supports unlimited virtual nodes. By adjusting mappings between virtual nodes
and physical nodes, unlimited physical nodes can be added.
Figure 3-6 uses a storage cluster with four physical nodes (each physical node contains five
virtual nodes) as an example to describe the DHT technical principles.
Figure 3-6 DHT technical principles
The hash space (0 to 232
-1) is divided into N same-size partitions. In the preceding figure, the
hash space is divided into 20 partitions from P0 to P19. Each partition contains the same
number of hash values.
The hash value of each key is mapped to a partition. For example, the hash value of key k1 is
mapped to partition P0.
A to T in the preceding figure represent 20 virtual nodes. Data in a partition is stored onto the
corresponding virtual node. For example, data represented by the key whose hash value is
mapped to P0 is stored to virtual node A. Similarly, data represented by the key whose hash
value is mapped to P1 is stored to virtual node B.
Physical nodes 1, 2, 3, and 4 that represent physical storage units (smart disks) provide
persistent data processing capabilities. A physical node has a mapping relationship with
virtual nodes. Usually, a physical node corresponds to multiple virtual nodes. This mapping
relationship is similar to that between partitions and physical nodes.
The number of partitions is determined when the UDS is being initialized and keeps
unchanged after the number of physical nodes increases. The change in partition quantity will
cause the number of hash values to change in each partition. As a result, data in each partition
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
13
and node will be relocated. Therefore, the number of hash partitions is kept unchanged to
avoid data relocation.
Physical nodes can be added or removed online based on capacity requirements. The number
of partitions does not change with that of physical nodes in the SoD cluster, but mappings
between partitions and physical nodes are automatically updated after the number of physical
nodes is changed. As shown in the preceding figure, physical nodes 1, 2, 3, and 4 correspond
to five virtual nodes (partitions) respectively. After the new physical node 5 is added, each
physical node corresponds to four virtual nodes (partitions).
Figure 3-7 Mappings between physical nodes and partitions after a physical node is added
After a new physical node is added, four partitions are allocated to the new physical node.
Therefore, 1/5 data of the cluster is migrated to the new node.
Based on the DHT algorithm, if the UDS has M physical nodes and a new node is added,
1/(M+1) data of the UDS is migrated after partitions are reallocated to all physical nodes.
Similarly, if a node is faulty or removed from the UDS, 1/M data of the UDS is migrated after
the partition reallocation. Figure 3-8 shows mappings between physical nodes and partitions
after physical node 4 is removed.
Figure 3-8 Mappings between physical nodes and partitions after a physical node is removed
3.2.1.2 Decentralized Architecture
The UDS has two logical clusters: the access cluster and the storage cluster, which consist of
A-Nodes and UDSNs respectively. An A-Node provides access to the object-based storage
service. It also processes and controls access requests initiated by clients, establishes object
transmission channels, and manages metadata. A UDSN is used to store, replicate, and ensure
consistency of data and metadata.
Figure 3-9 shows the DHT algorithm-based equivalent point-to-point data access between
A-Nodes and UDSNs. In the UDS, an A-Node can directly access any UDSN for data
read/write based on the DHT-based addressing. Unlike traditional storage systems, this way of
data access in the UDS does not rely on central nodes, which shortens the latency of data
index query and eliminates access bottlenecks.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Figure 3-9 Decentralized architecture for equivalent point-to-point access
3.2.1.3 Smart Disk
The UDS uses smart disks as storage units, which are also regarded as physical nodes. A
smart disk contains a disk drive, energy-saving Advanced RISC Machines (ARM) chip,
small-capacity memory, and Ethernet ports. Each smart disk is allocated a dedicated IP
address to connect to switches and communicate with other smart disks in a distributed and
interconnected network, as shown in Figure 3-10. The UDS capacity can be expanded by
adding smart disks, enabling fine-grain capacity expansion at the disk-level.
Figure 3-10 Decentralized architecture for equivalent point-to-point access
Each smart disk has fixed data access throughput. Therefore, the throughput of the UDS can
linearly grow with the number of smart disks. For details, see the HUAWEI OceanStor UDS
Massive Storage System Technical White Paper — Smart Disks.
3.2.1.4 Stateless Cluster
In the UDS, A-Nodes are networked in the access cluster. Based on the object-based storage
technology and the DHT algorithm-based one-off addressing, A-Nodes that are loosely
coupled with UDSNs can work as stateless service nodes. An A-Node can process any service
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
15
requests allocated to it after load balancing. Unlike traditional storage systems where the
number of nodes used for processing service requests is limited due to state synchronization
and locking mechanisms, the UDS can have an unlimited number of A-Nodes in its access
cluster theoretically, eliminating architecture bottlenecks that hinder exabyte-level capacity
expansion.
3.2.1.5 Metadata Hashing
The UDS does not have dedicated metadata nodes. Instead, metadata services are provided by
A-Nodes, which distribute metadata slices evenly onto UDSNs in the same way as common
data based on the DHT algorithm. When the number of concurrent access requests soars,
metadata service requests are distributed to A-Nodes in the access cluster for load balancing
and A-Nodes can be added on demand to improve the request processing capability,
preventing a bottleneck from occurring.
3.2.1.6 MDC
The UDS provides the Multiple Data Center (MDC) feature to centrally schedule and manage
multiple DCs across regions. To meet different capacity requirements, the UDS can be
expanded from several terabytes to exabytes on demand.
Figure 3-11 Centralized scheduling and management of multiple DCs across regions
As shown in Figure 3-11, the UDS can synchronize data between cross-regional DCs,
customize data copy policies based on service level agreements (SLAs), and preferentially
access data on the nearest DC. The MDC feature ensures exascale capacity expansion in terms
of scalability, reliability, and operability.
3.2.2 SoD Architecture
Sea of Disks (SoD) is an innovative and decentralized storage architecture dedicated to
processing massive unstructured data that is much more frequently read than written. With
the DHT algorithm-based addressing, a large number of power-saving and cost-effective
smart disks are consolidated into a decentralized cluster with unified software and hardware.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Based on the SoD architecture, the UDS provides outstanding availability, scalability, and
maintainability. High performance is delivered while costs and energy consumption are cut
down.
The SoD architecture comprises the access cluster and the storage cluster.
The access cluster consists of A-Nodes that process external requests. A-Nodes have powerful
computing capabilities. Therefore, the access cluster provides computing-intensive services
such as request access, user authentication, data slicing, data aggregation, and data routing.
The storage cluster consists of UDSNs whose computing capability is inferior to that of
A-Nodes. A UDSN comprises power-saving and energy-efficient smart disks. Each smart disk
provides key-value interfaces. All user data is stored in the storage cluster as data slices after
being processed by the access cluster.
Data slices and partitions in the storage cluster are divided and routed based on the DHT.
The DHT determines the location where a data slice is stored. Therefore, the storage cluster
can be regarded as a DHT ring.
3.2.2.1 I/O Process
The data write process is as follows:
1 Request access: The client sets up connections with an A-Node of the UDS and transmits
data to that A-Node.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
17
2 Storage policy selection: The A-Node determines the data storage policy based on preset
configurations.
3 Data slicing: If the amount of data transmitted from the client exceeds 1 MB, the A-Node
divides the data into multiple slices of 1 MB each.
4 Data route: The A-Node writes the data slices into the storage cluster based on the DHT.
1 Request access: The client sets up connections with an A-Node of the UDS and sends a
data read request to that A-Node.
2 Data routing: The A-Node locates the partition where the requested data resides based on
the DHT, and obtains the address of the smart disk where the partition resides.
3 Data repair: If any data slice is damaged, the A-Node repairs the data slice based on the
specified data storage policy.
4 Data aggregation: The A-Node aggregates data slices to the original data and sends the
data to the client.
Buffers are reserved in A-Nodes for data slicing and aggregation.
1 Data write: After dividing data into slices, an A-Node buffers some data slices and writes
data slices to different UDSNs to speed up data write.
2 Data read: An A-Node anticipates the range where the data requested by the client
resides and then reads data slices consecutively from smart disks onto the buffer to speed
up data read.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
To achieve the optimal throughput with the minimum resources, A-Nodes automatically
adjusts buffer sizes and the volume of data concurrently read from or written to smart disks
based on connection speeds and data volume of clients.
3.2.2.2 Request Access
The access cluster of the UDS provides standard S3 interfaces and rich S3 ecosystems
(including tools, development packages, and third-party software integration).
Amazon S3 is the de facto standard in the cloud storage field. Based on the HTTP protocol,
Amazon S3 provides mature interfaces complying with the Representational State Transfer
(REST) protocol. S3 interfaces are easy-to-use, reliable, stateless, and easy to be accessed via
networks in nature.
S3 interfaces define a data model consisting of three layers: user, bucket, and object.
1 A user in the UDS can own and manage buckets and objects.
2 A bucket is the container of objects, similar to a folder in a file system. A bucket can
contain multiple objects but no other buckets.
3 An object is a set of data, similar to a file in a file system.
The UDS data model abandons the traditional nested directory structure. A single bucket is
able to house hundreds of millions of objects and is easy to expand. This flat storage structure
is highly suitable to unstructured data.
The UDS defines a user model consisting of three layers: account, group, and user.
1 An account owns resources in the UDS, and corresponds to an individual user, enterprise,
or organization.
2 A user uses resources in the UDS. An account can create multiple users and grant users
permission for different resources. Usually, a user corresponds to an enterprise or an
employee of an organization.
3 A group is a collection of users. An account can create multiple groups, add users to
different groups, and grant groups permission for different resources. One user can
belong to different groups. Users in a group inherit all permission of the group. Usually,
a group corresponds to an enterprise or a sub-department or sub-organization.
3.2.2.3 Storage Policy
The UDS provides flexible storage policies. A storage policy determines the reliability,
availability, security, and space occupation of data. Upon receiving data access requests from
users, the access cluster of the UDS reads user configurations from the user server and then
determines a data storage policy accordingly.
1. Multiple Copies (MC)
The MC storage policy generates multiple copies for a piece of data. Each data copy is stored
onto different physical nodes in the same storage cluster. Even if some data copies are
completely damaged or lost, users can still access the other data copies. This storage policy
provides high redundancy and reliability, but consumes large storage space.
The MC storage policy works based on a quorum mechanism. This mechanism defines a
group of replication parameters, which are called NWR for short.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
19
N: indicates the number of data copies. A piece of data has N copies in the UDS.
W: indicates the number of data copies that are successfully written onto the UDS. Only
after W data copies are successfully written, the UDS returns a write success message to
the user.
R: indicates the number of data copies that are successfully read from the UDS. Only
after R copies of the requested data are successfully read, the data is returned to the user.
The default MC policy adopted by the UDS is NWR = 322, that is, each piece of data has two
copies and data read and write require at least two copies respectively. This storage policy
strikes a balance between reliability and data consistency, applicable to most
reliability-demanding scenarios.
With this storage policy, the UDS's access cluster slices user data into pieces, and each piece
is replicated to multiple copies. The data copies are then written onto different data partitions
of physical nodes in the DHT.
The UDS is environment-aware, storing data copies onto physical environments independent
from each other to improve data reliability and availability. For example, copies of the same
data are stored onto different storage cabinets, enclosures, and physical nodes, to tolerate
more fault scenarios.
2. Erasure Code (EC)
The EC storage policy generates redundant data for a piece of data. If a piece of data is
partially damaged or lost, the UDS can use its redundant data to reconstruct or repair the
damaged data. The EC storage policy ensures high data reliability and consumes less storage
space, striking a balance between reliability and economy.
After the access cluster divides data into slices, consecutive M slices comprise an EC group.
Based on the EC storage policy, the UDS generates N parity data slices for the EC group. The
data slices and parity slices are stored onto a consecutive group of data partitions in the
storage cluster. In doing so, the data slices are stored onto different physical nodes, improving
the data reliability.
As long as the number of damaged data slices does not exceed N, the access cluster is able to
restore the damaged slices using the other ones.
3. Data stored in different DCs for cross-region disaster recovery
The Multiple Data Center (MDC) policy is configured by the unit of bucket. If an MDC
policy is enabled for a bucket, the access cluster writes the data from the bucket to the local
storage cluster and its data copies to UDS systems in other data centers.
The MDC policy supports asynchronous replication. Data is first written onto the local
storage cluster and then a background asynchronous replication task is initiated to replicate
the data to a remote data center. If this task fails, the UDS initiates the task again after a
periodic background scan.
For details, see the HUAWEI OceanStor UDS Massive Storage System Technical White Paper
— Reliability.
3.2.2.4 Data Routing
The UDS routes data slices based on the DHT.
After the UDS is initialized, mappings between data partitions of the storage cluster and
physical nodes are determined and recorded. At the same time, the UDS maps the data
partitions evenly to a hash space residing in the range of [0, 232
-1]. The next number after 232
-1 is back to 0. Therefore, the hash space is a ring.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
When reading a data slice, an A-Node first uses the consistency hash algorithm to calculate
the hash value of the data slice based on its key. The hash value is also stored in the hash
space. Therefore, both data slices and data partitions reside on the same logical hash ring. The
consistency hash algorithm stores each data slice onto the first data partition next to the data
slice in counterclockwise direction. After obtaining a data partition, the A-Node can locate the
physical node where the data partition resides; thereby the routing of the data slice is
completed.
If the MC storage policy is used, N copies of a data slice are stored. After locating the first
data partition, the A-Node locates the other N-1 data partitions clockwise along the logical
ring. The other N-1 data partitions are where the other N-1 data copies are stored.
If the EC storage policy is used, an A-Node locates the first data partition for the first data
slice of the EC group and then addresses other data partitions according to specific addressing
rules. For details, see the HUAWEI OceanStor UDS Massive Storage System Technical White Paper — Reliability.
3.2.2.5 Data Repair
If data slices are found damaged when clients read data on the UDS, A-Nodes repair the data
slices based on the storage policy. This is done to ensure the correctness and reliability of the
data read by the clients.
MC
If the data slice that is stored based on the MC policy is damaged, A-Nodes attempt to
read one of its copies to repair the data.
EC
If the data slice that is stored based on the EC policy is damaged, A-Nodes attempt to
read the EC group where the data slice resides. The intact data slices and parity data
slices in the EC group are used to repair the damaged data slice. For details, see the
HUAWEI OceanStor UDS Massive Storage System Technical White Paper — Reliability.
MDC
If the data slice that is stored based on the MDC policy is damaged, A-Nodes attempt to
read the desired data slice from the backup data center and use the data slice to repair the
damaged one.
In addition to read repair, the UDS constantly scans and verifies data in the cluster and
restores damaged data in the background if any errors are detected. Background repair is
classified into two levels: object-level and slice-level.
Object-level
A-Nodes constantly scan data in the cluster, calculate the digest of the object, and
compare the digest with the correct digest stored in the metadata. If the object data is
incorrect, the UDS repairs the object data using the repair mechanism applied to data
read.
Slice-level
Smart disks of the UDS constantly scan the data that they carry and repair incorrect data
slices by using the anti-entropy mechanism. For details about the anti-entropy
mechanism, see the HUAWEI OceanStor UDS Massive Storage System Technical White
Paper — Smart Disks.
3.2.2.6 Cluster Management
1. Access cluster
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
21
The access cluster of the UDS is stateless and idempotent.
Stateless: A-Nodes do not use data layout information to process requests. Each request
can be processed by any A-Node. Therefore, requests are independent from each other.
Idempotent: A client can send the same request for multiple times, and these attempts
have no adverse impact. Therefore, users can send a failed request repeatedly until the
request is successfully processed. Most of UDS's external interfaces are idempotent.
The access cluster is decentralized and can be regarded as a loose couple of stateless A-Nodes.
Adding or removing A-Nodes from the access cluster has no impact on the other nodes in the
cluster.
The access cluster's DNS implements load balancing and health checks on A-Nodes in the
cluster. Faulty A-Nodes can be discovered in a timely manner and removed from the address
list.
2. Storage cluster
The storage cluster is also decentralized.
UDSNs synchronize information with each other using the gossip protocol. Inspired by
the form of gossip seen in social networks, the gossip protocol is used to transmit
information in large-scale clusters. When being initialized, each UDSN in the storage
cluster obtains the cluster information that records the status of each node in the cluster.
A UDSN periodically selects another UDSN at random and synchronizes cluster
information with the latter. After being synchronized, cluster information on both nodes
is combined and updated.
When synchronizing cluster information using the gossip protocol, a UDSN uses the
protected health information (PHI) fault detector to check the status of other UDSNs.
The PHI fault detector anticipates the time window of the next synchronization of a
UDSN based on previous synchronization. If cluster information of a UDSN is not
synchronized as anticipated, the PHI fault detector considers the UDSN to be faulty.
When an A-Node attempts to write data onto a UDSN that is unavailable due to faults,
the data will be written onto another UDSN temporarily and then written back to the
intended UDSN after the faults are rectified. This data write process is called the hinted
handoff mechanism. For details about the hinted handoff mechanism, see the HUAWEI OceanStor UDS Massive Storage System Technical White Paper — Smart Disks.
When a UDSN detects another UDSN to be faulty, it records the fault in a local status
table. The access cluster periodically obtains status tables from UDSNs. When detecting
that a UDSN is faulty for a long time, the access cluster removes the UDSN from the
storage cluster and updates cluster information. The updated cluster information is then
transmitted to all UDSNs in the storage cluster through the gossip protocol.
After a UDSN is removed from the storage cluster, data partitions on this UDSN are
automatically migrated to the other UDSNs in the storage cluster. According to the
consistency hash algorithm, only 1/N data needs to be migrated after a UDSN is
removed from the storage cluster consisting of N UDSNs.
When a small number of UDSNs are added to the storage cluster, data partitions on the
other UDSNs in the storage cluster are automatically migrated to the newly added
UDSNs. According to the consistency hash algorithm, only 1/N+1 data needs to be
migrated after a UDSN is added to the storage cluster consisting of N UDSNs.
3.2.3 High Security and Reliability
As the IT develops, data gradually becomes the most important asset of a company and data
loss has an unpredictable adverse impact on a company. Therefore, increasing importance is
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
attached to data protection. The UDS employs multi-tenant, Multiple Copies (MC), Erasure
Code (EC), and multiple data consistency technologies to ensure data security and reliability
in massive data storage scenarios.
3.2.3.1 High Reliability Clusters
1. Storage cluster
In the storage cluster, data on any faulty smart disk (caused by a man-made or mechanical
error) can be recovered to other smart disks automatically. A faulty smart disk can be removed
from the UDS system without affecting data availability and data slices on smart disks. Unlike
a traditional storage system whose RAID group degrades when a member disk is faulty, the
UDS works correctly when any disk is faulty.
Immediate maintenance of a faulty disk is not needed in the UDS. Instead, all faulty smart
disks can be replaced at a time and need to be maintained only after certain conditions are met,
for example:
The failure rate or the capacity usage reaches the preset threshold, and critical alarms are
generated to inform users of replacing disks in batches or expanding system capacity. For
example, the disk failure rate reaches 6% or the capacity usage exceeds 80%. You can
configure the disk failure rate and capacity usage based on site requirements. This
threshold-triggered maintenance prolongs the maintenance period and reduces
maintenance costs.
All damaged or slow disks are batch replaced based on a preset periodic maintenance
schedule. This schedule-based maintenance reduces the possibility of system faults. After
smart disks are replaced, data slices are evenly relocated to all smart disks based on the
intelligent balancing algorithm, prolonging disk lifespan, lowering disk failure rate,
reducing data loss, and improving system reliability.
2. Access cluster
All A-Nodes comprise a distributed access cluster via load balancers. Instead of controlling or
saving layout information about data and metadata, A-Nodes use the DHT-based hash
algorithm to calculate data storage locations.
This is a breakthrough in storage structure. Data layout controlled and recorded by control
nodes or engines in traditional storage architecture is no longer required in the UDS
structure where A-Nodes determine data routes based on rules. This change greatly simplifies
data processing and resolves the bottleneck in cluster scalability and reliability. Nodes in a
cluster are freed from complex synchronization and lock mechanisms that restrict a cluster's
node quantity and affects node consistency and reliability.
The decentralized architecture adopted by the UDS eliminates adverse impact on system
availability when any A-Node is faulty due to human or mechanical errors.
3.2.3.2 Multi-Level Data Protection
1. Smart disk level: disk lifecycle management
Focusing on how to lower disk failure rates and control impacts brought by disk faults, the
UDS disk lifecycle management adopts end-to-end lifecycle management technologies such
as disk detection, disk repair, disk failure control, and pre-reconstruction.
The smart disk lifecycle management greatly reduces disk failure rates, prolongs disk lifespan,
and has the following advantages:
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
23
Automatic hardware control: The UDS employs the self-managed smart disk architecture
that enables self-monitoring, self-management, and selective data synchronization of
smart disks, remarkably improving system reliability while lowering hardware failure
rates.
Lowered disk failure rate: With the disk lifecycle management, the UDS obtains disk
status in real time, separates and repairs physical and logical bad sectors in a timely
manner, and implements weight management of smart disks, striking a balance between
system data and service loads and prolonging disk lifespan.
2. UDSN level: EC redundancy algorithm
The Erasure Code (EC) redundancy algorithm developed by Huawei is a superset of
traditional RAID. Different from traditional storage systems using RAID consisting of fixed
member disks, the UDS uses EC to consolidate all disks into a unified storage resource pool.
Each time data is written onto or read from the UDS, disks are automatically selected at
random and form a temporary RAID group onto which data blocks and parity blocks
are written. Compared with traditional RAID, temporary RAID improves the overall system
performance and resource utilization.
The innovative DHT-based EC algorithm enables the UDS to provide:
Longer data durability: When the EC policy is configured to M:N = 15:6, data durability
can reach thirteen nines, minimizing data loss risks.
Faster data reconstruction speed: The UDS distributes data objects to different smart
disks based on the hash algorithm. Multiple disks are involved in the reconstruction,
increasing the capacity for concurrent reconstruction and reducing the time for
reconstructing 1 TB data to four hours. This has greatly reduced the reconstruction time
and improved system reliability.
Improved reconstruction efficiency: Data reconstruction is based on objects and only
damaged objects are reconstructed. Undamaged objects or empty regions are not
processed. In this way, the data reconstruction rate is greatly increased.
Global hot spare and batch disk replacement: The UDS uses global hot spare space
instead of hot spare disks for data reconstruction. When the failure rate of smart disks
reaches the threshold, the UDS will send alarms for disk replacement. In the UDS,
immediate maintenance is not needed, reducing quantities of spare parts and upgrading
cycles and saving inputs of maintenance personnel.
Eliminated data recovery restrictions and improved EC recoverability: Compared with
the traditional RAID algorithm, the refined EC reconstruction algorithm can effectively
eliminate restrictions on the quantity of damaged data. Any damaged data block can be
recovered fully, which enhances data durability and system reliability.
More flexible data redundancy and improved disk utilization: The available EC policy
configuration for tenants is as follows: M (the number of data blocks) ={3/6/9/12/15}, N
(the number of parity blocks) = {1/2/3/6}. The EC policy, which is defined based on user
environments, has direct impacts on data durability and provides users with flexible
redundancy ratio configuration and more choices of data durability. The EC algorithm
can flexibly control disk utilization and reduce costs. For example, when M is 12 and N
is 3, the disk utilization is almost 80%.
Enhanced storage management efficiency and lowered costs: Users do not need to spend
much time on storage planning because all the disks automatically form a unified storage
pool. Users only need to insert new disks to expand the system capacity. The UDS
automatically distributes data to each disk.
Intelligent reconstruction of EC groups and lighter system load: The UDS intelligently
determines the range and size of the damaged data and temporarily reconstructs EC
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
groups to refine the data to be recovered, which reduces system load and improves
recovery efficiency.
3. Data center level: cross-regional disaster recovery and failover
The UDS provides the Multiple Data Center (MDC) feature that enables users to access the
massive storage system on the nearest DC, maximizing resource utilization and reducing
investment in storage. The UDS also provides flexible cross-regional data redundancy
policies to prevent data loss in the event of a disaster or unexpected fault occurred on the
active DC. Upon the breakdown of an active DC, the UDS quickly resumes services of the
active DC on a standby DC, minimizing the service interruption duration and ensuring service
continuity. The UDS also provides the load balancing technology for resource management,
enabling users to maximize their resource utilization and improve the return on investment
(ROI).
The multi-tenant-based MDC feature has the following advantages:
Unified resource scheduling: Multiple DCs are globally virtualized and consolidated to a
unified resource pool for improving resource utilization. In addition, the MDC feature
uses policy-based scheduling to ensure preferential data access on the nearest DC.
SLA policy-based control: SLA policies are used to control DR paths, number of DR
backups, and quality of service (QoS), supporting customers' service choice decisions
and maintaining an optimal balance between services and resources.
Cross-regional DR: The MDC feature uses HTTP/REST interfaces to perform DR
among DCs. Data stored on a local DC can be backed up and verified on a
remote DC. Data is transmitted between DCs over optimized networks, remarkably
enhancing the DR efficiency. DCs back up for each other to optimize resource
utilization.
For details, see the HUAWEI OceanStor UDS Massive Storage System Technical White Paper
— Reliability.
3.2.3.3 Continuous Data Detection and Repair
1. Short-term fault handling
The UDS defines a special intermediate state: short-term fault state. If the UDS detects a
short-term fault, it will start the fault diagnosis mechanism and try to recover the fault. If the
recovery fails and the fault persists for more than X (X is user-configurable, for example, 15
minutes), the fault goes to the permanent fault state and the node where the fault occurs exits
from the DHT ring. The UDS then starts the data recovery policy to recover damaged data.
The short-term fault handing technology has the following advantages:
Improved data and performance stability: Only the faults that cannot be detected and
recovered by the short-term handling technology are considered permanent faults and
require adjustment of the DHT ring, reducing the workload for data recovery and
migration.
Transparent to upper-layer services and enhanced business continuity: The short-term
fault handling mechanism is invisible to upper-layer services, ensuring business
continuity and durability.
Efficient system performance utilization and increased data recovery efficiency: After a
fault is considered as a permanent fault, the UDS starts the data rebalancing process and
evenly distributes damaged data to multiple nodes through effective data distribution
control.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
25
2. Data integrity
The UDS provides multi-level data integrity protection measures and supports four-level data
repair from track, slice, object, to data center.
The data integrity protection measures effectively ensure end-to-end data durability and have
the following advantages:
Full integrity protection and improved data durability: The UDS provides end-to-end
repair measures for damaged data at four levels from track, partition, object, to data
center.
Progressive recovery measures and reduces resources used in data recovery: Based on
the degree of data damage, the UDS tries to recover the data in a range as small as
possible to reduce resources used in data recovery.
3. End-to-end consistency
The UDS supports the end-to-end data consistency check at the application level, object level,
slice level, and physical level.
With the four-level data consistency check, the UDS ensures that no silence errors occur
during the data writing process (from the time data is sent by end users to the time data
is written onto the disk). At each level, once data inconsistency is detected, the UDS can
repair or resend the data quickly.
The end-to-end data consistency check greatly improves data security and has the following
advantages:
Data will not be damaged in storage and transmission, ensuing data correctness.
Possible malicious data tampering from internal personnel of cloud storage service
providers is prevented, increasing data security.
3.2.3.4 End-to-End Data Security
The UDS ensures data security in terms of data transfer, data integrity, identity authentication,
data access control, and data encryption.
1. Data transfer
The UDS provides object-based storage interfaces that are compatible with Amazon S3
interfaces and supports Representational State Transfer (REST) interfaces. Users can upload
SSL-encrypted data to the UDS in a DC using a Huawei or third-party terminal.
2. Identity authentication
The UDS uses access key ID (AK) and secret access key (SK) to authenticate user identities.
The keyed-hash message authentication code algorithm (HMAC) is used in authentication.
Based on the HMAC algorithm, a key and a message is input and a message summary is
output.
Each client user has a pair of AK and SK. The AK is public and identifies a unique user. The
SK is used for calculating signatures. Client users are required to keep the SK safe. An
operation request sent by a client user contains the user's AK and a signature calculated using
the SK (the signature is calculated based on the HMAC-SHA1.) Upon receiving the request,
the UDS checks the AK and SK stored on it and calculates a signature using the SK. Then the
UDS compares the obtained signature with the one in the request. If the two signatures are
consistent, the authentication succeeds. Figure 3-12 shows the process of identity
authentication.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Figure 3-12 Identity authentication
3. Object access control
The UDS provides a flexible and secure data access mechanism that allows users to set
different access control policies based on bucket and object configurations. Available access
control policies are: READ, WRITE, READ_ACP (users are granted the permission to read
the access control policy), WRITE _ACP, and FULL_CONTROL.
4. Static data encryption
The current version of UDS does not support data encryption in the cloud. If sensitive data is
to be stored in the cloud, you are advised to upload the data after encrypting it locally. In the
cloud scenario, keys for encrypted data are kept on clients. Figure 3-13 shows the process of
data encryption in a non-cloud scenario.
Figure 3-13 Data encryption in a non-cloud scenario
5. Data integrity
The UDS uses digital signatures to ensure data integrity during transfer. The current version
of UDS supports object integrity signatures and the later versions will support slice integrity
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
27
signatures. The integrity of a data slice is automatically verified by the UDS and the integrity
verification of a data object must be supported by client applications.
6. Data durability
The UDS provides 99.999% data availability and 99.9999% data durability.
3.2.4 Low TCO
3.2.4.1 Energy Saving
Apart from servers and network devices that consume about half of a DC's energy, storage
devices also consume a large portion of a DC's energy. As the number of storage devices
increases, more equipment room space is occupied and a larger amount of energy will be
consumed.
With 4 TB enterprise-class disks equipped with ARM chips, each 4 U 75-slot UDSN in the
UDS provides a 300 TB capacity. A single UDS cabinet can house up to 525 disks and
provides a 2.1 PB capacity. Compared with x86 servers providing the same computing or
storage capacity, the UDS halves the CPU power consumption and equipment room space
occupation with an average power consumption of 4.2 W/TB, ranking the top in the industry.
Moreover, the UDS employs the intelligent CPU frequency control and intelligent fan speed
control technologies to maintain lower power consumption of idle storage units. With the
high-density and power-saving design, the UDS lowers the power consumption of an
equipment room by 45%, providing a comprehensive energy-saving storage solution.
3.2.4.2 Automated Management
The UDS provides a graphical management system (as shown in Figure 3-14) that
automatically manages topologies, alarms, configurations, performance, logs, and users.
Moreover, the UDS can be automatically deployed and upgraded, without the need of manual
intervention and service interruption. The automatic upgrade and deployment enable service
transparency and simplify system deployment, upgrade, and capacity expansion, greatly
improving the management efficiency while lowering management costs.
In the UDS, the minimum maintenance unit can be an ARM-based smart disk. Disk failures
have minor impact on services. Immediate maintenance is not needed. Instead, faulty smart
disks can be batch replaced after the disk failure rate reaches the preset threshold. This zero
touch maintenance reduces the quantity of spare parts and upgrade cycles and saves inputs of
maintenance personnel.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 3 Solutions
Figure 3-14 Graphical management page
3.2.4.3 Open Interfaces
The UDS provides object-based storage interfaces that are compatible with Amazon S3
interfaces and supports HTTP/HTTPS Representational State Transfer (REST) interfaces.
With the open interfaces, the UDS opens its storage space to various types of customer
applications. The UDS's underlying storage space can be accessed using standard protocols,
regardless of data storage location and data format. Moreover, as the carrier for the
multi-tenant services and multiple instances, the UDS provides customers with different
levels of secure SLA and QoS services tailored to different scenarios, enhancing customer
competitiveness.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
29
4 Experience
With the IT development, the amount of unstructured data increases exponentially, and
various types of services come out, posing demanding requirements on reliability and
openness of storage systems. Traditional storage systems fail to meet such requirements.
Based on an in-depth analysis of customer needs, Huawei launches the UDS, a massive
storage system that helps customers resolve existing or pending issues of storing the huge
amount of data. Apart from providing massive storage capacity, the UDS supports the access
of multiple types of services by providing interfaces compatible with Amazon S3 interfaces.
Based on its broad compatibility, the UDS can be tailored to various application scenarios,
boasting success stories in multiple vertical industries. Currently, the UDS can be used in the
massive resource pool solution and the centralized backup solution.
4.1 Solution 1: Massive Resource Pool
4.1.1 Typical Needs and Problems Facing Customers
As the IT industry develops by leaps and bounds, various applications come out. These
applications are closely related to our daily life. People have been accustomed to resolving
most of their routine issues on the Internet, for example, sending emails, visiting
e-communities, and paying bills online. These changes bring forth the following new
situations to the storage industry:
The data volume is growing rapidly and is estimated to reach 35 ZB by 2020.
Data that needs to be stored changes from traditional structured data (database type data)
to unstructured data (such as electronic bills).
Sources of data become increasingly diversified, covering various services like SMS,
micro blogs, medical images, and scientific data.
Diversified data storage poses higher requirements on data reliability.
Different storage systems are used to store data from different types of services. Storage
vendors provide complex storage management systems for self interests. As a result,
customers have to arrange quite a number of IT management personnel to maintain
heterogeneous storage systems and networks, causing the TCO to soar.
The maintenance cost remains high as the number of storage devices increases.
Storage devices keep working for a long time at high power consumption and measures
are taken to adjust working temperature and dissipate heat in equipment rooms, leading
to constant rise of electricity fees in data centers.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
To cope with the preceding challenges, customers require a new generation of storage system
that works in new a pattern. It is expected that storage systems of the next generation have the
following features:
Massive data storage
Infinite storage capacity expansion without affecting performance
Storage for data of various services
Highly reliable
Thin provisioning for flexible space expansion
Simple storage management
Efficient and energy-saving
4.1.2 Solution
The UDS provides the massive resource pool solution to resolve the preceding problems and
meet the preceding needs. The solution is oriented to various upper-layer services, providing
customers with storage capacity for various types of service data. The solution requires that
upper-layer services comply with interface specifications, as the following described:
The UDS provides unified storage space to store data of various services, including files,
videos, and images.
The UDS adopts the scale-out architecture for flexible capacity expansion, allowing the
system capacity to be easily expanded from the minimum capacity of 448 TB to exabyte
level for massive data storage.
The performance is improved in line with capacity expansion to prevent performance
bottlenecks caused by increasing data.
Online capacity expansion is supported to ensure hitless services.
Adopting the high-density design, the UDS provides 30% more device space than x86
peers. The energy consumption of UDS equipment rooms is 30% lower than that
consumed by equipment rooms accommodating traditional storage devices.
All UDS nodes are clustered and multiple data protection technologies (such as MC and
EC) and automatic fault detection and repair technologies are employed to ensure system
reliability.
With its external interfaces compatible with Amazon S3 interfaces, the UDS can
interconnect with multiple services (such as file and image services) in various scenarios.
Each service has its specific storage space needs. The UDS supports thin provisioning
that allows flexible space expansion, meeting storage needs of different services while
preventing a waste of storage space caused by space assignment by fixed quota.
The UDS employs S3 authentication, data encryption, and access control to ensure data
security.
The UDS provides device administrators with a web UI management tool that manages
cloud storage devices and upper-layer services in a unified manner. Administrators no
longer need to use different tools to manage different services. This unified management
means frees administrators from heavy management work and lowers the TCO. Service
operators no longer need to manage complex storage devices but concentrate on service
operation. In this way, more values are created.
Figure 4-1 shows the massive resource pool scenario.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
31
Figure 4-1 Massive resource pool scenario
4.1.3 Software and Hardware Configurations
This section provides a list that describes the devices, interfaces, and software to be
configured.
[Sample is omitted.]
Table 4-1 Software and hardware configurations of the massive resource pool solution
Location Hardware/Software Model Quantity Remarks
Equipment
room in
XXX DC
A-Node RH2288 3
UDSN UDSN 14
SATA disk 2 TB 224
Service switch S6748 2
Management switch S3728 1
Cabinet 1
Optical module 4
Distributed storage
software
1
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
System management
software
1
License 10
4.1.4 Benefits
The UDS massive resource pool solution enables customers to reliably store massive data and
simplify complex storage management, creating greater values. The benefits of the solution
include:
Massive capacity: The UDS provides storage for massive data and a storage capacity up
to exabyte level.
Security and reliability: To ensure multi-level security and reliability from storage nodes,
cabinets, to data centers, the UDS employs user authentication, data transmission
encryption, data integrity check, MC, and EC to protect data in an all-around way.
Low TCO: The UDS employs power-saving ARM chips to reduce energy consumption
per capacity and green technologies such as disk spin-down and intelligent fan speed
control to lower the power consumption of the entire system. Besides, a unified web
management page is provided to centrally manage all massive storage devices. This
unified management means simplifies storage management and cuts down investment in
IT personnel training and device maintenance, reducing the TCO.
4.2 Solution 2: Centralized Backup
4.2.1 Typical Needs and Problems Facing Customers
Backup is an important means for data protection. It is widely used in various application
scenarios and vertical industries, for example, file backup, database backup, banking data
backup, and transportation data backup. Media used by backup systems vary. The backup
media can be tapes, virtual tape libraries (VTLs), CD-ROMs, and disk arrays. All mainstream
storage vendors launch their backup storage products, for example, Symantec's NetBackup
and Backup Exec, and CommVault's Simpana. However, traditional backup systems have the
following disadvantages:
1 Dedicated personnel must be arranged to manage and maintain tapes of physical tape
libraries.
2 Physical tape libraries must be maintained periodically.
3 Data can only be recovered to the state since the last tape backup.
4 Data in physical tapes can only be sequentially accessed, causing long backup and
recovery windows.
5 Power consumption of physical tape libraries is low but that of VTL is high.
6 Tape libraries, particular VTLs, have limited capacities.
7 Both physical and virtual tape libraries cannot be infinitely expanded. New devices have
to be purchased.
8 Tape drives and robot arms of physical tape libraries may be damaged and storage array
engines and disks of VTLs may be faulty.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
33
9 Physical tape libraries must be carefully relocated.
To cope with the preceding challenges, customers require a new backup solution that has the
following capabilities:
High reliability: Storage of backup data is secure and reliable, and backup data is highly
available in recovery.
Ease of manage: A GUI-based web management system is provided to manage all
backup tasks and hardware devices in a unified manner. Self checks are periodically
initiated and alarms are automatically reported once faults are detected. A backup
task-oriented service process is set up to simplify backup and recovery.
Easy expansion: Capacities can be expanded from the initial minimum configurations to
large capacities without affecting performance.
Low cost: Massive data can be backed up at low power consumption and TCO.
4.2.2 Solution
The UDS provides the centralized backup solution to resolve the preceding problems and
meet the preceding needs. The solution is oriented to massive data scenarios, providing
customers with a comprehensive backup solution for ensuring data security. The solution
requires that upper-layer services comply with interface specifications. Currently, the UDS
has passed the interface tests of Symantec's NetBackup and CommVault's CV backup
software. Later, the UDS will participate in more interface tests of new backup software. The
cloud backup solution is described as follows:
The UDS adopts the scale-out architecture for flexible capacity expansion, allowing the
system capacity to be easily expanded from the minimum capacity of 448 TB to exabyte
level for massive data storage.
The performance is improved in line with capacity expansion to prevent performance
bottlenecks caused by increasing data.
Underlying massive storage space interwork with upper-layer backup software for
massive data backup solutions.
All UDS nodes are clustered and multiple data protection technologies (such as MC and
EC) and automatic fault detection and repair technologies are employed to ensure system
reliability.
The underlying massive resource pool provides highly reliable and low-cost storage
space.
Backup resources are allocated on demand to make full use of storage capacities.
Data can be recovered to a specific point-in-time without the need of sequential access,
saving recovery time and improving recovery efficiency.
Upper-layer backup services support various backup types such as files, databases, and
applications.
The entire solution can be automatically and quickly deployed across regions.
Backup resources across regions can be managed and schedules in a unified manner.
Solution components and services can be centrally managed.
Figure 4-2 shows the scenario of the centralized backup solution.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Figure 4-2 Scenario of the centralized backup solution
4.2.3 Software and Hardware Configurations
This section provides a list that describes the devices, interfaces, and software to be configured.
[Sample is omitted.]
Table 4-2 Software and hardware configurations of the centralized backup solution
Location Hardware/Software Model Quantity Remarks
Equipment
room in
XXX DC
A-Node RH2288 3
UDSN UDSN 14
SATA disk 2 TB 224
Service switch S6748 2
Management switch S3728 1
Cabinet 1
Optical module 4
Distributed storage
software
1
Desktop data backup
software
1
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
35
System management
software
1
License 10
Backup servers are provided by customers and are not listed in the preceding table.
4.2.4 Benefits
The UDS backup solution meets customers' requirements for massive data backup by
combining a massive storage system with upper-layer backup software. It also reliably
protects customers' data by providing diversified data reliability mechanisms. This solution is
economical, efficient, and easy-to-manage, saving customers' costs and creating larger value
for customers.
Massive capacity: The UDS provides storage for massive data and a storage capacity up
to exabyte level.
Low TCO: The UDS employs power-saving ARM chips to reduce energy consumption
per capacity and green technologies such as disk spin-down and intelligent fan speed
control to lower the power consumption of the entire system. Compared with VTLs, the
UDS enables the same backup software to provide a larger storage capacity, without the
need to add new backup software for capacity expansion, lowering the TCO.
High reliability: The UDS provides multiple data protection technologies such as MC,
EC, and MDC to ensure data reliability. Data can be recovered when any storage device,
cabinet, or data center is faulty, minimizing the recovery point objective (RPO). Besides,
all nodes in the UDS are deployed in clusters such as service cluster, switch cluster, and
storage cluster, eliminating single points of failure. Data can be restored upon lost and
the recovery time is not affected by any device fault, minimizing the recovery time
objective (RTO).
Ease of manage: A GUI-based web management system is provided to manage all
backup tasks and hardware devices in a unified manner. Self checks are periodically
initiated and alarms are automatically reported once faults are detected. A backup
task-oriented service process is set up to simplify backup and recovery.
High efficiency: Storage capacities are allocated on demand and expanded dynamically,
making full use of storage space and improving storage efficiency.
4.3 Solution 3: Web Disk
4.3.1 Typical Needs and Problems Facing Customers
With the in-depth development of the information society, individuals and enterprises have a
greater need of information sharing and exchanging. Existing information sharing platforms
fail to meet challenges brought by the increasing amount of data and diversified data storage
services. The challenges are as follows:
Individual and enterprise data increases rapidly and more types of data come out.
Enterprises are in urgent need of collaborative office and data backup.
Individual or enterprise data can hardly be accessed from mobile terminals such as
mobile phones.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Enterprise branches are scattered in different locations, causing difficulty in sharing
data. Diversified methods of data sharing cannot be centrally managed.
Data security must be ensured during access to individual and enterprise data over the
Internet.
An all-in-one solution is called for to store individual and enterprise data.
To cope with the preceding challenges, customers require a new data access solution that has
the following features:
For cloud service providers:
− Provides multiple access methods, such as data access from browsers, PCs and
mobile phone clients.
− Supports routine O&M, self-service, metadata billing, and interconnection with
network management systems.
For large- and medium-sized enterprises, governments, and scientific institutions:
− Implements desktop data protection.
− Provides a unified platform to store mission-critical information assets.
− Ensures that data is securely accessed anytime anywhere.
− Meets the requirements of on-demand data sharing among employees and branches.
4.3.2 Solution
The UDS provides the web disk solution to resolve the preceding problems and meet the
preceding needs. The solution uses the object-based storage technology to provide terminal
users with online storage services over IP-based networks. The solution is cost effective and
flexibly scalable, and provides a strong consolidation capability. It customizes user-oriented
and large-capacity individual DCs that are secure, speedy, and easy-to-use, providing
enterprise users with secure, reliable, economical, and easy-to-use web services that can be
quickly deployed during production and collaborative office. The cloud backup solution is
described as follows:
Underlying storage system consists of loosely coupled A-Nodes and UDSNs. A-Nodes
are used for data scheduling, that is, distributing data requests from upper-layer services
to UDSNs. UDSNs are used for data storage. A-Nodes and UDSNs are deployed in high
availability clusters. 4 U 75-slot UDSNs support enterprise-class SATA disks.
With the scale-out architecture, the capacity of the UDS can be flexibly expanded from
the initial minimum 300 TB to exabyte level. Performance can grow in line with
capacities, which eliminates performance bottlenecks caused by data growth. Moreover,
the UDS capacity can be expanded online without interrupting services.
Multiple access means are provided and the web disk clients support multiple operating
systems and browsers.
File sharing and synchronization are supported to meet requirements of data sharing
among multiple users and groups.
Interfaces compatible with Amazon S3 interfaces are provided for interconnection with
third-party services.
A comprehensive operation management system provides multi-level administrator
management, self-services such as web disk service subscription, activation, and
termination, and metadata of traffic and capacity billing.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
37
Comprehensive web disk management tools are provided to centrally manage underlying
storage resources covering system monitoring, logs, and alarms, simplifying
administrators' management work.
Figure 4-3 shows the scenario of the web disk solution.
Figure 4-3 Scenario of the web disk solution
4.3.3 Software and Hardware Configurations
[Sample is omitted.]
Table 4-3 Software and hardware configurations of the web disk solution
Location Hardware/Software Model Quantity Remarks
Equipment
room in
XXX DC
A-Node T3200 2
UDSN UDSN 4
Smart disk 4 TB 300
Service switch S6724 (for
enterprise
markets)/S6
324 (for
carrier
markets)
2
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Cloud storage
service-system data
node
4
Cloud storage
service-computing
node
4
Service switch for web
disks
S5700 (for
enterprise
markets)/S5
300 (for
carrier
markets)
2
Cabinet 2
Optical module 12
Optical fiber 12
UDS massive storage
system software – per
terabyte license (0 TB
to 500 TB)
500
UDS massive storage
system software – per
terabyte license (501
TB to 1000 TB)
500 Increases
with the
system
capacity. For
details, see the
quotation
template.
UDS massive storage
system software – per
terabyte license (1001
TB to 5000 TB)
200 Increases
with the
system
capacity.
For details, see
the quotation
template.
HUAWEI cloud
storage web disk – user
access subsystem
software license per
node
4
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
39
HUAWEI cloud
storage web disk – data
storage subsystem
software license per
node
4
4.3.4 Solution Network
This section provides the topology of the solution. Figure 4-4 shows an example topology. You
can modify the topology based on actual networks and upper-layer services.
Figure 4-4 Network diagram of the web disk solution
4.3.5 Benefits
The UDS web disk solution customizes massive storage systems for web disk services and
client management, meeting customers' requirements for an integrated E2E solution. The web
disk solution not only applies to carriers for external operation but also to enterprises for
internal use.
For cloud service providers:
− Enhanced customer loyalty
− More revenue streams from new services
− Web-based storage platforms that integrate various services to improve
competitiveness
For medium- and large-sized enterprises, governments, and scientific institutions:
− E2E data security
− Cross-region file sharing platform for higher working efficiency
− Permission- and domain-based management and organization structure import for
agile adaptation to service changes
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
− Resource statistics and report query that support O&M decision-making
4.4 Solution 4: Centralized Active Archiving
4.4.1 Typical Needs and Problems Facing Customers
Data must be archived when its amount reaches a certain degree. Both carriers and enterprises
are faced with the following problems in data archive:
Archive systems cannot be expanded in line with growing data.
Data is archived offline, causing inefficient management, query, and access.
Archive media are prone to damage caused by environments and climates.
To address the preceding problems, customers need a new archive solution that ensures:
Online archive data available anytime
Easy management, query, and archiving
Less expensive archive media
4.4.2 Solution
The UDS provides the active archive solution to resolve the preceding problems and meet the
preceding needs. Active archive means that data can be archived online. Active data refers to
hotspot data in the archived data. In this solution, archive data can be online.
Archiving software can be provided by customers or third-party vendors. Huawei has a
partner that provides the active archiving software.
A-Nodes and UDSNs are loosely coupled in the UDS. A-Nodes schedule data and
distribute upper-layer data requests to UDSNs and UDSNs are responsible for data
storage. Both A-Nodes and UDSNs can be deployed in clusters. High-density 4 U 75-slot
UDSNs support enterprise-class SATA disks.
With the scale-out architecture, the capacity of the UDS can be flexibly expanded from
the initial minimum 300 TB to exabyte level. Performance can grow in line with
capacities, which eliminates performance bottlenecks caused by data growth. Moreover,
the UDS capacity can be expanded online without interrupting services.
The UDS also provides automatic management tools for system administrators. The
automatic management tools can centrally manage all storage devices and their
upper-layer services, simplifying the administrative work and cutting the TCO.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
41
Figure 4-5 Scenario of the active archive solution
4.4.3 Software and Hardware Configurations
This section provides a list that describes the devices, interfaces, and software to be
configured.
[Sample is omitted.]
Table 4-4 Software and hardware configurations of the active archive solution
Location Hardware/Software Model Quantity Remarks
Equipment
room in
XXX DC
Service node T3200 2
A-Node T3200 2
UDSN UDSN 4
Smart disk 4 TB 300
Service switch S6724 (for
enterprise
markets)/S63
24 (for carrier
markets)
2
Cabinet 1
Optical module 12
Optical fiber 12
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
UDS massive storage
system software – per
terabyte license (0 TB
to 500 TB)
500
UDS massive storage
system software – per
terabyte license (501
TB to 1000 TB)
500 Increases
with the system
capacity. For
details,
see the quotation
template.
UDS massive storage
system software – per
terabyte license (1001
TB to 5000 TB)
200 Increases
with the
system capacity.
For details,
see the
quotation template.
4.4.4 Solution Network
This section provides the topology of the solution. Figure 4-6 shows an example topology. You can modify the topology based on actual networks and upper-layer services.
Figure 4-6 Network diagram of the active archive solution
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 4 Experience
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
43
4.4.5 Benefits
Archiving media is less expensive than the primary storage media, giving full play to
storage media of different price/performance ratios and cutting down the TCO.
Data is accessible at any time and data migration is transparent to ongoing services.
Archive data can be quickly and easily managed, queried, and accessed.
Storage systems are flexibly scalable to cope with explosive data growth.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 5 Conclusion
5 Conclusion
Massive data has become an irreversible IT development trend. All industries call for secure
and reliable storage of massive data. To address customers' problems and challenges, Huawei
launches the UDS massive storage system.
With industry-leading scale-out distributed storage architecture and the DHT algorithm, the
UDS has no storage engines (controllers) used in traditional storage systems and distributes
data onto different storage nodes. This architecture eliminates performance bottlenecks caused
by storage controllers. Besides, the UDS supports hitless capacity expansion up to exabytes.
System performance can grow linearly with capacities. The UDS has a significant cost
advantage over traditional storage systems when the data amount reaches a certain scale. With
all outstanding features, the UDS perfectly fits massive data storage scenarios.
With broad compatibility, the UDS can be used in diversified solutions tailored to different
scenarios. Various types of upper-layer applications can access the UDS and use the
underlying massive storage space of the UDS through UDS's external interfaces that are
compatible with Amazon S3 interfaces.
Data security is the focus of massive data scenarios. The UDS provides various data
protection mechanisms such as MC and EC to ensure data security, boosting customers'
confidence in the cloud era.
HUAWEI OceanStor UDS Massive Storage System
Technical White Pater 6 Acronyms and Abbreviations
Issue 01 (2014-04-106) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd..
45
6 Acronyms and Abbreviations
Table 6-1 Acronyms and abbreviations
Acronyms and Abbreviations Full Spelling
UDS Universal Distributed Storage
SoD Sea of Disks
A-Node Access Node
UDSN Universal Distributed Storage Node
ACL Access Control List
AK Access Key ID
SK Secret Key