17
HUAWEI TECHNOLOGIES CO., LTD. All rights reserved www.huawei.com Internal OBN110206 iManager N2000 BMS High- Availability System Maintenance ISSUE 1.0

111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Embed Size (px)

Citation preview

Page 1: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

HUAWEI TECHNOLOGIES CO., LTD. All rights reserved

www.huawei.com

Internal

OBN110206 iManager N2000 BMS High-Availability System

Maintenance ISSUE 1.0

Page 2: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 2

Overview

The HA solution provides a remote dual system backup for the N2000

network management system.

The N2000 HA system monitors the system software and hardware

constantly. If the failure occurs, the system switches over from the

active node to the standby node. This ensures the reliability of the

N2000.

SiteA SiteB

WAN

Router Router

Page 3: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 3

Hardware Architecture

2M leased line

Active server

LAN port

Client

Standby server

LAN port

RouterRouterLAN Switch

IPNetwork

LAN Switch

...

NE

Page 4: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 4

Software Architecture

Watchman serverapplication

N2000 server application

Sybase database

VxVM, VVR

Solaris

Active server (SiteA)

Watchman client A Watchman client B N2000 client A

Standby server (SiteB)

N2000 client A

Java virtual machine

Solaris/Windows

applicationWatchman client

Java virtual machine

Solaris/Windows

applicationWatchman client

Java virtual machine

Solaris/Windows

N2000 clientapplication

Java virtual machine

Solaris/Windows

N2000 clientapplication

Watchman serverapplication

N2000 server application

Sybase database

VxVM, VVR

Solaris

Page 5: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 5

Single System VS Dual System

Solaris 8

Sybase

N2000

Solaris 8

Veritas VxVM

Watchman

Veritas VVR

Sybase

N2000

Page 6: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 6

Software Architecture

Watchman

to control the application switchover between the active and

standby nodes developed by Huawei

Veritas Volume Manager (VxVM)

is the disk and disk volume management software offered

by Veritas

Veritas Volume Replicator (VVR)

to synchronize and replicate data between the active and

standby nodes.

Page 7: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 7

Functions of the Watchman

Real-time monitoring

Real-time backup

Continuous detection

Real-time refreshing

Floating IP address

Alarm management

Log management

Page 8: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 8

HA system Installation Flow

Start

Install software on the activenode

Install the Watchman client

Install the N2000 client

End

Install software on the standbynode

Page 9: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 9

HA system Installation Flow

The procedures of installing the N2000 HA system on the active and

standby nodes:

Installing the Solaris OS

Installing the Solaris patches

Installing the Solaris patch for Veritas

Installing the Veritas

Installing the Watchman server

Installing the Sybase

Installing the Sybase patches

Installing the N2000 server

Verifying the installation and synchronizing the system data

Page 10: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 10

Watchman Principles

Page 11: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 11

Switchover Control

Switchover control can ensure the security of the high-availability

system. When the active node is abnormal, all the services can be

switched to the standby node.

The switchover modes include

Manual switchover :If you need to restart, optimize, or maintain

the active server, you can adjust the data replication direction

through manual switchover, and then start the application on the

standby node to manage the whole system (the standby server

becomes the new active server).

Automatic switchover: When the process of the active server is

abnormal or the communication between the active server and the

device/client is interrupted, the system automatically conducts the

active/standby switchover.

Page 12: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 12

Automatic Switchover

The Watchman switches over automatically if the following events

occur:

The monitor process fails to poll other processes.

The N2000 triggers the switchover.

The database server triggers the switchover.

The heartbeat connection is interrupted.

The replication state is abnormal.

Page 13: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 13

Automatic Switchover

Different trigger conditions get different results.

If the switchover is triggered by the monitor process, the N2000,

or the database server, only one active server exists. After you fix

the problem, you can manually switch the services from one node

to another.

If the switchover is triggered by the heartbeat interruption or

replication exception, two active servers exist. After you fix the

problem, you must perform forced switchover. This process may

take a long time, and it depends on the amount of the data.

Page 14: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 14

Process Management

The Watchman supports process management mechanism for

convenient and intelligent process management.

When a node is in running, standby or alarm state, its icon is green,

grey or red respectively.

The Watchman supports enabling/disabling a process manually.

The Watchman automatically restarts every collapsed process. If the

system fails to start the process after several tries, it set the status of

the node to alarm and automatically conducts active/standby

switchover.

The Watchman supports specifying a certain relationship for

processes, so that the processes are started and closed according to

the expected order.

Page 15: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 15

Alarm Management

When a node or a process is faulty, the Watchman reports an

alarm. An alarm represents a fault. Through an alarm, you can

find the cause of a fault, remove the fault, and finally recover

the HA backup system.

With Watchman client you can:

Start or Stop an Alarm Sound

Browse Alarms

Query Alarms

Acknowledge Alarms

Clear Alarms

Dump Alarms

Page 16: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 16

Log Management

When the Watchman system runs, it records the system status and

user’s operations in the log. You can:

Browse logs

The Watchman updates the latest logs at the client automatically.

Query logs

To query the logs in a given time period for maintenance purpose.

Dump Logs

To dump the queried logs manually in the specified directory.

Clear logs:

To clear the logs in the log browse area.

Page 17: 111OBN209221 iManager N2000 BMS High-Availability System Maintenance ISSUE1.0

Page 17

HA system architecture

Watchman working principle

Watchman maintenance

Summary