348
Hardware Platform Monitoring Guide NetApp, Inc. 495 East Java Drive Sunnyvale, CA 94089 USA Telephone: +1 (408) 822-6000 Fax: +1 (408) 822-4501 Support telephone: +1 (888) 4-NETAPP Documentation comments: [email protected] Information Web: www.netapp.com Part number: 215-06774_A0 June 2012

215-06774_netapp-cmds

Embed Size (px)

Citation preview

Page 1: 215-06774_netapp-cmds

Hardware Platform Monitoring Guide

NetApp, Inc.495 East Java DriveSunnyvale, CA 94089 USATelephone: +1 (408) 822-6000Fax: +1 (408) 822-4501Support telephone: +1 (888) 4-NETAPPDocumentation comments: [email protected] Web: www.netapp.com

Part number: 215-06774_A0

June 2012

Page 2: 215-06774_netapp-cmds
Page 3: 215-06774_netapp-cmds

Contents

Sources of troubleshooting information ................................................... 25Where LEDs appear .................................................................................................. 25

Where messages are displayed .................................................................................. 25

How AutoSupport e-mail messages help with troubleshooting ................................ 26

Forms and use of diagnostic tools ............................................................................. 26

Where to find documentation .................................................................................... 27

System LEDs ............................................................................................... 31FAS20xx and SA200 system LEDs .......................................................................... 31

Location and meaning of LEDs on the front of FAS20xx and SA200

chassis ...................................................................................................... 31

Location and meaning of LEDs on the back of FAS20xx and SA200

controller modules ................................................................................... 33

Location and meaning of FAS20xx and SA200 PSU LEDs ......................... 35

FAS22xx system LEDs ............................................................................................. 36

Location and meaning of LEDs on the front of FAS22xx chassis ................ 37

Location and meaning of FAS22xx internal drive LEDs .............................. 38

Location and meaning of LEDs on the back of FAS22xx controllers .......... 40

Location and meaning of FAS22xx PSU LEDs ............................................ 43

Location and meaning of FAS22xx internal FRU LEDs .............................. 45

30xx and SA300 system LEDs .................................................................................. 45

Location and meaning of LEDs on the front of 30xx and SA300

controllers ................................................................................................ 45

Location and meaning of LEDs on the back of 30xx and SA300

controllers ................................................................................................ 46

Location and meaning of 30xx and SA300 PSU LEDs ................................ 47

31xx system LEDs .................................................................................................... 49

Location and meaning of LEDs on the front of 31xx chassis ....................... 49

Location and meaning of LEDs on the back of 31xx controllers .................. 50

Location and meaning of 31xx fan LEDs ..................................................... 51

Location and meaning of 31xx PSU LEDs ................................................... 52

Location and meaning of 31xx FRU LEDs ................................................... 53

32xx and SA320 system LEDs .................................................................................. 54

Table of Contents | 3

Page 4: 215-06774_netapp-cmds

Location and meaning of LEDs on the front of 32xx and SA320 chassis .... 54

Location and meaning of LEDs on the back of 32xx and SA320

controllers ................................................................................................ 55

Location and meaning of the LED on the back of 32xx and SA320 I/O

expansion modules .................................................................................. 58

Location and meaning of 32xx and SA320 fan LEDs .................................. 59

Location and meaning of 32xx and SA320 PSU LEDs ................................ 59

Location and meaning of 32xx and SA320 internal FRU LEDs ................... 60

60xx and SA600 system LEDs .................................................................................. 61

Location and meaning of LEDs on the front of 60xx and SA600

controllers ................................................................................................ 61

Location and meaning of LEDs on the back of 60xx and SA600

controllers ................................................................................................ 62

Location and meaning of 60xx and SA600 fan LEDs .................................. 63

Location and meaning of 60xx and SA600 PSU LEDs ................................ 64

62xx and SA620 system LEDs .................................................................................. 65

Location and meaning of LEDs on the front of 62xx and SA620 chassis .... 65

Location and meaning of LEDs on the back of 62xx and SA620

controllers ................................................................................................ 67

Location and meaning of the 62xx and SA620 I/O expansion module

LED ......................................................................................................... 71

Location and meaning of 62xx and SA620 fan LEDs .................................. 72

Location and meaning of 62xx and SA620 PSU LEDs ................................ 72

Location and meaning of 62xx and SA620 internal FRU LEDs ................... 73

HBA LEDs ................................................................................................................ 74

Location and meaning dual-port Fibre Channel HBA LEDs ........................ 74

Location and meaning of dual-port, 4-Gb or 8-Gb, target-mode Fibre

Channel HBA LEDs ................................................................................ 75

Location and meaning of dual-port, 8-Gb Fibre Channel Virtual

Interface HBA LEDs ............................................................................... 77

Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs:

four-LED version ..................................................................................... 78

Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs:

12-LED version ....................................................................................... 80

Location and meaning of fiber-optic iSCSI target HBA LEDs .................... 81

Location and meaning of copper iSCSI target HBA LEDs .......................... 82

4 | Platform Monitoring Guide

Page 5: 215-06774_netapp-cmds

Location and meaning of dual-port, 10-Gb, FCoE unified target HBA

LEDs ........................................................................................................ 84

Location of dual-port, 3-Gb SAS HBA ports ................................................ 86

Location of quad-port, 3-Gb SAS HBA ports ............................................... 86

MetroCluster adapter LEDs ...................................................................................... 87

Location and meaning of dual-port, 2-Gb VI-MetroCluster adapter

LEDs ........................................................................................................ 87

Location and meaning of dual-port, 4-Gb MetroCluster adapter LEDs ....... 89

Location and meaning of dual-port, 8-Gb MetroCluster adapter LEDs ....... 90

GbE NIC LEDs ......................................................................................................... 92

Location and meaning of single-port GbE NIC LEDs .................................. 92

Location and meaning of single-port, 10-GbE NIC LEDs (FAS2050

systems only) ........................................................................................... 94

Location and meaning of LEDs on the dual-port 10-GbE NIC that

supports fiber optic cables with SFP+ modules or copper SFP+

cables ....................................................................................................... 95

Location and meaning of LEDs on the dual-port 10-GbE NIC that

supports fiber optic cables with X6569 SFP+ modules or copper SFP

+ cables .................................................................................................... 96

Location and meaning of multiport GbE NIC LEDs .................................... 98

TOE NIC LEDs ....................................................................................................... 100

Location and meaning of single-port TOE NIC LEDs ............................... 100

Location and meaning of dual-port, 10GBase-SR TOE NIC LEDs ........... 102

Location and meaning of dual-port, 10GBase-CX4 TOE NIC LEDs ......... 103

Location and meaning of quad-port TOE NIC LEDs ................................. 104

NVRAM adapter LEDs ........................................................................................... 105

Location and meaning of NVRAM5 and NVRAM6 LEDs ........................ 106

Location and meaning of NVRAM7 LEDs ................................................. 107

Location and meaning of NVRAM5 and NVRAM6 media converter

LEDs ...................................................................................................... 108

Location and meaning of NVRAM8 LEDs ................................................. 108

Flash Cache module and PAM LEDs ..................................................................... 114

Location and meaning of PAM LEDs ......................................................... 114

Location and meaning of Flash Cache module LEDs ................................. 114

Startup messages ...................................................................................... 117POST messages ....................................................................................................... 117

Table of Contents | 5

Page 6: 215-06774_netapp-cmds

Boot messages ......................................................................................................... 118

FAS20xx and SA200 startup progress .................................................................... 118

Method of viewing progress on the console ................................................ 118

Method of viewing progress through the BIOS Status sensor .................... 119

3020 and 3050 system POST error messages ......................................................... 120

Abort Autoboot–POST Failure(s): CPU ..................................................... 120

Abort Autoboot–POST Failure(s): MEMORY ........................................... 120

Abort Autoboot–POST Failure(s): RTC, RTC_IO ..................................... 121

Abort Autoboot–POST Failure(s): UCODE ............................................... 121

Autoboot of backup image aborted ............................................................. 121

Autoboot of backup image failed ................................................................ 122

Autoboot of primary image aborted ............................................................ 122

Autoboot of primary image failed ............................................................... 122

Invalid FRU EEPROM Checksum .............................................................. 123

Memory init failure ..................................................................................... 123

No Memory found ....................................................................................... 123

Unsupported system bus speed ................................................................... 124

3040, 3070, 31xx, 60xx, SA300, and SA600 system POST error messages .......... 124

0200: Failure Fixed Disk ............................................................................. 124

0230: System RAM Failed at offset: ........................................................... 124

0231: Shadow RAM failed at offset ............................................................ 125

0232: Extended RAM failed at address line ................................................ 125

0235: Multiple-bit ECC error occurred ....................................................... 126

023C: Bad DIMM found in slot # ............................................................... 126

023E: Node Memory Interleaving disabled ................................................ 127

0241: Agent Read Timeout ......................................................................... 127

0242: Invalid FRU information ................................................................... 127

0250: System battery is dead ....................................................................... 128

0251: System CMOS checksum bad ........................................................... 128

0253: Clear CMOS jumper detected ........................................................... 128

0260: System timer error ............................................................................. 129

0280: Previous boot incomplete .................................................................. 129

02C2: No valid Boot Loader in System Flash–Non Fatal ........................... 129

02C3: No valid Boot Loader in System Flash–Fatal ................................... 129

02F9: FGPA jumper detected ...................................................................... 130

02FA: Watchdog Timer Reboot (PciInit) .................................................... 130

6 | Platform Monitoring Guide

Page 7: 215-06774_netapp-cmds

02FB: Watchdog Timer Reboot (MemTest) ............................................... 131

02FC: LDTStop Reboot (HTLinkInit) ........................................................ 131

No message on console ............................................................................... 131

FAS22xx, 32xx, 62xx, SA320, and SA620 system POST error messages ............. 132

0200: Failure Fixed Disk ............................................................................. 132

0230: System RAM Failed at offset: ........................................................... 132

0231: Shadow RAM Failed at offset: .......................................................... 133

0232: Extended RAM Failed at address line: .............................................. 133

BIOS detected uncorrectable ECC error in DIMM slot: ............................. 133

No message on the console ......................................................................... 133

BIOS detected errors or invalid configuration in DIMM slot: .................... 133

BIOS detected unknown errors in DIMM slot: ........................................... 134

023A: ONTAP Detected Bad DIMM in slot: .............................................. 134

023B: BIOS detected SPD checksum error in DIMM slot: ........................ 134

BIOS detected pattern write/read mismatch in DIMM slot: ....................... 134

0241: SMBus Read Timeout ....................................................................... 134

0242: Invalid FRU information ................................................................... 135

0250: System battery is dead - Replace and run SETUP ............................ 135

0251: System CMOS checksum bad ........................................................... 135

0260: System timer error ............................................................................. 135

0271: Check date and time settings ............................................................. 135

0280: Previous boot incomplete - Default configuration used .................... 136

02A1: SP Not Found ................................................................................... 136

02A2: BMC System Error Log (SEL) Full ................................................. 136

02A3: No Response From SP To FRU ID Read Request ........................... 136

SP FRU Entry is Blank or Checksum Error ................................................ 136

No Response to Controller FRU ID Read Request via IPMI ...................... 137

No Response to Midplane FRU ID Read Request via IPMI ....................... 137

02C2: No valid Boot Loader in System Flash - Non Fatal ......................... 137

02C3: No valid Boot Loader in System Flash - Fatal ................................. 138

Fatal Error: No DIMM detected and system can not continue boot! .......... 138

Fatal Error! All channels are disabled! ....................................................... 138

Software memory test failed! ...................................................................... 139

Fatal Error! RDIMMs and UDIMMs are mixed! ........................................ 139

Fatal Error! UDIMM in 3rd slot is not supported! ...................................... 139

Fatal Error! All DIMM failed and system can not continue boot! .............. 139

Table of Contents | 7

Page 8: 215-06774_netapp-cmds

Boot error messages ................................................................................................ 140

Boot device err ............................................................................................ 140

Cannot initialize labels ................................................................................ 140

Cannot read labels ....................................................................................... 140

Configuration exceeds max PCI space ........................................................ 140

DIMM slot # has correctable ECC errors .................................................... 141

Dirty shutdown in degraded mode .............................................................. 141

Disk label processing failed ........................................................................ 141

Drive %s.%d not supported ......................................................................... 141

Error detection detected too many errors to analyze at once ...................... 142

FC-AL loop down, adapter %d ................................................................... 142

File system may be scrambled .................................................................... 142

Halted disk firmware too old ....................................................................... 143

Halted: Illegal configuration ....................................................................... 143

Invalid PCI card slot %d ............................................................................. 143

No /etc/rc ..................................................................................................... 143

No disk controllers ...................................................................................... 144

No disks ....................................................................................................... 144

No /etc/rc, running setup ............................................................................. 144

No network interfaces ................................................................................. 144

No NVRAM present .................................................................................... 145

NVRAM #n downrev .................................................................................. 145

NVRAM: wrong pci slot ............................................................................. 145

Panic: DIMM slot #n has uncorrectable ECC errors ................................... 145

This platform is not supported on this release ............................................. 145

Too many errors in too short time ............................................................... 146

Warning: Motherboard Revision not available ........................................... 146

Warning: Motherboard Serial Number not available .................................. 146

Warning: system serial number is not available .......................................... 146

Watchdog error ............................................................................................ 146

Watchdog failed .......................................................................................... 147

EMS and operational messages ............................................................... 149Environmental EMS messages ................................................................................ 149

Chassis fan FRU failed ................................................................................ 149

Chassis over temperature on XXXX ........................................................... 150

Chassis over temperature shutdown on XXXX .......................................... 150

8 | Platform Monitoring Guide

Page 9: 215-06774_netapp-cmds

Chassis Power Degraded: 3.3V in warn high state ..................................... 150

Chassis power degraded: PS# ..................................................................... 151

Chassis Power Fail: PS# .............................................................................. 151

Chassis Power Shutdown ............................................................................ 151

Chassis power shutdown: 3.3V in warn low state ....................................... 152

Chassis Power Supply: PS# removed .......................................................... 152

Chassis power supply degraded: PS# .......................................................... 153

Chassis power supply fail: PS# ................................................................... 153

Chassis power supply off: PS# .................................................................... 153

Chassis power supply off: PS# .................................................................... 154

Chassis power supply OK: PS# ................................................................... 154

Chassis power supply removed: PS# .......................................................... 154

Chassis under temperature on XXXX ......................................................... 155

Chassis under temperature shutdown on XXXX ........................................ 155

Fan: # is spinning below tolerable speed .................................................... 155

monitor.chassisFan.degraded ...................................................................... 156

monitor.chassisFan.ok ................................................................................. 156

monitor.chassisFan.removed ....................................................................... 156

monitor.chassisFan.slow ............................................................................. 156

monitor.chassisFan.stop .............................................................................. 157

monitor.chassisFan.warning ........................................................................ 157

monitor.chassisFanFail.xMinShutdown ...................................................... 157

monitor.chassisPower.degraded .................................................................. 157

monitor.chassisPower.ok ............................................................................. 158

monitor.chassisPowerSupplies.ok ............................................................... 158

monitor.chassisPowerSupply.degraded ....................................................... 158

monitor.chassisPowerSupply.notPresent .................................................... 158

monitor.chassisPowerSupply.off ................................................................. 159

monitor.chassisPowerSupply.ok ................................................................. 159

monitor.chassisTemperature.cool ................................................................ 159

monitor.chassisTemperature.ok .................................................................. 159

monitor.chassisTemperature.warm ............................................................. 159

monitor.cpuFan.degraded ............................................................................ 160

monitor.cpuFan.failed ................................................................................. 160

monitor.cpuFan.ok ...................................................................................... 160

monitor.ioexpansionPower.degraded .......................................................... 161

Table of Contents | 9

Page 10: 215-06774_netapp-cmds

monitor.ioexpansionPower.ok ..................................................................... 161

monitor.ioexpansionTemperature.cool ........................................................ 161

monitor.ioexpansionTemperature.ok .......................................................... 161

monitor.ioexpansionTemperature.warm ..................................................... 162

monitor.ioexpansion.unpresent ................................................................... 162

monitor.nvmembattery.warninglow ............................................................ 162

monitor.nvramLowBattery .......................................................................... 162

monitor.power.unreadable ........................................................................... 163

monitor.shutdown.cancel ............................................................................ 163

monitor.shutdown.cancel.nvramLowBattery .............................................. 163

monitor.shutdown.chassisOverTemp .......................................................... 163

monitor.shutdown.chassisUnderTemp ........................................................ 164

monitor.shutdown.emergency ..................................................................... 164

monitor.shutdown.ioexpansionOverTemp .................................................. 164

monitor.shutdown.chassisUnderTemp ........................................................ 164

monitor.shutdown.nvramLowBattery.pending ........................................... 165

monitor.temp.unreadable ............................................................................. 165

Multiple chassis fans have failed ................................................................ 165

Multiple fan failure on XXXX .................................................................... 166

Multiple power supply fans failed ............................................................... 166

nvmem.battery.capacity.low ....................................................................... 166

nvmem.battery.capacity.low.warn .............................................................. 167

nvmem.battery.capacity.normal .................................................................. 167

nvmem.battery.current.high ........................................................................ 167

nvmem.battery.current.high.warn ............................................................... 167

nvmem.battery.sensor.unreadable ............................................................... 168

nvmem.battery.temp.high ............................................................................ 168

nvmem.battery.temp.low ............................................................................. 168

nvmem.battery.temp.normal ....................................................................... 169

nvmem.battery.voltage.high ........................................................................ 169

nvmem.battery.voltage.high.warn ............................................................... 169

nvmem.battery.voltage.normal .................................................................... 169

nvmem.voltage.high .................................................................................... 170

nvmem.voltage.high.warn ........................................................................... 170

nvmem.voltage.normal ................................................................................ 170

nvram.bat.missing.error ............................................................................... 170

10 | Platform Monitoring Guide

Page 11: 215-06774_netapp-cmds

nvram.battery.capacity.low ......................................................................... 171

nvram.battery.capacity.low.critical ............................................................. 171

nvram.battery.capacity.low.warn ................................................................ 171

nvram.battery.capacity.normal .................................................................... 171

nvram.battery.charging.nocharge ................................................................ 172

nvram.battery.charging.normal ................................................................... 172

nvram.battery.charging.wrongcharge .......................................................... 172

nvram.battery.current.high .......................................................................... 172

nvram.battery.current.high.warn ................................................................. 173

nvram.battery.current.low ........................................................................... 173

nvram.battery.current.low.warn .................................................................. 173

nvram.battery.current.normal ...................................................................... 174

nvram.battery.end_of_life.high ................................................................... 174

nvram.battery.end_of_life.normal ............................................................... 174

nvram.battery.fault ...................................................................................... 174

nvram.battery.fault.warn ............................................................................. 175

nvram.battery.fcc.low .................................................................................. 175

nvram.battery.fcc.low.critical ...................................................................... 175

nvram.battery.fcc.low.warn ......................................................................... 175

nvram.battery.fcc.normal ............................................................................ 176

nvram.battery.power.fault ........................................................................... 176

nvram.battery.power.normal ....................................................................... 176

nvram.battery.sensor.unreadable ................................................................. 176

nvram.battery.temp.high ............................................................................. 177

nvram.battery.temp.high.warn .................................................................... 177

nvram.battery.temp.low ............................................................................... 177

nvram.battery.temp.low.warn ...................................................................... 178

nvram.battery.temp.normal ......................................................................... 178

nvram.battery.voltage.high .......................................................................... 178

nvram.battery.voltage.high.warn ................................................................. 178

nvram.battery.voltage.low ........................................................................... 179

nvram.battery.voltage.low.warn .................................................................. 179

nvram.battery.voltage.normal ..................................................................... 179

nvram.hw.initFail ........................................................................................ 179

SAS EMS messages ................................................................................................ 180

ds.sas.config.warning .................................................................................. 180

Table of Contents | 11

Page 12: 215-06774_netapp-cmds

ds.sas.crc.err ................................................................................................ 180

ds.sas.drivephy.disableErr ........................................................................... 180

ds.sas.element.fault ..................................................................................... 181

ds.sas.element.xport.error ............................................................................ 181

ds.sas.hostphy.disableErr ............................................................................ 182

ds.sas.invalid.word ...................................................................................... 182

ds.sas.loss.dword ......................................................................................... 182

ds.sas.multPhys.disableErr .......................................................................... 183

ds.sas.phyRstProb ........................................................................................ 183

ds.sas.running.disparity ............................................................................... 183

ds.sas.ses.disableErr .................................................................................... 184

ds.sas.xfer.element.fault .............................................................................. 184

ds.sas.xfer.export.error ................................................................................ 184

ds.sas.xfer.not.sent ...................................................................................... 185

ds.sas.xfer.unknown.error ........................................................................... 185

sas.adapter.bad ............................................................................................ 186

sas.adapter.bootarg.option ........................................................................... 186

sas.adapter.debug ........................................................................................ 186

sas.adapter.exception ................................................................................... 186

sas.adapter.failed ......................................................................................... 187

sas.adapter.firmware.download ................................................................... 187

sas.adapter.firmware.fault ........................................................................... 187

sas.adapter.firmware.update.failed .............................................................. 187

sas.adapter.not.ready ................................................................................... 188

sas.adapter.offline ........................................................................................ 188

sas.adapter.offlining .................................................................................... 188

sas.adapter.online ........................................................................................ 189

sas.adapter.online.failed .............................................................................. 189

sas.adapter.onlining ..................................................................................... 189

sas.adapter.reset ........................................................................................... 189

sas.adapter.unexpected.status ...................................................................... 190

sas.cable.error .............................................................................................. 190

sas.cable.pulled ............................................................................................ 190

sas.cable.pushed .......................................................................................... 190

sas.config.mixed.detected ........................................................................... 191

sas.device.invalid.wwn ................................................................................ 191

12 | Platform Monitoring Guide

Page 13: 215-06774_netapp-cmds

sas.device.quiesce ........................................................................................ 191

sas.device.resetting ...................................................................................... 192

sas.device.timeout ....................................................................................... 192

sas.initialization.failed ................................................................................. 193

sas.link.error ................................................................................................ 193

sas.port.disabled .......................................................................................... 193

sas.port.down ............................................................................................... 193

sas.shelf.conflict .......................................................................................... 194

sasmon.adapter.phy.disable ......................................................................... 194

sasmon.adapter.phy.event ........................................................................... 195

sasmon.disable.module ................................................................................ 195

shm.threshold.spareBlocksConsumed ......................................................... 195

shm.threshold.spareBlocksConsumedMax ................................................. 196

SES EMS messages ................................................................................................. 196

ses.access.noEnclServ ................................................................................. 196

ses.access.noMoreValidPaths ...................................................................... 197

ses.access.noShelfSES ................................................................................ 197

ses.access.sesUnavailable ............................................................................ 198

ses.badShareStorageConfigErr .................................................................... 198

ses.bridge.fw.getFailWarn ........................................................................... 199

ses.bridge.fw.mmErr ................................................................................... 199

ses.channel.rescanInitiated .......................................................................... 199

ses.disk.pctl.timeout .................................................................................... 199

ses.config.drivePopError ............................................................................. 200

ses.config.IllegalEsh270 .............................................................................. 200

ses.config.shelfMixError ............................................................................. 200

ses.config.shelfPopError ............................................................................. 200

ses.disk.configOk ........................................................................................ 201

ses.disk.illegalConfigWarn ......................................................................... 201

ses.disk.pctl.timeout .................................................................................... 201

ses.download.powerCyclingChannel .......................................................... 201

ses.download.shelfToReboot ...................................................................... 202

ses.download.suspendIOForPowerCycle .................................................... 202

ses.drive.PossShelfAddr .............................................................................. 202

ses.drive.shelfAddr.mm ............................................................................... 203

ses.exceptionShelfLog ................................................................................. 203

Table of Contents | 13

Page 14: 215-06774_netapp-cmds

ses.extendedShelfLog .................................................................................. 204

ses.fw.emptyFile .......................................................................................... 204

ses.fw.resourceNotAvailable ....................................................................... 204

ses.giveback.restartAfter ............................................................................. 205

ses.giveback.wait ......................................................................................... 205

ses.psu.coolingReqError .............................................................................. 205

ses.psu.powerReqError ................................................................................ 205

ses.remote.configPageError ........................................................................ 206

ses.remote.elemDescPageError ................................................................... 206

ses.remote.faultLedError ............................................................................. 206

ses.remote.flashLedError ............................................................................ 207

ses.remote.shelfListError ............................................................................ 207

ses.remote.statPageError ............................................................................. 207

ses.shelf.changedID ..................................................................................... 207

ses.shelf.ctrlFailErr ...................................................................................... 208

ses.shelf.em.ctrlFailErr ................................................................................ 208

ses.shelf.IdBasedAddr ................................................................................. 208

ses.shelf.invalNum ...................................................................................... 209

ses.shelf.mmErr ........................................................................................... 209

ses.shelf.OSmmErr ...................................................................................... 210

ses.shelf.powercycle.done ........................................................................... 210

ses.shelf.powercycle.start ............................................................................ 210

ses.shelf.sameNumReassign ........................................................................ 210

ses.shelf.unsupportAllowErr ....................................................................... 211

ses.shelf.unsupportedErr ............................................................................. 211

ses.startTempOwnership ............................................................................. 211

ses.status.ATFCXError ............................................................................... 211

ses.status.ATFCXInfo ................................................................................. 212

ses.status.currentError ................................................................................. 212

ses.status.currentInfo ................................................................................... 212

ses.status.currentWarning ............................................................................ 213

ses.status.displayError ................................................................................. 213

ses.status.displayInfo ................................................................................... 213

ses.status.displayWarning ........................................................................... 214

ses.status.driveError .................................................................................... 214

ses.status.driveOk ........................................................................................ 214

14 | Platform Monitoring Guide

Page 15: 215-06774_netapp-cmds

ses.status.driveWarning ............................................................................... 215

ses.status.electronicsError ........................................................................... 215

ses.status.electronicsInfo ............................................................................. 215

ses.status.electronicsWarn ........................................................................... 215

ses.status.ESHPctlStatus ............................................................................. 216

ses.status.fanError ....................................................................................... 216

ses.status.fanInfo ......................................................................................... 216

ses.status.fanWarning .................................................................................. 216

ses.status.ModuleError ................................................................................ 217

ses.status.ModuleInfo .................................................................................. 217

ses.status.ModuleWarn ................................................................................ 217

ses.status.psError ......................................................................................... 218

ses.status.psInfo ........................................................................................... 218

ses.status.psWarning ................................................................................... 218

ses.status.temperatureError ......................................................................... 219

ses.status.temperatureInfo ........................................................................... 219

ses.status.temperatureWarning .................................................................... 220

ses.status.upsError ....................................................................................... 220

ses.status.upsInfo ......................................................................................... 220

ses.status.volError ....................................................................................... 221

ses.status.volWarning .................................................................................. 221

ses.system.em.mmErr .................................................................................. 221

ses.tempOwnershipDone ............................................................................. 222

sfu.adapterSuspendIO ................................................................................. 222

sfu.auto.update.off.impact ........................................................................... 222

sfu.ctrllerElmntsPerShelf ............................................................................ 222

sfu.downloadCtrllerBridge .......................................................................... 223

sfu.downloadError ....................................................................................... 223

sfu.downloadingController .......................................................................... 223

sfu.downloadingCtrllerR1XX ..................................................................... 223

sfu.downloadStarted .................................................................................... 224

sfu.downloadSuccess ................................................................................... 224

sfu.downloadSummary ................................................................................ 224

sfu.downloadSummaryErrors ...................................................................... 224

sfu.FCDownloadFailed ............................................................................... 224

sfu.firmwareDownrev ................................................................................. 225

Table of Contents | 15

Page 16: 215-06774_netapp-cmds

sfu.firmwareUpToDate ............................................................................... 225

sfu.partnerInaccessible ................................................................................ 225

sfu.partnerNotResponding ........................................................................... 226

sfu.partnerRefusedUpdate ........................................................................... 226

sfu.partnerUpdateComplete ......................................................................... 226

sfu.partnerUpdateTimeout ........................................................................... 226

sfu.rebootRequest ........................................................................................ 227

sfu.rebootRequestFailure ............................................................................. 227

sfu.resumeDiskIO ........................................................................................ 227

sfu.SASDownloadFailed ............................................................................. 227

sfu.statusCheckFailure ................................................................................ 228

sfu.suspendDiskIO ...................................................................................... 228

sfu.suspendSES ........................................................................................... 228

Flash Cache module and PAM module EMS messages ......................................... 229

extCache.io.BlockChecksumError .............................................................. 229

extCache.io.cardError .................................................................................. 229

extCache.io.readError .................................................................................. 229

extCache.io.writeError ................................................................................ 230

extCache.offline .......................................................................................... 230

extCache.ReconfigComplete ....................................................................... 230

extCache.ReconfigFailed ............................................................................ 230

extCache.ReconfigStart ............................................................................... 231

extCache.UECCerror ................................................................................... 231

extCache.UECCmax ................................................................................... 231

fal.chan.offline.comp ................................................................................... 232

fal.chan.online.erase.warn ........................................................................... 232

fal.chan.online.fail ....................................................................................... 232

fal.chan.online.read.warn ............................................................................ 232

fal.chan.online.rep.fail ................................................................................. 233

fal.chan.online.rep.part ................................................................................ 233

fal.chan.online.rep.succ ............................................................................... 233

fal.chan.online.rep.ver.err ........................................................................... 233

fal.chan.online.write.warn ........................................................................... 234

fal.init.failed ................................................................................................ 234

fmm.bad.block.detected .............................................................................. 234

fmm.device.stats.missing ............................................................................ 234

16 | Platform Monitoring Guide

Page 17: 215-06774_netapp-cmds

fmm.domain.card.failure ............................................................................. 235

fmm.domain.core.failure ............................................................................. 235

fmm.hourly.device.report ............................................................................ 235

fmm.threshold.bank.degraded ..................................................................... 235

fmm.threshold.bank.offline ......................................................................... 236

fmm.threshold.card.degraded ...................................................................... 236

fmm.threshold.card.failure .......................................................................... 236

fmm.threshold.core.offline .......................................................................... 236

iomem.bbm.bbtl.overflow ........................................................................... 237

iomem.bbm.init.failed ................................................................................. 237

iomem.bbm.new.flash ................................................................................. 237

iomem.card.disable ...................................................................................... 237

iomem.card.enable ...................................................................................... 238

iomem.card.fail.cecc ................................................................................... 238

iomem.card.fail.data.crc .............................................................................. 238

iomem.card.fail.desc.crc .............................................................................. 238

iomem.card.fail.dimm ................................................................................. 239

iomem.card.fail.firmware.primary .............................................................. 239

iomem.card.fail.fpga ................................................................................... 239

iomem.card.fail.fpga.primary ...................................................................... 240

iomem.card.fail.fpga.rev ............................................................................. 240

iomem.card.fail.internal .............................................................................. 241

iomem.card.fail.pci ...................................................................................... 241

iomem.card.fail.uecc ................................................................................... 241

iomem.dimm.log.checksum ........................................................................ 242

iomem.dimm.log.init ................................................................................... 242

iomem.dimm.log.read ................................................................................. 242

iomem.dimm.log.sync ................................................................................. 242

iomem.dimm.log.write ................................................................................ 243

iomem.dimm.mismatch.banks ..................................................................... 243

iomem.dimm.mismatch.burst ...................................................................... 243

iomem.dimm.mismatch.casLatency ............................................................ 243

iomem.dimm.mismatch.columns ................................................................ 244

iomem.dimm.mismatch.dataWidth ............................................................. 244

iomem.dimm.mismatch.eccWidth ............................................................... 244

iomem.dimm.mismatch.ranks ..................................................................... 244

Table of Contents | 17

Page 18: 215-06774_netapp-cmds

iomem.dimm.mismatch.rows ...................................................................... 245

iomem.dimm.mismatch.vendor ................................................................... 245

iomem.dimm.spd.banks ............................................................................... 245

iomem.dimm.spd.burst ................................................................................ 245

iomem.dimm.spd.casLatency ...................................................................... 246

iomem.dimm.spd.checksum ........................................................................ 246

iomem.dimm.spd.columns .......................................................................... 246

iomem.dimm.spd.dataWidth ....................................................................... 246

iomem.dimm.spd.detect .............................................................................. 247

iomem.dimm.spd.eccWidth ......................................................................... 247

iomem.dimm.spd.ranks ............................................................................... 247

iomem.dimm.spd.read ................................................................................. 247

iomem.dimm.spd.rows ................................................................................ 248

iomem.dma.crc.data .................................................................................... 248

iomem.dma.crc.desc .................................................................................... 248

iomem.dma.internal ..................................................................................... 248

iomem.dma.stall .......................................................................................... 249

iomem.ecc.cecc ........................................................................................... 249

iomem.ecc.correct.off .................................................................................. 249

iomem.ecc.correct.on .................................................................................. 249

iomem.ecc.detect.off ................................................................................... 250

iomem.ecc.detect.on .................................................................................... 250

iomem.ecc.inject .......................................................................................... 250

iomem.ecc.summary .................................................................................... 250

iomem.ecc.uecc ........................................................................................... 251

iomem.fail.stripe .......................................................................................... 251

iomem.firmware.package.access ................................................................. 251

iomem.firmware.primary ............................................................................ 252

iomem.firmware.program.complete ............................................................ 252

iomem.firmware.program.fail ..................................................................... 252

iomem.firmware.program.reboot ................................................................ 252

iomem.firmware.program.start .................................................................... 252

iomem.firmware.rev .................................................................................... 253

iomem.flash.mismatch.id ............................................................................ 253

iomem.fru.badInfo ....................................................................................... 253

iomem.fru.checksum ................................................................................... 253

18 | Platform Monitoring Guide

Page 19: 215-06774_netapp-cmds

iomem.fru.read ............................................................................................ 254

iomem.fru.write ........................................................................................... 254

iomem.i2c.link.down ................................................................................... 254

iomem.i2c.read.addrNACK ......................................................................... 254

iomem.i2c.read.dataNACK ......................................................................... 255

iomem.i2c.read.timeout ............................................................................... 255

iomem.i2c.write.addrNACK ....................................................................... 255

iomem.i2c.write.dataNACK ........................................................................ 255

iomem.i2c.write.timeout ............................................................................. 256

iomem.init.detect.fpga ................................................................................. 256

iomem.init.detect.pci ................................................................................... 256

iomem.init.fail ............................................................................................. 256

iomem.memory.flash.syndrome .................................................................. 256

iomem.memory.none ................................................................................... 257

iomem.memory.power.high ........................................................................ 257

iomem.memory.power.low ......................................................................... 257

iomem.memory.scrub.start .......................................................................... 257

iomem.memory.size .................................................................................... 258

iomem.memory.zero.complete .................................................................... 258

iomem.memory.zero.start ............................................................................ 258

iomem.nor.op.failed .................................................................................... 258

iomem.pci.error.config.bar .......................................................................... 258

iomem.pio.op.failed ..................................................................................... 259

iomem.remap.block ..................................................................................... 259

iomem.remap.target.bad .............................................................................. 259

iomem.temp.report ...................................................................................... 259

iomem.train.complete .................................................................................. 260

iomem.train.fail ........................................................................................... 260

iomem.train.notReady ................................................................................. 260

iomem.train.start .......................................................................................... 260

iomem.vmargin.high ................................................................................... 261

iomem.vmargin.low .................................................................................... 261

iomem.vmargin.nominal ............................................................................. 261

monitor.extCache.failed .............................................................................. 261

monitor.flexscale.noLicense ........................................................................ 261

USB boot device EMS messages ............................................................................ 262

Table of Contents | 19

Page 20: 215-06774_netapp-cmds

usb.adapter.debug ........................................................................................ 262

usb.adapter.exception .................................................................................. 262

usb.adapter.failed ........................................................................................ 262

usb.adapter.reset .......................................................................................... 263

usb.device.failed .......................................................................................... 263

usb.device.initialize.failed ........................................................................... 263

usb.device.maximum.connected ................................................................. 264

usb.device.protocol.mismatch ..................................................................... 264

usb.device.removed ..................................................................................... 265

usb.device.timeout ....................................................................................... 265

usb.device.unsupported ............................................................................... 265

usb.device.unsupported.speed ..................................................................... 266

usb.external.device.not.used ........................................................................ 266

usb.externalHub.notSupported .................................................................... 266

usb.port.error ............................................................................................... 266

usb.port.reset ............................................................................................... 267

usb.port.state.indeterminate ......................................................................... 267

usb.port.status.inconsistent .......................................................................... 267

usbmon.boot.device.failed ........................................................................... 268

usbmon.boot.device.pfa ............................................................................... 268

usbmon.disable.module ............................................................................... 268

usbmon.unable.to.monitor ........................................................................... 269

FCoE HBA EMS messages ..................................................................................... 269

ispcna.mpi.dump ......................................................................................... 269

ispcna.mpi.dump.saved ............................................................................... 269

ispcna.mpi.initFailed ................................................................................... 270

Operational error messages ..................................................................................... 270

Disk hung during swap ................................................................................ 270

Disk n is broken ........................................................................................... 271

Dumping core .............................................................................................. 271

Error dumping core ..................................................................................... 271

FC-AL LINK_FAILURE ............................................................................ 271

FC-AL RECOVERABLE ERRORS ........................................................... 271

Panicking ..................................................................................................... 272

RMC Alert: Boot Error ............................................................................... 272

RMC Alert: Down Appliance ..................................................................... 272

20 | Platform Monitoring Guide

Page 21: 215-06774_netapp-cmds

RMC Alert: OFW POST Error .................................................................... 272

RLM messages .......................................................................................... 275When and how RLM AutoSupport e-mail messages are sent ................................. 275

What RLM AutoSupport e-mail messages include ................................................. 276

When and how RLM EMS messages are sent ........................................................ 276

RLM-generated AutoSupport messages .................................................................. 276

Heartbeat loss warning ................................................................................ 276

Reboot (power loss) critical ........................................................................ 277

Reboot warning ........................................................................................... 277

Reboot (watchdog reset) warning ............................................................... 277

RLM heartbeat loss ..................................................................................... 277

RLM heartbeat stopped ............................................................................... 278

System boot failed (POST failed) ............................................................... 278

User triggered (RLM test) ........................................................................... 278

User_triggered (system nmi) ....................................................................... 278

User_triggered (system power cycle) .......................................................... 278

User_triggered (system power off) ............................................................. 279

User_triggered (system power on) .............................................................. 279

User_triggered (system reset) ...................................................................... 279

EMS messages about the RLM ............................................................................... 279

rlm.driver.hourly.stats ................................................................................. 279

rlm.driver.mailhost ...................................................................................... 280

rlm.driver.network.failure ........................................................................... 280

rlm.driver.timeout ........................................................................................ 280

rlm.firmware.update.failed .......................................................................... 281

rlm.firmware.upgrade.reqd .......................................................................... 281

rlm.firmware.version.unsupported .............................................................. 282

rlm.heartbeat.bootFromBackup ................................................................... 282

rlm.heartbeat.resumed ................................................................................. 282

rlm.heartbeat.stopped .................................................................................. 283

rlm.network.link.down ................................................................................ 283

rlm.notConfigured ....................................................................................... 284

rlm.orftp.failed ............................................................................................ 284

rlm.snmp.traps.off ....................................................................................... 285

rlm.systemDown.alert ................................................................................. 285

rlm.systemDown.notice ............................................................................... 285

Table of Contents | 21

Page 22: 215-06774_netapp-cmds

rlm.systemDown.warning ........................................................................... 286

rlm.systemPeriodic.keepAlive .................................................................... 286

rlm.systemTest.notice .................................................................................. 287

rlm.userlist.update.failed ............................................................................. 287

BMC messages .......................................................................................... 289How and when BMC AutoSupport e-mail notifications are sent ............................ 289

What BMC e-mail notifications include ................................................................. 289

BMC-generated AutoSupport messages ................................................................. 289

BMC_ASUP_UNKNOWN ......................................................................... 290

REBOOT (abnormal) .................................................................................. 290

REBOOT (power loss) ................................................................................ 290

REBOOT (watchdog reset) ......................................................................... 290

SYSTEM_BOOT_FAILED (POST failed) ................................................ 290

SYSTEM_POWER_OFF (environment) .................................................... 291

USER_TRIGGERED (bmc test) ................................................................. 291

USER_TRIGGERED (system nmi) ............................................................ 291

USER_TRIGGERED (system power cycle) ............................................... 291

USER_TRIGGERED (system power off) ................................................... 291

USER_TRIGGERED (system power on) ................................................... 292

USER_TRIGGERED (system power soft-off) ........................................... 292

USER_TRIGGERED (system reset) ........................................................... 292

EMS messages about the BMC ............................................................................... 292

bmc.asup.crit ............................................................................................... 292

bmc.asup.error ............................................................................................. 293

bmc.asup.init ............................................................................................... 293

bmc.asup.queue ........................................................................................... 293

bmc.asup.send ............................................................................................. 293

bmc.asup.smtp ............................................................................................. 294

bmc.batt.id ................................................................................................... 294

bmc.batt.invalid ........................................................................................... 294

bmc.batt.mfg ................................................................................................ 294

bmc.batt.rev ................................................................................................. 295

bmc.batt.seal ................................................................................................ 295

bmc.batt.unknown ....................................................................................... 295

bmc.batt.unseal ............................................................................................ 295

bmc.batt.upgrade ......................................................................................... 295

22 | Platform Monitoring Guide

Page 23: 215-06774_netapp-cmds

bmc.batt.upgrade.busy ................................................................................. 296

bmc.batt.upgrade.failed ............................................................................... 296

bmc.batt.upgrade.failure .............................................................................. 296

bmc.batt.upgrade.ok .................................................................................... 297

bmc.batt.upgrade.power-off ........................................................................ 297

bmc.batt.upgrade.voltagelow ...................................................................... 297

bmc.batt.voltage .......................................................................................... 297

bmc.config.asup.off ..................................................................................... 298

bmc.config.corrupted .................................................................................. 298

bmc.config.default ....................................................................................... 298

bmc.config.default.pef.filter ........................................................................ 298

bmc.config.default.pef.policy ...................................................................... 299

bmc.config.fru.systemserial ........................................................................ 299

bmc.config.mac.error .................................................................................. 299

bmc.config.net.error .................................................................................... 299

bmc.config.upgrade ..................................................................................... 300

bmc.power.on.auto ...................................................................................... 300

bmc.reset.ext ................................................................................................ 300

bmc.reset.int ................................................................................................ 300

bmc.reset.power .......................................................................................... 300

bmc.reset.repair ........................................................................................... 301

bmc.reset.unknown ...................................................................................... 301

bmc.sensor.batt.charger.off ......................................................................... 301

bmc.sensor.batt.charger.on .......................................................................... 301

bmc.sensor.batt.time.run.invalid ................................................................. 301

bmc.ssh.key.missing .................................................................................... 302

Service Processor messages ..................................................................... 303When and how SP AutoSupport e-mail messages are sent ..................................... 303

What SP AutoSupport e-mail messages include ..................................................... 304

When and how SP EMS messages are sent ............................................................. 304

SP-generated AutoSupport messages ...................................................................... 304

HEARTBEAT_LOSS ................................................................................. 304

REBOOT (abnormal) .................................................................................. 305

SYSTEM_BOOT_FAILED (POST failed) ................................................ 305

USER_TRIGGERED (sp test) .................................................................... 305

USER_TRIGGERED (system nmi) ............................................................ 305

Table of Contents | 23

Page 24: 215-06774_netapp-cmds

USER_TRIGGERED (system power cycle) ............................................... 306

USER_TRIGGERED (system power off) ................................................... 306

USER_TRIGGERED (system reset) ........................................................... 306

EMS messages about the SP ................................................................................... 306

sp.firmware.upgrade.reqd ............................................................................ 306

sp.firmware.version.unsupported ................................................................ 307

sp.heartbeat.resumed ................................................................................... 307

sp.heartbeat.stopped .................................................................................... 307

sp.network.link.down .................................................................................. 308

sp.notConfigured ......................................................................................... 308

sp.orftp.failed .............................................................................................. 309

sp.snmp.traps.off ......................................................................................... 309

sp.userlist.update.failed ............................................................................... 309

spmgmt.driver.hourly.stats .......................................................................... 310

spmgmt.driver.mailhost ............................................................................... 311

spmgmt.driver.network.failure .................................................................... 311

spmgmt.driver.timeout ................................................................................ 311

Abbreviations ............................................................................................ 313Copyright information ............................................................................. 329Trademark information ........................................................................... 331How to send your comments .................................................................... 333Index ........................................................................................................... 335

24 | Platform Monitoring Guide

Page 25: 215-06774_netapp-cmds

Sources of troubleshooting information

Your storage system alerts you when problems occur and informs you of events that do not poseproblems. It does so with LEDs and messages that appear on your system console.

Monitoring messages and LEDs and using this guide to determine the meaning of messages andLEDs can help you prevent or correct problems on your system.

The following systems are included in this guide:

• FAS20xx and SA200• FAS22xx• 30xx and SA300• 31xx• 32xx and SA320• 60xx• 62xx and SA620

Where LEDs appearLEDs appear on the front of system chassis, the back of controllers, on PSUs, and on fan FRUs. Theyalso appear on adapters that might be installed on your system.

LEDs for one system family differ from LEDs for another system family. For example, LEDs onFAS20xx and SA200 systems differ from those on 60xx and SA600 systems.

Where messages are displayedYour system displays messages in different places, depending on the type of message.

The following table lists the types of messages your system might generate and where you can seethem on your system.

Error message type Where the type of message is displayed

POST error messages System console

Boot error messages System console

EMS environmental messages and otheroperational messages

System console or LCD display

RLM notifications about the system and EMSmessages about the RLM

AutoSupport e-mail messages and the systemconsole

25

Page 26: 215-06774_netapp-cmds

Error message type Where the type of message is displayed

BMC notifications about the system and EMSmessages about the BMC

AutoSupport e-mail messages and the systemconsole

SP notifications about the system and EMSmessages about the SP

AutoSupport e-mail messages and the systemconsole

Your system also logs messages. See the System Administration Guide for the version of DataONTAP that your system is running for information about message logs.

Additional information about messages that appear on your system console or in logs may beavailable through the Syslog Translator on the NOW site.

How AutoSupport e-mail messages help withtroubleshooting

Your system has an AutoSupport feature, which sends e-mail containing information about yoursystem to technical support. AutoSupport provides customized real-time support to monitor theperformance of your system.

AutoSupport messages are generated and sent when specific events occur within a system or acluster. Messages also are sent weekly to provide support personnel information about systemperformance. If necessary, technical support contacts you at the e-mail address that you specify tohelp resolve a potential system problem.

You also can have AutoSupport messages sent to addresses that you designate, such as your internalsupport organization.

Descriptions of the AutoSupport messages that you receive are available through the MessageMatrices page on the NOW site.

For information about configuring AutoSupport, see the System Administration Guide for the versionof Data ONTAP that your system is running.

Note: AutoSupport is enabled by default. You should keep it enabled because it can significantlyspeed the determination and resolution of problems if they occur on your system.

Forms and use of diagnostic toolsDiagnostic tools enable you to troubleshoot problems with your storage system hardware. Forms anduse of diagnostics differ, depending on your system model. You need to understand how to use theapplicable form of diagnostics for your system.

The following diagnostic tools are available on different systems:

26 | Platform Monitoring Guide

Page 27: 215-06774_netapp-cmds

System-leveldiagnostics

System-level diagnostics are available on FAS22xx, 32xx, and 62xx systems byentering sldiag commands at the Maintenance mode prompt.

The sldiag commands enable you to specify devices, tests, and options; rundiagnostics based on the command; and then view the results. They are documentedin man pages and in the command reference documents on the NetApp Support Siteat support.netapp.com.

Additional information about system-level diagnostics is available in the System-Level Diagnostics Guide on the NetApp Support Site at support.netapp.com.

SYSDIAGtool

The SYSDIAG tool is available on systems earlier than FAS22xx, 32xx, and 62xx byentering the boot_diags command at the boot environment prompt and thennavigating menu options.

The command boots the diagnostic program and then displays the DiagnosticMonitor, the interface providing access to diagnostic menus. After you select and runa test, the SYSDIAG tool generates a message and displays it on the system consoleif the test finds an error.

Additional information about the SYSDIAG tool is available in the DiagnosticsGuide on the NetApp Support Site at support.netapp.com.

Where to find documentationDocumentation is available for specific system families and disk shelves that might be attached toyour storage system. You can find documentation on the NetApp Support Site at support.netapp.com.

Use the following table to learn what documents contain information that might assist you withtroubleshooting specific systems or disk shelves.

Sources of troubleshooting information | 27

Page 28: 215-06774_netapp-cmds

Platform or disk shelf type System or disk shelf model Document

FAS systems 62xx systems Hardware PlatformMonitoring Guide (Thisguide)60xx systems

32xx systems

31xx systems

30xx systems

FAS22xx systems

FAS20xx systems

FAS900 series FAS900 Hardware ServiceGuide

FAS250 and FAS270 systems FAS250/FAS270Hardware and ServiceGuide

Filer systems F800 filers F800 HardwareInstallation Guide

F87 filers F87 Hardware and ServiceGuide

F85 filers F85 Hardware and ServiceGuide

V-Series systems and gFilergateways

30xx systems Hardware PlatformMonitoring Guide (Thisguide)31xx systems

32xx systems

60xx systems

62xx systems

V900 gFiler gFiler HardwareMaintenance Guide

V270c gFiler

GF825 gFiler

28 | Platform Monitoring Guide

Page 29: 215-06774_netapp-cmds

Platform or disk shelf type System or disk shelf model Document

SA systems SA200 systems Hardware PlatformMonitoring Guide (Thisguide)SA300 systems

SA320 systems

SA600 systems

SA620 systems

NearStore systems R200 systems R200 Hardware andService Guide

R150 systems R150 Hardware andService Guide

R100 systems R100 Hardware andService Guide

Disk shelves DS2246 DS2246 Installation andService Guide

DS4243 DS4243 Installation andService Guide

DS14mk2 FC DS14mk2 FC HardwareGuide

DS14mk2 AT DS14mk2 AT HardwareGuide

FC9 FC9 Hardware Guide

Third-party hardware Switches, routers, storage subsystems,and tape backup devices

Applicable third-partyhardware documentation

Sources of troubleshooting information | 29

Page 30: 215-06774_netapp-cmds

30 | Platform Monitoring Guide

Page 31: 215-06774_netapp-cmds

System LEDs

LEDs enable you to monitor your storage system and its components.

Each storage system platform has LEDs on the chassis, controller, fans, and PSUs. These LEDsprovide high-level status of your system and network activity.

Your system might have adapters installed and configured on them. These adapters also have LEDs,which show you whether the adapter has power, whether there is a network connection, and whetherdata is being transmitted.

Note: For information about disk shelf LEDs, see the appropriate disk shelf guide on the NetAppSupport Site at support.netapp.com.

FAS20xx and SA200 system LEDsFAS20xx and SA200 systems have LEDs that you can check to learn whether the system and itsindividual components are turned on and are operating normally.

LEDs are visible on the front and the back of the system and on the power supply.

Location and meaning of LEDs on the front of FAS20xx and SA200 chassisYou can check the LEDs on the front of the system to learn whether the power is turned on, whetherthere is activity on the controller, whether the system is halted, or whether there is a fault in thechassis.

The following illustration shows the LEDs on the front of the FAS20xx and SA200 chassis.

31

Page 32: 215-06774_netapp-cmds

1 Power LED

2 Fault LED

3 Controller module A LED

4 Controller module B LED

The following table explains what the LEDs on the front of the chassis mean.

Label LED name Statusindicator

Description

Power Green The system is receiving power.

Off The system is not receiving power.

32 | Platform Monitoring Guide

Page 33: 215-06774_netapp-cmds

Label LED name Statusindicator

Description

Fault Amber The system halted or a fault occurred in thechassis. The error might be in a PSU, fan,controller module, or internal disk. The LEDalso is lit when there is a field-replaceable unitfailure, Data ONTAP is not running on acontroller module, or the system is inMaintenance mode.

Off The system is operating normally.

A/B

(Controller A orB)

Green The controller is operating and is active.

Blinking This LED blinks in proportion to activity; thegreater the activity, the more frequently the LEDblinks. When activity is absent or very low, theLED does not blink.

Off No activity is detected.

Note: If an internal disk drive fails or is disabled, the fault light on the front of the chassis turns on.When you remove the faulty or disabled disk drive, the fault light turns off. However, the failureof disk drives in expansion disk shelves does not affect the fault light on the front of the chassis.

Location and meaning of LEDs on the back of FAS20xx and SA200controller modules

You can check the LEDs on the back of the controller module to learn whether the controller moduleis functioning properly, or to learn the status of the system network or disk shelf connections orNVMEM.

The following LEDs are on the back of the controller module:

• Fibre Channel port• Remote management port• Ethernet port• NVMEM• Controller module fault

The following illustration shows the location of LEDs on the rear of FAS2050 and SA200 controllermodules.

System LEDs | 33

Page 34: 215-06774_netapp-cmds

The LEDs on the back of FAS2020 controller modules are the same as on the back of FAS2050 andSA200 controller modules, except for the placement of some labels.

The following illustration shows the location of LEDs on the back of FAS2040 controller modules.

The following table explains what the LEDs on the back of the controller modules mean.

Label Port type LED type Status indicator Description

Fibre Channel LNK Green Link is established andcommunication is happening.

Off No link is established.

SAS LNK Green Link is established on at least oneexternal SAS lane.

Off No link is established on anyexternal SAS lane.

Remotemanagement

LNK (Left) Green A valid network connection isestablished.

Off There is no network connectionpresent.

ACT(Right)

Amber There is data activity.

Off There is no network activitypresent.

Ethernet LNK (Left) Green A valid network connection isestablished.

Off There is no network connectionpresent.

ACT(Right)

Amber There is data activity.

Off There is no network activitypresent.

34 | Platform Monitoring Guide

Page 35: 215-06774_netapp-cmds

Label Port type LED type Status indicator Description

or

N/A NVMEMstatus LED

Blinking green NVMEM is in battery-backedstandby mode.

Off (power on) The system is running normally,and NVMEM is armed if DataONTAP is running.

Off (power off) The system is shut down,NVMEM is not armed, and thebattery is not enabled.

N/A Controllermodule faultLED

Amber The controller module is startingup, Data ONTAP is initializing,the controller module is inMaintenance mode, or a controllermodule fault is detected.

Off The controller module isfunctioning properly.

Attention: Do not replace DIMMs or any other system hardware when the NVMEM LED isblinking. Doing so might cause you to lose data. Always flush NVMEM contents to disk byentering a halt command at the system prompt before replacing the hardware.

Attention: To protect critical data in NVMEM, you cannot update BIOS or BMC firmware whenNVMEM is in use. Before updating firmware, ensure that NVMEM no longer contains criticaldata by performing a halt command to cleanly shut down Data ONTAP. When the systemreboots to the boot environment prompt, you can update your firmware.

Location and meaning of FAS20xx and SA200 PSU LEDsYou can check the LEDs on each PSU in your system to see whether the PSU has power and isfunctioning properly.

The following illustration shows the location of the PSU LEDs, which are visible at the back of thesystem.

Note: The following illustration shows the PSU of FAS2050 and SA200 systems. The location ofPSU LEDs in FAS2020 and FAS2040 systems are different, but the LEDs are functionallyidentical.

System LEDs | 35

Page 36: 215-06774_netapp-cmds

1 AC LED

2 Fault LED

The following table explains what the PSU LEDs mean.

Icon LED name LED color Description

AC Green AC input is good and the switch is on.

Off AC input is bad or the switch is off.

Fault Amber The power supply is not functioning properly andneeds service. See the system console for anyapplicable error messages.

Off The power supply is functioning properly.

FAS22xx system LEDsFAS22xx systems have LEDs that you can check to learn whether the system and its individualcomponents are turned on and are operating normally. FAS22xx systems include FAS2220 andFAS2240 systems.

LEDs are visible on the front of the chassis, on the back of controllers, and on the PSUs.

36 | Platform Monitoring Guide

Page 37: 215-06774_netapp-cmds

FAS22xx systems are available in three models: the 2U FAS2220 system, the 2U FAS2240-2system, and the 4U FAS2240-4 system.

Location and meaning of LEDs on the front of FAS22xx chassisYou can check the LEDs on the front of the chassis to learn whether the power is turned on, thecontroller is active, the system is halted, or a fault in the chassis has occurred.

The following illustration shows the LEDs on the front of a FAS2220 or FAS2240-2 system with thebezel in place.

1

2

1 LEDs

2 Shelf ID digital display

FAS2240-4 systems have 4U chassis, but the placement and function of the LEDs are the same as onFAS2220 and FAS2240-2 systems.

The following table shows what the LED labels look like and explains what the LEDs mean.

LED label LED name Status indicator Description

Power Green Power is being supplied to the system.

Off No power is being supplied to the system.

Fault Amber A fault has occurred in the controller, PSU, oronboard storage, or Data ONTAP is notrunning.

Off The system is operating normally.

The shelf ID digital display shows the shelf ID of the chassis, which contains disk drives.

System LEDs | 37

Page 38: 215-06774_netapp-cmds

Note: If the FAS2220 or FAS2240 system has no attached disk shelves, then the chassis can haveany ID number. However, if disk shelves are attached, the chassis shelf and attached disk shelvesmust have unique ID numbers.

When the bezel is removed, a third LED, indicating activity, is revealed below the fault LED. Thefollowing table shows what the activity LED label looks like and explains what the LED means.

LED label LED name Status indicator Description

Activity Green A link is established between the controllerand storage.

Location and meaning of FAS22xx internal drive LEDsWhen the bezel of the system is removed, you can view the LEDs on the internal disk drive carriers,which indicate whether the disk drive is functioning normally.

The following illustration shows the front of a disk drive carrier in FAS2220 and FAS2240-4 systemsand the location of its two LEDs.

1

2

1Activity LED

2Fault LED

The following illustration shows the front of a disk drive carrier in FAS2240-2 systems and thelocation of its two LEDs.

38 | Platform Monitoring Guide

Page 39: 215-06774_netapp-cmds

1

2

1Activity LED

2Fault LED

Although the drive carriers are different in appearance, the behavior of the LEDs is the same. Thefollowing table explains what the LEDs mean.

LED LED color Description

Activity Solid green The disk drive has power.

Blinking green The disk drive has power, and I/O is in progress.

Fault Solid amber There is an error with the functioning of the disk drive.

Not illuminated The disk drive is functioning normally.

System LEDs | 39

Page 40: 215-06774_netapp-cmds

Location and meaning of LEDs on the back of FAS22xx controllersYou can check the LEDs on the back of the controller to learn the status of its network or disk shelfconnections, or, in an HA pair, to identify the controller where a fault occurred.

The following illustration shows the ports and LEDs on the back of the controller.

LNKLNK

LNK LNK

IOIOI

0b 0a1a 1b

e0a e0c

e0b e0d

1

2 5 5 8 10 11 14

14

6

3 4 7 9 12 13

13

1 SAS port LEDs

2 SAS ports

3 Controller fault LED

4 NVMEM status LED

5 Optional mezzanine card LEDs (either 2/4/8 Gbps FC or 10 GbE) (FAS2240 systems only)

6 Optional mezzanine card ports (either 2/4/8 Gbps FC or 10 GbE) (FAS2240 systems only)

7 Serial port

8 USB port

9 Remote management Ethernet 10/100 Mb port LEDs

40 | Platform Monitoring Guide

Page 41: 215-06774_netapp-cmds

10 Remote management Ethernet 10/100 Mb port

11 Private management 10/100 Mb Ethernet port LEDs

12 Private management 10/100 Mb Ethernet port

13 GbE Ethernet port LEDs

14 GbE Ethernet port

If you have a FAS2240 system, the optional mezzanine card provides one of the following sets ofports:

• Two 2/4/8 Gbps FC ports, each with one LNK LED• Two 10-GbE ports, each with one activity LED and one LNK LED

The following table describes the meaning of the LEDs on the back of the controller.

Label Name Type Statusindicator

Description

SerialattachedSCSI (SAS)

Link Green Link is established on at least 1external SAS lane.

Off No link is established on anyexternal SAS lane.

Controllerfault

Activity Amber The controller module is starting up,Data ONTAP is initializing, thecontroller module is in Maintenancemode, or a controller module fault isdetected.

Note: The LED might beilluminated on both controllers.

Off The controller is functioningproperly.

System LEDs | 41

Page 42: 215-06774_netapp-cmds

Label Name Type Statusindicator

Description

NVMEM NVMEMstatus

Blinkinggreen

NVMEM is in battery-backedstandby mode.

Off (poweron)

The system is running normally, andNVMEM is armed if Data ONTAPis running.

FibreChannel

Link Green A connection is established on theport.

Off No connection is established on theport.

Ethernet Link Green A link is established between theport and some upstream device.

Off No link is established.

Activity Blinkingamber

Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

and

Remotemanagement

Link Green A link is established between theport and some upstream device.

Off No link is established.

Activity Blinkingamber

Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

and

Privatemanagement

Link Green A link is established between theport and a downstream disk shelf.

Off No link is established.

Activity Blinkingamber

Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

42 | Platform Monitoring Guide

Page 43: 215-06774_netapp-cmds

Location and meaning of FAS22xx PSU LEDsYou can check the LEDs on each PSU to see whether its power is on and whether the PSU andintegrated fan modules are working properly.

The PSUs on FAS2220 and FAS2240-2 systems are different from the PSUs on FAS2240-4 systems,but the PSU LEDs function the same way.

The following illustration shows the location of PSU LEDs on the back of FAS2220 and FAS2240-2systems.

AC

12

34

1 PSU OK

2 DC fault

3 AC fault

4 Fan fault

The following illustration shows the location of PSU LEDs on the back of the FAS2240-4 system.

System LEDs | 43

Page 44: 215-06774_netapp-cmds

21

34

1Fan fault

2AC fault

3PSU OK

4DC fault

The following table describes what the PSU LEDs on FAS22xx systems mean.

Label Name Status indicator Description

PSU OK Green The PSU is functioning normally.

Note: The other three LEDs are notilluminated.

DC fault Amber The PSU cannot provide DC voltage tothe disk shelf within margin.

AC fault Amber The PSU is not turned on or the ACpower cord is not plugged in.

Fan fault Amber An error occurred with the function ofthe fan.

44 | Platform Monitoring Guide

Page 45: 215-06774_netapp-cmds

Location and meaning of FAS22xx internal FRU LEDsFAS22xx systems contain LEDs inside the controller that assist in troubleshooting FRUs inside ofthem.

The following FRUs are in the controller and have LEDs on or near them:

• DIMMs (2)• RTC battery• Boot media device• Mezzanine card

The FRU LEDs remain unlit when the FRU is functioning normally and turn amber when a problemoccurs. They stay lit for at least 10 minutes even after you remove the controller from the chassis.

30xx and SA300 system LEDs30xx and SA300 systems have LEDs that you can check to learn whether the system and itscomponents are turned on and are operating normally.

LEDs are visible on the front and rear of each system and on the power supplies.

Note: 30xx systems do not include 3140, 3160, and 3170 systems, which are referred tocollectively as 31xx systems. 30xx systems also do not include 3210, 3240, and 3270 systems,which are referred to collectively as 32xx systems.

Location and meaning of LEDs on the front of 30xx and SA300 controllersYou can check the LEDs on the front of the controller to learn whether the power is turned on,whether there is activity on the controller, whether the system is halted, or whether a fault hasoccurred.

The following illustration shows the LEDs on the front of the controller.

System LEDs | 45

Page 46: 215-06774_netapp-cmds

1 Activity LED

2 Status LED

3 Power LED

The following table explains the meaning of the LEDs.

LED label Status indicator Description

Activity Green The system is operating and is active.

Blinking The system is actively processing data.

Off No activity is detected.

Status Green The system is operating normally

Amber The system halted or a fault occurred. The fault isdisplayed in the LCD.

Note: This LED remains lit during boot, while theoperating system loads.

Power Green The system is receiving power.

Off The system is not receiving power.

Location and meaning of LEDs on the back of 30xx and SA300 controllersYou can check the LEDs on the back of the controller to learn the status of the controller networkconnections.

The following LEDs are visible on the back of the controller:

• FC port LEDs• GbE port LEDs• RLM LEDs

The following illustration shows the location of LEDs on the back of the controller.

46 | Platform Monitoring Guide

Page 47: 215-06774_netapp-cmds

1 FC port LEDs

2 GbE port LEDs

3 RLM LEDs

The following table explains what the LEDs on the back of the controller mean.

Port type LED type Statusindicator

Description

FC LNK Off No link with the Fibre Channel is established.

Green A link is established.

GbE andRLM

LNK On A valid network connection is established.

Off There is no network connection.

ACT There is data activity.

Off There is no network activity present.

Location and meaning of 30xx and SA300 PSU LEDsYou can check the LEDs on the PSUs to learn whether they are functioning normally.

The following illustration shows the location of the PSU LEDs on the back of the system.

System LEDs | 47

Page 48: 215-06774_netapp-cmds

1 PSU 1

2 PSU 2

3 PSU LEDs

The following table explains what the PSU LEDs mean.

LED label Status indicator Description

AC Amber No fault is indicated.

OK or Status Green

AC Off There is no external power; check the connections and thepower source.

OK or Status Off

AC Amber • (3020 and 3050 systems) CFE prompt.• (3040, 3070, and SA300 systems) The system displays

the LOADER> prompt because it has not booted DataONTAP.

OK or Status Off

48 | Platform Monitoring Guide

Page 49: 215-06774_netapp-cmds

LED label Status indicator Description

AC Flashing amber There is a power supply fault; replace the power supply.

OK or Status Amber

31xx system LEDs31xx systems have LEDs that you can check to learn whether the system and its individualcomponents are turned on and are operating normally.

LEDs are visible on the front and rear of each system and on the fan FRUs and the power supplies.

Location and meaning of LEDs on the front of 31xx chassisYou can check the LEDs on the front of the chassis to learn whether the power is turned on, thecontroller is active, the system is halted, or a fault in the chassis has occurred.

The following illustration shows the LEDs on the front of the chassis.

1 LEDs on the front of the system

When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:

• Power• Fault• Controller A activity• Controller B activity

System LEDs | 49

Page 50: 215-06774_netapp-cmds

Controller A is the controller in the top of the chassis, and Controller B is the controller in the bottomof the chassis.

Note: When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottomorder:

• Power• Fault• Controller A activity• Controller B activity

The following table shows what the LED labels look like and explains what the LEDs mean.

LED label LED name Statusindicator

Description

Power Green At least one of the two PSUs is delivering power tothe system.

Off Neither PSU is delivering power to the system.

Fault Amber The system halted or a fault occurred in the chassis.The error might be in a PSU, fan, or controller. TheLED also is lit when there is a FRU failure, DataONTAP is not running on a controller, or the systemis in Maintenance mode.

Note: You can check the fault light on the back ofeach controller to see where the problem occurred.

Note: The fault light does not come on when youremove the controller from a dual-controllersystem in an HA pair.

Off Both controllers are operating normally.

A/B Activity Blinkinggreen

Data ONTAP is running on the controller. The lengthof time that the light remains on is proportional to thecontroller's activity.

Off Data ONTAP is not running on the controller.

Location and meaning of LEDs on the back of 31xx controllersYou can check the LEDs on the back of the controller to learn the status of its network or disk shelfconnections, or, in an HA pair, to identify the controller where a fault occurred.

The following LEDs are visible on the back of the controller:

• Ethernet port

50 | Platform Monitoring Guide

Page 51: 215-06774_netapp-cmds

• Fault• Fibre Channel port

The following illustration shows the location of the LEDs on the back of the controller.

The following table explains the behavior of the LEDs on the back of the controller.

LEDlabel

Type name LEDtype

Statusindicator

Description

Ethernet port Link(left)

Green A link is established between the port andsome upstream device.

Off No link is established.

Activity(right)

Amber Traffic is flowing over the connection.

Off No traffic is flowing over the connection.

and

Managementport (Ethernet)

Link(left)

Green A link is established between the port andsome upstream device.

Off No link is established.

Activity(right)

Amber Traffic is flowing over the connection.

Off No traffic is flowing over the connection.

Controllerfault

Activity Amber The controller is the one causing the frontpanel LED to be illuminated.

Note: This LED might be illuminated onboth controllers.

Off The controller is functioning properly.

Fibre Channel Link Green A loop connection is established on the port.

Off No loop connection is established on the port.

Location and meaning of 31xx fan LEDsYou can check the LED on each fan module FRU to pinpoint problems that can occur in the FRU.

When the bezel is removed, the fan module FRUs and their LEDs are visible. The followingillustration shows the LED on a fan module FRU.

System LEDs | 51

Page 52: 215-06774_netapp-cmds

1 Fan module FRU LED

The fan module FRU LED is amber and turns on when a problem occurs in the fan. If you see errormessages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU LEDto locate the FRU where the problem occurred.

Location and meaning of 31xx PSU LEDsYou can check the LEDs on each AC PSU or DC PSU to see whether its power is on and whether thePSU is working properly.

The following illustration shows the location of AC PSU LEDs on the back of the system. DC PSUshave different power connectors, but their LEDs are the same.

52 | Platform Monitoring Guide

Page 53: 215-06774_netapp-cmds

1 Fault LED

2 Power LED

The following table describes what the AC PSU and DC PSU LEDs mean.

PSU type PSU condition Power LEDstatus

Fault LED status

AC PSU is present and switched on.Normal mode.

Green Off

-48VDC

AC PSU is missing or switched off. Theother PSU is off or functioningnormally.

Off Off

-48VDC

AC PSU fault: AC in or -48VDC is out ofrange, or there is a DC fault or fanfault.

Off Blinking amber

-48VDC

Location and meaning of 31xx FRU LEDs31xx systems have 15 internal LEDs that assist in troubleshooting FRUs.

Eleven LEDs are next to FRUs on the controller board: (up to eight) DIMMs, CompactFlash, RLM,and the RTC battery. When an LED is lit, it indicates that the FRU next to it needs to be replaced.

System LEDs | 53

Page 54: 215-06774_netapp-cmds

Four LEDs are on the PCIe riser, one per PCIe slot. When one of the LEDs is lit, it indicates thatthere is a problem with the card in that particular PCIe slot.

The FRU LEDs stay lit for at least 10 minutes even after you remove the controller from the system.

32xx and SA320 system LEDs32xx and SA320 systems have LEDs that you can check to learn whether the system and itsindividual components are turned on and are operating normally.

LEDs are visible on the front of the chassis, on the back of controllers and I/O expansion modules,and on fan FRUs and power supplies.

Location and meaning of LEDs on the front of 32xx and SA320 chassisYou can check the LEDs on the front of the chassis to learn whether the power is turned on, thecontroller is active, the system is halted, or a fault in the chassis has occurred.

The following illustration shows the LEDs on the front of the chassis.

BA

1

1LEDs

When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:

• Power• Fault• Controller A activity• Controller B activity

When two controllers are installed in the chassis, Controller A is the controller in the top bay, andController B is the controller in the bottom bay. When a controller and an I/O expansion module areinstalled in the chassis, the controller is always in the top bay and the I/O expansion module isalways in the bottom bay.

The following table shows what the LED labels look like and explains what the LEDs mean.

54 | Platform Monitoring Guide

Page 55: 215-06774_netapp-cmds

LED label LED name Status indicator Description

Power Green Power is being supplied to the system.

Off No power is being supplied to the system.

Fault Amber The system halted, or a fault occurred in thechassis.

Off The controllers are operating normally, or thecontroller and the I/O expansion module areoperating normally.

Controller A/B Blinking green Data ONTAP is running on the controller. Thelength of time that the light remains on isproportional to the controller's activity.

Note: If an I/O expansion module is installedin the chassis, the corresponding controlleractivity LED is not lit.

Off Data ONTAP is not running on the controller.

Location and meaning of LEDs on the back of 32xx and SA320 controllersYou can check the LEDs on the back of the controller to learn the status of its network or disk shelfconnections, or, in an HA pair, to identify the controller where a fault occurred.

The following illustration shows the ports and LEDs on the back of the controller.

LNK LNK0a 0b

0c

0d e0b

e0a

c0b

c0a

!

131211108642

97531

1SAS port LEDs

System LEDs | 55

Page 56: 215-06774_netapp-cmds

2SAS ports

3HA port LEDs (LEDs pointing up belong to the upper port; LEDs pointing down belong tothe lower port.)

4HA ports

5Fibre Channel port LEDs (LED pointing up belongs to the upper port; LED pointing downbelongs to the lower port.)

6Fibre Channel ports

71-GbE port LEDs

81-GbE ports

9Management Ethernet 10/100 Mb port LEDs

10Private management 10/100 Mb Ethernet port

11USB (top) and serial console (bottom) ports (External USB devices are not currentlysupported.)

12Controller fault LED

13NVMEM LED

The following table describes the meaning of the LEDs on the back of the controller.

Label Name Type Statusindicator

Description

SerialattachedSCSI (SAS)

Link Green Link is established on at least 1external SAS lane.

Off No link is established on anyexternal SAS lane.

56 | Platform Monitoring Guide

Page 57: 215-06774_netapp-cmds

Label Name Type Statusindicator

Description

FibreChannel

Link Green A connection is established on theport.

Off No connection is established on theport.

Ethernet Link Green A link is established between theport and some upstream device.

Off No link is established.

Activity Amber Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

and

Remotemanagement

Link Green A link is established between theport and some upstream device.

Off No link is established.

Activity Amber Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

and

Privatemanagement

Link Green A link is established between theport and a downstream disk shelf.

Off No link is established.

Activity Amber Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

Controllerfault

Activity Amber A problem has occurred in thecontroller. This in turn has causedthe system fault LED on the front ofthe chassis to be illuminated.

Note: The LED might beilluminated on both controllers.

Off The controller is functioningproperly.

System LEDs | 57

Page 58: 215-06774_netapp-cmds

Label Name Type Statusindicator

Description

NVMEM NVMEMstatus

Blinkinggreen

NVMEM is in battery-backedstandby mode.

Off (poweron)

The system is running normally, andNVMEM is armed if Data ONTAPis running.

Location and meaning of the LED on the back of 32xx and SA320 I/Oexpansion modules

You can check the back of the I/O expansion module to detect whether a fault has occurred.

The following illustration shows the ports and LEDs on the back of an I/O expansion module.

!2

3

4

5

6

4

1 21

1PCIe slots (labeled 3, 4, 5, and 6)

2Fault LED

The following table describes the meaning of the LED on the I/O expansion module.

Label Name Type Statusindicator

Description

I/O expansionmodule fault

Activity Amber A fault has occurred.

Off The I/O expansion module isfunctioning normally.

58 | Platform Monitoring Guide

Page 59: 215-06774_netapp-cmds

Location and meaning of 32xx and SA320 fan LEDsYou can check the LED on each fan module FRU to pinpoint problems that can occur in the FRU.

When the bezel is removed, the fan module FRUs and their LEDs are visible. The followingillustration shows the LED on a fan module FRU.

1

1LED

The fan module FRU LED is amber and illuminates when a problem occurs in the fan. If you seeerror messages indicating a fan problem, you can remove the bezel and use the illuminated fan FRULED to locate the FRU where the problem occurred.

Location and meaning of 32xx and SA320 PSU LEDsYou can check the LEDs on each PSU to see whether its power is on and whether the PSU isworking properly.

The following illustration shows the location of PSU LEDs on the back of the system.

System LEDs | 59

Page 60: 215-06774_netapp-cmds

1 Fault LED

2 Power LED

The following table describes what the PSU LEDs mean.

Power LED status Fault LEDstatus

PSU condition

Green Off PSU is present and switched on. Normal mode.

Off Off PSU is missing or switched off. The other PSU is off orfunctioning normally.

Off Blinking amber PSU fault: AC in is out of range, or there is a DC fault orfan fault.

Location and meaning of 32xx and SA320 internal FRU LEDs32xx systems contain LEDs inside the controller and I/O expansion module that assist introubleshooting FRUs inside of them.

The following FRUs are in the controller and have LEDs on or near them:

• DIMMs (up to 4)• RTC battery• USB device• PCIe slots (2)

60 | Platform Monitoring Guide

Page 61: 215-06774_netapp-cmds

The I/O expansion module has four PCIe slots, each with an LED.

The FRU LEDs remain unlit when the FRU is functioning normally and turn amber when a problemoccurs. They stay lit for at least 10 minutes even after you remove the controller or I/O expansionmodule from the chassis.

60xx and SA600 system LEDs60xx and SA600 systems have LEDs that you can check to learn whether the system and itscomponents are turned on and are operating normally.

LEDs are visible on the front and rear of each system, and on the fan FRUs and the power supplies.

Location and meaning of LEDs on the front of 60xx and SA600 controllersYou can check the LEDs on the front of the controller to learn whether the power is turned on,whether the system is active, whether the system is halted, or whether there is a fault in the chassis.

The following illustration shows the LEDs on the front of the controller.

12

3

1 Activity LED

2 Status LED

3 Power LED

The following table explains what the LEDs on the front of the controller mean.

System LEDs | 61

Page 62: 215-06774_netapp-cmds

LED label Statusindicator

Description

Activity Green The system is operating and is active.

Blinking The system is actively processing data.

Off No activity is detected.

Status Green The system is operating normally.

Amber The system halted or a fault occurred. The fault is displayed inthe LCD.

Attention: The LED remains lit during boot, while theoperating system loads.

Power Green The system is receiving power.

Off The system is not receiving power.

Location and meaning of LEDs on the back of 60xx and SA600 controllersYou can check the LEDs on the back of the controller to learn the status of network and disk shelfconnections.

The following illustration shows the location of LEDs on the back of the controller.

1

2 3

1 GbE port LEDs

2 RLM port LEDs

62 | Platform Monitoring Guide

Page 63: 215-06774_netapp-cmds

3 Fibre Channel port LEDs

The following table explains what the LEDs on the rear of the controller mean.

Port type LED type Status indicator Description

Fibre Channel LNK (Green) Off No link with the Fibre Channel isestablished.

Blinking (6030and 6070systems)

A link is established andcommunication is happening.

Solid (6040,6080, and SA600systems)

GbE and RLM LNK On A valid network connection isestablished.

Off There is no network connection.

ACT On There is data activity.

Off There is no network activity present.

Location and meaning of 60xx and SA600 fan LEDsYou can check the fan LEDs to learn whether the fan is functioning properly.

The following illustration shows the location of the fan LEDs, which you can see when you removethe bezel from the system.

1

2

1 Fan

System LEDs | 63

Page 64: 215-06774_netapp-cmds

2 LEDs

The following table describes the behavior of the fan LEDs.

LED status Description

Orange blinking The fan failed.

Off There is no power to the system, or the fan is operational.

Location and meaning of 60xx and SA600 PSU LEDsYou can check the LEDs to learn whether the PSUs are providing power to your system and whetherthey are functioning properly.

The following illustration shows the location of the PSU LEDs on your system.

1

2

1 LEDs

2 Power supply

The following table explains what the PSU LEDs mean.

Amber

(AC input)

Green

(PSU status)

Description Corrective action

On On The AC power source isgood, and the PSU isproviding power to thesystem.

N/A

64 | Platform Monitoring Guide

Page 65: 215-06774_netapp-cmds

Amber

(AC input)

Green

(PSU status)

Description Corrective action

On Off AC power is present, butthe PSU is not deliveringpower to the system.

Ensure that the PSU is properly seatedand that its cables are connected andsecure.

On Blinking AC power is present, butthe power supply is notenabled.

1. Log in to the RLM and enter thefollowing command:

system power on

Note: Using the system powercommand might cause animproper shutdown of thestorage system. During power-cycling, a brief pause occursbefore power is turned back on.

2. If the problem persists, contacttechnical support.

Off Off AC power is either notpresent or not withinoperational limits.

Check the AC switch, AC powercable, and upstream circuit breakers.

62xx and SA620 system LEDs62xx and SA620 systems have LEDs that you can check to learn whether the system and itsindividual components are turned on and are operating normally.

LEDs are visible on the front of the chassis, the rear of controllers and I/O expansion modules, andon fan FRUs and power supplies.

Location and meaning of LEDs on the front of 62xx and SA620 chassisYou can check the LEDs on the front of the chassis to learn whether the power is turned on, thecontroller is active, the system is halted, or a fault in the chassis has occurred.

The following illustration shows the LEDs on the front of the 62xx and SA620 chassis.

System LEDs | 65

Page 66: 215-06774_netapp-cmds

1Chassis LEDs

When the bezel is in place, the LEDs are arranged horizontally in the following left-to-right order:

• Power• Fault• Controller A activity• Controller B activity

When two controllers are installed in the chassis, Controller A is the controller in the top bay, andController B is the controller in the bottom bay. When a controller and an I/O expansion module areinstalled in the chassis, the controller is always in the top bay and the I/O expansion module isalways in the bottom bay.

Note: When the bezel is removed, the LEDs are arranged vertically in the following top-to-bottomorder:

• Power• Fault• Controller A activity• Controller B activity

The following table shows what the LED labels look like and explains what the LEDs mean.

66 | Platform Monitoring Guide

Page 67: 215-06774_netapp-cmds

LED label LED name Status indicator Description

Power Green At least one of the two PSUs isdelivering power to the system.

Off Neither PSU is delivering power to thesystem.

Fault Amber The system halted or a fault occurred inthe chassis. The error might be in aPSU, fan, controller, or I/O expansionmodule. The LED also is lit when thereis a FRU failure, Data ONTAP is notrunning on a controller, or the system isin Maintenance mode.

Note: You can check the fault lighton the back of each controller to seewhere the problem occurred.

Note: The fault light does not comeon when you remove the controllerfrom a dual-controller system in anHA pair.

Off The system is operating normally.

Activity Blinking green Data ONTAP is running on thecontroller. The length of time that thelight remains on is proportional to thecontroller's activity.

Off Data ONTAP is not running on thecontroller.

Location and meaning of LEDs on the back of 62xx and SA620 controllersYou can check the LEDs on the back of the controller to learn the status of its network or disk shelfconnections, or, in an HA pair, to identify the controller where a fault occurred.

The following illustration shows the LEDs on left side the back of the 62xx and SA620 controllers.

System LEDs | 67

Page 68: 215-06774_netapp-cmds

e0be0c e0d e0e e0f

e0a0

!

LNK LNK LNK LNK

1 2 3

4 5 6 7 8 9

1Remote management port LEDs

2Private management port LEDs

3GbE port LEDs

4Controller fault LED

5Remote management port

6Private management port

7GbE port

810-GbE ports

910-GbE port LEDs

68 | Platform Monitoring Guide

Page 69: 215-06774_netapp-cmds

The following table describes the meaning of the LEDs on left side of the back of the controller.

LED label LED name LED type Statusindicator

Description

Fault Activity Amber The controller is the one causingthe front panel fault LED to beilluminated.

Note: The LED might beilluminated on bothcontrollers.

Off The controller is functioningproperly.

and

Remotemanagement

Link (Left) Green A link is established between theport and some upstream device.

Off No link is established.

Activity(Right)

Amber Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

and

Privatemanagement

Link (Left) Green A link is established between theport and a downstream disk shelf.

Off No link is established.

Activity(Right)

Amber Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

Port number

and

GbE Link (Left) Green A link is established between theport and some upstream device.

Off No link is established.

Activity(Right)

Amber Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

System LEDs | 69

Page 70: 215-06774_netapp-cmds

LED label LED name LED type Statusindicator

Description

Port number

and

10 GbE Activity(Top)

Amber Traffic is flowing over theconnection.

Off No traffic is flowing over theconnection.

Link(Bottom)

Green A link is established between theport and some upstream device.

Off No link is established.

The following illustration shows the location of ports and LEDs on the right side of the back of thecontroller.

0a 0b 0c 0d

LNK LNK LNK LNK

1

2 3 4

1USB port

28-Gb Fibre Channel port LED

38-Gb Fibre Channel ports

4Console port

70 | Platform Monitoring Guide

Page 71: 215-06774_netapp-cmds

The following table describes the meaning of the LEDs on the right of the back of the controller.

LED label LED name LEDtype

Status indicator Description

Port number

and

8-Gb FibreChannel

Link Green A connection is establishedon the port.

Off No connection is establishedon the port.

Location and meaning of the 62xx and SA620 I/O expansion module LEDYou can check the back of the I/O expansion module to check whether a fault has occurred.

The following illustration shows the ports and LEDs on the back of an 62xx and SA620 I/Oexpansion module.

1

2 23

1Fault LED

2PCIe slots

3Vertical I/O slots

The following table describes the meaning of LEDs on the I/O expansion module.

System LEDs | 71

Page 72: 215-06774_netapp-cmds

LED label LED name LED type Status indicator Description

Fault Activity Amber A fault has occurred.

Off The I/O expansion module isoperating properly.

Location and meaning of 62xx and SA620 fan LEDsYou can check the LED on each fan module FRU to pinpoint problems that can occur in the FRU.

When the bezel is removed, the fan module FRUs and their LEDs are visible. The followingillustration shows the LED on a fan module FRU.

1LED

The fan module FRU LED is amber and turns on when a problem occurs in the fan. If you see errormessages indicating a fan problem, you can remove the bezel and use the illuminated fan FRU LEDto locate the FRU where the problem occurred.

Location and meaning of 62xx and SA620 PSU LEDsYou can check the LEDs on each PSU to see whether its power is on and whether the PSU isworking properly.

The following illustration shows the location of PSU LEDs on the back of the system.

72 | Platform Monitoring Guide

Page 73: 215-06774_netapp-cmds

1 Fault LED

2 Power LED

The following table describes what the PSU LEDs mean.

Power LED status Fault LEDstatus

PSU condition

Green Off PSU is present and switched on. Normal mode.

Off Off PSU is switched off.

Off Blinking amber PSU fault: AC in is out of range, or there is a DC fault orfan fault.

Location and meaning of 62xx and SA620 internal FRU LEDs62xx and SA620 systems contain LEDs near FRUs inside the controller and I/O expansion modulethat assist in troubleshooting the FRUs.

The following FRUs LEDs are in the controller:

• DIMMs (up to 12)• RTC battery• USB boot device• PCIe slots

System LEDs | 73

Page 74: 215-06774_netapp-cmds

• 10-GbE slot• I/O slots (2)

The following FRU LEDs are in the I/O expansion module:

• PCIe slots• I/O slots

FRU LEDs are off when the FRU is functioning normally and turn amber when a problem occurs.They stay lit for at least 10 minutes even after you remove the controller or I/O expansion modulefrom the chassis.

HBA LEDsHBAs have LEDs that you can check to learn whether the adapter has power, whether a link isestablished, or whether an error has occurred.

Storage systems might have Fibre Channel or iSCSI host bus adapters installed and configured onthem.

Location and meaning dual-port Fibre Channel HBA LEDsYou can check the LEDs on the HBA to learn the status of the Fibre Channel connection.

The following illustration shows the location of the LED on a dual-port Fibre Channel HBA.

1 Green LED

74 | Platform Monitoring Guide

Page 75: 215-06774_netapp-cmds

2 Amber LED

The following table explains what the LEDs on a dual-port Fibre Channel HBA mean.

Green Amber Description

On On The power is on.

Off Blinking Sync is lost.

Off On Signal is acquired.

On Off Ready.

Flashing Off 4 seconds solid followed by one flash: 1-Gb link speed.

4 seconds solid green link followed by two flashes: 2-Gblink speed.

Flashing Blinking Adapter firmware error has been detected.

Location and meaning of dual-port, 4-Gb or 8-Gb, target-mode FibreChannel HBA LEDs

You can check the LEDs to learn whether the HBA power is on, whether a firmware error has beendetected, and whether a link has been established.

The following illustration shows the location of LEDs on a dual-port, 4-Gb or 8-Gb, target-modeFibre Channel HBA.

System LEDs | 75

Page 76: 215-06774_netapp-cmds

1 Amber

2 Green

3 Yellow

4 Port a

5 Port b

6 Yellow

7 Green

8 Amber

9 TX

10 RX

11 TX

12 RX

The following table explains what the LEDs mean.

Yellow Green Amber Description

Off Off Off Power is off.

On On On Power is on, before firmware initialization.

Blinking Power is on, after firmware initialization.

Blinking alternately A firmware error is detected.

76 | Platform Monitoring Guide

Page 77: 215-06774_netapp-cmds

Yellow Green Amber Description

Off Off On/Blinking

4-Gb HBA: 1-Gbps link/I/O is established.

8-Gb HBA: On for 2 Gbps link up. If there is I/O activity, theLED blinks several times per second.

Off On/Blinking

Off 4-Gb HBA: 2-Gbps link/I/O is established.

8-Gb HBA: On for 4 Gbps link up. If there is I/O activity, theLED blinks several times per second.

On/blinking

Off Off 4-Gb HBA: 4-Gbps link/I/O is established.

8-Gb HBA: On for 8 Gbps link up. If there is I/O activity, LEDblinks several times per second.

Blinking Off Blinking Beacon.

Location and meaning of dual-port, 8-Gb Fibre Channel Virtual InterfaceHBA LEDs

You can check the LEDs to learn whether the HBA power is on, whether a firmware error has beendetected, and whether a link has been established.

The following illustration shows the location of LEDs on the dual-port, 8-Gb Fibre Channel VirtualInterface HBA.

321

4

5

3

12

67

67

1Amber LED

System LEDs | 77

Page 78: 215-06774_netapp-cmds

2Green LED

3Yellow LED

4Port a

5Port b

6Transmitter port

7Receiver port

The following table explains what the LEDs mean.

Yellow Green Amber Description

Off Off Off Power off

On On On Power on, beforefirmware initialization

Blinking Blinking Blinking Power on, afterfirmware initialization

Yellow, green, and amber LEDs blinking alternately Firmware error

Off Off On/blinking Online, 2 Gbps link/I/O activity

Off On/blinking Off Online, 4 Gbps link/I/O activity

On/blinking Off Off Online, 8 Gbps link/I/O activity

Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs: four-LED version

You can check the LEDs on the HBA to learn the status of the storage system Fibre Channel link andwhether data is being transferred.

The following illustration shows the location of LEDs.

78 | Platform Monitoring Guide

Page 79: 215-06774_netapp-cmds

1 Port A (as identified by Data ONTAP)

2 Port B (as identified by Data ONTAP)

3 Port C (as identified by Data ONTAP)

4 Port D (as identified by Data ONTAP)

5 Port A LED

6 Port C LED

7 Port B LED

8 Port D LED

The following table describes what the LEDs mean.

System LEDs | 79

Page 80: 215-06774_netapp-cmds

LED label Status indicator Description

By port letter White There is a loss of sync or no link.

Blinking white There is a fault.

Amber 1-Gbps link is established.

Blinking amber 1-Gbps data transfer is taking place.

Green 2-Gbps link is established.

Blinking green 2-Gbps data transfer is taking place.

Blue 4-Gbps link is established.

Blinking blue 4-Gbps data transfer is taking place.

Location and meaning of quad-port, 4-Gb, Fibre Channel HBA LEDs: 12-LED version

You can check the LEDs on the HBA to learn the status of Fibre Channel connection and whetherdata is being transferred.

The following illustration shows the location of LEDs.

1 Port A (as identified in Data ONTAP)

2 Port B (as identified in Data ONTAP)

80 | Platform Monitoring Guide

Page 81: 215-06774_netapp-cmds

3 Port C (as identified in Data ONTAP)

4 Port D (as identified in Data ONTAP)

5 Ports A through D yellow LEDs

6 Ports A though D green LEDs

7 Ports A through D amber LEDs

The following table describes what the LEDs mean.

Yellow LEDs Green LEDs Amber LEDs Description

Off The power is off.

On The power is on (before firmware initialization).

Blinking The power is on (after firmware initialization).

Blinking alternately A firmware error is detected.

Off Off On 1-Gbps link is established.

Off Off Blinking 1-Gbps data transfer is taking place.

Off On Off 2-Gbps link is established.

Off Blinking Off 2-Gbps data transfer is taking place.

On Off Off 4-Gbps link is established.

Blinking Off Off 4-Gbps data transfer is taking place.

Location and meaning of fiber-optic iSCSI target HBA LEDsYou can check the LEDs on the HBA to learn whether the HBA is on, whether it is connected to thenetwork, and whether there is data activity.

The following illustration shows the location of LEDs on a fiber optic, iSCSI, target HBA.

System LEDs | 81

Page 82: 215-06774_netapp-cmds

1 LINK LED

2 ACT LED

3 Port 2

4 Port 1

The following table explains what the LEDs on a fiber optic, iSCSI, target HBA mean.

LED label Status indicator Description

LINK Yellow The HBA is on and connected to the network.

Off The HBA is not connected to the network.

ACT Green A connection is established.

Blinking green There is data activity.

Location and meaning of copper iSCSI target HBA LEDsYou can check the HBA LEDs to learn whether the HBA is running at 1 Gbps, whether a connectionis established, and whether there is data activity.

The following illustration shows the location of LEDs on a copper iSCSI target HBA.

82 | Platform Monitoring Guide

Page 83: 215-06774_netapp-cmds

1 Speed LED

2 ACT LED

3 Port 2

4 Port 1

The following table explains what the LEDs on a copper iSCSI target HBA mean.

LED label Status indicator Description

Speed Green The HBA is running at 1 Gbps.

Off The HBA is not running at 1 Gbps.

ACT Amber A connection is established.

Blinking amber There is data activity.

System LEDs | 83

Page 84: 215-06774_netapp-cmds

Location and meaning of dual-port, 10-Gb, FCoE unified target HBA LEDsYou can check the LEDs on the HBA to learn about SAN or LAN traffic over the HBA and thestatus of the HBA and the connection.

The following illustration shows the location of LEDs on a dual-port, 10-Gb, FCoE (Fibre Channelover Ethernet) HBA.

1

1

2

2

3

4

5

5

6

6

1 One of two LAN LEDs

2 One of two SAN LEDs

3 Port a

4 Port b

5 One of two transmitter ports

6 One of two receiver ports

84 | Platform Monitoring Guide

Page 85: 215-06774_netapp-cmds

The ports in the preceding illustration are labeled a and b because Data ONTAP identifies portsalphabetically. The physical ports are labeled Port 1 for Port a and Port 2 for Port b.

Note: These HBAs are supported only in target mode and single system image controller failovercfmode. You cannot use this HBA as an initiator to connect to disks or tape, and you cannot use itfor Fabric MetroCluster interconnect configurations.

The following table explains what the LEDs on a dual-port,10-GB, FCoE HBA mean.

Port SAN traffic green LED LAN traffic green LED Hardware state

a Off Off Power off.

Slow flashing (unison) Slow flashing (unison) Power on/no link.

On On Power on/link established, noactivity.

On Flashing Power on/link established,Rx/Tx Ethernet activity only.

Flashing On Power on/link established,Rx/Tx storage activity only.

Flashing Flashing Power on/link established,Rx/Tx Ethernet and storageactivity.

Slow flashing, alternatingwith other LED

Slow flashing, alternatingwith other LED

Beaconing.

b Off Off Power off.

Slow flashing (unison) Slow flashing (unison) Power on/no link.

On On Power on/link established, noactivity.

On Flashing Power on/link established,Rx/Tx Ethernet activity only.

Flashing On Power on/link established,Rx/Tx storage activity only.

Flashing Flashing Power on/link established,Rx/Tx Ethernet and storageactivity.

Slow flashing, alternatingwith other LED

Slow flashing, alternatingwith other LED

Beaconing.

System LEDs | 85

Page 86: 215-06774_netapp-cmds

Location of dual-port, 3-Gb SAS HBA portsDual-port, 3-Gb SAS HBAs do not have LEDs that you can monitor.

The following illustration shows the location of ports on a dual-port 3-Gb SAS HBA and its cable.

3 4

1

2

1 Port A

2 Port B

3 QSFP-to-Mini-SAS copper cable–Mini-SAS connector (to card)

4 QSFP-to-Mini-SAS copper cable–QSFP connector (to disk shelf)

Location of quad-port, 3-Gb SAS HBA portsQuad-port, 3-Gb SAS HBAs do not have LEDs that you can monitor.

The following illustration shows the location of ports on a quad-port, 3-Gb SAS HBA port and itscable.

86 | Platform Monitoring Guide

Page 87: 215-06774_netapp-cmds

1

2

5

3

4

1 Port A

2 Port B

3 Port C

4 Port D

5 SAS QSFP-to-QSFP copper cable

MetroCluster adapter LEDsMetroCluster adapters have LEDs that you can check to learn whether the adapter has power andwhether an error has occurred.

Location and meaning of dual-port, 2-Gb VI-MetroCluster adapter LEDsYou can check the LEDs on the adapter to learn whether the power is on, whether a signal has beenacquired, or whether an error has occurred.

The following illustration shows the location of LEDs on a dual-port 2-Gb VI-MetroCluster adapter.

System LEDs | 87

Page 88: 215-06774_netapp-cmds

1

1

2

2

3

4

5

5

6

6

1 One of two amber LEDs

2 One of two green LEDs

3 Port A

3 Port B

4 One of two transmitter ports

6 One of two receiver ports

The following table explains what the LEDs mean.

Green Amber Description

Off Off Power is off.

On On Power is on.

Off Blinking at half-second intervals Synchronization has been lost.

Off On A signal has been acquired.

88 | Platform Monitoring Guide

Page 89: 215-06774_netapp-cmds

Green Amber Description

On Off Adapter is online.

Blinking at half-second intervals A system error has occurred.

Location and meaning of dual-port, 4-Gb MetroCluster adapter LEDsYou can check the LEDs on the adapter to learn whether power is on, whether there is activity, orwhether an error has occurred.

The following illustration shows the LEDs on the dual-port, 4-Gb MetroCluster adapter.

1 Amber LED

2 Green LED

3 Yellow LED

4 Port a

5 Port b

6 Yellow LED

System LEDs | 89

Page 90: 215-06774_netapp-cmds

7 Green LED

8 Amber LED

9 Transmitter port

10 Receiver port

11 Transmitter port

12 Receiver port

The following table describes what the LEDs mean.

Yellow Green Amber Description

Off Off Off Power is off.

On On On Power is on, before firmware initialization.

Blinking Blinking Blinking Power is on, after firmware initialization.

Yellow, green, and amber LEDs blinkingalternately

A firmware error has occurred.

Off Off On/blinking Online, 1 Gbps link/ I/O activity.

Off On/blinking Off Online, 2 Gbps link/ I/O activity.

On/blinking Off Off Online, 4 Gbps link/ I/O activity.

Location and meaning of dual-port, 8-Gb MetroCluster adapter LEDsYou can check the LEDs on the adapter to learn whether power is on, whether there is activity, orwhether an error has occurred.

The following illustration shows the LEDs on the dual-port, 8-Gb MetroCluster adapter.

90 | Platform Monitoring Guide

Page 91: 215-06774_netapp-cmds

3

2

1

4

5

3

1

2

6

7

6

7

1Amber LED

2Green LED

3Yellow LED

4Port a

5Port b

6Transmitter port

7Receiver port

The following table describes what the LEDs mean.

Yellow Green Amber Description

Off Off Off Power off

On On On Power on, beforefirmware initialization

System LEDs | 91

Page 92: 215-06774_netapp-cmds

Yellow Green Amber Description

Blinking Blinking Blinking Power on, afterfirmware initialization

Yellow, green, and amber LEDs blinking alternately Firmware error

Off Off On/blinking Online, 2 Gbps link/I/O activity

Off On/blinking On Online, 4 Gbps link/I/O activity

On/blinking Off Off Online, 8 Gbps link/I/O activity

GbE NIC LEDsGigabit Ethernet NICs have LEDs that you can check to learn the status of the Ethernet connectionand, in some cases, transfer speeds.

The GbE NICs in your system might be fiber optic-based or copper-based. They might have one,two, or four ports.

Location and meaning of single-port GbE NIC LEDsYou can check the LEDs on your single-port copper or fiber GbE NIC to learn whether there is anetwork connection and whether there is data activity. On copper GbE NICs, you also can learn howfast data is being transmitted.

The following illustration shows the location of LEDs on copper and fiber single-port GbE NICs.

92 | Platform Monitoring Guide

Page 93: 215-06774_netapp-cmds

1 Copper 10Base-T/100Base-BX/1000Base-T NIC

2 Fiber 1000Base-SX NIC

The following table explains what the LEDs on single-port copper GbE NICs mean.

LED type Status indicator Description

ACT/LNK Green A valid network connection is established.

Blinking green or blinking amber There is data activity.

Off There is no network connection.

10=OFF Off Data transmits at 10 Mbps.

100=GRN Green Data transmits at 100 Mbps.

1000=YLW Yellow Data transmits at 1000 Mbps.

The following table explains what the LEDs on single-port fiber GbE NICs mean.

System LEDs | 93

Page 94: 215-06774_netapp-cmds

LED type Status indicator Description

LNK On A valid network connection isestablished.

Off There is no networkconnection.

ACT On There is data activity.

Off There is no network activitypresent.

Location and meaning of single-port, 10-GbE NIC LEDs (FAS2050 systemsonly)

You can check the LEDs on your single-port, 10-GbE NIC to learn whether there is a networkconnection and whether there is data activity. This NIC is used only in FAS2050 systems.

The following illustration shows the location of LEDs on the single-port, 10-GbE NIC.

1

2

1 LINK/ACT LED

2 Port A

94 | Platform Monitoring Guide

Page 95: 215-06774_netapp-cmds

The following table explains what the LEDs on the single-port 10-Gb NIC mean.

LED label Status indicator Description

LINK/ACT Green A valid network connection is established.

Blinking amber There is data activity.

Off There is no network connection present.

Location and meaning of LEDs on the dual-port 10-GbE NIC that supportsfiber optic cables with SFP+ modules or copper SFP+ cables

You can check the LEDs on your dual-port 10-GbE NIC that supports fiber optic cables and SFP +optical modules or copper SFP + cables to learn whether there is a network connection and whetherthere is data activity.

The following illustration shows the location of LEDs and ports on the NIC.

1

2

3

4

5

1 LINK/ACT LED for Port A

2 LINK/ACT LED for Port B

System LEDs | 95

Page 96: 215-06774_netapp-cmds

3 Port A

4 Port B

5 SFP module latches

The following table explains what the LEDs on the NIC mean.

LED label Status indicator Description

LINK/ACT Green A valid network connection is established.

Blinking amber There is data activity.

Off There is no network connection present.

Location and meaning of LEDs on the dual-port 10-GbE NIC that supportsfiber optic cables with X6569 SFP+ modules or copper SFP+ cables

You can check the LEDs on your dual-port 10-GbE NIC that supports fiber optic cables and X6569SFP+ optical modules or copper SFP+ cables to learn whether there is a network connection, whetherthere is data activity, and whether the card is operating at 10-Gb speed.

The following illustration shows the location of LEDs and ports on the NIC.

96 | Platform Monitoring Guide

Page 97: 215-06774_netapp-cmds

GRN=10GACT/LNK A

GRN=10GACT/LNK A

1

3

4

2

5

6

1Port A 10-Gb link LED

2Port A ACT/Link LED

3Port A with SFP+ installed

4Port B with no SFP+ connector

5Port B 10-Gb link LED

6Port B ACT/Link

System LEDs | 97

Page 98: 215-06774_netapp-cmds

The following table explains what the LEDs on the card mean.

LED label Status indicator Description

GRN=10G Green The NIC is operating at 10 Gb speed.

LINK/ACT Green A valid network connection is established.

Blinking amber There is data activity.

Off There is no network connection present.

Location and meaning of multiport GbE NIC LEDsYou can check the LEDs on your multiport copper or fiber GbE NIC to learn whether there is anetwork connection and whether there is data activity. On copper GbE NICs, you also can learn howfast data is being transmitted.

The following illustration shows the location of LEDs on copper and fibre dual-port GbE NICs.

1 Copper 10Base-T/100Base-TX/1000Base-T NIC

2 Fiber 1000Base-SX NIC

3 Network speed LEDs

The following illustration shows the location of LEDs on copper quad-port GbE NICs.

98 | Platform Monitoring Guide

Page 99: 215-06774_netapp-cmds

1 2

3 4 5 6

1 2

3 4 5 6

Note: The orientation of the ports on NICs might differ.

1 ACT LED

2 LNK LED

3 Port a

4 Port b

5 Port c

6 Port d

The following table explains what the LEDs on a copper multiport GbE NIC mean.

System LEDs | 99

Page 100: 215-06774_netapp-cmds

LED type Status indicator Description

ACT Green A valid network connection is established.

Blinking green or blinking amber There is data activity.

Off There is no network connection.

LNK Off Data transmits at 10 Mbps.

Green Data transmits at 100 Mbps.

Amber Data transmits at 1000 Mbps.

The following table explains what the LEDs on the fiber multiport GbE NICs mean.

LED type Status indicator Description

LNK On A valid network connection is established.

Off There is no network connection.

ACT On There is data activity.

Off There is no network activity present.

TOE NIC LEDsTOE NICs have LEDs that you can check to learn the state of the network connection.

TOE NICs might have one port or multiple ports.

Location and meaning of single-port TOE NIC LEDsThe single-port TCP offload engine is a 10GBase-SR fiber optic NIC. You can check the NIC LEDsto learn whether it is on, whether there is a network connection, or whether the operating system hasbooted.

The following illustration shows the location of LEDs on the NIC.

100 | Platform Monitoring Guide

Page 101: 215-06774_netapp-cmds

1 Fiber optic LC port

2 LINK LED

3 ACT LED

4 STAT (power) LED

The following table explains what the LEDs mean.

LED type Status indicator Description

ACT/LNK Green A valid network connection is established.

Blinking green There is data activity.

Off There is no network connection.

STAT Red The NIC is receiving power and is on.

Off The operating system has booted.

System LEDs | 101

Page 102: 215-06774_netapp-cmds

Location and meaning of dual-port, 10GBase-SR TOE NIC LEDsYou can check the LEDs on the TOE NIC to learn whether there is a network connection or dataactivity.

The following illustration shows the location of LEDs on the TOE NIC.

1 LINK/ACT LED, port A

2 LINK/ACT LED, port B

3 Fiber optic LC, port A

4 Fiber optic LC, port B

The following table explains what the LEDs on the TOE NIC mean.

LED label Status indicator Description

LINK/ACT Green A valid network connection is established.

Green There is data activity.

Off There is no network connection present.

102 | Platform Monitoring Guide

Page 103: 215-06774_netapp-cmds

Location and meaning of dual-port, 10GBase-CX4 TOE NIC LEDsYou can check the LEDs on the TOE NIC to learn whether there is a network connection or dataactivity.

Note: The 10GBase-CX4 dual-port TOE NIC is for use only on systems running Data ONTAP10.0.3 or later.

The following illustration shows the location of LEDs on the TOE NIC.

1 LINK/ACT LED A

2 Port A

3 LINK/ACT LED B

4 Port B

The following table explains what the LEDs on the TOE NIC mean.

System LEDs | 103

Page 104: 215-06774_netapp-cmds

LED type Status indicator Description

LINK/ACT Green A valid network connection is established.

Blinking green There is data activity.

Off There is no network connection present.

Location and meaning of quad-port TOE NIC LEDsYou can check the LEDs on the TOE NIC to learn whether there is data activity and the speed of datatransmission.

The following illustration shows the location of LEDs on the TOE NIC.

1 Activity LEDs: LED 1 corresponds to port a, LED 2 corresponds to port b, and so on.

2 Port a

3 Port b

4 Port c

5 Port d

104 | Platform Monitoring Guide

Page 105: 215-06774_netapp-cmds

6 Activity LEDs: LED 1 corresponds to port a, LED 2 corresponds to port b, and so on.

7 Port d

8 Port c

9 Port b

10 Port a

The following table explains what the LEDs on the TOE NIC mean.

LED label Status indicator Description

Labeled by portnumber

Yellow Data transmits at 1 Gbps.

Green Data transmits at 10/100 Mbps.

Blinking There is data activity.

NVRAM adapter LEDsNVRAM adapter LEDs enable you to determine whether NVRAM is holding unwritten data and, inHA pairs, to check the connection between the two nodes.

NVRAM preserves unwritten data if your system loses power. NVRAM also is the HA interconnectwhen your system is in an HA pair, except when you use MetroCluster.

Different systems have different kinds of NVRAM adapters. NVRAM5, NVRAM6, and NVRAM8adapters plug into the motherboard. NVRAM7 is integrated into the motherboard. The followingtable shows the type of NVRAM that different systems support.

NVRAM type Systems

NVRAM5 3020 and 3050

NVRAM6 • 3040, 3070, and SA300• 60xx and SA600

NVRAM7 31xx

NVRAM8 62xx

System LEDs | 105

Page 106: 215-06774_netapp-cmds

Location and meaning of NVRAM5 and NVRAM6 LEDsYou can check the LEDs to learn whether there is valid data in NVRAM when your system losespower. When you use the NVRAM adapter as an HA interconnect, you also can check the LEDs tolearn whether there is a connection between the nodes.

Two sets of LEDs by each port on the faceplate operate when you use the NVRAM5 or NVRAM6adapter as an HA interconnect. NVRAM adapters also have an internal LED that you can see throughthe faceplate. The following illustration shows LEDs on the NVRAM5 and NVRAM6 adapter.

NVRAM5

L02 PH2

L01 PH1

The following table explains what the LEDs on an NVRAM5 or NVRAM6 adapter mean.

LED type Indicator Status Description

Internal Red Blinking There is valid data in NVRAM.

Note: The LED might blink red if your system didnot shut down properly, as in the case of a powerfailure or panic. The data is replayed when thesystem boots again.

PH1 Green On The physical connection is working.

Off No physical connection exists.

LO1 Yellow On The logical connection is working.

Off No logical connection exists.

106 | Platform Monitoring Guide

Page 107: 215-06774_netapp-cmds

Location and meaning of NVRAM7 LEDsYou can check the LEDs to learn if there is any unwritten data in NVRAM if your controller losespower.

Each 31xx controller has two NVRAM7 LEDs:

• One is near the left front corner of the motherboard next to the NVRAM DIMM.The LED is labeled D35 and NVRAM Data Valid When Lit. You can see the LED only after youremove the controller from the chassis.

• One is near the right rear corner of the motherboard. It is labeled D87.You can see the LED through the rear grille of the controller, as shown in the followingillustration.

1

1 NVRAM7 LED

NVRAM7 LEDs flash red if unwritten data is being held in NVRAM when power to the controller isturned off. If you remove the NVRAM7 battery or NVRAM7 DIMM when the red LEDs areflashing, you lose data that is being held in NVRAM.

Note: In an HA pair, each node continually monitors its partner and mirrors its partner's NVRAMdata. Therefore, if you remove a controller from a 31xx system in an HA pair without first shuttingit down, you can disregard the illuminated NVRAM LEDs on the motherboard of the removedcontroller.

System LEDs | 107

Page 108: 215-06774_netapp-cmds

Location and meaning of NVRAM5 and NVRAM6 media converter LEDsYou can check the LED to learn whether the media converter has power, whether a link is present,and whether the converter is operating normally.

The following illustration shows the location of the LED on NVRAM5 and NVRAM6 mediaconverters.

1

2

1 LED

2 Media converter

The following table explains what the LED on NVRAM5 and NVRAM6 media converters means.

Indicator Status Description

Green On Normal operation

Green/amber On Power is present but link is down.

Green Flickering or off Power is present but link is down.

Location and meaning of NVRAM8 LEDsYou can check the LEDs on the NVRAM8 adapter to check the connection between controllers in anHA pair and to learn the status of data when the system loses power.

Five LEDs are on the faceplate, and one LED on the adapter board is visible through the faceplategrille.

The following illustration shows the LEDs on the NVRAM8 adapter.

108 | Platform Monitoring Guide

Page 109: 215-06774_netapp-cmds

1

3

5

4

7

2

6

LNK ACT

LNK

LNK

ACT

INT

1InfiniBand port 0 link LED

2InfiniBand port 0 activity LED

3InfiniBand port 0 connector

4Internal link select LED

5InfiniBand port 1 link LED

6InfiniBand port 1 activity LED

7InfiniBand port 1 connector

Port 0 link and activity LEDs are relevant when port 0 of the controller is connected to a partner in anHA pair. The following table explains the meaning of the port 0 LEDs.

System LEDs | 109

Page 110: 215-06774_netapp-cmds

LEDname

Status indicator Description

Port 0 link Green A physical connection is working on the port 0 connector.

Off A physical connection is not working on the port 0 connector.

Port 0activity

Amber A logical connection is working on the port 0 connector.

Off A logical connection is not working on the port 0 connector.

Port 1 LEDs reflect the state of the port 1 connector used between two controllers installed indifferent chassis or the state of the internal InfiniBand connection used between two controllersinstalled in the same chassis. The following table explains the meaning of the port 1 LEDs.

LEDname

Statusindicator

Internal link select LED status

On (internal midplaneconnection)

Off (external cableconnection)

Port 1 link Green An internal physical connection isworking over the midplane.

An external physicalconnection is working on theport 1 connector.

Off An internal physical connection isnot working over the midplane.

An external physicalconnection is not working onthe port 1 connector.

Port 1activity

Amber An internal logical connection isworking over the midplane.

An external logical connectionis working on the port 1connector.

Off An internal logical connection is notworking over the midplane.

An external logical connectionis not working on the port 1connector.

Port 1 LEDs depend on the state of the Internal link select LED, which in HA pair configurationsdepends on how the controllers are connected. The following table explains the meaning of theinternal link select LED.

LED name Status indicator Description

Internal linkselect

Green The HA pair consists of two controllers in the same chassisconnected over the internal midplane.

Off The HA pair consists of two controllers in different chassisconnected by an external cable.

110 | Platform Monitoring Guide

Page 111: 215-06774_netapp-cmds

A destage status LED, on the top of the adapter board, is visible through the grille of the faceplatehalfway between the top of the faceplate and the InfiniBand port 0 LEDs. The LED shows the statusof NVRAM8 data after an unexpected loss of system power.

Data might need to be destaged, or saved from active DRAM to nonvolatile flash memory after anunexpected power loss. Destaging lasts about one minute. Once data has been destaged, it must berestaged, or restored from nonvolatile flash memory to active DRAM during system initialization.

The destage LED might be lit as red or green. Its behavior depends on whether the system power ison or off. When the system power is off, the LED behavior depends on whether the NVRAM8adapter is running on battery power. The battery automatically turns off after data is destaged.

The following table explains the meaning of the destage status LED when the NVRAM8 adapter is inthe controller.

Destage LEDstatus indicator

System power on System power off

Battery power on Battery power off

Red The NVRAM8 adapter hasdestage data that needs tobe restored.

Invalid N/A

Green The NVRAM8 adapter hasrestored data and is readyfor the next destage.

Invalid N/A

Alternating red andgreen

Invalid The NVRAM8 adapteris destaging data.

N/A

Off Invalid Invalid The NVRAM8 adapterhas finished destagingdata.

You can use the destage status LED when the adapter is removed from the system to determinewhether destage data is in the NVRAM8 adapter.

The following illustration shows the location of the destage status LED.

System LEDs | 111

Page 112: 215-06774_netapp-cmds

1

2

1Destage status LED

2InfiniBand port 0 LEDs

You activate the destage status LED when the NVRAM8 adapter is removed from the controller bypressing and holding the button marked SW6 and STATUS on the bottom of the adapter board. Thefollowing illustration shows the location of the button.

112 | Platform Monitoring Guide

Page 113: 215-06774_netapp-cmds

STATUS

SW6

STATUS

SW6

1

1Button for activating destage status LED

The LED consists of a red LED and a green LED that might turn on separately or together, creating alight that appears amber. The following table explains the meaning of the destage status LED whenthe button is pressed.

LED color Description

None (Off) No status; no battery power.

Amber Miscellaneous status for debugging.

Green No data in flash memory; not destaged.

Red Data in flash memory; destaged.

System LEDs | 113

Page 114: 215-06774_netapp-cmds

Flash Cache module and PAM LEDsFlash Cache modules and Performance Acceleration Modules (PAMs) have LEDs that you can checkto ensure that the card has power or to learn about its performance.

Flash Cache modules are available in capacities of 256 GB, 512 GB, and 1 TB. PAMs have acapacity of 16 GB.

This document uses the term Flash Cache module to refer to caching modules with capacities greaterthan 16 GB. Before the release of Data ONTAP 7.3.5, such adapters were called PerformanceAcceleration Modules (PAM II). The name of the 16-GB caching module remains PerformanceAcceleration Module (PAM I).

Location and meaning of PAM LEDsThe PAM has two LEDs, both visible through the perforations of the PCIe bracket. You can checkthe LEDs to ensure that the module is in place and has power.

The position of the LEDs relative to the system depends on the model of the system it is installed in.Different systems can have horizontal or vertical expansion slots.

The following table describes the behavior of the module LEDs.

LED Description

Green Power ready indicator. Replace the card if the LED is off.

Blinking blue Indicates the presence of the card. The LED dims slightly on heavy loads.Replace the card if it does not blink after you boot Data ONTAP.

Location and meaning of Flash Cache module LEDsEach Flash Cache module has two LEDs, which you can check to see if the module is operatingproperly and to view its performance.

The illustration shows the LEDs on a module.

114 | Platform Monitoring Guide

Page 115: 215-06774_netapp-cmds

12

The following table explains what the LEDs on the module mean.

1 Fault

2 Activity

LED type Status indicator Description

Fault Solid amber A fault has occurred.

Activity Blinking green There is activity on the card. The LED blinks onceevery two seconds when the card is idle and increasesthe blink rate as its performance increases up to 10times per second.

System LEDs | 115

Page 116: 215-06774_netapp-cmds

116 | Platform Monitoring Guide

Page 117: 215-06774_netapp-cmds

Startup messages

When you apply power to your system, it verifies the hardware that is in the system, loads theoperating system, and displays startup informational and error messages on the system console.

There are two types of startup error messages:

• POST error messages• Boot error messages

Both error message types are displayed on the system console, and an e-mail notification is sent outby the remote management subsystem, if it is configured to do so.

POST messagesPOST is a series of tests run from the motherboard PROM. These tests check the hardware on themotherboard and differ depending on your system configuration.

POST messages appear on the system console before Data ONTAP software is loaded.

The following text is an example of a POST message on the console on a system that uses theLOADER boot environment. Systems using the CFE boot environment display similar messages.

Phoenix TrustedCore(tm) ServerCopyright 1985-2005 Phoenix Technologies Ltd. All Rights Reserved

Portions Copyright (c) 2005-2009 NetApp All Rights ReservedBIOS Version: 1.7X9

CPU= Dual Core AMD Opteron(tm) Processor 885 X 4 Testing RAM. 512MB RAM tested 32768MB RAM installed Fixed Disk 0: NACF1GBJU-A11

Boot Loader version 1.6.1X2 Copyright (C) 2000-2003 Broadcom Corporation. Portions Copyright (C) 2002-2009 NetApp

CPU Type: Dual Core AMD Opteron(tm) Processor 885

Starting AUTOBOOT press Ctrl-C to abort...

Note: If your system has an LCD, it displays POST messages without a header.

117

Page 118: 215-06774_netapp-cmds

Boot messagesAfter the boot is successfully completed, your system loads the operating system. Messages provideinformation about your system and alert you to errors that occur during boot.

Note: The exact boot messages that appear on your system console depend on your systemconfiguration.

The following message is an example of the start of a boot message that appears on the systemconsole of a FAS6030 storage system at first boot.

NetApp Release 7.3.1X19: Sat Nov 22 02:04:05 PST 2008Copyright (C) 1992-2008 NetApp.Starting boot on Wed Mar 25 00:51:31 GMT 2009Wed Mar 25 00:52:13 GMT [diskown.isEnabled:info]: Software ownership has been enabled ...Wed Mar 25 00:51:17 GMT [fmmb.current.lock.disk:info]: Disk 0b17 is a local HA mailbox diskWed Mar 25 00:51:17 GMT [fmmb.current.lock.disk:info]: Disk 0b16 is a local HA mailbox disk...Wed Mar 25 00:51:17 GMT [cf.fm.partner:info]: Cluster monitor: partner 'node2'...

FAS20xx and SA200 startup progressFAS20xx and SA200 systems do not display POST error messages on the system console.

You can track BIOS and boot loader progress by watching a progress indicator on the system consoleand by monitoring a sensor through the BMC.

Method of viewing progress on the consoleYou can view BIOS and boot loader progress by monitoring the progress indicator on your systemconsole.

The initial BIOS message appears on the console about five seconds after the system starts. Afterthat, and before the boot loader runs, continued POST progress is indicated by a line of dots (.) orplus signs (+). These dots or plus signs follow the line showing the BIOS version, as shown in theconsole output below:

AMI BIOS8 Modular BIOSCopyright (C) 1985-2006, American Megatrends, Inc. All RightsReserved

Portions Copyright (C) 2006 Network Appliance, Inc. All RightsReservedBIOS Version 3.0

118 | Platform Monitoring Guide

Page 119: 215-06774_netapp-cmds

...................Boot Loader version 1.3Copyright (C) 2000,2001,2002,2003 Broadcom Corporation.Portions Copyright (C) 2002-2005 Network Appliance Inc.CPU Type: Mobile Intel(R) Celeron(R) CPU 2.20GHzStarting AUTOBOOT press Ctrl-C to abort...

The dots or plus signs are a progress indicator to show that the BIOS is not hung. If the systemrestarts after a fault, the dots are replaced by plus signs to indicate that the system NVMEM is armed,or being protected, during the boot process.

The BIOS should begin loading Data ONTAP within about 25 seconds after the initial greeting.

Method of viewing progress through the BIOS Status sensorThe BMC monitors boot progress; you can determine the boot progress status through the BIOSStatus sensor by entering the sensors show BMC command.

The following text shows partial output of the BMC sensors show command:

bmc shell -> sensors showname State ID Reading Crit-Low Warn-Low Warn-Hi Crit-Hi-------------------------------------------------------------------------------------1.1V Normal #77 1121 mV 95 mV -- -- 1239 mV1.2V Normal #76 1239 mV 1038 mV -- -- 1357 mV1.5V Normal #75 1522 mV 1309 mV -- -- 1699 mV1.8V Normal #74 1829 mV 1569 mV -- -- 2029 mV12.0V Normal #70 12080 mV 10160 mV -- -- 13840 mV2.5V Normal #73 2520 mV 2116 mV -- -- 2870 mV3.3V Normal #72 3374 mV 2808 mV -- -- 3799 mVBIOS Status Normal #f0 Loader #20 -- -- -- --Batt 8.0V Normal #50 7552 mV -- -- 8512 mV 8576 mVBatt Amp Normal #59 0 mA -- -- 2112 mA 2208 mA

In the sensors show output, the BIOS Status sensor displays one of three states: Normal, Hung, orError. In the Reading column, the sensor displays BIOS and boot loader progress. In the exampleoutput, the BIOS Status sensor displays a state of Normal and a reading of Loader #20, indicatingthat the boot loader is running normally.

The following table lists the BIOS and boot loader progress values.

Status Description

0x00 System software has cleanly shut down. (Sent only by Data ONTAP.)

0x01 Memory initialization is in progress.

0x02 NVMEM initialization is in progress (when NVMEM is armed).

0x05 User has entered setup.

0x13 Booting to Data ONTAP (or boot loader).

0x1F BIOS is starting up. (Special message to the BMC.) This is the first BIOS status message.It might be quickly followed by another.

Startup messages | 119

Page 120: 215-06774_netapp-cmds

Status Description

0x20 Boot loader is running.

0x21 Boot loader is programming the primary firmware hub. The BMC does not allow thesystem to be powered down at this time.

0x22 Boot loader is programming the alternate firmware hub. The BMC does not allow thesystem to be powered down at this time.

0x2F Boot loader has transferred control to Data ONTAP. Data ONTAP might send thisperiodically to inform the BMC that Data ONTAP is running, if the BMC has rebooted.

0x60 BMC has shut power off.

0x61 BMC has turned power on.

0x62 BMC has reset the system.

0x63 BMC Watchdog power cycle.

0x64 BMC Watchdog cold reset.

The BIOS Status sensor also displays BIOS and boot loader error codes. If the BIOS status sensordisplays a Hung or Error state, contact technical support for interpretation of the codes.

3020 and 3050 system POST error messagesPOST error messages might appear on the system console if your system encounters errors while theCFE initiates the hardware.

Abort Autoboot–POST Failure(s): CPU

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Abort Autoboot–POST Failure(s): CPU

Description At least one CPU fails to start up properly.

Corrective action 1. Power-cycle the system to see whether the problem persists.

2. Replace the motherboard tray if the problem persists.

Abort Autoboot–POST Failure(s): MEMORY

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for that error message.

120 | Platform Monitoring Guide

Page 121: 215-06774_netapp-cmds

Message Abort Autoboot–POST Failure(s): MEMORY

Description The memory test failed.

Corrective action 1. Make sure that DIMMs are seated properly, then power- cycle your system.

2. Replace the DIMM if the problem persists.

Note: There is an LED next to each DIMM on the motherboard. When a DIMM fails, the LEDlights help you find the failed DIMM.

Abort Autoboot–POST Failure(s): RTC, RTC_IO

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Abort Autoboot–POST Failure(s): RTC, RTC_IO

Description The Common Firmware Environment (CFE) cannot read the real-time clock(RTC_IO) or the RTC date is invalid (RTC).

Corrective action 1. Use the set date and the set time command to set the date and time.

2. Make sure that the RTC battery is still good.

Abort Autoboot–POST Failure(s): UCODE

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Abort Autoboot–POST Failure(s): UCODE

Description At least one CPU fails to load the microcode

Corrective action 1. Power-cycle your system to see whether the problem persists.

2. Replace the motherboard tray if the problem persists.

Autoboot of backup image aborted

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Autoboot of backup image aborted

Description Autoboot is stopped due to a key being pressed during the autoboot process.

Corrective action Power-cycle the system and avoid pressing any keys during the autobootprocess.

Startup messages | 121

Page 122: 215-06774_netapp-cmds

Autoboot of backup image failed

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Autoboot of Back up image failed

Description The kernel could not be found on the CompactFlash card.

Correctiveaction

1. Check the CompactFlash card connection.

2. Make sure that the CompactFlash card content is valid; if it is not, replacethe CompactFlash card.

3. Follow the netboot procedure on your CompactFlash card documentation todownload a new kernel.

Autoboot of primary image aborted

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Autoboot of primary image aborted

Description Autoboot is stopped due to a key being pressed during the autoboot process.

Corrective action Power-cycle the system and avoid pressing any keys during the autobootprocess.

Autoboot of primary image failed

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Autoboot of primary image failed

Description The kernel could not be found on the CompactFlash card.

Corrective action 1. Check the CompactFlash card connection.

2. Make sure that the CompactFlash card content is valid; if it is not, replacethe CompactFlash card.

3. Follow the netboot procedure on your CompactFlash card documentation todownload a new kernel.

122 | Platform Monitoring Guide

Page 123: 215-06774_netapp-cmds

Invalid FRU EEPROM Checksum

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Invalid FRU EEPROM Checksum

Description The system backplane or motherboard Electrically Erasable ProgrammableRead-Only Memory (EEPROM) is corrupted.

Corrective action Call technical support.

Memory init failure

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for that error message.

Message Memory init failure: Data segment does not compare at XXXX

Description XXXX denotes memory address. The Common Firmware Environment (CFE)failed to initialize the system memory properly.

Corrective action 1. Make sure that the DIMM is supported.

2. Make sure that the DIMM is seated properly.

3. Replace the DIMM if the problem persists.

Note: There is an LED next to each DIMM on the motherboard. When a DIMM fails, the LEDlights help you find the failed DIMM.

No Memory found

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message No Memory found

Description The Common Firmware Environment (CFE) cannot detect the system DIMMs.

Corrective action 1. Make sure that the DIMM is seated properly and power- cycle your system.

2. Replace the DIMM if the problem persists.

Note: There is an LED next to each DIMM on the motherboard. When a DIMM fails, the LEDlights help you find the failed DIMM.

Startup messages | 123

Page 124: 215-06774_netapp-cmds

Unsupported system bus speed

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message Unsupported system bus speed 0xXXXX defaulting to 1000Mhz

Description The Common Firmware Environment (CFE) detects an unsupported DIMM.

Corrective action 1. Make sure that the DIMM is seated properly.

2. Replace the DIMM if the problem persists.

Note: There is an LED next to each DIMM on the motherboard. When a DIMM fails, the LEDlights help you find the failed DIMM.

3040, 3070, 31xx, 60xx, SA300, and SA600 system POSTerror messages

POST error messages might appear on the system console if your system encounters errors while theBIOS and boot loader initiate the hardware.

0200: Failure Fixed Disk

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0200: Failure Fixed Disk

Description A disk error occurred.

Corrective action Complete the following steps to see if the CompactFlash card is bad.

1. Enter the following command at the boot environment prompt:

boot_diags

2. Select the cf-card test.

3. If the test shows that the CompactFlash card is bad, replace it.If the CompactFlash card is good, replace the motherboard.

0230: System RAM Failed at offset:

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

124 | Platform Monitoring Guide

Page 125: 215-06774_netapp-cmds

Message 0230: System RAM Failed at offset

Description The BIOS cannot initialize the system memory or a DIMM has failed.

Correctiveaction

Check the DIMMs and replace any bad ones by completing the following steps:

1. Make sure that each DIMM is seated properly, then power- cycle the system.

2. If the problem persists, run the diagnostics to determine which DIMMs failed.Enter the following command at the boot environment prompt:

boot_diags

3. Select the following test: mem.

4. Replace the failed DIMMs.

0231: Shadow RAM failed at offset

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0231: Shadow RAM failed at offset

Description The BIOS cannot initialize the system memory or a DIMM has failed.

Correctiveaction

Check the DIMMs and replace any bad ones by completing the following steps:

1. Make sure that each DIMM is seated properly, then power- cycle the system.

2. If the problem persists, run the diagnostics to determine which DIMMs failed.Enter the following command at the boot environment prompt:

boot_diags

3. Select the following test: mem.

4. Replace the failed DIMMs.

0232: Extended RAM failed at address line

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0232: Extended RAM failed at address line

Description The BIOS cannot initialize the system memory or a DIMM has failed.

Correctiveaction

Check the DIMMs and replace any bad ones by completing the following steps:

1. Make sure that each DIMM is seated properly, then power- cycle the system.

Startup messages | 125

Page 126: 215-06774_netapp-cmds

2. If the problem persists, run the diagnostics to determine which DIMMs failed.Enter the following command at the boot loader prompt:

boot_diags

3. Select the following test: mem.

4. Replace the failed DIMMs.

0235: Multiple-bit ECC error occurred

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0235: Multiple-bit ECC error occurred

Description The BIOS cannot initialize the system memory or a DIMM has failed.

Correctiveaction

Check the DIMMs and replace any bad ones by completing the following steps:

1. Make sure that each DIMM is seated properly, then power- cycle the system.

2. If the problem persists, run the diagnostics to determine which DIMMs failed.Enter the following command at the boot loader prompt:

boot_diags

3. Select the following test: mem.

4. Replace the failed DIMMs.

023C: Bad DIMM found in slot #

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 023C: Bad DIMM found in slot #

Description The BIOS cannot initialize the system memory or a DIMM has failed

Correctiveaction

Check the DIMMs and replace any bad ones by completing the following steps:

1. Make sure that each DIMM is seated properly, then power- cycle the system.

2. If the problem persists, run the diagnostics to determine which DIMMs failed.Enter the following command at the boot loader prompt:

boot_diags

3. Select the following test: mem.

4. Replace the failed DIMMs.

126 | Platform Monitoring Guide

Page 127: 215-06774_netapp-cmds

023E: Node Memory Interleaving disabled

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 023E: Node Memory Interleaving disabled

Description A bad DIMM was detected, which causes BIOS to disable Node Interleaving.

Correctiveaction

Check the DIMMs and replace any bad ones by completing the following steps:

1. Make sure that each DIMM is seated properly, then power- cycle the system.

2. If the problem persists, run the diagnostics to determine which DIMMs failed.Enter the following command at the boot loader prompt:

boot_diags

3. Select the following test: mem.

4. Replace the failed DIMMs.

0241: Agent Read Timeout

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0241: Agent Read Timeout

Description Timeout occurs when BIOS tries to read or write information through SystemManagement Bus (SMBUS) or Inter-Integrated Circuit (I2C).

Correctiveaction

Run the Agent diagnostic test.

1. Enter the following command at the boot loader prompt:

boot_diags

2. Select and run the following tests: agent, 2, and 6.

3. Select and run the following tests: mb, 2, and 8.

0242: Invalid FRU information

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0242: Invalid FRU information

Description The information from the field-replaceable unit (FRU) Electrically ErasableProgrammable Read-Only Memory (EEPROM) is invalid.

Startup messages | 127

Page 128: 215-06774_netapp-cmds

Correctiveaction

1. Enter the following command at the boot environment prompt:

boot_diags

2. To determine the FRU involved, select the following tests: mb and 74.

3. Check whether the FRU’s model name, serial number, part number, andrevision are correct in one of the following ways:

• Visually inspect the FRU.• Look for error messages indicating that the FRU information is invalid or

could not be read.

4. Contact technical support if you suspect a misprogrammed FRU.

0250: System battery is dead

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0250: System battery is dead–Replace and run SETUP

Description The real-time clock (RTC) battery is dead.

Corrective action 1. Reboot the system.

2. If the problem persists, replace the RTC battery.

3. Reset the RTC.

0251: System CMOS checksum bad

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0251: System CMOS checksum bad–Default configuration used

Description CMOS checksum is bad, possibly because the system was reset during BIOSboot or because of a dead RTC battery.

Corrective action 1. Reboot the system.

2. If the problem persists, replace the RTC battery.

3. Reset the RTC.

0253: Clear CMOS jumper detected

Note: This message occurs only on 60xx and SA600 systems.

128 | Platform Monitoring Guide

Page 129: 215-06774_netapp-cmds

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0253: Clear CMOS jumper detected–Please remove for normal operation

Description The clear CMOS jumper is installed on the main board.

Corrective action Remove the clear CMOS jumper and reset the system.

0260: System timer error

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0260: System timer error

Description The system clock is not ticking.

Corrective action Replace the HT1000 chip.

0280: Previous boot incomplete

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 0280: Previous boot incomplete–Default configuration used

Description The previous boot was incomplete, and the default configuration was used.

Corrective action Reboot the system.

02C2: No valid Boot Loader in System Flash–Non Fatal

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 02C2: No valid Boot Loader in System Flash–Non Fatal

Description No valid boot loader is found in system flash memory while the option to HaltFor Invalid Boot Loader is disabled in setup. As the result, the system still canboot from CompactFlash if it has a valid boot loader.

Correctiveaction

Enter the update_flash command two times to place a good boot loader in thesystem flash.

02C3: No valid Boot Loader in System Flash–Fatal

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Startup messages | 129

Page 130: 215-06774_netapp-cmds

Message 02C3: No valid Boot Loader in System Flash–Fatal

Description No valid boot loader is found in system flash memory while the option to Halt ForInvalid Boot Loader is enabled in setup. As the result, the system halts. Usersshould take corrective action.

Correctiveaction

Place a valid version of the boot loader in the system flash by completing either ofthe following series of steps:

1. Boot from the backup boot image.

2. Enter the update_flash command.

or

1. Enter BIOS setup and disable boot from system flash.

2. Save the setting.

3. Reboot to the boot environment prompt, and then enter the update_flashcommand two times.

02F9: FGPA jumper detected

Note: This message occurs only on 60xx and SA600 systems.

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 02F9: FGPA jumper detected–Please remove for normal operation

Description The Field Programmable Gate Array (FPGA) jumper was installed on themotherboard.

Corrective action 1. Remove the FPGA jumper.

2. Reboot the system.

02FA: Watchdog Timer Reboot (PciInit)

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 02FA: Watchdog Timer Reboot (PciInit)

Description The watchdog times out while BIOS is doing PCI initialization.

Correctiveaction

1. Power-cycle the system a few times or reset the system through the RLM.

2. If the problem persists, check the PCI interface. At the boot environmentprompt, enter the following command:

130 | Platform Monitoring Guide

Page 131: 215-06774_netapp-cmds

boot_diags

3. Select and run the following tests: mb, 4, 71

4. Replace the motherboard if the diagnostics show a problem.

02FB: Watchdog Timer Reboot (MemTest)

Note: This message appears only on 60xx and SA600 systems.

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 02FB: Watchdog Timer Reboot (MemTest)

Description The watchdog times out while BIOS is testing the extended memory.

Correctiveaction

1. Power-cycle the system a few times or reset the system through the RLM.

2. If the problem persists, check the memory interface. At the boot loaderprompt, enter the following command:

boot_diags

3. Select and run the following tests: mem and 1

4. Replace the DIMMs if the diagnostics show a problem.

5. Replace the motherboard if the problem persists.

02FC: LDTStop Reboot (HTLinkInit)

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Message 02FC: LDTStop Reboot (HTLinkInit)

Description The watchdog times out while BIOS is setting up the HT link speed.

1. Power-cycle the system a few times or reset the system through the Remote LANModule (RLM).

2. If the problem persists, replace the motherboard.

No message on console

Note: Always power-cycle your system when you receive this message. If the system repeats theerror message, follow the corrective action for the error message.

Startup messages | 131

Page 132: 215-06774_netapp-cmds

Message No message on console. Problem might be reported in the Remote LAN Module(RLM) system event log with the code 037h or in the SMBIOS system event log(SEL) with the error code 237h.

Description There is not enough memory to accommodate SMBIOS structure.

Correctiveaction

Perform one of the following steps:

• Remove some adapters from PCI slots.• Check the DIMMs and replace any bad ones by completing the following steps:

1. Make sure that each DIMM is seated properly, then power- cycle thesystem.

2. If the problem persists, run the diagnostics to determine which DIMMsfailed. Enter the following command at the boot loader prompt:

boot_diags

3. Select the following test: mem

4. Replace the failed DIMMs.

FAS22xx, 32xx, 62xx, SA320, and SA620 system POST errormessages

POST error messages might appear on the system console if your system encounters errors while theBIOS and boot loader initiate the hardware.

0200: Failure Fixed Disk

Message 0200: Failure Fixed Disk

Description A disk error occurred.

Corrective action Replace the USB boot device.

SP error code 000h

0230: System RAM Failed at offset:

Message 0230: System RAM Failed at offset:

Description The BIOS cannot initialize the system memory, or a DIMM has failed.

Corrective action Check and replace the bad DIMM modules.

SP error code 030h

132 | Platform Monitoring Guide

Page 133: 215-06774_netapp-cmds

0231: Shadow RAM Failed at offset:

Message 0231: Shadow RAM Failed at offset:

Description The BIOS cannot initialize the system memory or a DIMM has failed.

Corrective action Check and replace the bad DIMM modules.

SP error code 031h

0232: Extended RAM Failed at address line:

Message 0232: Extended RAM Failed at address line:

Description The BIOS cannot initialize the system memory, or a DIMM has failed.

Corrective action Check and replace the bad DIMM modules.

SP error code 032h

BIOS detected uncorrectable ECC error in DIMM slot:

Message BIOS detected uncorrectable ECC error in DIMM slot:

Description BIOS detected an uncorrectable ECC error in the displayed DIMM slot.

Corrective action Check and replace the bad DIMM modules.

SP error code 035h

No message on the console

Message No message on the console.

Description There is not enough memory to accommodate SMBIOS structure.

Corrective action Check and replace the bad DIMM modules.

SP error code 037h

BIOS detected errors or invalid configuration in DIMM slot:

Message BIOS detected errors or invalid configuration in DIMM slot:

Description BIOS detected unknown errors in the displayed DIMM.

Corrective action Check and replace the bad DIMM modules.

SP error code 038h

Startup messages | 133

Page 134: 215-06774_netapp-cmds

BIOS detected unknown errors in DIMM slot:

Message BIOS detected unknown errors in DIMM slot:

Description BIOS detected unknown errors in the displayed DIMM.

Corrective action Check and replace the bad DIMM modules.

SP error code 038h

023A: ONTAP Detected Bad DIMM in slot:

Message 023A: ONTAP Detected Bad DIMM in slot:

Description Data ONTAP detected a bad DIMM and disabled it in the displayed DIMMslot.

Corrective action Check and replace the bad DIMM modules.

SP error code 03Ah

023B: BIOS detected SPD checksum error in DIMM slot:

Message 023B: BIOS detected SPD checksum error in DIMM slot:

Description BIOS detected an SPD checksum error in the displayed DIMM slot.

Corrective action Check and replace the bad DIMM modules.

SP error code 03Bh

BIOS detected pattern write/read mismatch in DIMM slot:

Message BIOS detected pattern write/read mismatch in DIMM slot:

Description BIOS detected a pattern write/read mismatch in the displayed DIMM slot.

Corrective action Check and replace the bad DIMM modules.

SP error code 03Ch

0241: SMBus Read Timeout

Message 0241: SMBus Read Timeout

Description Timeout occurs when BIOS tries to read or write information through SystemManagement Bus (SMBUS) or Inter-Integrated Circuit (I2C).

Corrective action Run system-level diagnostics to check the SMBUS.

SP error code 041h

134 | Platform Monitoring Guide

Page 135: 215-06774_netapp-cmds

0242: Invalid FRU information

Message 0242: Invalid FRU information

Description The information from the field-replaceable unit (FRU) Electrically ErasableProgrammable Read-Only Memory (EEPROM) is invalid.

Corrective action Program the FRU information through the SP or system-level diagnostics.

SP error code 042h

0250: System battery is dead - Replace and run SETUP

Message 0250: System battery is dead - Replace and run SETUP

Description The real-time clock (RTC) battery is dead.

Corrective action Replace the CMOS battery.

SP error code 050h

0251: System CMOS checksum bad

Message 0251: System CMOS checksum bad -- Default configuration

used

Description CMOS checksum is bad, possibly because the system was reset during BIOSboot or because of a dead RTC battery.

Corrective action None. BIOS corrects the error automatically, and the system continues normalboot.

SP error code 051h

0260: System timer error

Message 0260: System timer error

Description The system clock is not ticking.

Corrective action Replace the chipset.

SP error code 060h

0271: Check date and time settings

Message 0271: Check date and time settings

Description Date or time setting is invalid.

Startup messages | 135

Page 136: 215-06774_netapp-cmds

Corrective action 1. Set date and time in a proper range.

2. Make sure that the RTC battery is in and not dead.

SP error code 071h

0280: Previous boot incomplete - Default configuration used

Message 0280: Previous boot incomplete -- Default configuration

used

Description The previous boot was incomplete, and the default configuration is used.

Corrective action Reboot the system.

SP error code 080h

02A1: SP Not Found

Message 02A1: SP Not Found

Description SP does not respond or SP hangs.

Corrective action Check and replace the SP.

SP error code 0A2h

02A2: BMC System Error Log (SEL) Full

Message 02A2: BMC System Error Log (SEL) Full

Description SP system error log (SEL) is full.

Corrective action Clear the SEL log for SP.

SP error code 0A2h

02A3: No Response From SP To FRU ID Read Request

Messages 02A3: No Response From SP To FRU ID Read Request

Description Service Processor fails to respond to the FRU ID read request.

Corrective action Check and replace the Service Processor.

SP error code 0A3h

SP FRU Entry is Blank or Checksum Error

Message SP FRU Entry is Blank or Checksum Error

136 | Platform Monitoring Guide

Page 137: 215-06774_netapp-cmds

Description FRU information is invalid.

Corrective action Check and replace the FRU.

SP error code 0A3h

No Response to Controller FRU ID Read Request via IPMI

Message No Response to Controller FRU ID Read Request via IPMI

Description SP does not respond to a controller FRU information inquiry.

Corrective action Check and replace the SP.

SP error code 0A4h

No Response to Midplane FRU ID Read Request via IPMI

Message No Response to Midplane FRU ID Read Request via IPMI

Description The SP does not respond to a midplane FRU information inquiry.

Corrective action Check and replace the SP.

SP error code 0A5h

02C2: No valid Boot Loader in System Flash - Non Fatal

Message 02C2: No valid Boot Loader in System Flash - Non Fatal

Description No valid boot loader is found in system flash memory while the option to HaltFor Invalid Boot Loader is disabled in setup. As the result, the system still canboot from the boot media if it has a valid boot loader.

Correctiveaction

Take one of the following actions:

• If the system can boot to the boot loader prompt through the boot media, runthe following command to place a good boot loader in system flash:

flash

• If the system cannot boot to the boot loader prompt through the boot media,boot from the backup image through the SP and then enter the followingcommand to place a good boot loader in the corrupted portion of systemflash:

flash

SP error code 0C2h

Startup messages | 137

Page 138: 215-06774_netapp-cmds

02C3: No valid Boot Loader in System Flash - Fatal

Message 02C3: No valid Boot Loader in System Flash - Fatal

Description No valid boot loader is found in system flash memory while the option to HaltFor Invalid Boot Loader is enabled in setup. As the result, the system halts.Users should take corrective action.

Corrective action Place a valid version of the boot loader in the system flash by completing thefollowing steps:

1. Boot the system from the backup boot image.

2. Enter the following command:

flash

SP error code 0C3h

Fatal Error: No DIMM detected and system can not continue boot!

Message Fatal Error: No DIMM detected and system can not continue

boot!

Description All DIMM serial presence detect (SPD) EEPROMs are inaccessible due to thehanging of the Inter-Integrated Circuit (I2C) switch for System ManagementBus (SMBUS). The system regards the condition as if there were no DIMMs onthe system.

Corrective action Complete the following steps:

1. If the message persists, try to power-cycle the system.

2. If the problem persists after power-cycling the system, replace themotherboard.

SP error code 0E8h

Fatal Error! All channels are disabled!

Message Fatal Error! All channels are disabled!

Description All channels of DIMM are disabled.

Corrective action Complete the following steps:

1. Clear CMOS.

2. Power-cycle the system.

138 | Platform Monitoring Guide

Page 139: 215-06774_netapp-cmds

3. If the problem persists, replace all DIMMs.

SP error code 0EAh

Software memory test failed!

Message Software memory test failed!

Description Software memory test failed in memory reference code (MRC).

Corrective action Check and replace the bad DIMM modules.

SP error code 0EBh

Fatal Error! RDIMMs and UDIMMs are mixed!

Message Fatal Error! RDIMMs and UDIMMs are mixed!

Description The registered dual inline memory modules (RDIMMs) and unregistered dualinline memory modules (UDIMMs) are mixed in the system.

Corrective action Make sure that the RDIMMs and UDIMMs are not mixed.

SP error code 0EDh

Fatal Error! UDIMM in 3rd slot is not supported!

Message Fatal Error! UDIMM in 3rd slot is not supported!

Description An unregistered dual inline memory module (UDIMM) is populated in the thirdslot.

Corrective action Make sure that an unregistered dual inline memory module (UDIMM) is notplugged into the third slot.

SP error code 0EEh

Fatal Error! All DIMM failed and system can not continue boot!

Message Fatal Error! All DIMM failed and system can not continue

boot!

Description All DIMMs are mapped out either as bad or having the disable flag set. Thesystem has no memory to continue.

Corrective action Complete the following steps:

1. Clear CMOS.

2. Power-cycle the system.

Startup messages | 139

Page 140: 215-06774_netapp-cmds

3. If the problem persists, replace all DIMMs.

SP error code N/A

Boot error messagesBoot error messages might appear after the hardware passes all POSTs and your system encounterserrors while loading the operating system.

Boot device err

Message Boot device err

Description A CompactFlash card could not be found to boot from.

Corrective action Insert a valid CompactFlash card.

Cannot initialize labels

Message Cannot initialize labels

Description When the system tries to create a new file system, it cannot initialize the disklabels.

Corrective action Usually, you do not need to create and initialize a file system; do so only afterconsulting technical support.

Cannot read labels

Message Cannot read labels

Description When your system tries to initialize a new file system, it has a problem readingthe disk labels it wrote to the disks.

This problem can be because the system failed to read the disk size, or thewritten disk labels were invalid.

Correctiveaction

Usually, you do not need to create and initialize a file system; do so only afterconsulting technical support.

Configuration exceeds max PCI space

Message Configuration exceeds max PCI space

Description The memory space for mapping PCI adapters has been exhausted, for one of tworeasons:

140 | Platform Monitoring Guide

Page 141: 215-06774_netapp-cmds

• There are too many PCI adapters in the system• An adapter is demanding too many resources

Correctiveaction

1. Verify that all expansion adapters in your system are supported.

2. Contact technical support for help. Have a list ready of all expansionadapters installed in your system.

DIMM slot # has correctable ECC errors

Message DIMM slot # has correctable ECC errors

Description The specified DIMM slot has correctable error correction code (ECC) errors.

Corrective action Run diagnostics on your DIMMs. If the problem persists, replace the specifiedDIMM.

Dirty shutdown in degraded mode

Message Dirty shutdown in degraded mode

Description The file system is inconsistent because you did not shut down the systemcleanly when it was in degraded mode.

Corrective action Contact technical support for instructions about repairing the file system.

Disk label processing failed

Message Disk label processing failed

Description Your system detects that the disk is not in the correct drive bay.

Corrective action Make sure that the disk is in the correct bay.

Drive %s.%d not supported

Message Drive %s.%d not supported

Description %s—The disk number; %d—The disk ID number. The system detects anunsupported disk drive.

Correctiveaction

1. Remove the drive immediately or the system drops down to theprogrammable ROM (PROM) monitor within 30 seconds.

2. Check the System Configuration Guide on the NetApp Support Site at support.netapp.com to verify support for your disk drive.

Startup messages | 141

Page 142: 215-06774_netapp-cmds

Error detection detected too many errors to analyze at once

Message Error detection detected too many errors to analyze at once

Description This message occurs when other error messages occur at the same time.

Corrective action See the other error messages and their respective corrective actions. If theproblem persists, contact technical support.

FC-AL loop down, adapter %d

Message FC-AL loop down, adapter %d

Description The system cannot detect the Fibre Channel-Arbitrated Loop (FC-AL) loop oradapter.

Correctiveaction

1. Identify the adapter by entering the following command:

storage show adapter

2. Turn off the power on your system and verify that the adapter is properlyseated in the expansion slot.

3. Verify that all Fibre Channel cables are connected.

File system may be scrambled

Message File system may be scrambled

Descriptionsand correctiveactions

The following table lists errors that cause the file system to become inconsistentand steps you can take to correct the problem.

Description Corrective action

An unclean shutdown when yoursystem is in degraded mode and whenNVRAM is not working.

Contact technical support to learnhow to start the system from a systemboot diskette and repair the filesystem.

The number of disks detected in thedisk array is different from thenumber of disks recorded in the disklabels. The system cannot start whenmore than one disk is missing.

Make sure that all disks on the systemare properly installed in the diskshelves.

The system encounters a read errorwhile reconstructing parity.

Contact technical support for help.

142 | Platform Monitoring Guide

Page 143: 215-06774_netapp-cmds

Description Corrective action

A disk failed at the same time thesystem crashed.

Contact technical support to learnhow to repair the file system.

Halted disk firmware too old

Message Halted disk firmware too old

Description The disk firmware is an old version.

Corrective action Update the disk firmware by entering the following command:

disk_fw_update

Halted: Illegal configuration

Message Halted: Illegal configuration

Description Incorrect HA pair.

Corrective action 1. Check the console for details.

2. Verify that all cables are correctly connected.

Invalid PCI card slot %d

Message Invalid PCI card slot %d

Description %d—The expansion slot number. The system detects a adapter that is notsupported.

Corrective action Replace the unsupported adapter with an adapter that is included in the SystemConfiguration Guide at http://now.netapp.com.

No /etc/rc

Message No /etc/rc

Description The /etc/rc file is corrupted.

Correctiveaction

1. At the hostname> prompt, enter

setup

2. As the system prompts for system configuration information, use theinformation you recorded in your system configuration informationworksheet in the Getting Started Guide.

Startup messages | 143

Page 144: 215-06774_netapp-cmds

For more information about your system setup program, see the appropriatesystem administration guide.

No disk controllers

Message No disk controllers

Description The system cannot detect any Fibre Channel-Arbitrated Loop (FC-AL) diskcontrollers.

Corrective action 1. Turn off your system power.

2. Verify that all NICs are properly seated in the appropriate expansion slots.

No disks

Message No disks

Description The system cannot detect any Fibre Channel-Arbitrated Loop (FC-AL) disks.

Corrective action Verify that all disks are properly seated in the drive bays.

No /etc/rc, running setup

Message No /etc/rc, running setup

Description The system cannot find the /etc/rc file and automatically starts setup.

Corrective action As the system prompts for system configuration information, use theinformation you recorded in your system configuration information worksheetin the Getting Started Guide.

For more information about your system setup program, see the appropriatesystem administration guide.

No network interfaces

Message No network interfaces

Description The system cannot detect any network interfaces.

Corrective action 1. Turn off the system and verify that all network interface cards (NICs) areseated properly in the appropriate expansion slots.

2. Run diagnostics to check the onboard Ethernet port.

3. If the problem persists, contact technical support.

144 | Platform Monitoring Guide

Page 145: 215-06774_netapp-cmds

No NVRAM present

Message No NVRAM present

Description The system cannot detect the NVRAM adapter.

Corrective action Make sure that the NVRAM adapter is securely installed in the appropriateexpansion slot.

NVRAM #n downrev

Message NVRAM #n downrev

Description n—The serial number of the nonvolatile RAM (NVRAM) adapter. TheNVRAM adapter is an early revision that cannot be used with the system.

Corrective action Check the console for information about which revision of the NVRAM adapteris required. Replace the NVRAM adapter.

NVRAM: wrong pci slot

Message NVRAM: wrong pci slot

Description The system cannot detect the nonvolatile RAM (NVRAM) adapter.

Corrective action • For a stand-alone 3020 or 3050 system, make sure that the NVRAM adapteris in slot 1.

• For a 3020 or 3050 system in an HA pair, make sure that the NVRAMadapter is in slot 2.

Panic: DIMM slot #n has uncorrectable ECC errors

Message Panic: DIMM slot #n has uncorrectable ECC errors. Replace these DIMMS.

Description The specified DIMM has uncorrectable ECC errors.

Corrective action Replace the specified DIMM.

This platform is not supported on this release

Message This platform is not supported on this release. Please consult the release notes.

Please downgrade to a supported release! Shutting down: EOL platform

Description This platform is not supported on this release. Please consult the release notesfor your software.

Corrective action You must downgrade your software version to a compatible release.

Startup messages | 145

Page 146: 215-06774_netapp-cmds

Verify that you have the correct URL for software download.

Too many errors in too short time

Message Too many errors in too short time

Description The error detection system is experiencing problems. This message occurswhen other error messages occur at the same time.

Corrective action See the other error messages and their respective corrective actions. If theproblem persists, call technical support.

Warning: Motherboard Revision not available

Message Warning: Motherboard Revision not available. Motherboard is notprogrammed.

Description The system motherboard is not programmed with the correct revision.

Corrective action Replace the motherboard.

Warning: Motherboard Serial Number not available

Message Warning: Motherboard Serial Number not available. Motherboard is notprogrammed

Description The system motherboard is not programmed with the correct serial number.

Corrective action Replace the motherboard.

Warning: system serial number is not available

Message Warning: system serial number is not available. System backplane is notprogrammed.

Description The backplane of your system does not have the correct system serial number.

Corrective action Report the problem to technical support so that your system can be replaced.

Watchdog error

Message Watchdog error

Description An error occurred during the testing of the watchdog timer.

Corrective action Replace the motherboard.

146 | Platform Monitoring Guide

Page 147: 215-06774_netapp-cmds

Watchdog failed

Message Watchdog failed

Description Your system watchdog reset hardware, used to reset your system from a systemhang condition, is not functioning properly.

Corrective action Replace the motherboard.

Startup messages | 147

Page 148: 215-06774_netapp-cmds

148 | Platform Monitoring Guide

Page 149: 215-06774_netapp-cmds

EMS and operational messages

You might encounter various messages on your system during normal operation.

The EMS collects event data from various parts of the Data ONTAP kernel and displays informationabout those events in AutoSupport messages. EMS messages appear on your system console or LCDand provide information about disk drives, disk shelves, system power supply, system fans, andacceleration modules.

Operational error messages might appear on your system console or LCD when the system isoperating, when it is halted, or when it is restarting because of system problems.

Environmental EMS messagesEMS messages appear on the console and in AutoSupport messages if your system encountersextremes in its operational environment. They also appear on the LCD display if your system hasone.

Note: In 31xx systems, both controllers in a chassis share the power supplies. As a result, thesystem is never shut down because of a single power supply failure. Removing one power supplydoes not shut down the system.

Note: Degraded power might be caused by bad power supplies, bad wall power, or badcomponents on the motherboard. If spare power supplies are available, try replacing them to seewhether that alleviates the problem.

Chassis fan FRU failed

Message Chassis fan FRU failed: current speed is 4272 RPM, on [time stamp].

LCD display Fans stopped; replace them

LED behavior FRU LED: Green if problem is PSU; off if problem is fan.

Description This message occurs when a system fan fails.

Corrective action Check LEDs on the fans and power supply.

• If both fan LEDs are green, run diagnostics on the power supplies.• If the fan LED is off, replace the fan.

SNMP trap ID #414: Chassis fan is degraded

149

Page 150: 215-06774_netapp-cmds

Chassis over temperature on XXXX

LCD display Temperature exceeds limits

Message Chassis over temperature on XXXX at [time stamp].

Description This message occurs when the system is operating above the high-temperaturethreshold.

Corrective action 1. Make sure that the system has proper ventilation.

2. Power-cycle the system and run diagnostics on the system.

SNMP trap ID #372: Chassis temperature is too hot

Chassis over temperature shutdown on XXXX

Message Chassis over temperature shutdown on XXXX at [time stamp].

LCD display Temperature exceeds limits

Description This message occurs when the system is operating above the high-temperaturethreshold. The system shuts down immediately.

Corrective action 1. Make sure that the system has proper ventilation.

2. Power-cycle the system and run diagnostics on the system.

SNMP trap ID #371: Chassis temperature is too hot

Chassis Power Degraded: 3.3V in warn high state

Message Chassis Power Degraded: 3.3V is in warn high state current voltage is 3273 mVon XXXX at [time stamp].

LCD display Power supply degraded

Description This message occurs when the system is operating above the high-voltagethreshold.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.• If the power supply is inserted, power-cycle your system and run

diagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID #403: Chassis power is degraded

150 | Platform Monitoring Guide

Page 151: 215-06774_netapp-cmds

Chassis power degraded: PS#

Message Chassis Power degraded: PS#

LCD display Power supply degraded

LED behavior FRU LED: Amber

Description This message occurs when there is a problem with one of the power supplies.

Corrective action 1. Check that the power supply is seated properly in its bay and that all powercords are connected.

2. Power-cycle your system and run diagnostics on the identified powersupply.

3. If the problem persists, replace the identified power supply.

SNMP trap ID #392: Chassis power supply is degraded

Note: In 31xx systems, both controllers in a chassis share the power supplies. As a result, thesystem is never shut down because of a single power supply failure. Removing one power supplydoes not shut down the system.

Chassis Power Fail: PS#

Message Chassis Power Fail: PS#

LCD display Power supply degraded

Description This message occurs when the power supply fails.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.• If the power supply is inserted, power-cycle your system and run

diagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID #6: Chassis power is degraded

Chassis Power Shutdown

Message Chassis Power Shutdown: Chassis Power Supply Fail: PS#

LCD display Power supply degraded

LED behavior FRU LED: Amber

EMS and operational messages | 151

Page 152: 215-06774_netapp-cmds

Description This message occurs when the system is in a warning state. The system shutsdown immediately.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.• If the power supply is inserted, power-cycle your system and run

diagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID #392: Chassis power supply is degraded

Note: In 31xx systems, both controllers in a chassis share the power supplies. As a result, thesystem is never shut down because of a single power supply failure. Removing one power supplydoes not shut down the system.

Chassis power shutdown: 3.3V in warn low state

Message Chassis power shutdown: 3.3V is in warn low state current voltage is 3273 mVon XXXX at [time stamp].

LCD display Power supply degraded

Description This message occurs when the system is operating below the low-voltagethreshold. The system shuts down immediately.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.• If the power supply is inserted, power-cycle your system and run

diagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID #403: Chassis power is degraded

Chassis Power Supply: PS# removed

Message Chassis Power Supply: PS# removed system will shutdown in 2 minutes

LCD display Power supply degraded

LED behavior FRU LED: Amber

Description This message occurs when the power supply unit is removed from the system.The system will shut down unless the power supply is replaced.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.

152 | Platform Monitoring Guide

Page 153: 215-06774_netapp-cmds

• If the power supply is inserted, power-cycle your system and rundiagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID #501: Chassis power supply is degraded

Chassis power supply degraded: PS#

Note: This message appears only on 31xx systems.

Message Chassis power supply degraded: PS#

LED behavior FRU LED: Amber

Description This message occurs when there is a problem with one of the power supplies.

Corrective action 1. Check that the power supply is seated properly in its bay and that all powercords are connected.

2. Power-cycle your system and run diagnostics on the identified powersupply.

3. If the problem persists, replace the identified power supply.

SNMP trap ID #392: Chassis power supply is degraded

Chassis power supply fail: PS#

LCD display Power supply degraded

Message Chassis power supply fail: PS#

Description This message occurs when the system is operating below the low-voltagethreshold. The system shuts down immediately.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.• If the power supply is inserted, power-cycle your system and run

diagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID N/A

Chassis power supply off: PS#

Note: This message appears only on 31xx systems.

Message Chassis Power supply off: PS#

EMS and operational messages | 153

Page 154: 215-06774_netapp-cmds

LED behavior FRU LED: Off

Description This message occurs when the power supply unit is turned off.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is present and is switched off, turn the switch on.• If the power supply is present and turned on, power-cycle your system and

run diagnostics on the identified power supply. If the problem persists,replace the identified power supply.

SNMP trap ID #395: Power supply not present

Chassis power supply off: PS#

Message Chassis power supply off: PS#

LCD display Power supply degraded

Description This message occurs when one or more chassis power supplies are turned off.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.• If the power supply is inserted, power-cycle your system and run

diagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID #395: Power supply not present

Chassis power supply OK: PS#

Note: This message appears only on 31xx systems.

Message Chassis power supply OK: PS#

LED behavior FRU LED: Green

Description This message occurs when the power supply is operating normally.

Corrective action None.

SNMP trap ID #397: Chassis power supply (%id) is OK

Chassis power supply removed: PS#

Note: This message appears only on 31xx systems

Message Chassis power supply removed: PS#

LED behavior N/A

154 | Platform Monitoring Guide

Page 155: 215-06774_netapp-cmds

Description This message occurs when the power supply unit is removed from the system.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.• If the power supply is inserted, power-cycle your system and run

diagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID #394: I/O expansion module is not present in the chassis

Chassis under temperature on XXXX

Message Chassis under temperature on XXXX at [time stamp].

LCD display Temperature exceeds limits

Description This message occurs when the system is operating below the low-temperaturethreshold.

Corrective action 1. Raise the ambient temperature around the system.

2. Power-cycle the system and run diagnostics on the system.

SNMP trap ID #372: Chassis temperature is too cold

Chassis under temperature shutdown on XXXX

Message Chassis under temperature shutdown on XXXX at [time stamp].

LCD display Temperature exceeds limits

Description This message occurs when the system is operating below the low-temperaturethreshold. The system shuts down immediately.

Corrective action 1. Check that the system has proper ventilation. You might need to raise theambient temperature around the system.

2. Power-cycle the system and run diagnostics on the system.

SNMP trap ID #371: Chassis temperature is too cold

Fan: # is spinning below tolerable speed

Message Fan: # is spinning below tolerable speed replace immediately to avoidoverheating

LCD display Fans stopped; replace them

Description This message occurs when one or more chassis fans is spinning too slowly.

EMS and operational messages | 155

Page 156: 215-06774_netapp-cmds

Corrective action Check LEDs on the fans.

• If both fan LEDs are green, run diagnostics on the motherboard• If the fan LED is off, replace the fan.

SNMP trap ID #415: Chassis fan is degraded

monitor.chassisFan.degraded

Message monitor.chassisFan.degraded

Severity ALERT

Description This message is issued when a chassis fan is degraded.

Corrective action The fan unit should be replaced.

SNMP trap ID #412 Chassis fan is degraded: %s

monitor.chassisFan.ok

Message monitor.chassisFan.ok

Severity NOTICE

Description This message occurs when the chassis fans are OK.

Corrective action N/A

SNMP trap ID #366 Chassis FRU is OK

monitor.chassisFan.removed

Message monitor.chassisFan.removed

Severity ALERT

Description This message occurs when a chassis fan is removed.

Corrective action Replace the fan unit.

SNMP trap ID #363 Chassis FRU is removed

monitor.chassisFan.slow

Message monitor.chassisFan.slow

Severity ALERT

Description This message occurs when a chassis fan is spinning too slowly.

Corrective action Replace the fan unit.

156 | Platform Monitoring Guide

Page 157: 215-06774_netapp-cmds

SNMP trap ID #365 Chassis FRU contains at least one fan spinning slowly

monitor.chassisFan.stop

Message monitor.chassisFan.stop

Severity ALERT

Description This message occurs when a chassis fan is stopped.

Corrective action Replace the fan unit.

SNMP trap ID #364 Chassis FRU contains at least one stopped fan

monitor.chassisFan.warning

Message monitor.chassisFan.warning

Severity ALERT

Description This message is issued when a chassis fan is spinning either too slowly or toofast. This is a warning message.

Corrective action The fan unit should be replaced.

SNMP trap ID #415 Chassis fan is in warning state

monitor.chassisFanFail.xMinShutdown

Message monitor.chassisFanFail.xMinShutdown

Severity EMERG

Description This message indicates that multiple chassis fans have failed and the systemwill shut down in few minutes unless corrected.

Corrective action Make sure the system fans are working.

SNMP trap ID #511 Multiple Chassis Fan failure: System will shut down in 2 minutes.

monitor.chassisPower.degraded

Message monitor.chassisPower.degraded

Severity NOTICE

Description This message indicates that a power supply is degraded.

Corrective action 1. If spare power supplies are available, try replacing them to see whetherthat alleviates the problem.

EMS and operational messages | 157

Page 158: 215-06774_netapp-cmds

2. Otherwise, contact technical support for further instruction.

SNMP trap ID #403 Chassis power is degraded

monitor.chassisPower.ok

Message monitor.chassisPower.ok

Severity NOTICE

Description This messages indicates that the motherboard power is OK.

Corrective action N/A

SNMP trap IP #406 Normal operation

monitor.chassisPowerSupplies.ok

Message monitor.chassisPowerSupplies.ok

Severity INFO

Description This message indicates that all power supplies are OK.

Corrective action N/A

SNMP trap ID #396 Normal operation

monitor.chassisPowerSupply.degraded

Message monitor.chassisPowerSupply.degraded

Severity INFO

Description This message indicates that a power supply is degraded.

Corrective action A replacement power supply might be required. Contact technical support forfurther instruction.

SNMP trap ID #392 Chassis power supply is degraded

monitor.chassisPowerSupply.notPresent

Message monitor.chassisPowerSupply.notPresent

Severity NOTICE

Description This message indicates that a power supply is not present.

Corrective action Replace the power supply.

SNMP trap ID #394 Power supply not present

158 | Platform Monitoring Guide

Page 159: 215-06774_netapp-cmds

monitor.chassisPowerSupply.off

Message monitor.chassisPowerSupply.off

Severity NOTICE

Description This message indicates that a power supply is turned off.

Corrective action Turn on the power supply.

SNMP trap ID #395 Power supply not present

monitor.chassisPowerSupply.ok

Message monitor.chassisPowerSupply.ok

Severity INFO

Description This message indicates the power supply is OK

Corrective action None.

SNMP trap ID # 397 Chassis power supply (%id) is OK

monitor.chassisTemperature.cool

Message monitor.chassisTemperature.cool

Severity ALERT

Description This message occurs when the chassis temperature is too cool.

Corrective action Raise the temperature around the system.

SNMP trap ID #372 Chassis temperature is too cool

monitor.chassisTemperature.ok

Message monitor.chassisTemperature.ok

Severity NOTICE

Description This message occurs when the chassis temperature is normal.

Corrective action N/A

SNMP trap ID #376 Normal operation

monitor.chassisTemperature.warm

Message monitor.chassisTemperature.warm

EMS and operational messages | 159

Page 160: 215-06774_netapp-cmds

Severity ALERT

Description This message occurs when the chassis temperature is too warm.

Corrective action Check to see whether air conditioning units are needed, or whether they arefunctioning properly.

SNMP trap ID #372 Chassis temperature is too warm

monitor.cpuFan.degraded

Message monitor.cpuFan.degraded

Severity NOTICE

Description This message indicates that a CPU fan is degraded.

Corrective action 1. Replace the identified fan.

2. Power-cycle the system and run diagnostics on the system.

SNMP trap ID #383 A CPU fan is not operating properly

monitor.cpuFan.failed

Message monitor.cpuFan.failed

Severity NOTICE

Description This message indicates that a CPU fan is degraded.

Corrective action 1. Replace the identified fan.

2. Power-cycle the system and run diagnostics on the system.

SNMP trap ID #381: CPU fan is stopped

monitor.cpuFan.ok

Message monitor.cpuFan.ok

Severity INFO

Description This message indicates that a CPU fan is OK.

Corrective action N/A

SNMP trap ID #386 Normal operation

160 | Platform Monitoring Guide

Page 161: 215-06774_netapp-cmds

monitor.ioexpansionPower.degraded

Message monitor.ioexpansionPower.degraded

Severity NOTICE

Description This message indicates that power on the I/O expansion module is degraded.

Corrective action Degraded power might be caused by bad power supplies, bad wall power, orbad components on the motherboard. If spare power supplies are available, tryexchanging them to see whether the problem is resolved. Otherwise, contacttechnical support.

SNMP trap ID #403 Power on IO expansion is degraded:

monitor.ioexpansionPower.ok

Message monitor.ioexpansionPower.ok

Severity NOTICE

Description This messages indicates that power on the I/O expansion module is OK.

Corrective action None.

SNMP trap ID #406 Power on IO expansion module is OK

monitor.ioexpansionTemperature.cool

Message monitor.ioexpansionTemperature.cool

Severity ALERT

Description This warning message occurs when the I/O expansion module is too cold.

Corrective action The system cannot function in an environment that is too cold; find ways towarm the system.

SNMP trap ID #372 I/O expansion module is too cold:

monitor.ioexpansionTemperature.ok

Message monitor.ioexpansionTemperature.ok

Severity NOTICE

Description This message occurs when the temperature of the I/O expansion module isnormal. It can occur for the following two cases: 1) LOG_NOTICE to showthat a bad condition has reverted to normal. 2) LOG_INFO for hourly toindicate that the temperature is OK.

EMS and operational messages | 161

Page 162: 215-06774_netapp-cmds

Corrective action None.

SNMP trap ID #376 Temperature of the I/O expansion module is OK.

monitor.ioexpansionTemperature.warm

Message monitor.ioexpansionTemperature.warm

Severity ALERT

Description This warning message occurs when the I/O expansion module is too warm.

Corrective action Evaluate the environment in which the system is functioning: Are airconditioning units needed or is the current air conditioning not functioningproperly?

SNMP trap ID #372 I/O expansion module is too warm:

monitor.ioexpansion.unpresent

Message monitor.ioexpansion.unpresent

Severity NOTICE

Description This message occurs when the I/O expansion module is not inserted into thechassis.

Corrective action None.

SNMP trap ID #394: I/O expansion module is not present in the chassis.

monitor.nvmembattery.warninglow

Message monitor.nvmembattery.warninglow

Severity WARNING

Description This message occurs when the NVMEM (nonvolatile memory) lithium batteryis low on power.

Corrective action Replace the NVMEM battery as soon as practical.

SNMP trap ID #63 NVMEM battery is low on power and should be replaced as soon aspractical.

monitor.nvramLowBattery

Message monitor.nvramLowBattery

Severity NODE_ERROR

162 | Platform Monitoring Guide

Page 163: 215-06774_netapp-cmds

Description This message occurs when the NVRAM batteries are discovered to be at adangerously low power level.

Corrective action Contact technical support.

SNMP trap ID N/A

monitor.power.unreadable

Message monitor.power.unreadable

Severity INFO

Description This message occurs when a power sensor in the controller module is notreadable.

Corrective action Shut down the system and power-cycle the controller module. If the sensor isstill not readable, replace the controller module.

SNMP trap ID N/A

monitor.shutdown.cancel

Message monitor.shutdown.cancel

Severity WARNING

Description This message is issued when an automatic shutdown sequence has beencanceled.

Corrective action None.

SNMP trap ID #6 Automatic shutdown sequence canceled

monitor.shutdown.cancel.nvramLowBattery

Message monitor.shutdown.cancel.nvramLowBattery

Severity WARNING

Description This message is issued when an automatic shutdown sequence has beenpostponed due to RAID reconstruction.

Corrective action Unknown

SNMP trap ID #6 NVRAM battery is dangerously Low. Halt delayed until %s finishes.

monitor.shutdown.chassisOverTemp

Message monitor.shutdown.chassisOverTemp

EMS and operational messages | 163

Page 164: 215-06774_netapp-cmds

Severity CRIT

Description This message occurs just before shutdown, indicating that the chassistemperature is too hot.

Corrective action Check to see if air conditioning units are needed, or whether they arefunctioning properly.

#371 Chassis temperature is too hot

monitor.shutdown.chassisUnderTemp

Message monitor.shutdown.chassisUnderTemp

Severity CRIT

Description This message occurs just before shutdown, indicating that the chassistemperature becomes too cold.

Corrective action Raise the temperature around the system.

SNMP trap ID #371 Chassis temperature is too cold

monitor.shutdown.emergency

Message monitor.shutdown.emergency

Severity NODE_FAULT

Description This message is issued when an emergency shutdown is initiated.

Corrective action None.

SNMP trap ID #6 Emergency shutdown: %s

monitor.shutdown.ioexpansionOverTemp

Message monitor.shutdown.ioexpansionOverTemp

Severity CRIT

Description This message occurs when the I/O expansion module is too hot. This messageis sent just before shutdown.

Corrective action The system environment is too hot; cool the environment.

SNMP trap ID #371 I/O expansion module is too hot:

monitor.shutdown.chassisUnderTemp

Message monitor.shutdown.chassisUnderTemp

164 | Platform Monitoring Guide

Page 165: 215-06774_netapp-cmds

Severity CRIT

Description This message occurs just before shutdown, indicating that the chassistemperature becomes too cold.

Corrective action Raise the temperature around the system.

SNMP trap ID #371 Chassis temperature is too cold

monitor.shutdown.nvramLowBattery.pending

Message monitor.shutdown.nvramLowBattery.pending

Severity WARNING

Description This message is issued when an automatic shutdown sequence is pending due toa low battery.

Corrective action Replace the battery.

SNMP trap ID #62 Emergency shutdown: NVRAM battery dangerously low in degradedmode. Replace the battery immediately!

monitor.temp.unreadable

Message monitor.temp.unreadable

Severity INFO

Description This message occurs when the controller module temperature is not readable.The system does not automatically shut down if it becomes too hot for reliableoperation.

Corrective action Shut down the system and power-cycle the controller module. If thetemperature is still not readable, replace the controller module.

SNMP trap ID N/A

Multiple chassis fans have failed

Message Multiple chassis fans have failed; system will shut down in 2 minutes.

LCD display Fans stopped; replace them.

Description This message occurs during a multiple chassis fan failure. The system shutsdown in two minutes if this condition is uncorrected.

Corrective action 1. Replace both fans.

2. Power-cycle and run diagnostics on the system.

EMS and operational messages | 165

Page 166: 215-06774_netapp-cmds

SNMP trap ID #511: Chassis fan is degraded

Multiple fan failure on XXXX

Message Multiple fan failure on XXXX at [time stamp].

LCD display Fans stopped; replace them.

LED behavior FRU LED: Amber

Description This message occurs when both system fans fail. The system shuts downimmediately.

Corrective action 1. Replace both fans.

2. Power-cycle and run diagnostics on the system.

SNMP trap ID #6 Emergency shutdown

Multiple power supply fans failed

Message Multiple power supply fans failed; system will shut down in 2 minutes.

LCD display Power supply degraded

Description This message occurs when multiple power supplies and fans have failed. Thesystem shuts down in two minutes if this condition is uncorrected.

Corrective action Your action depends on whether the power supply is present.

• If the power supply is not inserted, insert it.• If the power supply is inserted, power-cycle your system and run

diagnostics on the identified power supply. If the problem persists, replacethe identified power supply.

SNMP trap ID #521: Chassis power is degraded

nvmem.battery.capacity.low

Message nvmem.battery.capacity.low

Severity NODE_ERROR

Description This message occurs when the NVMEM battery lacks the capacity to preservethe NVMEM contents for the required minimum of 72 hours. The system is atthe risk of data loss if the power fails. This message repeats every hour while theproblem continues and the system shuts down in 24 hours if automaticrecharging of the battery does not restore its charge.

166 | Platform Monitoring Guide

Page 167: 215-06774_netapp-cmds

Correctiveaction

Correct any environmental problems, such as chassis over-temperature. Thebattery charges automatically. If the capacity is not restored in several hours,replace the battery pack. If the problem persists, replace the controller module.

SNMP trap ID N/A

nvmem.battery.capacity.low.warn

Message nvmem.battery.capacity.low.warn

Severity INFO

Description This message occurs when the NVMEM battery capacity is below normal.

Corrective action None.

SNMP trap ID N/A

nvmem.battery.capacity.normal

Message nvmem.battery.capacity.normal

Severity INFO

Description This message occurs when the NVMEM battery capacity is normal.

Corrective action None.

SNMP trap ID N/A

nvmem.battery.current.high

Message nvmem.battery.current.high

Severity NODE_ERROR

Description This message occurs when the NVMEM battery current is excessively high andthe system will shut down.

Corrective action First, correct any environmental problems, such as chassis overtemperature. Ifthe NVMEM battery current is still too high, replace the battery pack. If theproblem persists, replace the controller module.

SNMP trap ID N/A

nvmem.battery.current.high.warn

Message nvmem.battery.current.high.warn

Severity INFO

EMS and operational messages | 167

Page 168: 215-06774_netapp-cmds

Description This message occurs when the NVMEM battery current is above normal.

Corrective action INFO

SNMP trap ID N/A

nvmem.battery.sensor.unreadable

Message nvmem.battery.sensor.unreadable

Severity INFO

Description This message occurs when the battery state of the battery-backed memory(NVMEM) is unknown. One of the battery sensors is not readable.

Corrective action Shut down the system and power-cycle the controller module. If the problem isnot corrected, replace the battery. If the sensor is still not readable, replace thecontroller module.

SNMP trap ID N/A

nvmem.battery.temp.high

Message nvmem.battery.temp.high

Severity NODE_ERROR

Description This message occurs when the NVMEM battery is too hot and the system is at ahigh risk of data loss if power fails.

Corrective action If the system is excessively warm, allow it to cool gradually. If the NVMEMbattery temperature reading is still too high, replace the battery pack. If theproblem persists, replace the controller module.

SNMP trap ID N/A

nvmem.battery.temp.low

Message nvmem.battery.temp.low

Severity NODE_ERROR

Description This message occurs when the NVMEM battery is too cold and the system is ata high risk of data loss if power fails.

Corrective action If the system is excessively cold, allow it to warm gradually. If the NVMEMbattery temperature reading is still too low, replace the battery pack. If theproblem persists, replace the controller module.

SNMP trap ID N/A

168 | Platform Monitoring Guide

Page 169: 215-06774_netapp-cmds

nvmem.battery.temp.normal

Message nvmem.battery.temp.normal

Severity INFO

Description This message occurs when the NVMEM battery temperature is normal.

Corrective action None.

SNMP trap ID N/A

nvmem.battery.voltage.high

Message nvmem.battery.voltage.high

Severity NODE_ERROR

Description This message occurs when the NVMEM battery voltage is excessively high andthe system will shut down.

Corrective action First, correct any environmental problems, such as chassis overtemperature. Ifthe NVMEM battery voltage is still too high, replace the battery pack. If theproblem persists, replace the controller module.

SNMP trap ID N/A

nvmem.battery.voltage.high.warn

Message nvmem.battery.voltage.high.warn

Severity INFO

Description This message occurs when the NVMEM battery voltage is above normal.

Corrective action None.

SNMP trap ID N/A

nvmem.battery.voltage.normal

Message nvmem.battery.voltage.normal

Severity INFO

Description This message occurs when the NVMEM battery voltage is normal.

Corrective action None.

SNMP trap ID N/A

EMS and operational messages | 169

Page 170: 215-06774_netapp-cmds

nvmem.voltage.high

Message nvmem.voltage.high

Severity NODE_ERROR

Description This message occurs when the NVMEM supply voltage is high and the systemis at a high risk of data loss if power fails.

Corrective action First, correct any environmental or battery problems. If the problem continues,replace the controller module.

SNMP trap ID N/A

nvmem.voltage.high.warn

Message nvmem.voltage.high.warn

Severity INFO

Description This message occurs when the NVMEM supply voltage is above normal.

Corrective action None.

SNMP trap ID N/A

nvmem.voltage.normal

Message nvmem.voltage.normal

Severity INFO

Description This message occurs when the NVMEM supply voltage is normal.

Corrective action None.

SNMP trap ID N/A

nvram.bat.missing.error

Message nvram.bat.missing.error

Severity NODE_ERROR

Description This message occurs when the battery in the chassis is degrading.

Corrective action Contact technical support.

SNMP trap ID N/A

170 | Platform Monitoring Guide

Page 171: 215-06774_netapp-cmds

nvram.battery.capacity.low

Message nvram.battery.capacity.low

Severity NODE_ERROR

Description This message occurs when the NVRAM battery lacks the capacity to preservethe NVRAM contents for the required minimum of 72 hours. The system is atthe risk of data loss if the power fails. This message repeats every hour while theproblem continues, and the system shuts down in 24 hours if automaticrecharging of the battery does not restore its charge.

Correctiveaction

Correct any environmental problems, such as chassis over-temperature. Thebattery charges automatically. If the capacity is not restored in several hours,replace the battery pack. If the problem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.capacity.low.critical

Message nvram.battery.capacity.low.critical

Severity NODE_ERROR

Description This message occurs when the NVRAM battery capacity is dangerously low.To prevent data loss, the system will shut down in 20 minutes

Corrective action Correct any environmental problems, such as chassis over-temperature. Thebattery charges automatically. If the capacity is not restored automatically,replace the battery pack. If the problem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.capacity.low.warn

Messages nvram.battery.capacity.low.warn

Severity INFO

Description This message occurs when the NVRAM battery capacity is below normal.

Corrective action None.

SNMP trap ID N/A

nvram.battery.capacity.normal

Message nvram.battery.capacity.normal

Severity INFO

EMS and operational messages | 171

Page 172: 215-06774_netapp-cmds

Description This message occurs when the NVRAM battery capacity is normal

Corrective action None.

SNMP trap ID N/A

nvram.battery.charging.nocharge

Message nvram.battery.charging.nocharge

Severity NODE_ERROR

Description This message occurs when the NVRAM battery is requesting to be charged butthe charger is not charging the battery. To prevent data loss, the system willshut down in 20 minutes.

Corrective action Replace the NVRAM battery/card. If the problem persists, replace thecontroller module.

SNMP trap ID N/A

nvram.battery.charging.normal

Message nvram.battery.charging.normal

Severity INFO

Description This message occurs when the NVRAM battery charging status is normal.

Corrective action None.

SNMP trap ID N/A

nvram.battery.charging.wrongcharge

Message nvram.battery.charging.wrongcharge

Severity NODE_ERROR

Description This message occurs when the NVRAM battery charger is charging the batteryeven though the battery is not requesting to be charged. To prevent data loss,the system will be shut down in 20 minutes.

Corrective action Replace the NVRAM battery. If the problem persists, replace the NVRAMcard.

SNMP trap ID N/A

nvram.battery.current.high

Message nvram.battery.current.high

172 | Platform Monitoring Guide

Page 173: 215-06774_netapp-cmds

Severity NODE_ERROR

Description This message occurs when the NVRAM battery current is excessively high andthe system will shut down.

Corrective action First, correct any environmental problems, such as chassis over-temperature. Ifthe NVRAM battery current is still too high, replace the battery pack. If theproblem persists, replace the controller module

SNMP trap ID N/A

nvram.battery.current.high.warn

Message nvram.battery.current.high.warn

Severity INFO

Description This message occurs when the NVRAM battery current is above normal.

Corrective action None.

SNMP trap ID N/A

nvram.battery.current.low

Message nvram.battery.current.low

Severity NODE_ERROR

Description This message occurs when the NVRAM battery has a short circuit.

Corrective action Replace the NVRAM battery/card. If the problem persists, replace thecontroller module

SNMP trap ID N/A

nvram.battery.current.low.warn

Message nvram.battery.current.low.warn

Severity NODE_ERROR

Description This message occurs when the NVRAM battery current is below normal.

Corrective action First, correct any environmental problems. If the NVRAM battery current isstill below normal, replace the NVRAM battery/card. If the problem persists,replace the controller module.

SNMP trap ID N/A

EMS and operational messages | 173

Page 174: 215-06774_netapp-cmds

nvram.battery.current.normal

Message nvram.battery.current.normal

Severity INFO

Description This message occurs when the NVRAM battery current is normal.

Corrective action None.

SNMP trap ID N/A

nvram.battery.end_of_life.high

Message nvram.battery.end_of_life.high

Severity INFO

Description This message occurs when the NVRAM battery-cycle count indicates that thebattery has reached its anticipated life expectancy.

Corrective action None.

SNMP trap ID N/A

nvram.battery.end_of_life.normal

Message nvram.battery.end_of_life.normal

Severity INFO

Description This message occurs when the NVRAM battery-cycle count indicates that thebattery is well below its anticipated life expectancy.

Corrective action None.

SNMP trap ID N/A

nvram.battery.fault

Message nvram.battery.fault

Severity NODE_ERROR

Description This message occurs when the NVRAM battery is reporting a fatal faultcondition. To prevent data loss, the system will shut down in 2 minutes.

Corrective action Correct any environmental problems, such as chassis over-temperature. If thebattery still reports a fatal fault condition, replace the NVRAM battery/card. Ifthe problem persists, replace the controller module.

174 | Platform Monitoring Guide

Page 175: 215-06774_netapp-cmds

SNMP trap ID N/A

nvram.battery.fault.warn

Message nvram.battery.fault.warn

Severity INFO

Description This message occurs when the NVRAM battery is reporting a non-fatal faultcondition.

Corrective action Correct any environmental problems, such as chassis over-temperature.

SNMP trap ID N/A

nvram.battery.fcc.low

Message nvram.battery.fcc.low

Severity NODE_ERROR

Description This message occurs when the NVRAM battery full-charge capacity is low. Toprevent data loss, the system will shut down in 24 hours.

Corrective action First, correct any environmental problems, such as chassis over-temperature. Ifthe NVRAM full-charge capacity is still dangerously low, replace the NVRAMbattery/card. If the problem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.fcc.low.critical

Message nvram.battery.fcc.low.critical

Severity NODE_ERROR

Description This message occurs when the NVRAM battery full-charge capacity isdangerously low. To prevent data loss, the system will shut down in 20minutes.

Corrective action First, correct any environmental problems, such as chassis over-temperature. Ifthe NVRAM full-charge capacity is still dangerously low, replace the NVRAMbattery/card. If the problem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.fcc.low.warn

Message nvram.battery.fcc.low.warn

EMS and operational messages | 175

Page 176: 215-06774_netapp-cmds

Severity INFO

Description This message occurs when the NVRAM battery full-charge capacity is belownormal.

Corrective action Replace the NVRAM battery/card during your next scheduled down-time(within 3 months).

SNMP trap ID N/A

nvram.battery.fcc.normal

Message nvram.battery.fcc.normal

Severity INFO

Description This message occurs when the NVRAM battery full-charge capacity is normal.

Corrective action None.

SNMP trap ID N/A

nvram.battery.power.fault

Message nvram.battery.power.fault

Severity NODE_ERROR

Description This message occurs when the NVRAM battery is not getting powered.

Corrective action Correct any environmental problems such as chassis over-temperature. If theNVRAM battery is still not getting power, replace the NVRAM battery/card. Ifthe problem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.power.normal

Message nvram.battery.power.normal

Severity INFO

Description This message occurs when the NVRAM battery power is normal.

Corrective action None.

SNMP trap ID N/A

nvram.battery.sensor.unreadable

Messages nvram.battery.sensor.unreadable

176 | Platform Monitoring Guide

Page 177: 215-06774_netapp-cmds

Severity INFO

Description This message occurs when the battery state of the battery-backed memory(NVRAM) is unknown. One of the battery sensors is not readable.

Corrective action Shut down the system and power-cycle the controller module. If the problem isnot corrected, replace the NVRAM battery/card. If the sensor is still notreadable, replace the controller module.

SNMP trap ID N/A

nvram.battery.temp.high

Message nvram.battery.temp.high

Severity NODE_ERROR

Description This message occurs when the NVRAM battery is too hot and the system is at ahigh risk of data loss if power fails.

Corrective action If the system is excessively warm, allow it to cool gradually. If the NVRAMbattery temperature reading is still too high, replace the battery pack. If theproblem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.temp.high.warn

Message nvram.battery.temp.high.warn

Severity INFO

Description This message occurs when the NVRAM battery temperature is high.

Corrective action None.

SNMP trap ID N/A

nvram.battery.temp.low

Message nvram.battery.temp.low

Severity NODE_ERROR

Description This message occurs when the NVRAM battery is too cold and the system is ata high risk of data loss if power fails.

Corrective action If the system is excessively cold, allow it to warm gradually. If the NVRAMbattery temperature reading is still too low, replace the battery pack. If theproblem persists, replace the controller module.

SNMP trap ID N/A

EMS and operational messages | 177

Page 178: 215-06774_netapp-cmds

nvram.battery.temp.low.warn

Message nvram.battery.temp.low.warn

Severity INFO

Description This message occurs when the NVRAM battery temperature is low.

Corrective action None.

SNMP trap ID N/A

nvram.battery.temp.normal

Message nvram.battery.temp.normal

Severity INFO

Description This message occurs when the NVRAM battery temperature is normal.

Corrective action None.

SNMP trap ID N/A

nvram.battery.voltage.high

Message nvram.battery.voltage.high

Severity NODE_ERROR

Description This message occurs when the NVRAM battery voltage is excessively high andthe system will shut down.

Corrective action First, correct any environmental problems, such as chassis over-temperature. Ifthe NVRAM battery voltage is still too high, replace the battery pack. If theproblem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.voltage.high.warn

Message nvram.battery.voltage.high.warn

Severity INFO

Description This message occurs when the NVRAM battery voltage is above normal.

Corrective action None.

SNMP trap ID N/A

178 | Platform Monitoring Guide

Page 179: 215-06774_netapp-cmds

nvram.battery.voltage.low

Message nvram.battery.voltage.low

Severity NODE_ERROR

Description This message occurs when the NVRAM battery voltage is critically low. Toprevent data loss, the system will shut down in 2 minutes.

Corrective action First correct any environmental problem, such as chassis over-temperature. Ifthe NVRAM battery voltage is still critically low, replace the NVRAM battery/card. If the problem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.voltage.low.warn

Message nvram.battery.voltage.low.warn

Severity INFO

Description This message occurs when the NVRAM battery voltage is below normal. Toprevent data loss, the system will shut down in 24 hours.

Corrective action First, correct any environmental problems such as chassis over-temperature. Ifthe NVRAM battery voltage is still below normal, replace the NVRAM battery/card. If the problem persists, replace the controller module.

SNMP trap ID N/A

nvram.battery.voltage.normal

Message nvram.battery.voltage.normal

Severity INFO

Description This message occurs when the NVRAM battery voltage is normal.

Corrective action None.

SNMP trap ID N/A

nvram.hw.initFail

Message nvram.hw.initFail

Severity ERR

Description This message occurs when the Data ONTAP NVRAM hardware fails toinitialize.

EMS and operational messages | 179

Page 180: 215-06774_netapp-cmds

Corrective action Typically, this type of error is unexpected and indicates that the NVRAMhardware is failing and should be replaced. Contact technical support forassistance with the replacement.

SNMP trap ID N/A

SAS EMS messagesSAS EMS messages inform you of events and problems involving your system SAS disk drives.

ds.sas.config.warning

Message ds.sas.config.warning

Severity WARNING

Description This message occurs when the system detects a configuration problem on theshelf I/O module.

Corrective action 1. Reseat the disk shelf I/O module.

2. If that does not fix the problem, replace the disk shelf I/O module.

SNMP trap ID N/A

ds.sas.crc.err

Message ds.sas.crc.err

Severity DEBUG

Description This message occurs when a serial-attached SCSI (SAS) cyclic redundancycheck (CRC) error is detected.

Corrective action N/A

SNMP trap ID N/A

ds.sas.drivephy.disableErr

Message ds.sas.drivephy.disableErr

Severity ERR

Description This message occurs when a physical layer device (PHY) on a serial-attachedSCSI (SAS) I/O module is disabled because of one of the following reasons:

• Manually bypassed

180 | Platform Monitoring Guide

Page 181: 215-06774_netapp-cmds

• Exceeded loss of double word synchronization threshold• Exceeded running disparity threshold transmitter fault• Exceeded cyclic redundancy check (CRC) error threshold• Exceeded invalid double word threshold• Exceeded PHY reset problem threshold• Exceeded broadcast change threshold• Mirroring disabled on the other I/O module

Corrective action Replace the disabled disk drive.

SNMP Trap ID #574

ds.sas.element.fault

Message ds.sas.element.fault

Severity ERR

Description This message indicates a transport error.

Corrective action 1. Check cabling to the disk shelf.

2. Check the status LED on the disk shelf and make sure that fault LEDs arenot on.

3. Clear any fault condition, if possible.

4. See the quick reference card beneath the disk shelf for information about themeanings of the LEDs.

SNMP trap ID N/A

ds.sas.element.xport.error

Message ds.sas.element.xport.error

Severity ERR

Description This message indicates a transport error.

Corrective action 1. Check cabling to the disk shelf.

2. Check the status LED on the disk shelf and make sure that fault LEDs arenot on.

3. Clear any fault condition, if possible

4. See the quick reference card beneath the disk shelf for information about themeanings of the LEDs.

EMS and operational messages | 181

Page 182: 215-06774_netapp-cmds

SNMP trap ID N/A

ds.sas.hostphy.disableErr

Message ds.sas.hostphy.disableErr

Severity ERR

Description This message occurs when a host physical layer device (PHY) on a serial-attached SCSI (SAS) I/O module is disabled because of one of the followingreasons:

• Manually bypassed• Exceeded loss of double word synchronization threshold• Exceeded running disparity threshold Transmitter fault• Exceeded cyclic redundancy check (CRC) error threshold• Exceeded invalid double word threshold• Exceeded PHY reset problem threshold• Exceeded broadcast change threshold• Mirroring disabled on the other I/O module

Corrective action Replace the disk shelf module to which the host physical layer device belongs.

SNMP trap ID N/A

ds.sas.invalid.word

Message ds.sas.invalid.word

Severity DEBUG

Description This message occurs when a serial-attached SCSI (SAS) word error is detectedin a SAS primitive. These errors can be caused by the disk drive, the cable, thehost bus adapter (HBA), or the shelf I/O module.

Corrective action The SAS specification allows for a certain bit error rate so that these errors canoccur. There is nothing to be alarmed about if these individual errors show upoccasionally.

SNMP trap ID N/A

ds.sas.loss.dword

Message ds.sas.loss.dword

Severity DEBUG

182 | Platform Monitoring Guide

Page 183: 215-06774_netapp-cmds

Description This message occurs when a serial-attached SCSI (SAS) loss of double wordsynchronization error is detected in a SAS primitive.

Corrective action N/A

SNMP trap ID N/A

ds.sas.multPhys.disableErr

Message ds.sas.multPhys.disableErr

Severity ERR

Description This message occurs when physical layer devices (PHYs) are disabled onmultiple disk drives in a serial-attached SCSI (SAS) disk shelf.

Corrective action 1. Check whether the problems on the physical layer devices are valid.

2. If multiple physical layer devices are disabled at the same time, replace thedisk shelf module.

SNMP trap ID N/A

ds.sas.phyRstProb

Message ds.sas.phyRstProb

Severity DEBUG

Description This message occurs when a serial-attached SCSI (SAS) physical layer device(PHY) reset error is detected in a SAS primitive.

Corrective action N/A

SNMP trap ID N/A

ds.sas.running.disparity

Message ds.sas.running.disparity

Severity DEBUG

Description This message occurs when a serial-attached SCSI (SAS) running disparity erroris detected in a SAS primitive. These errors are caused when the number oflogical 1s and 0s are too much out of sync.

Corrective action N/A

SNMP trap ID N/A

EMS and operational messages | 183

Page 184: 215-06774_netapp-cmds

ds.sas.ses.disableErr

Message ds.sas.ses.disableErr

Severity NODE_ERROR

Description This message occurs when a virtual SCSI Enclosure Services (SES) physicallayer device (PHY) on a serial-attached SCSI (SAS) I/O module is disabled dueto one of the following reasons:

• Manually bypassed• Exceeded loss of double word synchronization threshold• Exceeded running disparity threshold Transmitter fault• Exceeded cyclic redundancy check (CRC) error threshold• Exceeded invalid double word threshold• Exceeded PHY reset problem threshold• Exceeded broadcast change threshold

Corrective action Replace the shelf module to which the concerned SES physical layer devicebelongs.

SNMP trap ID N/A

ds.sas.xfer.element.fault

Message ds.sas.xfer.element.fault

Severity ERR

Description This message indicates that an element had a fault during an I/O request. Itmight be because of a transient condition in link connectivity.

Corrective action 1. Check cabling to the shelf.

2. Check the status LED on the shelf, and make sure that fault LEDs are noton.

3. Clear any fault condition, if possible.

4. See the quick reference card beneath the shelf for information about themeanings of the LEDs.

SNMP trap ID N/A

ds.sas.xfer.export.error

Message ds.sas.xfer.export.error

184 | Platform Monitoring Guide

Page 185: 215-06774_netapp-cmds

Severity ERR

Description This message indicates a transport error during an I/O request. It might be dueto a transient condition in link activity.

Corrective action 1. Check cabling to the shelf.

2. Check cabling to the shelf.

3. Clear any fault condition, if possible.

4. See the quick reference card beneath the shelf for information about themeanings of the LEDs.

SNMP trap ID N/A

ds.sas.xfer.not.sent

Message ds.sas.xfer.not.sent

Severity ERR

Description This message indicates that an I/O transfer could not be sent. It might bebecause of a transient condition in link connectivity.

Corrective action 1. Check cabling to the shelf.

2. Check the status LED on the shelf, and make sure that fault LEDs are noton.

3. Clear any fault condition, if possible.

4. See the quick reference card beneath the shelf for information about themeanings of the LEDs.

SNMP trap ID N/A

ds.sas.xfer.unknown.error

Message ds.sas.xfer.unknown.error

Severity ERR

Description This message indicates that an unknown error occurred during an I/O request.

Corrective action N/A

SNMP trap ID N/A

EMS and operational messages | 185

Page 186: 215-06774_netapp-cmds

sas.adapter.bad

Message sas.adapter.bad

Severity ALERT

Description This message occurs when the serial-attached SCSI (SAS) adapter fails toinitialize.

Corrective action 1. Reseat the adapter.

2. If reseating the adapter failed to help, replace the adapter.

SNMP trap ID N/A

sas.adapter.bootarg.option

Message sas.adapter.bootarg.option

Severity INFO

Description The serial-attached SCSI (SAS) adapter driver is setting an option based on thesetting of a bootarg/environment variable.

Corrective action None

SNMP trap ID N/A

sas.adapter.debug

Message sas.adapter.debug

Severity INFO

Description This message occurs during the serial-attached SCSI (SAS) adapter driverdebug event.

Corrective action None

SNMP trap ID N/A

sas.adapter.exception

Message sas.adapter.exception

Severity WARNING

Description This message occurs when the serial-attached SCSI (SAS) adapter driverencounters an error with the adapter. The adapter is reset to recover.

Corrective action None.

186 | Platform Monitoring Guide

Page 187: 215-06774_netapp-cmds

SNMP trap ID N/A

sas.adapter.failed

Message sas.adapter.failed

Severity ERR

Description This message occurs when the serial-attached SCSI (SAS) adapter drivercannot recover the adapter after resetting it multiple times. The adapter is putoffline.

Corrective action 1. If the adapter is in use, check the cabling.

2. If connected to disk shelves, check the seating of IOM cards and disks.

3. If the problem persists, try replacing the adapter.

4. If the issue is still not resolved, contact technical support.

SNMP trap ID N/A

sas.adapter.firmware.download

Message sas.adapter.firmware.download

Severity INFO

Description This message occurs when firmware is being updated on the serial-attachedSCSI (SAS) adapter.

Corrective action None.

SNMP trap ID N/A

sas.adapter.firmware.fault

Message sas.adapter.firmware.fault

Severity WARNING

Description This message occurs when a firmware fault is detected on the serial-attachedSCSI (SAS) adapter and it is being reset to recover.

Corrective action None.

SNMP trap ID N/A

sas.adapter.firmware.update.failed

Message sas.adapter.firmware.update.failed

EMS and operational messages | 187

Page 188: 215-06774_netapp-cmds

Severity CRIT

Description This message occurs when firmware on the serial-attached SCSI (SAS) adaptercannot be updated.

Corrective action Replace the adapter as soon as possible. The SAS adapter driver attempts tocontinue using the adapter without updating the firmware image.

SNMP trap ID N/A

sas.adapter.not.ready

Message sas.adapter.not.ready

Severity ERR

Description This message occurs when the serial-attached SCSI (SAS) adapter does notbecome ready after being reset.

Corrective action The SAS adapter driver automatically attempts to recover from this error. If theerror keeps occurring, the adapter might need to be replaced.

SNMP trap ID N/A

sas.adapter.offline

Message sas.adapter.offline

Severity INFO

Description This message indicates the name of the associated serial-attached SCSI (SAS)host bus adapter (HBA).

Corrective action None.

SNMP trap ID N/A

sas.adapter.offlining

Message sas.adapter.offlining

Severity INFO

Description This message occurs when the serial-attached SCSI (SAS) adapter is goingoffline after all outstanding I/O requests have finished.

Corrective action None.

SNMP trap ID N/A

188 | Platform Monitoring Guide

Page 189: 215-06774_netapp-cmds

sas.adapter.online

Message sas.adapter.online

Severity INFO

Description This message indicates that the serial-attached SCSI (SAS) adapter is nowonline.

Corrective action None.

SNMP trap ID N/A

sas.adapter.online.failed

Message sas.adapter.online.failed

Severity LOG_ERR

Description This message indicates the name of the associated serial-attached SCSI (SAS)host bus adapter (HBA).

Corrective action 1. If the HBA is in use, check the cabling.

2. If the HBA is connected to disk shelves, check the seating of IOM cards.

SNMP trap ID N/A

sas.adapter.onlining

Message sas.adapter.onlining

Severity INFO

Description This message indicates that the serial-attached SCSI (SAS) adapter is in theprocess of going online.

Corrective action None.

SNMP trap ID N/A

sas.adapter.reset

Message sas.adapter.reset

Severity INFO

Description This message occurs when the Data ONTAP serial-attached SCSI (SAS) driveris resetting the specified HBA. This can occur during normal error handling orby user request.

EMS and operational messages | 189

Page 190: 215-06774_netapp-cmds

Corrective action None.

SNMP trap ID N/A

sas.adapter.unexpected.status

Message sas.adapter.unexpected.status

Severity WARNING

Description This message occurs when the serial-attached SCSI (SAS) adapter returns anunexpected status and is reset to recover.

Corrective action None.

SNMP trap ID N/A

sas.cable.error

Message sas.cable.error

Severity WARNING

Description Failure to retrieve information about cable attached to the serial-attached SCSI(SAS) adapter port occurred.

Corrective action None.

SNMP trap ID N/A

sas.cable.pulled

Message sas.cable.pulled

Severity INFO

Description The cable attached to the serial-attached SCSI (SAS) adapter port was pulledout.

Corrective action None.

SNMP trap ID N/A

sas.cable.pushed

Message sas.cable.pushed

Severity INFO

Description The cable attached to the serial-attached SCSI (SAS) adapter port was pushedin.

190 | Platform Monitoring Guide

Page 191: 215-06774_netapp-cmds

Corrective action None.

SNMP trap ID N/A

sas.config.mixed.detected

Message sas.config.mixed.detected

Severity WARNING

Description This message occurs when a serial-attached SCSI (SAS) disk shelf contains amixture of SAS drives, serial advanced technology attachment (SATA) drivesor bridged SAS drives. Mixing drive types within a disk shelf is not supported.

Corrective action Ensure that each SAS disk shelf is populated with drives of only one type.

SNMP trap ID N/A

sas.device.invalid.wwn

Message sas.device.invalid.wwn

Severity ERR

Description This message occurs when the serial-attached SCSI (SAS) device responds withan invalid worldwide name.

Corrective action Power-cycling the device might allow it to recover from this problem.

SNMP trap ID N/A

sas.device.quiesce

Message sas.device.quiesce

Severity INFO

Description This message indicates that at least one command to the specified device has notcompleted in the normally expected time. In this case, the driver stops sendingadditional commands to the device until all outstanding commands have had anopportunity to be completed. This condition is automatically handled by theData ONTAP serial-attached SCSI (SAS) driver.

Correctiveaction

This condition by itself does not mean that the target device is problematic. Highworkloads might cause link saturation leading to device contention for the bus.Transport issues might also cause link throughput to decrease, thereby causingI/Os to take longer than normal.

If you see this message only on occasion, no action is required. The systemhandles the condition automatically.

EMS and operational messages | 191

Page 192: 215-06774_netapp-cmds

SNMP trap ID N/A

sas.device.resetting

Message sas.device.resetting

Severity WARNING

Description This message indicates device level error recovery has escalated to resetting thedevice. It is usually seen in association with error conditions such as devicelevel timeouts or transmission errors.

This message reports the recovery action taken by the Data ONTAP serial-attached SCSI (SAS) driver when evaluating associated device-related or link-related error conditions.

Corrective action None.

SNMP trap ID N/A

sas.device.timeout

Message sas.device.timeout

Severity ERR

Description This message occurs when not all outstanding commands to the specified devicewere completed within the allotted time. As part of the standard error handlingsequence managed by the Data ONTAP serial-attached SCSI (SAS) driver, allcommands to the device are aborted and reissued.

Correctiveaction

Device level timeouts are a common indication of a SAS link stability problem.In some cases, the link is operating normally and the specified device is havingtrouble processing I/O requests in a timely manner. In such cases, the specifieddevice should be evaluated for possible replacement.

Quite often the problem results from the partial failure of a component involvedin the SAS transport. Common things to check include the following:

• Complete seating of drive carriers in enclosure bays• Properly secured cable connections• IOM seating• Crimped or otherwise damaged cables

SNMP trap ID N/A

192 | Platform Monitoring Guide

Page 193: 215-06774_netapp-cmds

sas.initialization.failed

Message sas.initialization.failed

Severity ERR

Description This message occurs when the serial-attached SCSI (SAS) adapter fails toinitialize the link and appears to be unattached or disconnected.

Corrective action 1. If the adapter is in use, check the cabling.

2. If the adapter is connected to disk shelves, check the seating of IOM cards.

SNMP trap ID N/A

sas.link.error

Message sas.link.error

Severity ERR

Description This message occurs when the serial-attached SCSI (SAS) adapter cannotrecover the link and is going offline.

Corrective action 1. If the adapter is in use, check the cabling.

2. If the adapter is connected to disk shelves, check the seating of IOM cardsand disks.

3. If this does not resolve the issue, contact technical support.

SNMP trap ID N/A

sas.port.disabled

Message sas.port.disabled

Severity WARNING

Description The serial-attached SCSI (SAS) adapter port went down by virtue of beingdisabled by the operator.

Corrective action None.

SNMP trap ID N//A

sas.port.down

Message sas.port.down

EMS and operational messages | 193

Page 194: 215-06774_netapp-cmds

Severity WARNING

Description The serial-attached SCSI (SAS) adapter port went down through no action bythe operator.

Corrective action None.

SNMP trap ID N/A

sas.shelf.conflict

Message sas.shelf.conflict

Severity ERR

Description This message occurs when the system detects that two or more SAS (SerialAttached SCSI) disk shelves have the same shelf ID. The SAS domain isfunctional, but references to disk shelves will be based on disk shelf serialnumbers, not disk shelf IDs.

Corrective action Reassign disk shelf IDs so that no conflict exists.

SNMP trap ID N/A

sasmon.adapter.phy.disable

Message sasmon.adapter.phy.disable

Severity ERR

Description This message occurs when a serial attached serial-attached SCSI (SAS)transceiver (physical layer device) attached to a SAS host bus adapter (HBA) isdisabled due to one of the following reasons:

• Exceeded loss of double word synchronization error threshold• Exceeded running disparity error threshold• Exceeded invalid double word error threshold• Exceeded physical layer device reset problem threshold• Exceeded broadcast change threshold

Correctiveaction

1. If the adapter is in use, check the cabling.

2. If the adapter is connected to the disk shelves, check the seating of the IOMcards.

3. If that does not fix the problem, contact technical support.

SNMP trap ID N/A

194 | Platform Monitoring Guide

Page 195: 215-06774_netapp-cmds

sasmon.adapter.phy.event

Message sasmon.adapter.phy.event

Severity DEBUG

Description This message occurs when a serial attached serial-attached SCSI (SAS)transceiver (physical layer device) attached to a SAS host bus adapter (HBA)experiences a transient error. These errors are observed on a received doubleword (dword) or when resetting a PHY.

Types of these errors are disparity errors, invalid dword errors, physical layerdevice (PHY) reset problem errors, loss of dword synchronization errors, andPHY change events. The SAS specification allows for a certain bit error rate sothat these errors can occur under normal operating conditions.

There is no cause for concern if these individual errors show up occasionally.

Correctiveaction

None.

SNMP trap ID N/A

sasmon.disable.module

Message sasmon.disable.module

Severity INFO

Description This message occurs when the Data ONTAP module responsible for monitoringthe serial attached serial-attached SCSI (SAS) domain’s transient errors isdisabled due to the environment variable disable-sasmon? being set totrue.

Corrective action Set the environment variable disable-sasmon? to false to enable thismonitor module.

SNMP trap ID N/A

shm.threshold.spareBlocksConsumed

Message shm.threshold.spareBlocksConsumed

Severity NOTICE

Description This message occurs when the spares consumed value exceeds the firstthreshold on an SSD.

Corrective action None.

EMS and operational messages | 195

Page 196: 215-06774_netapp-cmds

shm.threshold.spareBlocksConsumedMax

Message shm.threshold.spareBlocksConsumedMax

Severity WARNING

Description This messages occurs when the spares consumed value exceeds the secondthreshold on an SSD.

Corrective action None.

SES EMS messagesSES messages appear in AutoSupport messages if failures or warning conditions occur in yoursystem’s storage components.

ses.access.noEnclServ

Message ses.access.noEnclServ

Severity NODE_ERROR

Description This message occurs when SCSI Enclosure Services (SES) in the storage systemcannot establish contact with the enclosure monitoring process in any disk shelf onthe channel. Some disk shelves require that disks be installed and functioning inparticular shelf bays.

Correctiveaction

Note: This message applies to DS14/DS14mk2/DS14mk4 disk shelves that arenot -AT-type shelves. DS14mk2 is used in this message as an example.

1. In disk shelves that require certain disk placement, verify that disks areinstalled in the indicated bays: DS14/DS14mk2 FC: bays 0 and/or 1

Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, andDS14mk2 AT shelves do not rely on disk placement for SES.

SES in the storage system tries periodically to reestablish contact with the diskshelf.

2. If disks are placed correctly but the error persists for more than an hour, halt thestorage system, power-cycle the disk shelf, and reboot.

3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)might need to be replaced. In SCSI-based shelves, replace the shelf.

196 | Platform Monitoring Guide

Page 197: 215-06774_netapp-cmds

ses.access.noMoreValidPaths

Message ses.access.noMoreValidPaths

Severity NODE_ERROR

Description This message occurs when SCSI Enclosure Services (SES) in the storage systemloses access to the enclosure monitoring process in the disk shelf. Some diskshelves require that disks be installed and functioning in particular shelf bays.

Correctiveaction

Note: This message applies to DS14/DS14mk2/DS14mk4 disk shelves that arenot -AT-type shelves. DS14mk2 is used in this message as an example.

1. This message occurs when SES in the storage system loses access to theenclosure monitoring process in the disk shelf. Some disk shelves require thatdisks be installed and functioning in particular shelf bays: DS14/DS14mk2 FC:bays 0 and/or 1

Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, andDS14mk2 AT shelves do not rely on disk placement for SES.

SES in the storage system tries periodically to reestablish contact with thedisk shelf.

2. If disks are placed correctly, but the error persists for more than an hour, haltthe storage system, power-cycle the disk shelf, and reboot.

3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)might need to be replaced. In SCSI-based shelves, replace the shelf.

ses.access.noShelfSES

Message ses.access.noShelfSES

Severity NODE_ERROR

Description This message occurs when SCSI Enclosure Services (SES) in the storage systemcannot establish contact with the SES process in the indicated disk shelf. Somedisk shelves require that disks be installed and functioning in particular disk shelfbays.

Correctiveaction

Note: This message applies to DS14/DS14mk2/DS14mk4 disk shelves that arenot -AT-type shelves. DS14mk2 is used in this message as an example.

1. In disk shelves that require certain disk placement, verify that disks areinstalled in the indicated bays: DS14/DS14mk2 FC: bays 0 and/or 1

Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, andDS14mk2 AT shelves do not rely on disk placement for SES.

EMS and operational messages | 197

Page 198: 215-06774_netapp-cmds

SES in the storage system tries periodically to reestablish contact with thedisk shelf.

2. If disks are placed correctly but the error persists for more than an hour, halt thestorage system, power-cycle the disk shelf, and reboot.

3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)might need to be replaced. In SCSI-based shelves, replace the shelf.

ses.access.sesUnavailable

Message ses.access.sesUnavailable

Severity NODE_ERROR

Description This message occurs when SCSI Enclosure Services (SES) in the storage systemcannot establish contact with the enclosure monitoring process in one or more diskshelves on the channel. Some disk shelves require that disks be installed andfunctioning in particular disk shelf bays.

Correctiveaction

Note: This message applies to DS14/DS14mk2/DS14mk4 disk shelves that arenot -AT-type shelves. DS14mk2 is used in this message as an example.

1. In disk shelves that require certain disk placement, verify that disks are installedin the indicated bays: DS14/DS14mk2 FC: bays 0 and/or 1

Note: SCSI-based shelves, serial-attached SCSI (SAS) shelves, andDS14mk2 AT shelves do not rely on disk placement for SES.

SES in the storage system tries periodically to reestablish contact with the diskshelf.

2. If disks are placed correctly but the error persists for more than an hour, halt thestorage system, power-cycle the disk shelf, and reboot.

3. If the error persists, then SES hardware (for example, VEM, LRC, or IOM)might need to be replaced. In SCSI-based shelves, replace the shelf.

ses.badShareStorageConfigErr

Message ses.badShareStorageConfigErr

Severity NODE_ERROR

Description This message occurs when a disk shelf module that is not supported in aSharedStorage system, such as an LRC module, is detected in a SharedStoragesystem.

198 | Platform Monitoring Guide

Page 199: 215-06774_netapp-cmds

Corrective action Replace the unsupported module with one that is supported, such as an ESH,ESH2, or AT-FCX module.

ses.bridge.fw.getFailWarn

Message ses.bridge.fw.getFailWarn

Severity WARNING

Description This message occurs when the bridge firmware revision cannot be obtained.

Corrective action Check the connection to the bank of Maxtor drives.

ses.bridge.fw.mmErr

Message ses.bridge.fw.mmErr

Severity SVC_ERROR

Description This message occurs when the bridge firmware revision is inconsistent.

Corrective action Check the firmware revision number and make sure that they are consistent.You might have to update the firmware.

ses.channel.rescanInitiated

Message ses.channel.rescanInitiated

Severity INFO

Description This message identifies the name of the adapter port or switch port beingrescanned; for example, “7a” or “myswitch:5”.

Corrective action None.

ses.disk.pctl.timeout

Message ses.disk.pctl.timeout

Severity DEBUG

Description This message occurs when a power control request submitted to the specifiedSCSI Enclosure Services (SES) module is not completed within 60 seconds.

Correctiveaction

Normally, there is no corrective action required for this error because thetimeout might be due to a transient error. However, if you see this messagefrequently, there might be an issue with the I/O module in the shelf, whichmight need to be replaced.

EMS and operational messages | 199

Page 200: 215-06774_netapp-cmds

ses.config.drivePopError

Message ses.config.drivePopError

Severity WARNING

Description This message occurs when the channel has more disk drives on it than areallowed.

Systems using synchronous mirroring allow more disk drives per channel thanother systems.

Correctiveaction

Your action depends on whether you intend to use synchronous mirroring.

• If you intend to use synchronous mirroring, make sure that the license isinstalled.

• If you do not intend to use synchronous mirroring, reduce the number of diskdrives on the channel to no more than the maximum allowed.

ses.config.IllegalEsh270

Message ses.config.IllegalEsh270

Severity NODE_ERROR

Description This message occurs when Data ONTAP detects one or more ESH disk shelfmodules in a disk shelf that is attached to a FAS270 system. This is not asupported configuration.

Corrective action Replace the ESH modules with ESH2 modules.

ses.config.shelfMixError

Message ses.config.shelfMixError

Severity NODE_ERROR

Description This message occurs when the channel has a mixture of ATA and Fibre Channeldisk shelves; this is not a supported configuration.

Correctiveaction

Mixed-mode operation of ATA and Fibre Channel disks on the system is onlysupported on separate loops. Move all Fibre Channel-based disk shelves to oneloop and place all Fibre Channel-to-ATA-based disk shelves on another loop.

ses.config.shelfPopError

Message ses.config.shelfPopError

Severity NODE_ERROR

200 | Platform Monitoring Guide

Page 201: 215-06774_netapp-cmds

Description This message occurs when the channel has more shelves on it than are allowed.

Corrective action Reduce the number of disk shelves on the channel to the number specified.

ses.disk.configOk

Message ses.disk.configOk

Severity INFO

Description This message occurs when there are no longer any drives in a FAS2050 or anSA200 system slots between 20 and 23.

Corrective action None.

ses.disk.illegalConfigWarn

Message ses.disk.illegalConfigWarn

Severity WARNING

Description This message occurs when disk drives are inserted into the bottom row of aFAS2050 or an SA200 system. Disk drives are not supported in those slots.

Corrective action None.

ses.disk.pctl.timeout

Message ses.disk.pctl.timeout

Severity DEBUG

Description This message occurs when a power control request submitted to the specifiedSCSI Enclosure Services (SES) module is not completed within 60 seconds.

Correctiveaction

Normally, there is no corrective action required for this error because thetimeout might be due to a transient error. However, if you see this messagefrequently, there might be an issue with the I/O module in the shelf, whichmight need to be replaced.

ses.download.powerCyclingChannel

Message ses.download.powerCyclingChannel

Severity INFO

Description This message occurs when the power-cycling channel event is issued after adisk shelf firmware download to disk shelves that require a power-cycle toactivate the new code.

EMS and operational messages | 201

Page 202: 215-06774_netapp-cmds

Corrective action None.

ses.download.shelfToReboot

Message ses.download.shelfToReboot

Severity INFO

Description This message occurs after the completion of shelf firmware transfer to theDS14mk2 AT disk shelf. At this point, the disk shelf requires about another fiveminutes to transfer the new firmware to its nonvolatile program memory,whereupon it reboots to begin to execute the new firmware. During this reboot,a Fibre Channel loop reinitialization occurs, temporarily interrupting the loop.

Correctiveaction

None.

ses.download.suspendIOForPowerCycle

Message ses.download.suspendIOForPowerCycle

Severity INFO

Description This message occurs when the suspending I/O event signals that the storagesubsystem is temporarily stopping I/O to disks while one or more disk shelveshave their power cycled after a download, if required by the disk shelf design.

Correctiveaction

None.

ses.drive.PossShelfAddr

Message ses.drive.PossShelfAddr

Severity WARNING

Description This message occurs in conjunction with the message ses.drive.shelfAddr.mmwhen there are devices that have apparently taken a wrong address; the adaptershows device addresses that SCSI Enclosure Services (SES) indicates should notexist, and vice versa.

This error is not a fatal condition. It means that SES cannot perform certainoperations on the affected disk drives, such as setting failure LEDs, because it is notcertain which disk shelf the affected disk drive is in.

Correctiveaction

1. If the problem is throughout the disk shelf, replace the disk shelf.

2. If the error is only one disk drive per disk shelf, the drive might have taken anincorrect address at power-on.

202 | Platform Monitoring Guide

Page 203: 215-06774_netapp-cmds

3. Arrange to make this disk drive a spare, and then reseat it to cause it to take itsaddress again.

4. If the problem persists, insert a different spare disk drive into the slot. If theerror then clears, replace the original disk drive.

5. If the problem persists, there is a hardware problem with the individual diskbay. Replace the disk shelf.

ses.drive.shelfAddr.mm

Message ses.drive.shelfAddr.mm

Severity NODE_ERROR

Description This message occurs when there is a mismatch between the position of the drivesdetected by the disk shelf and the address of the drives detected by the FibreChannel loop or SCSI bus.

This error indicates that a disk drive took an address other than what the disk shelfshould have provided, or that SCSI Enclosure Services (SES) in a disk shelf cannotbe contacted for address information, or that a disk drive unexpectedly does notparticipate in device discovery on the loop or bus.

If the message EMS_ses_drive_possShelfAddr subsequently appears, followthe corrective actions in that message.

In this condition, the SES process in the system might be unable to perform certainoperations on the disk, such as setting failure LEDs or detecting disk swaps.

Correctiveaction

Note: This message applies to DS14/DS14mk2/DS14mk4 disk shelves that arenot -AT-type shelves. DS14mk2 is used in this message as an example.

1. If this occurs to multiple disk drives on the same loop, check the I/O modules atthe back of the disk shelves on that loop for errors.

2. In disk shelves that require certain disk placement, verify that disks are installedin the indicated bays: DS14/DS14mk2 FC: bays 0 and/or 1

Note: SCSI-based disk shelves and DS14mk2 AT disk shelves do not rely ondisk placement for SES.

ses.exceptionShelfLog

Message ses.exceptionShelfLog

Severity DEBUG

Description This message occurs when an I/O module encounters an exception condition.

EMS and operational messages | 203

Page 204: 215-06774_netapp-cmds

Correctiveaction

1. Check the system logs to see whether any disk errors recently occurred.

2. Pull an AutoSupport message file that contains the latest copy of the shelflog information from each disk shelf.

3. Try to correlate the date and time from the errors in the message file with thedate and time of events in the shelf log file.

ses.extendedShelfLog

Message ses.extendedShelfLog

Severity DEBUG

Description This message occurs when a disk encounters an error and the system requests thatadditional log information be obtained from both modules in the disk shelfreporting the error to aid in debugging problems.

Correctiveaction

1. Check the system logs to see whether any disk errors recently occurred.

2. Pull an AutoSupport message file that contains the latest copy of the shelf loginformation from each disk shelf.

3. Try to correlate the date and time from the errors in the message file with thedate and time of events in the shelf log file.

ses.fw.emptyFile

Message ses.fw.emptyFile

Severity WARNING

Description This message occurs when a firmware file is found to be empty during a diskshelf firmware update.

Corrective action Obtain the correct firmware file and place it in the etc/shelf_fw directory. Youcan download the firmware file from the NOW site at http://now.netapp.com/.

ses.fw.resourceNotAvailable

Message ses.fw.resourceNotAvailable

Severity ERR

Description This message occurs when there is not enough contiguous memory available todownload disk shelf firmware.

Corrective action 1. Reduce the amount of system activities before performing a manual diskshelf firmware update.

204 | Platform Monitoring Guide

Page 205: 215-06774_netapp-cmds

2. If the disk shelf firmware update fails again, reboot the storage system.

ses.giveback.restartAfter

Message ses.giveback.restartAfter

Severity INFO

Description This message occurs when SCSI Enclosure Services (SES) is restarted aftergiveback.

Corrective action None.

ses.giveback.wait

Message ses.giveback.wait

Severity INFO

Description This message occurs when SCSI Enclosure Services (SES) information is notavailable because the system is waiting for giveback.

Corrective action None.

ses.psu.coolingReqError

Message ses.psu.coolingReqError

Severity LOG_CRIT

Description This message occurs when the installed power supplies are placed so that air-flowrequirements of the disk shelf are not met. The power supply chassis and theirpower supplies are an integral part of the disk shelf cooling and air-flow design.

Correctiveaction

Verify that the power supplies are placed in the locations required to provideproper air flow according to the disk shelf specifications.

DS14-style shelves always require both power supplies. SAS-Shelf24 requirespower supplies in power supply bays 1 and 4 for proper air flow and cooling.

ses.psu.powerReqError

Message ses.psu.powerReqError

Severity LOG_CRIT

Description This message occurs when too few power supplies are installed to redundantlysatisfy the current-draw requirements of the disk drives in the disk shelf. Thismight occur if a power supply is removed or fails. Some disk drive models require

EMS and operational messages | 205

Page 206: 215-06774_netapp-cmds

more power than others. If the disk shelf specifications for the installed drivemodels specify more power supplies to support that disk type, then this conditioncan also occur at disk swap or insertion in some disk shelves.

Correctiveaction

Verify that the number of power supplies installed satisfies the power requirementsof the installed disk drives.

DS14-style shelves always require both power supplies. SAS-Shelf24 requirespower supplies in power supply bays 1 and 4 for proper cooling and air flow. Ifany disk drives are 10K RPM or faster, then power supply bays 2 and 3 must alsohave power supplies.

ses.remote.configPageError

Message ses.remote.configPageError

Severity INFO

Description This message occurs when a request to another system in a SharedStorageconfiguration fails. This request was for a specific disk shelf's SCSI EnclosureServices (SES) configuration page.

Corrective action Contact technical support.

ses.remote.elemDescPageError

Message ses.remote.elemDescPageError

Severity INFO

Description This message occurs when a request to another system in a SharedStorageconfiguration fails. This request was for the element descriptor pages that theother system has local access to.

Corrective action Contact technical support.

ses.remote.faultLedError

Message ses.remote.faultLedError

Severity INFO

Description This message occurs when a request to another system to have it set the faultLED of a disk drive on a disk shelf fails.

Corrective action Contact technical support.

206 | Platform Monitoring Guide

Page 207: 215-06774_netapp-cmds

ses.remote.flashLedError

Message ses.remote.flashLedError

Severity INFO

Description This message occurs when a request to another system to have it flash the LEDof a disk drive on a disk shelf fails.

Corrective action Contact technical support.

ses.remote.shelfListError

Message ses.remote.shelfListError

Severity INFO

Description This message occurs when a request to another system in a SharedStorageconfiguration fails. This request was for a list of the disk shelves that the othersystem has local access to.

Corrective action Contact technical support.

ses.remote.statPageError

Message ses.remote.statPageError

Severity INFO

Description This message occurs when a request to another system in a SharedStorageconfiguration fails. This request was for the SCSI Enclosure Services (SES)status pages that the other system has local access to.

Corrective action Contact technical support.

ses.shelf.changedID

Message ses.shelf.changedID

Severity WARNING

Description This message occurs on a SAS disk shelf when the disk shelf ID changes afterpower is applied to the disk shelf.

Correctiveaction

1. Verify that the disk shelf ID displayed in this message is the same as the diskshelf ID shown on the disk shelf.

2. If they are different, perform one of the following steps:

EMS and operational messages | 207

Page 208: 215-06774_netapp-cmds

• If the disk shelf ID displayed in this message is the one you want, reset thedisk shelf ID on the thumbwheel to match it.

• If you want the new disk shelf ID instead of the disk shelf ID displayed inthe message, verify that the disk shelf ID you want does not conflict withother disk shelves in the domain.

3. Power-cycle the disk shelf chassis. You can wait to perform this procedureuntil your next maintenance window.

4. If the warning persists on both disk shelf modules after you complete theprocedure, replace the disk shelf chassis. If it persists on only one disk shelfmodule, replace the disk shelf module.

ses.shelf.ctrlFailErr

Message ses.shelf.ctrlFailErr

Severity SVC_ERROR

Description This message occurs when the adapter and loop ID of the SCSI EnclosureServices (SES) target for which the SES has control fail.

Correctiveaction

1. Check the LEDs on the disk shelf and the disk shelf modules on the back ofthe disk shelf to see whether there are any abnormalities. If the modulesappear to be problematic, replace the applicable module.

2. If the SES target is a disk drive, check to see whether the disk drive failed. Ifit failed, replace the disk drive.

ses.shelf.em.ctrlFailErr

Message ses.shelf.em.ctrlFailErr

Severity SVC_ERROR

Description This message occurs when SCSI Enclosure Services (SES) control to theinternal disk drives of a system fails.

Corrective action 1. Enter environment shelf to see whether that disk shelf is still beingactively monitored.

2. If the environment shelf command indicates a failure, there is ahardware failure in the system's internal disk shelf.

ses.shelf.IdBasedAddr

Message ses.shelf.IdBasedAddr

208 | Platform Monitoring Guide

Page 209: 215-06774_netapp-cmds

Severity WARNING

Description This message occurs on a serial-attached SCSI (SAS) disk shelf when the SASaddresses of the devices are based on the disk shelf ID instead of the disk shelfbackplane serial number. This indicates problems communicating with the diskshelf backplane.

Correctiveaction

1. Reseat the master disk shelf module, as indicated by the output of theenvironment shelf command.

2. If the problem persists, reseat the slave disk shelf module.

3. If the problem persists, find the new master disk shelf module and replace it.

4. If the problem persists, replace the other disk shelf module.

5. If the problem persists, replace the disk shelf enclosure.

ses.shelf.invalNum

Message ses.shelf.invalNum

Severity WARNING

Description This message occurs when Data ONTAP detects that a serial-attached SCSI(SAS) shelf connected to the system has an invalid shelf number.

Corrective action 1. Power-cycle the shelf.

2. If the problem persists, replace the shelf modules.

3. If the problem persists, replace the shelf.

ses.shelf.mmErr

Message ses.shelf.mmErr

Severity SVC_FAULT

Description This message occurs when there is a disk shelf that is not supported by theplatform it was booted on.

Correctiveaction

1. Check whether the current version of Data ONTAP supports the disk shelf.

2. If the current version of Data ONTAP does not support the disk shelf, installa version that does support the disk shelf.

If the disk shelf is supported, the error might be cleared by hourly attempts byData ONTAP to establish proper contact with the disk shelf.

EMS and operational messages | 209

Page 210: 215-06774_netapp-cmds

ses.shelf.OSmmErr

Message ses.shelf.OSmmErr

Severity SVC_ERROR

Description This message occurs when there are incompatible Data ONTAP versions in aSharedStorage configuration that would cause SCSI Enclosure Services (SES)not to function properly.

Corrective action Update the system that has an earlier Data ONTAP version to match the onethat has the latest Data ONTAP version.

ses.shelf.powercycle.done

Message ses.shelf.powercycle.done

Severity INFO

Description This message occurs when a disk shelf power-cycle finishes.

Corrective action None.

ses.shelf.powercycle.start

Message ses.shelf.powercycle.start

Severity INFO

Description This message occurs when a disk shelf is power-cycled and SCSI EnclosureServices (SES) needs to wait for it to finish.

Corrective action None.

ses.shelf.sameNumReassign

Message ses.shelf.sameNumReassign

Severity WARNING

Description This message occurs when Data ONTAP detects more than one serial-attachedSCSI (SAS) disk shelf connected to the same adapter with the same shelfnumber.

Correctiveaction

1. Change the shelf number on the shelf to one that does not conflict with othershelves attached to the same adapter. Halt the system and reboot the shelf.

2. If the problem persists, contact technical support.

210 | Platform Monitoring Guide

Page 211: 215-06774_netapp-cmds

ses.shelf.unsupportAllowErr

Message ses.shelf.unsupportAllowErr

Description This message occurs when a disk shelf is not supported by Data ONTAP. DataONTAP will continue to use the disk shelf, but environmental monitoring of thedisk shelf is not possible.

Severity SVC_FAULT

Correctiveaction

1. Check whether the current version of Data ONTAP supports the disk shelf.

2. If the current version of Data ONTAP does not support the disk shelf, install aversion that does support the disk shelf.

If the disk shelf is supported, the error might be cleared by hourly attempts byData ONTAP to establish proper contact with the disk shelf.

ses.shelf.unsupportedErr

Message ses.shelf.unsupportedErr

Severity SVC_FAULT

Description This message occurs when there is a disk shelf that is not supported by DataONTAP.

Corrective action Check whether this disk shelf is supported by a newer version of Data ONTAP.If it is, upgrade to the appropriate version.

ses.startTempOwnership

Message ses.startTempOwnership

Severity DEBUG

Description This message occurs when SCSI Enclosure Services (SES) is startingtemporary ownership acquisition of disks owned by other nodes. This involvesremoving the disk reservations while the SES operations are in progress

Corrective action Contact technical support.

ses.status.ATFCXError

Message ses.status.ATFCXError

Severity NODE_ERROR

EMS and operational messages | 211

Page 212: 215-06774_netapp-cmds

Description This message occurs when the reporting disk shelf detects an error in theindicated AT-FCX module. The module might not be able to perform I/O todisks within the disk shelf.

Corrective action 1. Verify that the AT-FCX module is fully seated and secured.

2. If the problem persists, replace the AT-FCX module.

ses.status.ATFCXInfo

Message ses.status.ATFCXInfo

Severity INFO

Description This message occurs when a previously reported error in the AT-FCX moduleis corrected, or the system reports other information that does not necessarilyrequire customer action.

Corrective action None.

ses.status.currentError

Message ses.status.currentError

Severity NODE_ERROR

Description This message occurs when a critical condition is detected in the indicatedstorage shelf current sensor. The shelf might be able to continue operation.

Corrective action 1. Verify that the power supply and the AC line are supplying power.

2. Monitor the power grid for abnormalities.

3. Replace the power supply.

4. If the problem persists, contact technical support.

ses.status.currentInfo

Message ses.status.currentInfo

Severity INFO

Description This message occurs when an error or warning condition previously reported byor about the disk shelf current sensor is corrected, or the system reports otherinformation about the current in the disk shelf that does not necessarily requirecustomer action.

Corrective action None.

212 | Platform Monitoring Guide

Page 213: 215-06774_netapp-cmds

ses.status.currentWarning

Message ses.status.currentWarning

Severity WARNING

Description This message occurs when a warning condition is detected in the indicatedstorage shelf current sensor. The shelf might be able to continue operation.

Corrective action 1. Verify that the power supply and the AC line are supplying power.

2. Monitor the power grid for abnormalities.

3. Replace the power supply.

4. If the problem persists, contact technical support.

ses.status.displayError

Message ses.status.displayError

Severity NODE_ERROR

Description This message occurs when the SCSI Enclosure Services (SES) module in the diskshelf detects an error in the disk shelf display panel. The disk shelf might beunable to provide correct addresses to its disks.

Correctiveaction

1. If possible, verify that the connection between the disk shelf and the display issecure.

2. Verify that the SES module or modules are fully seated; replacing them mightsolve the problem.

3. If the problem persists, the SES module that detected the warning conditionmight be faulty.

4. If the problem persists after the module or modules are replaced, replace thedisk shelf.

5. If the problem persists, contact technical support.

ses.status.displayInfo

Message ses.status.displayInfo

Severity INFO

Description This message occurs when a previous condition in the display panel iscorrected.

EMS and operational messages | 213

Page 214: 215-06774_netapp-cmds

Corrective action None.

ses.status.displayWarning

Message ses.status.displayWarning

Severity WARNING

Description This message occurs when the SCSI Enclosure Services (SES) module detects awarning condition for the disk shelf display panel. The disk shelf might be unableto provide correct addresses to its disks.

Correctiveaction

1. If possible, verify that the connection between the disk shelf and the display issecure.

2. Verify that the SES module or modules are fully seated; replacing them mightsolve the problem.

3. If the problem persists, the SES module that detected the warning conditionmight be faulty.

4. If the problem persists after the module or modules are replaced, replace thedisk shelf.

5. If the problem persists, contact technical support.

ses.status.driveError

Message ses.status.driveError

Severity NODE_ERROR

Description This message occurs when a critical condition is detected for the disk drive inthe shelf. The drive might fail.

Correctiveaction

1. Make sure that the drive is not running on a degraded volume. If it is, thenadd as many spares as necessary into the system, up to the specified level.

2. After the volume is no longer in degraded mode, replace the drive that isfailing.

ses.status.driveOk

Message ses.status.driveOk

Severity INFO

Description This message occurs when a disk drive that was previously experiencingproblem returns to normal operation.

214 | Platform Monitoring Guide

Page 215: 215-06774_netapp-cmds

Corrective action None.

ses.status.driveWarning

Message ses.status.driveWarning

Severity NODE_ERROR

Description This message occurs when a non-critical condition is detected for the disk drivein the shelf. The drive might fail.

Correctiveaction

1. Make sure that the drive is not running on a degraded volume. If it is, thenadd as many spares as necessary into the system, up to the specified level.

2. After the volume is no longer in degraded mode, replace the drive that isfailing.

ses.status.electronicsError

Message ses.status.electronicsError

Severity NODE_ERROR

Description This message occurs when a failure has been detected in the module thatprovides disk SCSI Enclosure Services (SES) monitoring capability.

Corrective action Replace the module. In some disk shelf types, this function is integrated into theFibre Channel, SCSI, or serial-attached SCSI (SAS) interface modules.

ses.status.electronicsInfo

Message ses.status.electronicsInfo

Severity INFO

Description This message occurs when a problem previously reported about the disk shelfSCSI Enclosure Services (SES) electronics is corrected or when otherinformation about the enclosure electronics that does not necessarily requirecustomer action is reported.

Corrective action None.

ses.status.electronicsWarn

Message ses.status.electronicsWarn

Severity WARNING

EMS and operational messages | 215

Page 216: 215-06774_netapp-cmds

Description This message occurs when a non-fatal condition is detected in the module thatprovides disk SCSI Enclosure Services (SES) monitoring capability.

Corrective action Replace the module. In some disk shelf types, this function is integrated into theFibre Channel, SCSI, or serial-attached SCSI (SAS) interface modules.

ses.status.ESHPctlStatus

Message ses.status.ESHPctlStatus

Severity DEBUG

Description This message occurs when a change in the power control status is detected inthe indicated disk shelf.

Corrective action None.

ses.status.fanError

Message ses.status.fanError

Severity NODE_ERROR

Description This message occurs when the indicated disk shelf cooling fan or fan modulefails, and the shelf or its components are not receiving required cooling airflow.

Corrective action 1. Verify that the fan module is fully seated and secured. (The fan is integratedinto the power supply module in some disk shelves.)

2. If the problem persists, replace the fan module.

3. If the problem persists, contact technical support.

ses.status.fanInfo

Message ses.status.fanInfo

Severity INFO

Description This message occurs when a condition previously reported about the disk shelfcooling fan or fan module is corrected or when other information about the fansthat does not necessarily require customer action is reported.

Corrective action None.

ses.status.fanWarning

Message ses.status.fanWarning

Severity WARNING

216 | Platform Monitoring Guide

Page 217: 215-06774_netapp-cmds

Description This message occurs when a disk shelf cooling fan is not operating tospecification, or a component of a fan module has stopped functioning. The diskshelf components continue to receive cooling airflow but might eventually reachtemperatures that are out of specification.

Correctiveaction

1. Verify that the fan module is fully seated and secured. (The fan is integratedinto the power supply module in some disk shelves.)

2. If the problem persists, replace the fan module.

3. If the problem persists, contact technical support.

ses.status.ModuleError

Message ses.status.ModuleError

Severity NODE_ERROR

Description This message occurs when the reporting disk shelf detects an error in theindicated disk shelf module.

Corrective action 1. Verify that the shelf module is fully seated and secure.

2. If the problem persists, replace the disk shelf module.

ses.status.ModuleInfo

Message ses.status.ModuleInfo

Severity INFO

Description This message occurs when a previously reported error in the shelf module iscorrected or when other information that does not necessarily require customeraction is reported.

Corrective action None.

ses.status.ModuleWarn

Message ses.status.ModuleWarn

Severity WARNING

Description This message occurs when the reporting disk shelf detects a warning in theindicated disk shelf module.

Corrective action 1. Verify that the shelf module is fully seated and secure.

2. If the problem persists, replace the disk shelf module.

EMS and operational messages | 217

Page 218: 215-06774_netapp-cmds

ses.status.psError

Message ses.status.psError

Severity NODE_ERROR

Description This message occurs when a critical condition is detected in the indicated storageshelf power supply. The power supply might fail.

Correctiveaction

1. Verify that power input to the shelf is correct. If separate events of this typeare reported simultaneously, the common power distribution point might be atfault.

2. If the shelf is in a cabinet, verify that the power distribution unit is ON andfunctioning properly. Make sure that the shelf power cords are fully insertedand secured, the supply is fully seated and secured, and the supply is switchedON.

3. Verify that power supply fans, if any, are functioning. If the problem persists,replace the power supply.

4. If the problem persists, contact technical support.

ses.status.psInfo

Message ses.status.psInfo

Severity INFO

Description This message occurs when a condition previously reported about the disk shelfpower supply is corrected or when other information about the power supplythat does not necessarily require customer action is reported.

Corrective action None.

ses.status.psWarning

Message ses.status.psWarning

Severity WARNING

Description This message occurs when a warning condition is detected in the indicated storageshelf power supply. The power supply might be able to continue operation.

Correctiveaction

1. Verify that the disk shelf is receiving power. If separate events of this type arereported simultaneously, the common power distribution point might be atfault.

2. If the disk shelf is in a cabinet, verify that the power distribution unit status isON and functioning properly. Make sure that the disk shelf power cords are

218 | Platform Monitoring Guide

Page 219: 215-06774_netapp-cmds

fully inserted and secured, the power supply is fully seated and secured, andthe power supply is switched on.

3. If the problem persists, replace the power supply.

4. If the problem persists, contact technical support.

ses.status.temperatureError

Message ses.status.temperatureError

Severity NODE_ERROR

Description This message occurs when the indicated disk shelf temperature sensor reports atemperature that exceeds the specifications for the disk shelf or its components.

Correctiveaction

1. Verify that the ambient temperature where the shelf is installed is withinequipment specifications using the environment shelf [adapter]command, and that airflow clearances are maintained.

2. If the same disk shelf also reports fan or fan module failures, correct thatproblem now. If the problem is reported by the ambient temperature sensor(located on the operator panel), verify that the connection between the diskshelf and the panel is secure, if possible.

3. If the problem persists, and if the shelf has multiple temperature sensors ofwhich only one exhibits the problem, replace the module that contains thesensor that reports the error. If the problem persists, contact technical supportfor assistance.

Note: You can display temperature thresholds for each shelf through theenvironment shelf command.

ses.status.temperatureInfo

Message ses.status.temperatureInfo

Severity INFO

Description This message occurs when an error or warning condition previously reported byor about the disk shelf temperature sensor is corrected or when otherinformation about the temperature in the disk shelf that does not necessarilyrequire customer action is reported.

Corrective action None.

EMS and operational messages | 219

Page 220: 215-06774_netapp-cmds

ses.status.temperatureWarning

Message ses.status.temperatureWarning

Severity WARNING

Description This message occurs when the indicated disk shelf temperature sensor reports atemperature that is close to exceeding the specifications for the disk shelf or itscomponents.

Correctiveaction

1. Verify that the ambient temperature where the disk shelf is installed is withinequipment specifications by using the environment shelf [adapter]command, and that airflow clearances are maintained.

2. If this disk shelf also reports fan or fan module errors or warnings, correctthose problems now.

3. If the problem persists, and the shelf has multiple temperature sensors and onlyone of them exhibits the problem, replace the module that contains the sensor.

4. If the problem persists, contact technical support.

Note: Temperature thresholds for each shelf can be displayed through theenvironment shelf command.

ses.status.upsError

Message ses.status.upsError

Severity NODE_ERROR

Description This message occurs when the disk shelf detects a failure in the uninterruptiblepower supply (UPS) attached to it. This might occur, for example, if power tothe UPS is lost.

Correctiveaction

1. Restore power to the UPS

2. Verify that the connection from the UPS to the disk shelf is in place andsecured and that the UPS is enabled.

3. If the problem persists, contact technical support.

ses.status.upsInfo

Message ses.status.upsInfo

Severity INFO

220 | Platform Monitoring Guide

Page 221: 215-06774_netapp-cmds

Description This message occurs when a condition previously reported about theuninterruptible power supply (UPS) attached to the disk shelf is corrected orwhen other information about the UPS that does not necessarily requirecustomer action is reported.

Corrective action None.

ses.status.volError

Severity NODE_ERROR

Description This message occurs when a critical condition is detected in the indicated diskstorage shelf voltage sensor. The shelf might be able to continue operation.

Correctiveaction

1. Verify that the power supply and the AC line are supplying power.

2. Monitor the power grid for abnormalities.

3. Replace the power supply.

4. If the problem persists, contact technical support.

ses.status.volWarning

Message ses.status.volWarning

Severity WARNING

Description This message occurs when a warning condition is detected in the indicatedstorage shelf voltage sensor. The shelf might be able to continue operation.

Corrective action 1. Verify that the power supply and the AC line are supplying power

2. Monitor the power grid for abnormalities.

3. Replace the power supply.

4. If the problem persists, contact technical support.

ses.system.em.mmErr

Message ses.system.em.mmErr

Severity NODE_FAULT

Description This message occurs when Data ONTAP does not support this system withinternal disk drives.

Corrective action Check whether this system is currently supported. If it is, upgrade to theappropriate Data ONTAP version.

EMS and operational messages | 221

Page 222: 215-06774_netapp-cmds

ses.tempOwnershipDone

Message ses.tempOwnershipDone

Severity DEBUG

Description This message occurs when SCSI Enclosure Services (SES) completestemporary ownership acquisition.

Corrective action Contact technical support.

sfu.adapterSuspendIO

Message sfu.adapterSuspendIO

Severity INFO

Description This message occurs during a disk shelf firmware update on a disk shelf thatcannot perform I/O while updating firmware. Typically, the shelves involvedare bridge-based as opposed to LRC-based or ESH-based.

Corrective action None.

sfu.auto.update.off.impact

Message sfu.auto.update.off.impact

Severity WARNING

Description This message occurs when the automated disk shelf firmware update cannot becompleted on a downrev disk shelf enclosure because the (hidden) globaloption shelf.fw.auto.update is set to off.

Corrective action Use the storage download shelf command to update. To have theautomatic update enabled, set the hidden option shelf.fw.auto.update toon.

sfu.ctrllerElmntsPerShelf

Message sfu.ctrllerElmntsPerShelf

Severity INFO

Description This message occurs when a disk shelf firmware download determines thenumber of controller elements per shelf that can be downloaded.

Corrective action None.

222 | Platform Monitoring Guide

Page 223: 215-06774_netapp-cmds

sfu.downloadCtrllerBridge

Message sfu.downloadCtrllerBridge

Severity INFO

Description This message occurs when a disk shelf firmware download starts on a particulardisk shelf.

Corrective action None.

sfu.downloadError

Message sfu.downloadError

Severity ERR

Description This message occurs when a disk shelf firmware update fails to successfullydownload firmware to a disk shelf or shelves in the system.

Corrective action 1. Redownload the latest disk shelf firmware from the NOW site at http://now.netapp.com/ NOW/download/tools/ diskshelf/.

2. Attempt to download disk shelf firmware again by using the storagedownload shelf command.

sfu.downloadingController

Message sfu.downloadingController

Severity INFO

Description This message occurs when a disk shelf firmware download starts on a particulardisk shelf.

Corrective action None.

sfu.downloadingCtrllerR1XX

Message sfu.downloadingCtrllerR1XX

Severity INFO

Description This message occurs when a disk shelf firmware download starts on a particulardisk shelf.

Corrective action None.

EMS and operational messages | 223

Page 224: 215-06774_netapp-cmds

sfu.downloadStarted

Message sfu.downloadStarted

Severity INFO

Description This message occurs when a disk shelf firmware update starts to download diskshelf firmware.

Corrective action None.

sfu.downloadSuccess

Message sfu.downloadSuccess

Severity INFO

Description This message occurs when disk shelf firmware is updated successfully.

Corrective action None.

sfu.downloadSummary

Message sfu.downloadSummary

Severity INFO

Description This message occurs when a disk shelf firmware update is completedsuccessfully.

Corrective action None.

sfu.downloadSummaryErrors

Message sfu.downloadSummaryErrors

Severity ERR

Description This message occurs when a disk shelf firmware update is completed withoutsuccessfully downloading to all shelves it attempted.

Corrective action Issue the storage download shelf command again.

sfu.FCDownloadFailed

Message sfu.FCDownloadFailed

Severity ERR

224 | Platform Monitoring Guide

Page 225: 215-06774_netapp-cmds

Description This message occurs when a disk shelf firmware update fails to download shelffirmware to a Fibre Channel or an ATA shelf successfully.

Corrective action 1. Redownload the latest disk shelf firmware from the NOW site at http://now.netapp.com/ NOW/download/tools/ diskshelf/.

2. Attempt to download disk shelf firmware again by using the storagedownload shelf command.

sfu.firmwareDownrev

Message sfu.firmwareDownrev

Severity WARNING

Description This message occurs when disk shelf firmware is downrev and therefore cannotbe updated automatically.

Corrective action 1. Copy updated disk shelf firmware into the /etc/shelf_fw directory on thestorage appliance.

2. Manually issue the storage download shelf command.

sfu.firmwareUpToDate

Message sfu.firmwareUpToDate

Severity INFO

Description This message occurs when a disk shelf firmware update is requested but thesystem determines that all shelves are already updated already to the latestversion of firmware available.

Corrective action None.

sfu.partnerInaccessible

Message sfu.partnerInaccessible

Severity ERR

Description This message occurs in an HA pair in which communication between partnernodes cannot be established.

Corrective action 1. Verify that the HA pair interconnect is operational.

2. Retry the storage download shelf command.

EMS and operational messages | 225

Page 226: 215-06774_netapp-cmds

sfu.partnerNotResponding

Message sfu.partnerNotResponding

Severity ERR

Description This message occurs in an HA pair in which one node does not respond tofirmware download requests from another node. In this case, the other nodecannot download disk shelf firmware.

Correctiveaction

Verify that the HA pair interconnect is up and running on both nodes of theconfiguration and then attempt to redownload the disk shelf firmware, using thestorage download shelf command.

sfu.partnerRefusedUpdate

Message sfu.partnerRefusedUpdate

Severity ERR

Description This message occurs in an HA pair in which one node refuses firmwaredownload requests from its partner node. In this case, the partner node cannotdownload disk shelf firmware.

Correctiveaction

1. Verify that both the partners are running the same version of Data ONTAPand that the active/active configuration interconnect is up and running on allnodes of the configuration.

2. Attempt the storage download shelf command again.

sfu.partnerUpdateComplete

Message sfu.partnerUpdateComplete

Severity INFO

Description This message occurs in an HA pair in which a partner downloads disk shelffirmware and the download is completed. At this point, this notification is sentand SCSI Enclosure Services (SES) are resumed by the partner.

Corrective action None.

sfu.partnerUpdateTimeout

Message sfu.partnerUpdateTimeout

Severity INFO

226 | Platform Monitoring Guide

Page 227: 215-06774_netapp-cmds

Description This message occurs in an HA pair in which a partner downloads disk shelffirmware but the download times out. At this point, this notification is sent andSCSI Enclosure Services (SES) are resumed by the partner.

Corrective action 1. Verify that the HA pair interconnect is operational.

2. Retry the storage download shelf command.

sfu.rebootRequest

Message sfu.rebootRequest

Severity INFO

Description This message occurs when the disk shelf firmware update is completed. Thedisk shelf reboots to run the new code.

Corrective action None.

sfu.rebootRequestFailure

Message sfu.rebootRequestFailure

Severity ERR

Description This message occurs when an attempt to issue a reboot request afterdownloading shelf firmware fails, indicating a software error.

Corrective action Reboot the storage system, if possible, and try the firmware update again.

sfu.resumeDiskIO

Message sfu.resumeDiskIO

Severity INFO

Description This message occurs when a disk shelf firmware update is completed and diskI/O is resumed.

Corrective action None.

sfu.SASDownloadFailed

Message sfu.SASDownloadFailed

Severity ERR

Description This message occurs when a disk shelf firmware update fails to download shelffirmware to a shelf successfully.

EMS and operational messages | 227

Page 228: 215-06774_netapp-cmds

Corrective action 1. Redownload the latest disk shelf firmware from the NOW site at http://now.netapp.com/ NOW/download/tools/ diskshelf/.

2. Download disk shelf firmware again by using the storage downloadshelf command.

sfu.statusCheckFailure

Message sfu.statusCheckFailure

Severity ERR

Description This message occurs when the storage download shelf commandencounters a failure while attempting to read the status of the firmware updatein progress.

Corrective action Retry the storage download shelf command.

sfu.suspendDiskIO

Message sfu.suspendDiskIO

Severity INFO

Description This message occurs when a disk shelf firmware update is started and disk I/Ois suspended.

Corrective action None.

sfu.suspendSES

Message Suspending enclosure services -- partner is updating disk shelf firmware.

Severity INFO

Description This message occurs when a disk shelf firmware update is requested in an HApair environment. In this case, one partner node updates the firmware on thedisk shelf module while the other partner node temporarily disables SCSIEnclosure Services (SES) while the firmware update is in process.

Corrective action None.

228 | Platform Monitoring Guide

Page 229: 215-06774_netapp-cmds

Flash Cache module and PAM module EMS messagesThe caching module WAFL cache, hardware driver, and system monitoring can generate errormessages. All messages are reported through the EMS.

This document uses the term Flash Cache module to refer to caching modules with capacities greaterthan 16 GB. Before the release of Data ONTAP 7.3.5, such adapters were called PerformanceAcceleration Modules (PAM II). The name of the 16-GB caching module remains PerformanceAcceleration Module (PAM I).

extCache.io.BlockChecksumError

Message extCache.io.BlockChecksumError

Severity NODE_ERROR

Description This message occurs when the external cache detects a block checksumverification error while performing a read operation. The operation will beretried from persistent storage (RAID).

Corrective action Contact technical support.

extCache.io.cardError

Message extCache.io.cardError

Severity NODE_Error

Description This message occurs when the external cache detects a card failure on read orwrite I/O. If the I/O was a read, the operation will be retried from persistentstorage (RAID).

Corrective action Contact technical support.

extCache.io.readError

Message extCache.io.readError

Severity NODE_ERROR

Description This message occurs when the external cache detects an I/O error on a read.The operation will be retried from persistent storage (RAID).

Corrective action Contact technical support.

EMS and operational messages | 229

Page 230: 215-06774_netapp-cmds

extCache.io.writeError

Message extCache.io.writeError

Severity NODE_ERROR

Description This message occurs when the external cache detects an I/O error on a write.This causes the external cache component to be disabled and might result indegraded performance until the problem is corrected.

Corrective action Contact technical support.

extCache.offline

Message extCache.offline

Severity SVC_ERROR

Description This message occurs when the external cache is automatically taken offline anddisabled. This can happen after an I/O error on the external cache and mightresult in degraded performance until the problem is corrected. Check the EventManagement System (EMS) log for earlier errors.

Corrective action Contact technical support.

extCache.ReconfigComplete

Message extCache.ReconfigComplete

Severity NODE_ERROR

Description This message occurs when the Write Anywhere File Layout (WAFL) externalcache has detected a failure of one or more cache memory cards, and was ableto successfully reconfigure to continue operation with the remaining cards.

Corrective action None.

extCache.ReconfigFailed

Message extCache.ReconfigFailed

Severity NODE_ERROR

Description This message occurs when an attempt to reconfigure the external cache hasfailed. The message identifies what step of the reconfiguration failed.

Corrective action Contact technical support.

230 | Platform Monitoring Guide

Page 231: 215-06774_netapp-cmds

extCache.ReconfigStart

Message extCache.ReconfigStart

Severity NODE_ERROR

Description This message occurs when the Write Anywhere File Layout (WAFL) externalcache has detected a failure of one or more cache memory cards. An attempt willbe made to restart the cache with the remaining card(s). Even if the cache isrestarted performance may be degraded due to the reduced size of cacheavailable. See related EMS messages for details of the failing unit.

Correctiveaction

Contact technical support.

extCache.UECCerror

Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectableECC memory errors occur per day for three consecutive days, replace the module.

Message extCache.UECCerror

Severity NODE_ERROR

Description This message occurs when an uncorrectable multi-bit ECC memory error isreported to the Write Anywhere File Layout (WAFL) file system external cache.When this event occurs the data will be re-read from persistent storage (RAID)and operation continues. See related EMS messages for details about the failingunit.

Correctiveaction

If multiple uncorrectable multi-bit ECC errors are issued, this indicates that ahardware component might be failing and should be considered for replacement.

extCache.UECCmax

Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectableECC memory errors occur per day for three consecutive days, replace the module.

Message extCache.UECCmax

Severity NODE_ERROR

Description This message occurs when the Write Anywhere File Layout (WAFL) filesystem external cache has detected excessive multi-bit uncorrectable ECCmemory errors in a recent period. When too many multi-bit ECC errors arereported, WAFL disables the external cache until the failing component isreplaced, resulting in degraded performance. See related EMS messages fordetails about the failing unit.

EMS and operational messages | 231

Page 232: 215-06774_netapp-cmds

Correctiveaction

Contact technical support.

fal.chan.offline.comp

Message fal.chan.offline.comp

Severity INFO

Description This message occurs when the FAL (Flash Adaptation Layer) finishes taking achannel offline.

Corrective action None.

fal.chan.online.erase.warn

Message fal.chan.online.erase.warn

Severity INFO

Description This message occurs when an erase of a label block fails while attempting tobring online a channel of a card. This could lead to a failure to read the label(see the fal.chan.online.read.warn event).

Corrective action None.

fal.chan.online.fail

Message fal.chan.online.fail

Severity SVC_ERROR

Description This message occurs when the FAL (Flash Adaptation Layer) fails to bringonline a channel of a card for the mentioned reason.

Corrective action None.

fal.chan.online.read.warn

Message fal.chan.online.read.warn

Severity INFO

Description This message occurs when the read of a label fails while attempting to bring onlinea channel of a module. This is expected on the first boot with a Flash Cachemodule. Otherwise, it means existing FAL (Flash Adaptation Layer) labelinformation is lost. The current version of software does not depend on labelinformation, so this loss is not a problem right now. However, future versions ofsoftware might store cache data persistently. If persistent data is stored on a card

232 | Platform Monitoring Guide

Page 233: 215-06774_netapp-cmds

and this version of software is booted on such a system, failure to read the labelmight lead to loss of some cached data.

Correctiveaction

None.

fal.chan.online.rep.fail

Message fal.chan.online.rep.fail

Severity SVC_ERROR

Description This message occurs when the FAL (Flash Adaptation Layer) fails to bringonline all channels in a caching module. The reasons for failure are listed in theaccompanying fal.chan.online.fail events.

Corrective action Contact technical support.

fal.chan.online.rep.part

Message fal.chan.online.rep.part

Severity SVC_ERROR

Description This message occurs when the FAL (Flash Adaptation Layer) fails to bringonline some channels in a caching module. The reasons for failure are listed inthe accompanying fal.chan.online.fail events.

Corrective action Contact technical support.

fal.chan.online.rep.succ

Message fal.chan.online.rep.succ

Severity INFO

Description This message occurs when the FAL (Flash Adaptation Layer) successfullybrings online all channels in a card.

Corrective action None.

fal.chan.online.rep.ver.err

Message fal.chan.online.rep.ver.err

Severity SVC_ERROR

Description This message occurs when the FAL (Flash Adaptation Layer) fails to bringonline all channels in a caching module because of version mismatch.

EMS and operational messages | 233

Page 234: 215-06774_netapp-cmds

Corrective action Follow the documented revert procedure.

fal.chan.online.write.warn

Message fal.chan.online.write.warn

Severity INFO

Description This message occurs when a write of a label block fails while attempting tobring online a channel of a module. This could lead to a failure to read the label(see the fal.chan.online.read.warn event).

Corrective action None.

fal.init.failed

Message fal.init.failed

Severity SVC_ERROR

Description This message occurs when the FAL (Flash Adaptation Layer) fails to initialize.This error likely indicates a software bug.

Corrective action Contact technical support.

fmm.bad.block.detected

Message fmm.bad.block.detected

Severity DEBUG

Description This message occurs when Flash Management Module (FMM) gets a messagefrom a flash device driver reporting that a bad block is detected.

Corrective action None.

fmm.device.stats.missing

Message fmm.device.stats.missing

Severity DEBUG

Description This message occurs when the onboard copy of statistics maintained by FlashManagement Module (FMM) are missing. This can happen when a device isinitially activated in the controller.

Corrective action None.

234 | Platform Monitoring Guide

Page 235: 215-06774_netapp-cmds

fmm.domain.card.failure

Message fmm.domain.card.failure

Severity SVC_ERROR

Description This message occurs when the Flash Management Module (FMM) detects thata flash device failed. Typically, this is the result of a hardware failure on theflash device itself.

Corrective action Repair or replace the failed flash device.

fmm.domain.core.failure

Message fmm.domain.core.failure

Severity DEBUG

Description This message occurs when Flash Management Module (FMM) detects that acore domain on a flash device managed by FMM has failed. Typically, this isthe result of a hardware failure on the flash device itself. Core failure is notconsidered to be fatal.

Corrective action None.

fmm.hourly.device.report

Message fmm.hourly.device.report

Severity DEBUG

Description This message is sent by Flash Management Module (FMM) every hour, toreport the status of a flash device that FMM manages.

Corrective action None.

fmm.threshold.bank.degraded

Message fmm.threshold.bank.degraded

Severity DEBUG

Description This message occurs when Flash Management Module (FMM) detects that in aflash device, the percentage of a bank that is offline is above a warningthreshold. FMM responds with the action described by the action parameter.

Corrective action None.

EMS and operational messages | 235

Page 236: 215-06774_netapp-cmds

fmm.threshold.bank.offline

Message fmm.threshold.bank.offline

Severity DEBUG

Description This message occurs when Flash Management Module (FMM) detects that in aflash device, a critical percentage of a bank is offline, beyond which the bankcannot operate. FMM responds with the action described by the actionparameter.

Corrective action None.

fmm.threshold.card.degraded

Message fmm.threshold.card.degraded

Severity SVC ERROR

Description This message occurs when the Flash Management Module (FMM) detects theoffline percentage of a flash device exceeds a specified warning threshold.FMM responds with the action described by the action parameter.

Corrective action Repair or replace this degraded flash device.

fmm.threshold.card.failure

Message fmm.threshold.card.failure

Severity SVC_Error

Description This message occurs when Flash Management Module (FMM) detects theoffline percentage of a flash device exceeds a specified critical threshold beyondwhich the device cannot operate. FMM responds with the action described bythe action parameter.

Correctiveaction

This flash device can no longer operate and will be taken offline. Repair orreplace the flash device.

fmm.threshold.core.offline

Message fmm.threshold.core.offline

Severity DEBUG

Description This message occurs when Flash Management Module (FMM) detects that anexcessive number of blocks in a core of a flash device have gone bad. Thethreshold for a core is defined as a percentage of bad blocks, and when that

236 | Platform Monitoring Guide

Page 237: 215-06774_netapp-cmds

threshold is exceeded, FMM responds with the action described by the actionparameter.

Correctiveaction

None.

iomem.bbm.bbtl.overflow

Message iomem.bbm.bbtl.overflow

Severity NODE_ERROR

Description This message occurs when the caching module driver detects that the BadBlock Transaction Log has overflowed.

Corrective action None.

iomem.bbm.init.failed

Message iomem.bbm.init.failed

Severity NODE_ERROR

Description This message occurs when the caching module driver detects that an operationto a NOR flash memory has failed.

Corrective action None.

iomem.bbm.new.flash

Message iomem.bbm.new.flash

Severity DEBUG

Description This message occurs when the caching module driver detects that a NANDflash package has been replaced.

Corrective action None.

iomem.card.disable

Message iomem.card.disable

Severity WARNING

Description This message occurs when the caching module has been disabled as a result ofan explicit diagnostic command.

Corrective action None.

EMS and operational messages | 237

Page 238: 215-06774_netapp-cmds

iomem.card.enable

Message iomem.card.enable

Severity INFO

Description This message occurs when the caching module has been enabled as a result ofan explicit diagnostic command.

Corrective action None.

iomem.card.fail.cecc

Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectableECC memory errors occur per day for three consecutive days, replace the module.

Message iomem.card.fail.cecc

Severity NODE_ERROR

Description This message occurs when the caching module driver takes an acceleration cardoffline due to an excessive number of correctable memory errors.

Corrective action Replace the caching module.

iomem.card.fail.data.crc

Message iomem.card.fail.data.crc

Severity NODE_ERROR

Description This message occurs when the caching module driver takes a caching moduleoffline due to an excessive number of detected data cyclic redundancy check(CRC) errors.

Corrective action Replace the caching module.

iomem.card.fail.desc.crc

Message iomem.card.fail.desc.crc

Severity NODE_ERROR

Description This message occurs when the caching module driver takes a caching moduleoffline due to an excessive number of detected descriptor cyclic redundancycheck (CRC) errors.

Corrective action Replace the caching module.

238 | Platform Monitoring Guide

Page 239: 215-06774_netapp-cmds

iomem.card.fail.dimm

Message iomem.card.fail.dimm

Severity NODE_ERROR

Description This message occurs when the caching module driver takes a caching moduleoffline due to failure of a memory DIMMs.

Corrective action Replace the caching module.

iomem.card.fail.firmware.primary

Message iomem.card.fail.firmware.primary

Severity NODE_ERROR

Description This messages occurs when the caching module driver detects that the module isnot running on the primary firmware image. The card does not function unless itrunning on the primary image.

Correctiveaction

Note: The following steps are for systems that use the SYSDIAG diagnostic tool.32xx and 62xx systems use system-level diagnostics, which is a differentdiagnostic tool. For details about using system-level diagnostics, see the System-Level Diagnostics Guide on the NetApp Support Site at support.netapp.com.

1. Enter the following command at the boot environment prompt:

boot_diags

2. Select xtnd yes on the diagnostic main menu.

3. Take one of the following actions:

• If your system has a 16-GB Performance Acceleration Module, select theiomem submenu and then run test 62, Update FPGA [Extended].

• If your system has a 256-GB or 512-GB Performance Acceleration Module,select the pam2 submenu and then run test 61, Update FPGA [Extended].

4. Exit diagnostics and reboot the system.

iomem.card.fail.fpga

Message iomem.card.fail.fpga

Severity NODE_ERROR

EMS and operational messages | 239

Page 240: 215-06774_netapp-cmds

Description This message occurs when the caching module driver detects a fatal operationalerror with the onboard field programmable gate array (FPGA) hardware and istaking the caching module offline.

Corrective action Contact technical support.

iomem.card.fail.fpga.primary

Message iomem.card.fail.fpga.primary

Severity NODE_ERROR

Description This messages occurs when the acceleration card driver detects that the card is notrunning on the primary firmware image. The card does not function unless it isrunning on the primary image.

Correctiveaction

Note: The following steps are for systems that use the SYSDIAG diagnostictool. 32xx and 62xx systems use system-level diagnostics, which is a differentdiagnostic tool. For details about using system-level diagnostics, see the System-Level Diagnostics Guide on the NetApp Support Site at support.netapp.com.

Take one of the following actions:

• If you have a 16-GB Performance Acceleration Module, complete the followingsteps:

1. Enter the following command at the boot environment prompt:

boot_diags

2. Select xtnd yes on the diagnostic main menu.

3. Run test 62, Update FPGA [Extended].

4. Exit diagnostics and reboot the system.• If you have a Flash Cache module, the FPGA firmware should be programmed

automatically. Other EMS messages earlier in the log should indicate whyprogramming failed.

iomem.card.fail.fpga.rev

Message iomem.card.fail.fpga.rev

Severity NODE_ERROR

Description This message occurs when the caching module driver detects that the fieldprogrammable gate array (FPGA) firmware image is a revision not supported bythe driver.

240 | Platform Monitoring Guide

Page 241: 215-06774_netapp-cmds

Correctiveaction

Note: The following steps are for systems that use the SYSDIAG diagnostictool. 32xx and 62xx systems use system-level diagnostics, which is a differentdiagnostic tool. For details about using system-level diagnostics, see the System-Level Diagnostics Guide on the NetApp Support Site at support.netapp.com.

Take one of the following actions:

• If you have a 16-GB Performance Acceleration Module, complete thefollowing steps:

1. Enter the following command at the boot environment prompt:

boot_diags

2. Select xtnd yes on the diagnostic main menu.

3. Run test 62, Update FPGA [Extended].

4. Exit diagnostics and reboot the system.• If you have a Flash Cache module, the FPGA firmware should be programmed

automatically. Other EMS messages earlier in the log should indicate whyprogramming failed.

iomem.card.fail.internal

Message iomem.card.fail.internal

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a fatal internalerror on the caching module and is taking the module offline.

Corrective action Contact technical support.

iomem.card.fail.pci

Message iomem.card.fail.pci

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a fatal PCI erroron the caching module and is taking the module offline.

Corrective action Contact technical support.

iomem.card.fail.uecc

Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectableECC memory errors occur per day for three consecutive days, replace the module.

EMS and operational messages | 241

Page 242: 215-06774_netapp-cmds

Message iomem.card.fail.uecc

Severity NODE_ERROR

Description This message occurs when the caching module driver takes a caching moduleoffline due to an excessive number of uncorrectable memory errors.

Corrective action Replace the caching module.

iomem.dimm.log.checksum

Message iomem.dimm.log.checksum

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a checksum errorin the error log for a DIMM on the caching module.

Corrective action Replace the caching module.

iomem.dimm.log.init

Message iomem.dimm.log.init

Severity INFO

Description This message occurs when the caching module driver initializes the error logfor a DIMM.

Corrective action None.

iomem.dimm.log.read

Message iomem.dimm.log.read

Severity NODE_ERROR

Description This message occurs when the caching module driver fails to read the error logfor a DIMM on the caching module.

Corrective action Replace the caching module.

iomem.dimm.log.sync

Message iomem.dimm.log.sync

Severity INFO

Description This message occurs when the caching module driver is writing the error log fora DIMM to persistent storage.

242 | Platform Monitoring Guide

Page 243: 215-06774_netapp-cmds

Corrective action None.

iomem.dimm.log.write

Message iomem.dimm.log.write

Severity NODE_ERROR

Description This message occurs when the caching module driver fails to write the error logfor a DIMM on the caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.banks

Message iomem.dimm.mismatch.banks

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anumber of banks that does not match that of the other installed DIMMs on thecaching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.burst

Message iomem.dimm.mismatch.burst

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with aburst size that does not match that of the other installed DIMMs on the cachingmodule.

Corrective action Replace the caching module.

iomem.dimm.mismatch.casLatency

Message iomem.dimm.mismatch.casLatency

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with acolumn address select (CAS) that does not match that of the other installedDIMMs on the caching module.

Corrective action Replace the caching module.

EMS and operational messages | 243

Page 244: 215-06774_netapp-cmds

iomem.dimm.mismatch.columns

Message iomem.dimm.mismatch.columns

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anumber of columns that does not match that of the other installed DIMMs onthe caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.dataWidth

Message iomem.dimm.mismatch.dataWidth

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with adata synchronous dynamic RAM (SDRAM) width that does not match that ofthe other installed DIMMs on the caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.eccWidth

Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectableECC memory errors occur per day for three consecutive days, replace the module.

Message iomem.dimm.mismatch.eccWidth

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anECC synchronous dynamic RAM (SDRAM) width that does not match that ofthe other installed DIMMs on the caching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.ranks

Message iomem.dimm.mismatch.ranks

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anumber of ranks that does not match that of the other installed DIMMs on thecaching module.

Corrective action Replace the caching module.

244 | Platform Monitoring Guide

Page 245: 215-06774_netapp-cmds

iomem.dimm.mismatch.rows

Message iomem.dimm.mismatch.rows

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anumber of rows that does not match that of the other installed DIMMs on thecaching module.

Corrective action Replace the caching module.

iomem.dimm.mismatch.vendor

Message iomem.dimm.mismatch.vendor

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with amanufacturer ID that does not match that of the other installed DIMMs on thecaching module.

Corrective action Replace the caching module.

iomem.dimm.spd.banks

Message iomem.dimm.spd.banks

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anumber of banks incompatible with the memory controller of the cachingmodule.

Corrective action Replace the caching module.

iomem.dimm.spd.burst

Message iomem.dimm.spd.burst

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with aburst size incompatible with the memory controller of the caching module.

Corrective action Replace the caching module.

EMS and operational messages | 245

Page 246: 215-06774_netapp-cmds

iomem.dimm.spd.casLatency

Message iomem.dimm.spd.casLatency

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with acolumn address select (CAS) latency incompatible with the memory controllerof the caching module

Corrective action Replace the caching module.

iomem.dimm.spd.checksum

Message iomem.dimm.spd.checksum

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a checksum errorfor the identifying information read from the serial presence detect (SPD)electronically erasable programmable read-only memory (EEPROM) of aDIMM installed on the caching module.

Corrective action Replace the caching module.

iomem.dimm.spd.columns

Message iomem.dimm.spd.columns

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anumber of columns incompatible with the memory controller of the cachingmodule.

Corrective action Replace the caching module.

iomem.dimm.spd.dataWidth

Message iomem.dimm.spd.dataWidth

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with adata synchronous dynamic RAM (SDRAM) width incompatible with thememory controller of the caching module.

Corrective action Replace the caching module.

246 | Platform Monitoring Guide

Page 247: 215-06774_netapp-cmds

iomem.dimm.spd.detect

Message iomem.dimm.spd.detect

Severity INFO

Description This message occurs when the caching module driver detects the presence of aninstalled DIMM during initialization.

Corrective action None.

iomem.dimm.spd.eccWidth

Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectableECC memory errors occur per day for three consecutive days, replace the module.

Message iomem.dimm.spd.eccWidth

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anECC synchronous dynamic RAM (SDRAM) SDRAM width incompatible withthe memory controller of the caching module.

Corrective action Replace the caching module.

iomem.dimm.spd.ranks

Message iomem.dimm.spd.ranks

Severity NODE_ERROR

Description This message occurs when the acceleration card driver detects a DIMM with anumber of ranks incompatible with the memory controller of the accelerationcard.

Corrective action Replace the acceleration card.

iomem.dimm.spd.read

Message iomem.dimm.spd.read

Severity NODE_ERROR

Description This message occurs when the caching module driver fails to read theidentifying information from the synchronous dynamic RAM (SDRAM)electronically erasable programmable read-only memory EEPROM of a DIMMinstalled on the caching module.

Corrective action Replace the acceleration card.

EMS and operational messages | 247

Page 248: 215-06774_netapp-cmds

iomem.dimm.spd.rows

Message iomem.dimm.spd.rows

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a DIMM with anumber of rows incompatible with the memory controller of the cachingmodule.

Corrective action Replace the caching module.

iomem.dma.crc.data

Message iomem.dma.crc.data

Severity WARNING

Description This message occurs when the caching module driver detects a data checksumerror for data in transit across the PCI link between the system and the cachingmodule.

Corrective action Contact technical support.

iomem.dma.crc.desc

Message iomem.dma.crc.desc

Severity WARNING

Description This message occurs when the caching module driver detects a descriptorchecksum error for data in transit across the PCI link between the system andthe caching module.

Corrective action Contact technical support.

iomem.dma.internal

Message iomem.dma.internal

Severity WARNING

Description This message occurs when the caching module driver detects an internal directmemory access (DMA) error during data transfer.

Corrective action Contact technical support.

248 | Platform Monitoring Guide

Page 249: 215-06774_netapp-cmds

iomem.dma.stall

Message iomem.dma.stall

Severity WARNING

Description This message occurs when the acceleration card driver detects a direct memoryaccess (DMA) channel has unexpectedly stalled and is attempting to restart theDMA channel for normal operation.

Corrective action None.

iomem.ecc.cecc

Note: If you have a 16-GB Performance Acceleration Module, and if more than 10 uncorrectableECC memory errors occur per day for three consecutive days, replace the module.

Message iomem.ecc.cecc

Severity WARNING

Description This message occurs when a correctable ECC memory error is detected whileaccessing the memory of a caching module. If frequent, correctable ECC errorsusually indicate that a hardware memory component of the caching module isfailing.

Corrective action None.

iomem.ecc.correct.off

Message iomem.ecc.correct.off

Severity WARNING

Description This message occurs when the error correction code (ECC) memory errorcorrection has been disabled for a caching module.

Correctiveaction

ECC error correction should never be disabled for the caching module undernormal operating conditions. The only way that this can occur is if it has beenexplicitly disabled through a private diagnostic interface. If this message isencountered under normal operating conditions, contact technical support.

iomem.ecc.correct.on

Message iomem.ecc.correct.on

Severity INFO

EMS and operational messages | 249

Page 250: 215-06774_netapp-cmds

Description This message occurs when the error correction code (ECC) memory errorcorrection has been enabled for a caching module.

Corrective action None.

iomem.ecc.detect.off

Message iomem.ecc.detect.off

Severity WARNING

Description This message occurs when the error correction code (ECC) memory errordetection has been disabled for an acceleration card.

Correctiveaction

ECC error detection should never be disabled for the caching module undernormal operating conditions. The only way that this can occur is if thefunctionality has been explicitly disabled via a private diagnostic interface. Ifthis message is encountered under normal operating conditions, contacttechnical support.

iomem.ecc.detect.on

Message iomem.ecc.detect.on

Severity INFO

Description This message occurs when the error correction code (ECC) memory errordetection has been enabled for a caching module.

Corrective action None.

iomem.ecc.inject

Message iomem.ecc.inject

Severity WARNING

Description This message occurs when an error correction code (ECC) memory error ismanually injected into the memory of a caching module. This injection eventwill only occur during diagnostic testing.

Corrective action None.

iomem.ecc.summary

Message iomem.ecc.summary

Severity WARNING

250 | Platform Monitoring Guide

Page 251: 215-06774_netapp-cmds

Description This message occurs when the caching module driver makes its periodic errorsummary report indicating that uncorrectable memory errors have been detectedon the acceleration card.

Corrective action Replace the acceleration card.

iomem.ecc.uecc

Message iomem.ecc.uecc

Severity NODE_ERROR

Description This message occurs when an uncorrectable ECC memory error is detected whileaccessing the memory of a caching module. Uncorrectable ECC errors indicatethat a hardware memory component of the caching module has failed or isfailing. Uncorrectable memory errors can only be isolated to a pair of DIMMs onthe caching module.

Correctiveaction

None.

Note: If you have a 16-GB Performance Acceleration Module, and if morethan 10 uncorrectable ECC memory errors occur per day for three consecutivedays, replace the module.

iomem.fail.stripe

Message iomem.fail.stripe

Severity INFO

Description An erase stripe is being failed.

Corrective action None.

iomem.firmware.package.access

Message iomem.firmware.package.access

Severity NODE_error

Description This message occurs when the caching module driver encounters a problemwhile accessing the firmware package. The caching module might continue tofunction, but it is recommended that you follow the corrective action at theearliest opportunity.

Correctiveaction

Reinstall the Data ONTAP software package or service image.

EMS and operational messages | 251

Page 252: 215-06774_netapp-cmds

iomem.firmware.primary

Message iomem.firmware.primary

Severity WARNING

Description This message occurs when the caching module driver detects that the card is notrunning on the primary firmware image. The card does not function unless it isrunning on the primary image.

Corrective action None.

iomem.firmware.program.complete

Message iomem.firmware.program.complete

Severity INFO

Description This message occurs when the caching module driver finishes the programmingprocedure for the caching module firmware.

Corrective action None.

iomem.firmware.program.fail

Message iomem.firmware.program.fail

Severity NODE_ERROR

Description This message occurs when the caching module driver fails to program the cardfirmware.

Corrective action Contact technical support.

iomem.firmware.program.reboot

Message iomem.firmware.program.reboot

Severity INFO

Description This message occurs when the caching module driver triggers a reboot due toprogramming firmware on one or more caching modules.

iomem.firmware.program.start

Message iomem.firmware.program.start

Severity INFO

252 | Platform Monitoring Guide

Page 253: 215-06774_netapp-cmds

Description This message occurs when the caching module driver begins the programmingprocedure for the module firmware.

Corrective action None.

iomem.firmware.rev

Message iomem.firmware.rev

Severity WARNING

Description This message occurs when the caching module driver detects that the fieldprogrammable gate array (FPGA) firmware image is a revision not supportedby the driver.

Corrective action None.

iomem.flash.mismatch.id

Message iomem.flash.mismatch.id

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a flash devicewith an identifier that does not match the identifier contained in the field-replaceable unit (FRU) information. The caching module is not functional untilyou resolve this issue.

Corrective action Contact technical support.

iomem.fru.badInfo

Message iomem.fru.badInfo

Severity WARNING

Description This message occurs when the caching module driver detects invalidinformation in the field-replaceable unit (FRU) electronically erasableprogrammable read-only memory (EEPROM) of the caching module.

Corrective action Replace the caching module.

iomem.fru.checksum

Message iomem.fru.checksum

Severity WARNING

Description This message occurs when the caching module driver detects a checksum errorin the card field-replaceable unit (FRU) information for the caching module.

EMS and operational messages | 253

Page 254: 215-06774_netapp-cmds

Corrective action Replace the caching module.

iomem.fru.read

Message iomem.fru.read

Severity WARNING

Description This message occurs when the caching module driver encounters an errorreading the field-replaceable unit (FRU) electronically erasable programmableread-only memory (EEPROM) of the caching module.

Corrective action Replace the caching module..

iomem.fru.write

Message iomem.fru.write

Severity WARNING

Description This message occurs when the caching module driver encounters an errorwriting the field-replaceable unit (FRU) electronically erasable programmableread-only memory (EEPROM) of the caching module.

Corrective action Replace the caching module.

iomem.i2c.link.down

Message iomem.i2c.link.down

Severity WARNING

Description This message occurs when the caching module driver detects the failure ofInter-Integrated Circuit (I2C) serial link on the caching module.

Corrective action Replace the caching module.

iomem.i2c.read.addrNACK

Message iomem.i2c.read.addrNACK

Severity WARNING

Description This message occurs when the caching module driver detects an addressnegative acknowledgment (NACK) error condition when reading data from anInter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

254 | Platform Monitoring Guide

Page 255: 215-06774_netapp-cmds

iomem.i2c.read.dataNACK

Message iomem.i2c.read.dataNACK

Severity WARNING

Description This message occurs when the caching module driver detects a data negativeacknowledgment (NACK) error condition when reading data from an Inter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.i2c.read.timeout

Message iomem.i2c.read.timeout

Severity WARNING

Description This message occurs when the caching module driver times out while trying toread data from an Inter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.i2c.write.addrNACK

Message iomem.i2c.write.addrNACK

Severity WARNING

Description This message occurs when the caching module driver detects an addressnegative acknowledgment (NACK) error condition when writing data from anInter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.i2c.write.dataNACK

Message iomem.i2c.write.dataNACK

Severity WARNING

Description This message occurs when the caching module driver detects a data negativeacknowledgment (NACK) error condition when writing data from an Inter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

EMS and operational messages | 255

Page 256: 215-06774_netapp-cmds

iomem.i2c.write.timeout

Message iomem.i2c.write.timeout

Severity WARNING

Description This message occurs when the caching module driver times out while trying towrite data from an Inter-Integrated Circuit (I2C) device on the caching module.

Corrective action Replace the caching module.

iomem.init.detect.fpga

Message iomem.init.detect.fpga

Severity INFO

Description This message occurs when the field-programmable gate array (FPGA) on acaching module is detected and initialized for use by the driver.

Corrective action None.

iomem.init.detect.pci

Message iomem.init.detect.pci

Severity INFO

Description This message occurs when a caching module is detected in a PCI slot and isbeing initialized for use by the system.

Corrective action None.

iomem.init.fail

Message iomem.init.fail

Severity NODE_ERROR

Description This message occurs when the caching module driver fails to initialize acaching module.

Corrective action Look for the specific failure log messages in the EMS log prior to this message;they identify the reason for the failure.

iomem.memory.flash.syndrome

Message iomem.memory.flash.syndrome

Severity DEBUG

256 | Platform Monitoring Guide

Page 257: 215-06774_netapp-cmds

Description This messages occurs when the caching module driver detects a syndrome codeassociated with a flash memory access.

Corrective action None.

iomem.memory.none

Message iomem.memory.none

Severity NODE_ERROR

Description This message occurs when the caching module driver cannot detect anyinstalled memory on a caching module.

Corrective action Replace the caching module.

iomem.memory.power.high

Message iomem.memory.power.high

Severity WARNING

Description This message occurs when the memory of the caching module has beenconfigured to operate in high power mode.

Correctiveaction

Memory high power mode should never be enabled for the caching moduleunder normal operating conditions. The only way that this can occur is if it hasbeen explicitly enabled via a private diagnostic interface. If this message isencountered under normal operating conditions, contact technical support.

iomem.memory.power.low

Message iomem.memory.power.low

Severity INFO

Description This message occurs when the memory DIMMs of the caching module havebeen configured to operate in low power mode.

Corrective action None.

iomem.memory.scrub.start

Message iomem.memory.scrub.start

Severity INFO

Description This message occurs when the background error correction code (ECC)memory scrubbing process on a caching module is starting.

EMS and operational messages | 257

Page 258: 215-06774_netapp-cmds

Corrective action None.

iomem.memory.size

Message iomem.memory.size

Severity INFO

Description This message occurs when the caching module driver has determined theamount of memory installed on a caching module.

Corrective action None.

iomem.memory.zero.complete

Message iomem.memory.zero.complete

Severity INFO

Description This message occurs when the boot-time zeroing of the memory of a cachingmodule is complete.

Corrective action None.

iomem.memory.zero.start

Message iomem.memory.zero.start

Severity INFO

Description This message occurs when the boot-time zeroing of the memory of a cachingmodule is starting.

Corrective action None.

iomem.nor.op.failed

Message iomem.nor.op.failed

Severity NODE_ERROR

Description This message occurs when the caching module driver detects that an operationto a NOR flash memory has failed.

Corrective action None.

iomem.pci.error.config.bar

Message iomem.pci.error.config.bar

258 | Platform Monitoring Guide

Page 259: 215-06774_netapp-cmds

Severity NODE_ERROR

Description This message occurs when the caching module driver detects a misconfiguredBase Address Register (BAR) on the caching hardware.

Correctiveaction

Boot into diagnostics and use the applicable menu option to reprogram theprimary field-programmable gate array (FPGA) image on the caching module.If the problem persists, replace the caching module.

iomem.pio.op.failed

Message iomem.pio.op.failed

Severity NODE_ERROR

Description This message occurs when the caching module driver detects that aprogrammed I/O (PIO) NAND flash access failed.

Corrective action None.

iomem.remap.block

Message iomem.remap.block

Severity INFO

Description This message occurs when a bad erase block is being remapped to a spareblock.

Corrective action None.

iomem.remap.target.bad

Message iomem.remap.target.bad

Severity INFO

Description This message occurs when the target of a remap is found to be bad.

Corrective action None.

iomem.temp.report

Message iomem.temp.report

Severity INFO

Description This message occurs periodically to report the operating temperature of thefield-programmable gate array (FPGA) on the caching module.

Corrective action None.

EMS and operational messages | 259

Page 260: 215-06774_netapp-cmds

iomem.train.complete

Message iomem.train.complete

Severity INFO

Description This message occurs when the caching module driver has successfully trainedone of the memory controllers for a memory DIMM bank to report thecalibrated idelay setting.

Corrective action None.

iomem.train.fail

Message iomem.train.fail

Severity NODE_ERROR

Description This message occurs when the caching module driver detects that the cardmemory controllers have failed to train for the installed DIMMs.

Corrective action Replace the caching module.

iomem.train.notReady

Message iomem.train.notReady

Severity NODE_ERROR

Description This message occurs when the caching module driver detects that a cachingmodule memory controller has failed to become ready for operation aftercalibration.

Corrective action Replace the caching module.

iomem.train.start

Message iomem.train.start

Severity INFO

Description This message occurs when the caching module driver initiates training of thememory controllers on the acceleration card to calibrate them to the installedmemory modules.

Corrective action None.

260 | Platform Monitoring Guide

Page 261: 215-06774_netapp-cmds

iomem.vmargin.high

Message iomem.vmargin.high

Severity WARNING

Description This message occurs when the acceleration card driver has been configured tomargin a voltage level high for testing purposes.

Corrective action None.

iomem.vmargin.low

Message iomem.vmargin.low

Severity WARNING

Description This message occurs when the caching module driver has been configured tomargin a voltage level low for testing purposes.

Corrective action None.

iomem.vmargin.nominal

Message iomem.vmargin.nominal

Severity INFO

Description This message occurs when voltage margining has been returned to nominallevel on the caching module.

Corrective action None.

monitor.extCache.failed

Message monitor.extCache.failed

Severity LOG_WARNING

Description This message occurs if the monitor detects the Write Anywhere File Layout(WAFL) external cache subsystem (FlexScale) has failed and is no longeravailable for use.

Corrective action Consult the system logs to determine the original cause of the error.

monitor.flexscale.noLicense

Message monitor.flexscale.noLicense

Severity INFO

EMS and operational messages | 261

Page 262: 215-06774_netapp-cmds

Description This message occurs if the monitor detects that the caching module is presentbut the FlexScale product is not licensed. FlexScale requires a license for use.

Corrective action Obtain a license for the FlexScale product, or remove the caching module.

USB boot device EMS messagesThe universal serial bus boot device on 32xx and 62xx systems can generate informational, warning,and error messages. All messages are reported through the EMS.

usb.adapter.debug

Message usb.adapter.debug

Severity INFORMATION

Description This message indicates a Data ONTAP universal serial bus (USB) adapterdriver debug event.

Corrective action None.

usb.adapter.exception

Message usb.adapter.exception

Severity WARNING

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver encounters an error with the adapter. The adapter is reset to recover.

Corrective action None.

usb.adapter.failed

Message usb.adapter.failed

Severity ERROR

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver cannot recover the adapter after resetting it multiple times. The adapterand the devices attached to it will not be used anymore.

Correctiveaction

Take the following actions:

1. If the adapter is in use, verify that all attached devices are supported devicesand that they are seated correctly.

2. If the problem persists, replace the attached devices.

262 | Platform Monitoring Guide

Page 263: 215-06774_netapp-cmds

3. If the problem still persists, contact technical support for help in diagnosing aUSB issue.

usb.adapter.reset

Message usb.adapter.reset

Severity INFORMATION

Description This message occurs when the Data ONTAP universal serial bus (USB) driverresets the specified adapter. This can occur during normal error handling.

Corrective action If the problem persists, then contact technical support.

usb.device.failed

Message usb.device.failed

Severity ERROR

Description This message occurs when multiple consecutive commands to the specifieduniversal serial bus (USB) device are not completed within the allotted time. Allrecovery actions have been taken and the device cannot be used anymore.

Correctiveaction

Take the following actions:

1. Ensure that all attached devices are supported devices and that they areseated correctly.

2. If the problem persists, replace the attached devices.

3. If the problem still persists, contact technical support for help in diagnosing aUSB issue.

usb.device.initialize.failed

Message usb.device.initialize.failed

Severity ERROR

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver fails to initialize the device attached to the associated port in the associatedadapter for one of the following reasons: Cannot set a unique address for thedevice; device descriptor is invalid or contains incorrect data; cannot set an activeconfiguration for the device; or the device had multiple interfaces. Note that theData ONTAP USB driver only supports USB 2.0 bulk-only mass storage devices.

Correctiveaction

Take one of the following actions:

EMS and operational messages | 263

Page 264: 215-06774_netapp-cmds

1. If the device is connected to an external USB port, try reinserting the device.

2. If that fails, try replacing the device with a device from a different productfamily.

3. If the device is connected to the motherboard and the problem persists, contacttechnical support for help in diagnosing a USB issue.

usb.device.maximum.connected

Message usb.device.maximum.connected

Severity WARNING

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver detects a new USB device inserted into the associated port in the associatedadapter. This new device cannot be initialized because the maximum number ofUSB devices supported by the Data ONTAP USB adapter driver is alreadyconnected to the system.

Correctiveaction

Take the following actions:

1. Remove a USB device that is already connected but is not being used.

2. Wait for 10 seconds, then reinsert the new device.

usb.device.protocol.mismatch

Message usb.device.protocol.mismatch

Severity ERROR

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver detects a protocol mismatch in the device attached to the associated port inthe associated adapter. It can be due to one of the following reasons:

• Unsupported interface.• Unsupported device class or device subclass.• Does not support the required pipes.• Does not support required end points.• Does not support the required maximum transfer packet size.

Note that the Data ONTAP USB driver only supports USB 2.0 bulk-only massstorage devices.

Correctiveaction

Take one of the following actions:

• If the device is connected to an external USB port, try replacing the device witha device from a different product family.

264 | Platform Monitoring Guide

Page 265: 215-06774_netapp-cmds

• If the device is connected to the motherboard, contact technical support for helpin diagnosing a USB issue.

usb.device.removed

Message usb.device.removed

Severity INFORMATION

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver successfully detects and handles the removal of the associated device,and the device is no longer accessible.

Corrective action None.

usb.device.timeout

Message usb.device.timeout

Severity ERROR

Description This message occurs when an outstanding command to the specified universalserial bus (USB) device is not completed within the allotted time. As part of thestandard error handling sequence managed by the Data ONTAP USB adapterdriver, this command to the device is aborted and reissued.

Correctiveaction

Device level timeouts are a common indication of a USB link stability problem. Insome cases, the link is operating normally and the specified device is havinginternal trouble processing I/O requests in a timely manner. In such cases, evaluatethe specified device for possible replacement. Quite often the problem results fromthe partial failure of a component involved in the USB transport. The mostcommon thing to check is the seating of the USB device into the USB port or theheader.

Take one of the following actions:

• If the device is connected to an external USB port, try replacing the device witha device from a different product family.

• If the device is connected to the motherboard, contact technical support for helpin diagnosing the USB issue.

usb.device.unsupported

Message usb.device.unsupported

Severity ERROR

EMS and operational messages | 265

Page 266: 215-06774_netapp-cmds

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver detects an unsupported device attached to the default boot device port onthe motherboard.

Corrective action Contact technical support for a replacement USB boot device.

usb.device.unsupported.speed

Message usb.device.unsupported.speed

Severity ERROR

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver detects a non high-speed device in the associated port.

Correctiveaction

Remove all non high-speed devices attached to the system because the DataONTAP USB adapter driver does not support non high-speed devices.

usb.external.device.not.used

Message usb.external.device.not.used

Severity WARNING

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver detects a USB device connected to the external port.

Corrective action Remove the external USB device connected to the system.

usb.externalHub.notSupported

Message usb.externalHub.notSupported

Severity WARNING

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver detects a USB hub device.

Corrective action Remove all hub devices attached to the system because the USB adapter driverdoes not support USB hub devices.

usb.port.error

Message usb.port.error

Severity ERROR

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver detects an unrecoverable error on the associated port.

266 | Platform Monitoring Guide

Page 267: 215-06774_netapp-cmds

Corrective action Take the following actions:

1. If a device is attached to the associated port, try reinserting the device.

2. If the problem persists, try replacing the device.

3. If the problem still persists, contact technical support for assistance indiagnosing a USB issue.

usb.port.reset

Message usb.port.reset

Severity INFORMATION

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver resets the specified port on the associated adapter. This can occur duringnormal error handling.

Corrective action If the problem persists, contact technical support.

usb.port.state.indeterminate

Message usb.port.state.indeterminate

Severity WARNING

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver cannot determine the status of the associated port.

Correctiveaction

Take the following actions:

1. If a device is attached to the associated port, try reinserting the device.

2. If the problem persists, try replacing the device.

3. If the problem still persists, contact technical support for assistance indiagnosing a USB issue.

usb.port.status.inconsistent

Message usb.port.status.inconsistent

Severity ERROR

Description This message occurs when the Data ONTAP universal serial bus (USB) adapterdriver detects an inconsistent state of the associated port and cannotcommunicate with the attached device.

EMS and operational messages | 267

Page 268: 215-06774_netapp-cmds

Correctiveaction

If a device is attached to the associated port, try reinserting the device. If thatfails, try replacing the device. If the problem persists, contact technical supportfor assistance in diagnosing a USB issue.

usbmon.boot.device.failed

Message usbmon.boot.device.failed

Severity ERROR

Description This message occurs when the Data ONTAP module that is responsible formonitoring the health of the universal serial bus (USB) boot devices determinesthat the associated boot device will fail all writes to the media.

Correctiveaction

Take the following actions:

1. Replace the device.

2. If the problem persists, contact technical support for help in diagnosing theUSB issue.

usbmon.boot.device.pfa

Message usbmon.boot.device.pfa

Severity WARNING

Description This message occurs when the Data ONTAP universal serial bus (USB) bootdevice health monitor PFA (predictive failure analysis) determines that failureis forthcoming for the associated boot device.

Corrective action Take the following actions:

1. Replace the device.

2. If the problem persists, contact technical support for help in diagnosing theUSB issue.

usbmon.disable.module

Message usbmon.disable.module

Severity INFORMATION

Description This message occurs when the Data ONTAP module that is responsible formonitoring the health of the universal serial bus (USB) boot devices is disabled.

Correctiveaction

1. Halt the system by entering the following command at the system prompt:

halt

268 | Platform Monitoring Guide

Page 269: 215-06774_netapp-cmds

2. After the system boots to the LOADER prompt, run the setenv disable-usbmon? false command at the LOADER prompt.

3. Continue to boot the system by entering the following command at theLOADER prompt:

boot_ontap

usbmon.unable.to.monitor

Message usbmon.unable.to.monitor

Severity WARNING

Description This message occurs when the Data ONTAP module that is responsible formonitoring the health of the universal serial bus (USB) boot devices cannotextract health information from the monitored device.

Corrective action Take the following actions:

1. Replace the device.

2. If the problem persists, contact technical support.

FCoE HBA EMS messagesFCoE messages appear if the CNA (Converged Network Adapter) MPI (Management Port Interface)driver detects an unexpected event or illegal condition or if the HBA fails to initialize.

ispcna.mpi.dump

Message ispcna.mpi.dump

Severity SVC_ERROR

Description This message occurs when an unexpected event or illegal condition is detectedby the CNA (Converged Network Adapter) Management Port Interface (MPI)driver and the contents of the adapter's Static RAM and memory must bedumped. After the dump, the adapter is reset and the contents of the dump arestored in a file in the /etc/log/ql8mpi directory.

Corrective action None; the adapter was reset.

ispcna.mpi.dump.saved

Message ispcna.mpi.dump.saved

EMS and operational messages | 269

Page 270: 215-06774_netapp-cmds

Severity SVC_ERROR

Description This message occurs when an unexpected event or illegal condition is detectedby the CNA (Converged Network Adapter) Management Port Interface (MPI)driver and the contents of the adapter's Static RAM and memory are saved. Thedump files are stored on the system's root volume in the /etc/log/ql8mpidirectory, with the following file name format: mpi[adapter]_[date]_[time].bin

Correctiveaction

Send the dump file to technical support for analysis.

ispcna.mpi.initFailed

Message ispcna.mpi.initFailed

Severity NODE_ERROR

Description This message occurs when the CNA (Converged Network Adapter) fails toinitialize.

Corrective action Take corrective actions based on the indicated reason for the failure.

Operational error messagesOperational error messages might appear on your system console or LCD when the system isoperating, when it is halted, or when it is restarting because of system problems.

Disk hung during swap

Message Disk hung during swap

Description A disk error occurred as you were hot-swapping a disk.

Fatal? Yes.

Corrective action 1. Disconnect the disk from the power supply by opening the latch and pullingit halfway out.

2. Wait 15 seconds to allow all disks to spin down.

3. Reinstall the disk.

4. Restart the system by entering the following command:

boot

270 | Platform Monitoring Guide

Page 271: 215-06774_netapp-cmds

Disk n is broken

Message Disk n is broken

Description n—The RAID group disk number. The solution depends on whether you have ahot spare in the system.

Fatal? No.

Corrective action See the appropriate system administration guide for information about how tolocate a disk based on the RAID group disk number and how to replace a faultydisk.

Dumping core

Message Dumping core

Description The system is dumping core after a system crash.

Fatal? Yes.

Corrective action Write down the system crash message on the system console and report theproblem to technical support.

Error dumping core

Message Error dumping core

Description The system cannot dump core during a system crash and restarts withoutdumping core.

Fatal? Yes.

Corrective action Report the problem to technical support.

FC-AL LINK_FAILURE

Message FC-AL LINK_FAILURE

Description Fibre Channel arbitrated loop has link failures.

Fatal? No.

Corrective action Report the problem to technical support.

FC-AL RECOVERABLE ERRORS

Message FC-AL RECOVERABLE ERRORS

EMS and operational messages | 271

Page 272: 215-06774_netapp-cmds

Description Fibre Channel arbitrated loop has been determined to be unreliable. The linkerrors are recoverable in the sense that the system is still up and running

Fatal? No.

Corrective action Report the problem to technical support.

Panicking

Message Panicking

Description The system is crashing. If the system does not hang while crashing, the messageDumping core appears.

Fatal? Yes.

Corrective action Report the problem to technical support.

RMC Alert: Boot Error

Message RMC Alert: Boot Error

Description RMC card sent a DOWN APPLIANCE message. Causes might be a down system,a boot error, or an OFN POST error.

Fatal? Yes.

Corrective action Harness script filters them and creates a case.

Contact technical support.

RMC Alert: Down Appliance

Message RMC Alert: Down Appliance

Description RMC card sent a DOWN APPLIANCE message. Causes might be a down system,a boot error, or an OFN POST error.

Fatal? Yes.

Corrective Action Harness script filters them and creates a case.

Contact technical support.

RMC Alert: OFW POST Error

Message RMC Alert: OFW POST Error

Description RMC card sent a DOWN APPLIANCE message. Causes might be a downsystem, a boot error, or an OFN POST error.

272 | Platform Monitoring Guide

Page 273: 215-06774_netapp-cmds

Fatal? Yes.

Corrective action Harness script filters them and creates a case.

Contact technical support.

EMS and operational messages | 273

Page 274: 215-06774_netapp-cmds

274 | Platform Monitoring Guide

Page 275: 215-06774_netapp-cmds

RLM messages

The RLM provides remote management capabilities for some storage systems and continuouslymonitors system health. Two types of messages are associated with the RLM and can help youmonitor your system and troubleshoot problems.

The following systems contain RLMs:

• 30xx and SA300 systems• 31xx systems• 60xx and SA600 systems

The RLM sends AutoSupport messages when certain problems occur with the system. These mightinclude a reboot failure or a user-triggered power cycle.

Data ONTAP generates EMS messages when RLM events and errors occur. These might include afirmware update failure or a communication error.

Note: For more information about what the RLM does, see the System Administration Guide forthe version of Data ONTAP that your system is running.

When and how RLM AutoSupport e-mail messages are sentThe RLM generates AutoSupport e-mail messages when the system goes down or when certainproblems occur.

The RLM sends AutoSupport e-mail messages under the following conditions:

• The system reboots unexpectedly• The system stops communicating with the RLM• A watchdog reset occurs• The system is power-cycled• Firmware POST errors occur• A user-initiated AutoSupport message occurs

The subject line of e-mail messages contains the words "System Notification" and includes the hostname of the system and the message type. The following text shows an example of an RLMAutoSupport e-mail subject line: System Notification from system (RLM HBTSTOPPED)CRITICAL

Messages are sent to recipients that you designate when you configure AutoSupport in Data ONTAP.

Note: The RLM must be properly configured to send AutoSupport messages. For informationabout configuring the RLM, see the System Administration Guide and the Software Setup Guidefor the version of Data ONTAP that your system is running.

275

Page 276: 215-06774_netapp-cmds

What RLM AutoSupport e-mail messages includeRLM AutoSupport e-mail messages have different sections that contain different kinds ofinformation about your system.

RLM e-mail messages include the following sections and information:

• Subject line: a system notification from the RLM of the system, stating the system condition orevent that caused the AutoSupport message and the log level.

• Message body: the RLM configuration and version information, the system ID, serial number,model number, and host name.

• Attachments: SELs, the system sensor state as determined by the RLM, and console logs.

Note: For more information about the contents of AutoSupport messages, see the SystemAdministration Guide for the version of Data ONTAP running on your system.

When and how RLM EMS messages are sentData ONTAP generates EMS messages when problems occur with the RLM and displays them onthe system console.

Problems that trigger EMS messages might include failed network configuration, failed RLMheartbeat, or firmware update errors.

The console message includes the name of the EMS message and a brief description of the event orproblem. The following text contains an example of an RLM EMS message:

[rlm.orftp.failed:warning]: RLM communication error, unsupported send request

RLM-generated AutoSupport messagesThe RLM continuously monitors the system's health and generates AutoSupport messages when thesystem goes down or when other problems, such as startup errors, occur.

Heartbeat loss warning

Message Heartbeat loss warning

Description The Remote LAN Module (RLM) detects that the system is offline, possiblybecause the system stopped serving data.

Correctiveaction

If this system shutdown was manually triggered, no action is necessary.Otherwise, complete the following steps.

276 | Platform Monitoring Guide

Page 277: 215-06774_netapp-cmds

1. Check the status of your system and verify that the system and disk shelvesare operational.

2. Contact technical support if the problem persists.

Reboot (power loss) critical

Message Reboot (power loss) critical

Description The Remote LAN Module (RLM) detects that the system lost AC power.

Corrective action If you switched off the system before you received the notification, no action isnecessary. Otherwise, restore power to the system.

Reboot warning

Message Reboot warning

Description The Remote LAN Module (RLM) detects an abnormal system reboot.

Corrective action If this was a manually triggered or expected reboot, no action is necessary.Otherwise, complete the following steps.

1. Check the status of the system and determine the cause of the reboot.

2. Contact technical support if the system fails to reboot.

Reboot (watchdog reset) warning

Message Reboot (watchdog reset) warning

Description The Remote LAN Module (RLM) detects a watchdog reset error.

Corrective action 1. Check the system to verify that it is operational.

2. If your system is operational, run diagnostics on your entire system.

3. Contact technical support if the storage system is not serving data.

RLM heartbeat loss

Message RLM heartbeat loss

Description The Remote LAN Module (RLM) detects the loss of heartbeat from DataONTAP. The system possibly stopped serving data.

Corrective action 1. Connect to the RLM command-line interface (CLI) to check whether theRLM is operational.

RLM messages | 277

Page 278: 215-06774_netapp-cmds

2. Contact technical support if the problem persists.

RLM heartbeat stopped

Message RLM heartbeat stopped

Description The system software cannot see the RLM.

Corrective action 1. Connect to the RLM command-line interface (CLI) to check whether theRLM is operational.

2. Contact technical support if the problem persists.

System boot failed (POST failed)

Message System boot failed (POST failed)

Description The Remote LAN Module (RLM) detects that a system error occurred duringthe POST and the system software cannot be booted.

Corrective action 1. Run diagnostics on your system.

2. Contact technical support if running diagnostics does not detect any faultycomponents.

User triggered (RLM test)

Message User triggered (RLM test)

Description The Remote LAN Module (RLM) received the rlm test command, whichtests the RLM configuration.

Corrective action No action is necessary.

User_triggered (system nmi)

Message User_triggered (system nmi)

Description A user is initiating a system core dump (nmi) through the Remote LAN Module(RLM).

Corrective action No action is necessary.

User_triggered (system power cycle)

Message User_triggered (system power cycle)

278 | Platform Monitoring Guide

Page 279: 215-06774_netapp-cmds

Description A user is initiating a system power-cycle through the Remote LAN Module(RLM).

Corrective action No action is necessary.

User_triggered (system power off)

Message User_triggered (system power off)

Description A user is powering off the system through the Remote LAN Module (RLM).

Corrective action No action is necessary.

User_triggered (system power on)

Message User_triggered (system power on)

Description A user is powering on the system through the Remote LAN Module (RLM).

Corrective action No action is necessary.

User_triggered (system reset)

Message User_triggered (system reset)

Description A user is resetting the system through the Remote LAN Module (RLM).

Corrective action No action is necessary.

EMS messages about the RLMData ONTAP generates EMS messages when problems occur with the RLM. These problems mightinclude failed network configuration or firmware update errors.

rlm.driver.hourly.stats

Message rlm.driver.hourly.stats

Severity Warning

Description The system encountered an error while trying to get hourly statistics from theRemote LAN Module (RLM).

Corrective action 1. Check whether the RLM is online by entering the following command at theData ONTAP prompt:

rlm status

RLM messages | 279

Page 280: 215-06774_netapp-cmds

2. If the RLM is operational and the problem persists, enter the followingcommand to reboot the RLM:

rlm reboot

rlm.driver.mailhost

Message rlm.driver.mailhost

Severity Warning

Description This message occurs when Remote LAN Module (RLM) setup verifies whethera mailhost specified in ONTAP can be reached. In this case, RLM setup cannotconnect to the specified mailhost.

Corrective action 1. Verify that a valid mailhost is configured in Data ONTAP by checking thesystem AutoSupport configuration.

2. Ensure that ONTAP can successfully connect to the specified mailhost byentering a test AutoSupport command.

rlm.driver.network.failure

Message rlm.driver.network.failure

Severity Warning

Description A failure occurred during the network configuration of the Remote LAN Module(RLM). The system could not assign the RLM a Dynamic Host ConfigurationProtocol (DHCP) or fixed IP address.

Correctiveaction

1. Check whether the RLM is online by entering the following command at theData ONTAP prompt:

rlm status

2. If the RLM is operational and the problem persists, enter the followingcommand to reboot the RLM:

rlm reboot

rlm.driver.timeout

Message rlm.driver.timeout

Severity Warning

280 | Platform Monitoring Guide

Page 281: 215-06774_netapp-cmds

Description A failure occurred during communication with the Remote LAN Module(RLM).

Corrective action 1. Check whether the RLM is online by entering the following command at theData ONTAP prompt:

rlm status

2. If the RLM is operational and the problem persists, enter the followingcommand to reboot the RLM:

rlm reboot

rlm.firmware.update.failed

Message rlm.firmware.update.failed

Severity SVC_ERROR

Description An error occurred during an update to the Remote LAN Module (RLM) firmware.The firmware might have failed due to the following reasons:

• An incorrect RLM firmware image or a corrupted image file• A communication error while sending new firmware to the RLM• An update failure while applying new firmware at the RLM• A system reset or loss of power during an update

Correctiveaction

1. Download the firmware image by entering the following command:

software install http://pathto/RLM_FW.zip -f

2. Make sure that the RLM is still operational by entering the following commandat the system prompt:

rlm status

3. Retry updating the RLM firmware. For more information, see the section onupdating RLM firmware in the System Administration Guide.

4. If the failure persists, contact technical support.

rlm.firmware.upgrade.reqd

Message rlm.firmware.upgrade.reqd

Severity WARNING

Description The Remote LAN Module (RLM) firmware version and the version of DataONTAP are incompatible and cannot communicate correctly about a particularcapability.

RLM messages | 281

Page 282: 215-06774_netapp-cmds

Corrective action Update the firmware version of the RLM to the version recommended for yourversion of Data ONTAP.

For more information, see the section on upgrading RLM firmware in theSystem Administration Guide.

rlm.firmware.version.unsupported

Message rlm.firmware.version.unsupported

Severity WARNING

Description The firmware on the Remote LAN Module (RLM) is an unsupported versionand must be upgraded.

Correctiveaction

Update the firmware version of the RLM to the version recommended for yourversion of Data ONTAP.

For more information, see the section on upgrading RLM firmware in theSystem Administration Guide.

rlm.heartbeat.bootFromBackup

Message rlm.heartbeat.bootFromBackup

Severity WARNING

Description The system rebooted the Remote LAN Module (RLM) from its backup firmwareto restore RLM availability. The RLM is considered unavailable when the systemstops receiving heartbeat notifications from the RLM. To restore availability, thesystem tries to reboot the RLM form the RLM's primary firmware. If that fails, thesystem tries to reboot the RLM from the RLM's backup firmware. This message isgenerated if the reboot from backup firmware restores availability.

Correctiveaction

Update the firmware version of the RLM to the version recommended for yourversion of Data ONTAP.

For more information, see the section on upgrading RLM firmware in the SystemAdministration Guide.

rlm.heartbeat.resumed

Message rlm.heartbeat.resumed

Severity WARNING

Description The system detected the resumption of Remote LAN Module (RLM) heartbeatnotifications, indicating that the RLM is now available. The earlier issueindicated by the rlm.heartbeat.stopped message was resolved.

282 | Platform Monitoring Guide

Page 283: 215-06774_netapp-cmds

Corrective action None needed.

rlm.heartbeat.stopped

Message rlm.heartbeat.stopped

Severity WARNING

Description The system did not receive an expected heartbeat message from the Remote LANModule (RLM). The RLM and the system exchange heartbeat messages, whichthey use to detect when one or the other is unavailable.

Correctiveaction

1. Connect to the RLM CLI.

2. Collect debugging information by entering the following commands:

rlm version

rlm config

priv set advanced

rlm log debug

rlm log messages

3. Run the RLM diagnostics:

a. From the boot loader prompt, enter

boot_diags

b. When the diagnostics main menu appears, select agent.

c. To test the syst/agent/RLM interface, select tests 2 and 6.

4. See the section on troubleshooting RLM problems in the SystemAdministration Guide.

5. If the problem persists, contact technical support.

rlm.network.link.down

Message rlm.network.link.down

Severity WARNING

Description The Remote LAN Module (RLM) detected a link error on the RLM network port.This can happen if a network cable is not plugged into the RLM network port. Itcan also happen if the network that the RLM is connected to cannot run at 10/100Mbps.

RLM messages | 283

Page 284: 215-06774_netapp-cmds

Correctiveaction

1. Check whether the network cable is correctly plugged into the RLM networkport.

2. Check the link status LED on the RLM.

3. Verify that the network that the RLM is connected to supports autonegotiationto 10/100 Mbps or is running at one of those speeds; otherwise, RLM networkconnectivity does not work.

rlm.notConfigured

Message rlm.notConfigured

Severity WARNING

Description This message occurs weekly to remind you to configure the Remote LAN Module(RLM). The RLM is a physical device that is incorporated into your system toprovide remote access and remote management capabilities. To use the fullfunctionality of RLM, you need to configure it first.

Correctiveaction

1. Use the rlm setup command to configure the RLM.If necessary, use the rlm status command to obtain its MAC address.

2. Use the rlm status command to verify the RLM network configuration.

3. Use the rlm test autosupport command to verify that the RLM can sendAutoSupport e-mail.Note that AutoSupport mailhosts and recipients must be properly configuredin Data ONTAP before issuing this command.

rlm.orftp.failed

Message rlm.orftp.failed

Severity WARNING

Description A communication error occurred while sending or receiving information fromthe Remote LAN Module (RLM).

Corrective action 1. Check whether the RLM is operational by entering the following commandat the Data ONTAP prompt:

rlm status

2. If the RLM is operational and this error persists, enter the followingcommand to reboot the RLM:

rlm reboot

284 | Platform Monitoring Guide

Page 285: 215-06774_netapp-cmds

3. If this message persists after you reboot the RLM, contact technical support.

rlm.snmp.traps.off

Message rlm.snmp.traps.off

Severity INFO

Description The advanced privilege level in Data ONTAP was used to disable the SNMPtrap feature of the Remote LAN Module (RLM). This message occurs at boot.

This message also occurs when the SNMP trap capability was disabled and auser invokes a Data ONTAP command to use the RLM to send an SNMP trap.

Correctiveaction

To enable RLM SNMP trap support, set the rlm.snmp.traps option to On.

rlm.systemDown.alert

Message rlm.systemDown.alert

Severity ALERT

Description System remote management detected a system down event.

This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)firmware. The trap includes a string describing the specific event that triggeredthe trap. The string is structured in the following form with key=value pairs:

Remote Management Event: type={system_down|system_up|test|keep_alive}, severity={alert|warning|notice|normal|debug|info}, event={post_error|watchdog_reset|power_loss}

Correctiveaction

1. Check the system to verify that it has power and is operational.

2. If your system is operational, run diagnostics on your entire system.

3. Contact technical support if the system is not serving data.

rlm.systemDown.notice

Message rlm.systemDown.notice

Severity NOTICE

Description System remote management detected a system down event.

This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)firmware. The trap includes a string describing the specific event that triggered thetrap. The string is structured in the following form with key=value pairs:

RLM messages | 285

Page 286: 215-06774_netapp-cmds

Remote Management Event: type={system_down|system_up|test|keep_alive}, severity={alert|warning|notice|normal|debug|info}, event={power_off_via_rlm|power_cycle_via_rlm|reset_via_rlm}

Correctiveaction

1. Check the system to verify that it has power and is operational.

2. If your system is operational, run diagnostics on your entire system.

3. Consult technical support if the system is not serving data.

rlm.systemDown.warning

Message rlm.systemDown.warning

Severity WARNING

Description System remote management detected a system down event.

This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)firmware. The trap includes a string describing the specific event that triggered thetrap. The string is structured in the following form with key=value pairs:

Remote Management Event: type={system_down|system_up|test|keep_alive}, severity={alert|warning|notice|normal|debug|info}, event={loss_of_heartbeat}

Correctiveaction

1. Check the system to verify that it has power and is operational.

2. If your system is operational, run diagnostics on your entire system.

3. Consult technical support if the system is not serving data.

rlm.systemPeriodic.keepAlive

Message rlm.systemPeriodic.keepAlive

Severity INFO

Description System remote management sent a periodic keep-alive event.

This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)firmware. The trap includes a string describing the specific event that triggeredthe trap. The string is structured in the following form with key=value pairs:

Remote Management Event: type={system_down|system_up|test|keep_alive}, severity={alert|warning|notice|normal|debug|info}, event={periodic_message}

Correctiveaction

None needed.

286 | Platform Monitoring Guide

Page 287: 215-06774_netapp-cmds

rlm.systemTest.notice

Message rlm.systemTest.notice

Severity NOTICE

Description System remote management detected a test event.

This is only an SNMP trap that is sent out by the Remote LAN Module (RLM)firmware. The trap includes a string describing the specific event that triggeredthe trap. The string is structured in the following form with key=value pairs:

Remote Management Event: type={system_down|system_up|test|keep_alive}, severity={alert|warning|notice|normal|debug|info}, event={test}

Correctiveaction

None needed.

rlm.userlist.update.failed

Message rlm.userlist.update.failed

Severity WARNING

Description There was an error while updating user information for the Remote LAN Module(RLM). When user information is updated on Data ONTAP, the RLM is alsoupdated with the new changes. This enables users to log in to the RLM.

Correctiveaction

1. Check whether the RLM is operational by entering the following command atthe Data ONTAP prompt:

rlm status

2. If the RLM is operational and this error persists, reboot the RLM by enteringthe following command:

rlm reboot

3. Retry the operation that caused the error message.

4. If this message persists after you reboot the RLM, contact technical support.

RLM messages | 287

Page 288: 215-06774_netapp-cmds

288 | Platform Monitoring Guide

Page 289: 215-06774_netapp-cmds

BMC messages

The BMC provides remote platform management capabilities on FAS20xx and SA200 systems.

BMC capabilities include remote access, monitoring, troubleshooting, logging, and alerting features.The BMC sends AutoSupport messages through its independent management interface, regardless ofthe state of the system.

How and when BMC AutoSupport e-mail notifications aresent

BMC e-mail notifications are sent to configured recipients designated by the AutoSupport feature.

The e-mail notifications have the title “System Alert from BMC of filer serial number," followedby the message type. The serial number is that of the controller with which the BMC isassociated.

Typical BMC-generated AutoSupport messages occur under the following conditions:

• The system reboots unexpectedly• A system reboot fails• A user-issued action triggers an AutoSupport message

What BMC e-mail notifications includeThe different parts of BMC e-mail messages contain information about your system.

BMC e-mail notifications include the following information:

• Subject line: a system notification from the BMC of the system, listing the system condition orevent that cause the AutoSupport message and the log level.

• Message body: the IP address, netmask, and other information about the system.• Attachments: system configuration and sensor information.

BMC-generated AutoSupport messagesThe BMC can generate a variety of messages telling you of problems or events occurring on yoursystem.

289

Page 290: 215-06774_netapp-cmds

BMC_ASUP_UNKNOWN

Message BMC_ASUP_UNKNOWN

Description Unknown Baseboard Management Controller (BMC) error.

Corrective action Report the problem to technical support.

REBOOT (abnormal)

Message REBOOT (abnormal)

Explanation An abnormal reboot occurred.

Corrective action Verify that the system has returned to operation.

REBOOT (power loss)

Message REBOOT (power loss)

Description A power failure was detected, and the system restarted. This occurs when thesystem is power-cycled by the external switches or in a true power loss.

Corrective action Verify that the system has returned to operation.

REBOOT (watchdog reset)

Message REBOOT (watchdog reset)

Description The system stopped responding and was rebooted by the BaseboardManagement Controller (BMC). This occurs when the BMC watchdog istriggered.

Corrective action Verify that the system has returned to operation.

SYSTEM_BOOT_FAILED (POST failed)

Message SYSTEM_BOOT_FAILED (POST failed)

Description The system failed to pass the BIOS POST. This occurs when the BIOS statussensor is in a failed or hung state.

Correctiveaction

1. Issue a system reset backup command from the Baseboard ManagementController (BMC) console, and if the system can come up to the boot loader,issue the flash command to update the primary BIOS firmware.

2. If the system is still nonresponsive, contact technical support.

290 | Platform Monitoring Guide

Page 291: 215-06774_netapp-cmds

SYSTEM_POWER_OFF (environment)

Message SYSTEM_POWER_OFF (environment)

Description An environmental sensor entered a critical, nonrecoverable state, and DataONTAP has been requested to power off the system.

Corrective action Verify the environmental conditions of the system.

USER_TRIGGERED (bmc test)

Message USER_TRIGGERED (bmc test)

Description A user triggered the Baseboard Management Controller (BMC) AutoSupportinternal test through the BMC console, Systems Management Architecture forServer Hardware (SMASH), or Intelligent Platform Management Interface(IPMI).

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system nmi)

Message USER_TRIGGERED (system nmi)

Description A user requested a core dump through the BMC console, SMASH, or IPMI.

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system power cycle)

Message USER_TRIGGERED (system power cycle)

Description A user issued a power-cycle command through the Baseboard ManagementController (BMC) console, Systems Management Architecture for ServerHardware (SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system power off)

Message USER_TRIGGERED (system power off)

Description A user issued a power off command through the Baseboard ManagementController (BMC) console, Systems Management Architecture for ServerHardware (SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

BMC messages | 291

Page 292: 215-06774_netapp-cmds

USER_TRIGGERED (system power on)

Message USER_TRIGGERED (system power on)

Description A user issued a power on command through the Baseboard ManagementController (BMC) console, Systems Management Architecture for ServerHardware (SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system power soft-off)

Message USER_TRIGGERED (system power soft-off)

Description A user issued a power soft-off command through the Baseboard ManagementController (BMC) console, Systems Management Architecture for ServerHardware (SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

USER_TRIGGERED (system reset)

Message USER_TRIGGERED (system reset)

Description A user issued a reset command through the Baseboard Management Controller(BMC) console, Systems Management Architecture for Server Hardware(SMASH), or Intelligent Platform Management Interface (IPMI).

Corrective action Verify that the command was issued by an authorized user.

EMS messages about the BMCThe EMS might send messages to your system console about the BMC.

bmc.asup.crit

Message bmc.asup.crit

Description This message occurs when the Baseboard Management Controller (BMC) sendsan AutoSupport message of a CRITICAL priority.

Correctiveaction

The action you take depends on whether the operating environment for thesystem, storage, or associated cabling has changed.

• If the operating environment has changed, shut down and power off the systemuntil the environment is restored to normal operations.

• If the operating environment has not changed, check for previous errors andwarnings. Also check for hardware statistics from Fibre Channel, SCSI, disk

292 | Platform Monitoring Guide

Page 293: 215-06774_netapp-cmds

drives, other communications mechanisms, and previous administrativeactivities.

bmc.asup.error

Message bmc.asup.error

Description This message occurs when the Baseboard Management Controller (BMC) failsto construct the necessary attachments of an AutoSupport message.

Corrective action This message indicates an internal error with the BMC's AutoSupportprocessing. Contact technical support.

bmc.asup.init

Message bmc.asup.init

Description This message occurs when the Baseboard Management Controller (BMC) failsto initialize its AutoSupport subsystem due to a lack of resources.

Corrective action This message indicates an internal error with the BMC's AutoSupportprocessing. Contact technical support.

bmc.asup.queue

Message bmc.asup.queue

Description This message occurs when the Baseboard Management Controller (BMC) hastoo many outstanding AutoSupport messages and no longer has enoughresources to service them.

Correctiveaction

This message might indicate an issue with your AutoSupport configuration.

1. Ensure that your system is configured to use the correct AutoSupport SMTPmail host, and that the mail host is properly configured to handleAutoSupport messages originating from the BMC.

2. For additional help, contact technical support.

bmc.asup.send

Message bmc.asup.send

Description This message occurs when the Baseboard Management Controller (BMC) sendsan AutoSupport message.

Corrective action 1. Follow the corrective action recommended for the AutoSupport messagethat was sent.

BMC messages | 293

Page 294: 215-06774_netapp-cmds

2. For additional help, contact technical support.

bmc.asup.smtp

Message bmc.asup.smtp

Description This message occurs when the Baseboard Management Controller (BMC) failsto contact the mailhost when attempting to send an AutoSupport message.

Correctiveaction

This message indicates an issue with your AutoSupport configuration.

1. Ensure that your system is configured to use the correct AutoSupport SMTPmail host and that the mail host is properly configured to handle AutoSupportmessages originating from the BMC.

2. For additional help, contact technical support.

bmc.batt.id

Message bmc.batt.id

Description This message occurs when the Baseboard Management Controller (BMC)cannot read the part number information stored in the battery configurationfirmware.

Corrective action Contact technical support for the current procedure to determine whether thebattery failed.

bmc.batt.invalid

Message bmc.batt.invalid

Description This message occurs when the Baseboard Management Controller (BMC)determines that the battery installed is not the correct model for your system.

Corrective action Contact technical support to request the appropriate replacement battery foryour model of system.

bmc.batt.mfg

Message bmc.batt.mfg

Description This message occurs when the Baseboard Management Controller (BMC)cannot read the manufacturer information stored in the battery configurationfirmware.

Corrective action Contact technical support for the current procedure to determine whether thebattery failed.

294 | Platform Monitoring Guide

Page 295: 215-06774_netapp-cmds

bmc.batt.rev

Message bmc.batt.rev

Description This message occurs when the Baseboard Management Controller (BMC)cannot read the revision code stored in the battery configuration firmware.

Corrective action Contact technical support for the current procedure to determine whether thebattery failed.

bmc.batt.seal

Message bmc.batt.seal

Description This message occurs when the Baseboard Management Controller (BMC)cannot seal the battery's configuration information after a battery upgrade.

Corrective action Contact technical support for the current procedure to determine whether thebattery failed.

bmc.batt.unknown

Message bmc.batt.unknown

Description This message occurs when the Baseboard Management Controller (BMC)determines that the installed battery is not a recognized part that is approved foruse in your system.

Corrective action Contact technical support to request the appropriate replacement battery foryour model of system.

bmc.batt.unseal

Message bmc.batt.unseal

Description This message occurs when the Baseboard Management Controller (BMC)cannot unseal the battery's configuration information to determine whether thebattery firmware requires an upgrade.

Corrective action Contact technical support for the current procedure to determine whether thebattery failed.

bmc.batt.upgrade

Message bmc.batt.upgrade

BMC messages | 295

Page 296: 215-06774_netapp-cmds

Description This message occurs when the Baseboard Management Controller (BMC)generates it before an upgrade of the battery's configuration firmware toindicate to the user the present and new revisions of battery configuration.

Corrective action None.

bmc.batt.upgrade.busy

Message bmc.batt.upgrade.busy

Description This message occurs when the Baseboard Management Controller (BMC)determines that the battery configuration firmware requires an upgrade, but thatthe BMC is too busy to perform the upgrade.

Correctiveaction

It is normal to get this message one time after a BMC upgrade. However, if thismessage is issued more than once, it indicates a problem with your system.Contact technical support for the current procedure to determine whether yoursystem needs to be replaced.

bmc.batt.upgrade.failed

Message bmc.batt.upgrade.failed

Description This message occurs when the Baseboard Management Controller (BMC) cannotupgrade the battery configuration firmware to the latest revision.

Correctiveaction

In most cases, this error does not impact the functionality of your system, butreplacing the battery might be advised at your next maintenance window.Contact technical support for the current procedure to determine whether thebattery needs to be replaced.

bmc.batt.upgrade.failure

Message bmc.batt.upgrade.failure

Description This message occurs when the Baseboard Management Controller (BMC)generates it for every configuration item in the battery configuration firmwarethat could not be updated during a battery upgrade.

Correctiveaction

1. Remove and reinsert the controller module. In most cases, this forces theBMC to reattempt and successfully upgrade the battery.

2. If you see this message more than once, contact technical support for thecurrent procedure to determine whether the battery needs to be replaced.

296 | Platform Monitoring Guide

Page 297: 215-06774_netapp-cmds

bmc.batt.upgrade.ok

Message bmc.batt.upgrade.ok

Description This message occurs when the entire battery upgrade process is complete.

Corrective action None.

bmc.batt.upgrade.power-off

Message bmc.batt.upgrade.power-off

Description This message occurs in the rare event where the Baseboard ManagementController (BMC) cannot turn on system power, and the battery has not beenchecked to determine whether it requires a configuration upgrade.

Correctiveaction

1. Remove and reinsert the controller module.

2. If you continue to see this message, contact technical support for the currentprocedure to determine whether the controller module needs to be replaced.

bmc.batt.upgrade.voltagelow

Message bmc.batt.upgrade.voltagelow

Description This message occurs when the Baseboard Management Controller (BMC)generates it because the battery is discharged to below 6.0V and the batteryrequires a configuration firmware update.

Correctiveaction

This message is printed every 10 minutes until the battery is recharged. If youcontinue to see this message after one hour, contact technical support for thecurrent procedure to determine whether the battery needs to be replaced.

bmc.batt.voltage

Message bmc.batt.voltage

Description This message occurs in the rare event where the Baseboard ManagementController (BMC) determines that the battery configuration firmware requiresan update and the battery is successfully prepared for the update, but the BMCcannot read the battery voltage sensor.

Correctiveaction

Contact technical support for the current procedure to determine whether thebattery needs to be replaced.

BMC messages | 297

Page 298: 215-06774_netapp-cmds

bmc.config.asup.off

Message bmc.config.asup.off

Description This message occurs in the rare event that the Baseboard ManagementController (BMC) detects corruption in the BMC's internal cached copy of theAutoSupport mail host and/or configured destinations. AutoSupport messagesfrom the BMC are disabled until the system boots.

Correctiveaction

Boot the system to ensure that the BMC's cache of the AutoSupportconfiguration is correct.

bmc.config.corrupted

Message bmc.config.corrupted

Description This message occurs in the rare event that the Baseboard Management Controller(BMC) internal configuration is corrupted and is being reset to defaults. Notably,the SSH service on the BMC LAN interface is disabled until the system boots.

Correctiveaction

1. Boot the system. Upon boot, the Secure Shell (SSH) host keys for the BMCare regenerated. The previous host keys for the BMC are no longer valid andcannot be used for logins.

2. Contact technical support to determine whether your system needsmaintenance.

bmc.config.default

Message bmc.config.default

Description This message occurs in the rare event that the Baseboard Management Controller(BMC) internal configuration is corrupted and is being reset to defaults. Notably,the Secure Shell (SSH) service on the BMC LAN interface is disabled until thesystem boots.

Correctiveaction

1. Boot the system. Upon boot, the SSH host keys for the BMC are regenerated.The previous host keys for the BMC are no longer valid and cannot be usedfor logins.

2. Contact technical support to determine whether your system needsmaintenance.

bmc.config.default.pef.filter

Message bmc.config.default.pef.filter

298 | Platform Monitoring Guide

Page 299: 215-06774_netapp-cmds

Description This message occurs in the rare event that the Baseboard Management Controller(BMC) internal configuration is corrupted and is being reset to defaults. Notably,the BMC's Platform Event Filter (PEF) tables are being cleared to factory defaults.

Correctiveaction

Most users need to take no action. However, if you want to use custom IntelligentPlatform Management Interface (IPMI) PEF tables, you need to reenable theBMC IPMI LAN interface, and reload any custom PEF tables that might bedefined for your site.

bmc.config.default.pef.policy

Message bmc.config.default.pef.policy

Description This message occurs in the rare event that the Baseboard Management Controller(BMC) internal configuration is corrupted and is being reset to defaults. Notably,the BMC's Platform Event Filter (PEF) tables are being cleared to factory defaults.

Correctiveaction

Most users need to take no action. However, if you want to use custom IPMI PEFtables, you need to reenable the BMC Intelligent Platform Management Interface(IPMI) LAN interface, and reload any custom PEF tables that might be defined foryour site.

bmc.config.fru.systemserial

Message bmc.config.fru.systemserial

Description This message occurs when the Baseboard Management Controller (BMC)detects an invalid System Serial Number field in the system’s field-replaceableunit (FRU) configuration area.

Corrective action Contact technical support to determine the maintenance procedure for yoursystem.

bmc.config.mac.error

Message bmc.config.mac.error

Description This message occurs when the Baseboard Management Controller (BMC)Ethernet Media Access Control (MAC) identifier is invalid.

Corrective action Contact technical support to determine the corrective procedure for yoursystem.

bmc.config.net.error

Message bmc.config.net.error

BMC messages | 299

Page 300: 215-06774_netapp-cmds

Description This message occurs when the Baseboard Management Controller (BMC)cannot start networking support on the BMC LAN interface.

Corrective action Contact technical support to determine the corrective procedure for yoursystem.

bmc.config.upgrade

Message bmc.config.upgrade

Description This message occurs when the Baseboard Management Controller (BMC)internal configuration defaults are updated.

Corrective action None.

bmc.power.on.auto

Message bmc.power.on.auto

Description This message occurs when, upon power up, the Baseboard ManagementController (BMC) detects that the system was previously soft powered-off.

Corrective action None.

bmc.reset.ext

Message bmc.reset.ext

Description This message occurs when the Baseboard Management Controller (BMC)detects that a bmc reboot command was issued on the system previously.

Corrective action None.

bmc.reset.int

Message bmc.reset.int

Description This message occurs when the Baseboard Management Controller (BMC) wasreset through the BMC command sequence ngs smash; set reboot=1;priv set diag.

Corrective action None.

bmc.reset.power

Message bmc.reset.power

Description This message occurs when the Baseboard Management Controller (BMC)detects a system power up, or after the BMC is upgraded.

300 | Platform Monitoring Guide

Page 301: 215-06774_netapp-cmds

Corrective action None.

bmc.reset.repair

Message bmc.reset.repair

Description This message occurs when the Baseboard Management Controller (BMC)detects and corrects an internal BMC error.

Corrective action If you receive this message frequently, contact technical support to determinethe corrective procedure for your system.

bmc.reset.unknown

Message bmc.reset.unknown

Description This message occurs when the Baseboard Management Controller (BMC)cannot determine why it was reset.

Corrective action This message usually indicates a BMC internal error. Contact technical supportto determine the corrective procedure for your system.

bmc.sensor.batt.charger.off

Message bmc.sensor.batt.charger.off

Description This message occurs when the Baseboard Management Controller (BMC)detects that the battery charger cannot be disabled for the hourly battery loadtest.

Corrective action Contact technical support to determine the corrective procedure for yoursystem.

bmc.sensor.batt.charger.on

Message bmc.sensor.batt.charger.on

Description This message occurs when the Baseboard Management Controller (BMC)cannot reenable the battery charger after the hourly battery load test.

Corrective action Contact technical support to determine the corrective procedure for yoursystem.

bmc.sensor.batt.time.run.invalid

Message bmc.sensor.batt.time.run.invalid

BMC messages | 301

Page 302: 215-06774_netapp-cmds

Description This message occurs when the Baseboard Management Controller (BMC)detects that the battery's calculated run time differs substantially from thebattery's run-time sensor.

Corrective action None.

bmc.ssh.key.missing

Message bmc.ssh.key.missing

Description This message occurs when the Baseboard Management Controller (BMC)detects that the Secure Shell (SSH) host keys for the BMC are corrupted ormissing.

Corrective action Reboot the system. The boot sequence regenerates the host key and makes theBMC SSH service available again.

302 | Platform Monitoring Guide

Page 303: 215-06774_netapp-cmds

Service Processor messages

The Service Processor (SP) enables you to access, monitor, and troubleshoot FAS22xx, 32xx, 62xx,SA320, and SA620 storage systems remotely. Two types of messages are associated with the SP andcan help you monitor your system and troubleshoot problems.

The SP sends AutoSupport messages when certain problems occur. These might include a loss ofheartbeat or a reboot failure.

Data ONTAP generates EMS messages when SP events and errors occur. These might include areminder to configure the SP or an alert to an SP communication problem.

Note: For more information about what the SP does, see the System Administration Guide for theversion of Data ONTAP that your system is running.

When and how SP AutoSupport e-mail messages are sentThe SP generates AutoSupport e-mail messages when the system goes down or when certainproblems occur.

The SP sends the messages under the following conditions:

• The storage system reboots unexpectedly.• The storage system stops communicating with the SP.• A watchdog reset occurs.

The watchdog is a built-in hardware sensor that monitors the storage system for a hung orunresponsive condition. If the watchdog detects such a condition, it resets the storage system sothat the system can automatically reboot and begin functioning.

• The storage system is power-cycled.• Firmware power-on self-test (POST) errors occur.• A user initiates an AutoSupport message.• A user resets the system using the SP.

The subject line of e-mail messages contains the word Notification and includes the host name of thesystem and the message type. The following text shows an example of an SP AutoSupport e-mailsubject line:

System Notification from host_name (HEARTBEAT_LOSS [WARNING]

Messages are sent to recipients that you designate when you configure AutoSupport in Data ONTAP.

Note: The SP must be properly configured to send AutoSupport messages. For information aboutconfiguring the SP, see the System Administration Guide and the Software Setup Guide for theversion of Data ONTAP that your system is running.

303

Page 304: 215-06774_netapp-cmds

What SP AutoSupport e-mail messages includeSP AutoSupport e-mail messages have different sections that contain different kinds of informationabout your system.

SP e-mail messages include the following sections and information:

• Subject line: a system notification from the SP of the system, stating the system condition orevent that caused the AutoSupport message and the log level.

• Message body: the SP configuration and version information, the system ID, serial number,model, and host name.

• Attachments: System Event Logs, the system sensor state as determined by the SP, and consolelogs.

When and how SP EMS messages are sentData ONTAP generates EMS messages when problems occur with the SP and displays them on thesystem console.

Problems that trigger EMS messages might include installation of the wrong version of firmware,communication failure, or a network configuration failure.

The console message includes the name of the EMS message and a brief description of the event orproblem. The following text contains an example of an SP EMS message:

Date [sp.notConfigured:warning] The system's Service Processor (SP) is not configured. Use the 'sp setup' command to configure it.

SP-generated AutoSupport messagesThe SP continuously monitors the system's health and generates AutoSupport messages whenproblems occur.

HEARTBEAT_LOSS

Message HEARTBEAT_LOSS

Description This message is sent by the Service Processor (SP) when it detects loss ofheartbeat from Data ONTAP, possibly because the system has stopped servingdata.

Correctiveaction

If this was a manually triggered or expected reboot, no action is needed.Otherwise, complete the following steps:

1. Check the status of the system and determine whether it is operational.

304 | Platform Monitoring Guide

Page 305: 215-06774_netapp-cmds

2. Contact technical support.

REBOOT (abnormal)

Message REBOOT (abnormal)

Description This message is sent by the Service Processor (SP) when it detects an abnormalreboot of the system.

Correctiveaction

If this was a manually triggered or expected reboot, no action is needed.Otherwise, complete the following steps:

1. Check the status of the system and determine the cause of reboot.

2. If the system fails to boot, contact technical support.

SYSTEM_BOOT_FAILED (POST failed)

Message SYSTEM_BOOT_FAILED (POST failed)

Description This message is sent by the Service Processor (SP) when the system firmwarehas a Power On Self Test (POST) failure and cannot load and run DataONTAP.

Corrective action 1. Run diagnostics on your system.

2. Contact technical support.

USER_TRIGGERED (sp test)

Message USER_TRIGGERED (sp test)

Description This message is sent by the Service Processor (SP) when the sp testautosupport command is run from the Data ONTAP CLI. This is a testmechanism to verify the SP configuration.

Corrective action None.

USER_TRIGGERED (system nmi)

Message USER_TRIGGERED (system nmi)

Description This message is sent by the Service Processor (SP) when a user issues a systemcore dump (NMI) SP command.

Corrective action None.

Service Processor messages | 305

Page 306: 215-06774_netapp-cmds

USER_TRIGGERED (system power cycle)

Message USER_TRIGGERED (system power cycle)

Description This message is sent by the Service Processor (SP) when a user power-cyclesthe system using SP.

Corrective action None.

USER_TRIGGERED (system power off)

Message USER_TRIGGERED (system power off)

Definition This message is sent by the Service Processor (SP) when a user powers off thesystem using the SP.

Corrective action None.

USER_TRIGGERED (system reset)

Message USER_TRIGGERED (system reset)

Description This message is sent by the Service Processor (SP) when a user resets thesystem using the SP.

Corrective action None.

EMS messages about the SPData ONTAP generates EMS messages when problems occur with the SP.

sp.firmware.upgrade.reqd

Message sp.firmware.upgrade.reqd

Severity WARNING

Description This message occurs when the Service Processor (SP) firmware version and theData ONTAP software version are incompatible and cannot communicatecorrectly about a particular capability.

Correctiveaction

Update the firmware version of the SP to the version recommended for yourversion of Data ONTAP. The firmware and update instructions are available onthe NetApp Support Site. After you update the firmware, this message should nolonger occur. If the message occurs again, contact technical support and explainthat you already updated the firmware to the recommended version.

306 | Platform Monitoring Guide

Page 307: 215-06774_netapp-cmds

sp.firmware.version.unsupported

Message sp.firmware.version.unsupported

Severity WARNING

Description This message occurs when the firmware on the Service Processor (SP) is anunsupported version and must be upgraded.

Correctiveaction

The firmware and instructions are available on the NOW site. After the SP isrunning the new firmware, this message should no longer occur. If the messageoccurs again, contact technical support and explain that you already updated thefirmware to the recommended version.

sp.heartbeat.resumed

Message sp.heartbeat.resumed

Severity INFO

Description This message occurs when the system detects resumption of Service Processor(SP) heartbeat notifications indicating that the SP is now available. The earlierissue indicated by the sp.heartbeat.stopped event has been resolved.

Corrective action None.

sp.heartbeat.stopped

Message sp.heartbeat.stopped

Severity WARNING

Description This message occurs when Data ONTAP does not receive expected ServiceProcessor (SP) heartbeat notifications. The SP and Data ONTAP exchangeheartbeat messages so that they can detect when one or the other is unavailable.This event is generated when Data ONTAP has not received an expectedheartbeat message from the SP.

Correctiveaction

1. Connect to the SP CLI and enter the following commands:

sp version

priv set advanced

sp log debug

sp log messages

2. Run SP system diagnostics.

Service Processor messages | 307

Page 308: 215-06774_netapp-cmds

3. If you still see this EMS message, contact technical support.

sp.network.link.down

Message sp.network.link.down

Severity WARNING

Description This message occurs when the Service Processor (SP) detects a link error on theSP network port. This can happen if a network cable is not plugged into the SPnetwork port. It can also happen if the network that the SP is connected to cannotrun at 10/100 Mbps.

Correctiveaction

1. Check whether the network cable is correctly plugged into the SP networkport.

2. Check the link status LED on the SP.

3. Verify that the network that the SP is connected to supports autonegotiation to10/100 Mbps or is running at one of those speeds; otherwise, SP networkconnectivity does not work.The SP supports a 10/100 Mbps Ethernet network in autonegotiation mode.

sp.notConfigured

Message sp.notConfigured

Severity WARNING

Description This message occurs weekly to remind you to configure the Service Processor(SP). The SP is a physical device that is incorporated into your system to provideremote access and remote management capabilities. To use the full functionalityof SP, you must configure it first.

Correctiveaction

Ensure that AutoSupport mailhosts and recipients are properly configured in DataONTAP, and then take the following actions:

1. Configure the SP by entering the following command:

sp setup

If necessary, use the sp status command to obtain the SP's MAC address.

2. Verify the SP network configuration by entering the following command: spstatus

3. Verify that the SP can send AutoSupport messages by entering the followingcommand:

sp test autosupport

308 | Platform Monitoring Guide

Page 309: 215-06774_netapp-cmds

sp.orftp.failed

Message sp.orftp.failed

Severity WARNING

Description This message occurs when there is a communication error while sendinginformation to or receiving information from the Service Processor (SP). Thiserror could be due to the following reasons:

• Communication error while the information is being sent or received.• SP is nonoperational.

Correctiveaction

1. Check whether the SP is operational by entering the following command atthe Data ONTAP prompt:

sp status

2. If the SP is operational and this message persists, reboot the SP by enteringthe following command at the Data ONTAP prompt:

sp reboot

3. If this message persists after you reboot the SP, contact technical support.

sp.snmp.traps.off

Message sp.snmp.traps.off

Severity INFO

Description This message occurs each time a system boots, if the advanced privilege level inData ONTAP was used to disable the SNMP Trap feature of the ServiceProcessor (SP).

This message also occurs when the SNMP Trap capability is disabled and a userinvokes a Data ONTAP command to use the SP to send an SNMP trap.

Correctiveaction

SP SNMP Trap support is currently disabled. To enable this feature, set thesp.snmp.traps option to On.

sp.userlist.update.failed

Message sp.userlist.update.failed

Severity WARNING

Service Processor messages | 309

Page 310: 215-06774_netapp-cmds

Description This message occurs when there is an error updating user information for theService Processor (SP). When user information is updated on Data ONTAP, the SPis also updated with the new changes. This enables users to log in to the SP.

User information update for the Service Processor (SP) may have failed due to thefollowing reasons:

• Communication error with the SP.• SP might not be operational.

Correctiveaction

1. Check whether the SP is operational by entering the following command at theData ONTAP prompt:

sp status

2. If the SP is operational and this message persists, reboot the SP by entering thefollowing command at the Data ONTAP prompt:

sp reboot

3. Retry the operation that caused the error message.

4. If this message persists after you reboot the SP, contact technical support.

spmgmt.driver.hourly.stats

Message spmgmt.driver.hourly.stats

Severity WARNING

Description This message occurs when the system encounters an error while trying to gethourly statistics from the Service Processor (SP). The error could be due to thefollowing reasons:

• Communication error with the (SP).• SP is not operational.

Correctiveaction

1. Check whether the SP is online by entering the following command at the DataONTAP prompt:

sp status

2. If the SP is online and this message persists, reboot the SP by entering thefollowing command at the Data ONTAP prompt:

sp reboot

3. If this message persists after you reboot the SP, contact technical support.

310 | Platform Monitoring Guide

Page 311: 215-06774_netapp-cmds

spmgmt.driver.mailhost

Message spmgmt.driver.mailhost

Severity WARNING

Description This message occurs when the Service Processor (SP) setup attempts to verifywhether a mailhost specified in Data ONTAP can be reached. In this case, SPsetup cannot connect to the specified mailhost.

Correctiveaction

1. Verify that a valid mailhost is configured in Data ONTAP by checking thesystem AutoSupport configuration.

2. Ensure that Data ONTAP can successfully connect to the specified mailhostby invoking a test command to invoke AutoSupport.

spmgmt.driver.network.failure

Message spmgmt.driver.network.failure

Severity WARNING

Description This message occurs when the system encounters a failure during networkconfiguration of the Service Processor (SP). The system cannot assign the SP aDHCP (Dynamic Host Configuration Protocol) or fixed IP address.

Correctiveaction

1. Check whether the network cable is correctly plugged into the SP network port.

2. Check the link status LED on the SP.

3. Verify that the network that the SP is connected to supports autonegotiation to10/100 speed or is running at one of those speeds; otherwise, SP networkconnectivity does not work.The SP supports a 10/100 Ethernet network in autonegotiation mode.

spmgmt.driver.timeout

Message spmgmt.driver.timeout

Severity WARNING

Description This message occurs when there is a failure during communication with theService Processor (SP) firmware. The failure could be due to the followingreasons:

• Communication error with the SP.• SP is not operational.

Service Processor messages | 311

Page 312: 215-06774_netapp-cmds

Correctiveaction

1. Check whether the SP is online by entering the following command at the DataONTAP prompt:

sp status

2. If the SP is operational and this message persists, reboot the SP by entering thefollowing command at the Data ONTAP prompt:sp rebootAfter the reboot, this message should no longer occur. If the message occursagain, contact support and explain that you already performed the precedingsteps.

312 | Platform Monitoring Guide

Page 313: 215-06774_netapp-cmds

Abbreviations

A list of abbreviations and their spelled-out forms are included here for your reference.

A

ABE (Access-Based Enumeration)

ACE (Access Control Entry)

ACL (access control list)

ACP (Alternate Control Path)

AD (Active Directory)

ALPA (arbitrated loop physical address)

ALUA (Asymmetric Logical Unit Access)

AMS (Account Migrator Service)

API (Application Program Interface)

ARP (Address Resolution Protocol)

ASCII (American Standard Code for Information Interchange)

ASP (Active Server Page)

ATA (Advanced Technology Attachment)

B

BCO (Business Continuance Option)

BIOS (Basic Input Output System

BCS (block checksum type )

BLI (block-level incremental)

BMC (Baseboard Management Controller)

313

Page 314: 215-06774_netapp-cmds

C

CD-ROM (compact disc read-only memory)

CDDI (Copper Distributed Data Interface)

CDN (content delivery network)

CFE (Common Firmware Environment)

CFO (controller failover)

CGI (Common Gateway Interface)

CHA (channel adapter)

CHAP (Challenge Handshake Authentication Protocol)

CHIP (Client-Host Interface Processor)

CIDR (Classless Inter-Domain Routing)

CIFS (Common Internet File System)

CIM (Common Information Model)

CLI (command-line interface)

CP (consistency point)

CPU (central processing unit)

CRC (cyclic redundancy check)

CSP (communication service provider)

314 | Platform Monitoring Guide

Page 315: 215-06774_netapp-cmds

D

DAFS (Direct Access File System)

DBBC (database consistency checker)

DCE (Distributed Computing Environment)

DDS (Decru Data Decryption Software)

dedupe (deduplication)

DES (Data Encryption Standard)

DFS (Distributed File System)

DHA (Decru Host Authentication)

DHCP (Dynamic Host Configuration Protocol)

DIMM (dual-inline memory module)

DITA (Darwin Information Typing Architecture)

DLL (Dynamic Link Library)

DMA (direct memory access)

DMTD (Distributed Management Task Force)

DNS (Domain Name System)

DOS (Disk Operating System)

DPG (Data Protection Guide)

DTE (Data Terminal Equipment)

Abbreviations | 315

Page 316: 215-06774_netapp-cmds

E

ECC (Elliptic Curve Cryptography) or (EMC Control Center)

ECDN (enterprise content delivery network)

ECN (Engineering Change Notification)

EEPROM (electrically erasable programmable read-only memory)

EFB (environmental fault bus)

EFS (Encrypted File System)

EGA (Enterprise Grid Alliance)

EISA (Extended Infrastructure Support Architecture)

ELAN (Emulated LAN)

EMU environmental monitoring unit)

ESH (embedded switching hub)

F

FAQs (frequently asked questions)

FAS (fabric-attached storage)

FC (Fibre Channel)

FC-AL (Fibre Channel-Arbitrated Loop)

FC SAN (Fibre Channel storage area network)

FC Tape SAN (Fibre Channel Tape storage area network)

FC-VI (virtual interface over Fibre Channel)

FCP (Fibre Channel Protocol)

FDDI (Fiber Distributed Data Interface)

FQDN (fully qualified domain name)

FRS (File Replication Service)

FSID (file system ID)

FSRM (File Storage Resource Manager)

FTP (File Transfer Protocol)

316 | Platform Monitoring Guide

Page 317: 215-06774_netapp-cmds

G

GbE (Gigabit Ethernet)

GID (group identification number)

GMT (Greenwich Mean Time)

GPO (Group Policy Object)

GUI (graphical user interface)

GUID (globally unique identifier)

H

HA (high availability)

HBA (host bus adapter)

HDM (Hitachi Device Manager Server)

HP (Hewlett-Packard Company)

HTML (hypertext markup language)

HTTP (Hypertext Transfer Protocol)

Abbreviations | 317

Page 318: 215-06774_netapp-cmds

I

IB (InfiniBand)

IBM (International Business Machines Corporation)

ICAP (Internet Content Adaptation Protocol)

ICP (Internet Cache Protocol)

ID (identification)

IDL (Interface Definition Language)

ILM (information lifecycle management)

IMS (If-Modified-Since)

I/O (input/output)

IP (Internet Protocol)

IP SAN (Internet Protocol storage area network)

IQN (iSCSI Qualified Name)

iSCSI (Internet Small Computer System Interface)

ISL (Inter-Switch Link)

iSNS (Internet Storage Name Service)

ISP (Internet storage provider)

J

JBOD (just a bunch of disks)

JPEG (Joint Photographic Experts Group)

K

KB (Knowledge Base)

Kbps (kilobits per second)

KDC (Kerberos Distribution Center)

318 | Platform Monitoring Guide

Page 319: 215-06774_netapp-cmds

L

LAN (local area network)

LBA (Logical Block Access)

LCD (liquid crystal display)

LDAP (Lightweight Directory Access Protocol)

LDEV (logical device)

LED (light emitting diode)

LFS (log-structured file system)

LKM (Lifetime Key Management)

LPAR (system logical partition)

LREP (logical replication tool utility)

LUN (logical unit number)

LUSE (Logical Unit Size Expansion)

LVM (Logical Volume Manager)

Abbreviations | 319

Page 320: 215-06774_netapp-cmds

M

MAC (Media Access Control)

Mbps (megabits per second)

MCS (multiple connections per session)

MD5 (Message Digest 5)

MDG (managed disk group)

MDisk (managed disk)

MIB (Management Information Base)

MIME (Multipurpose Internet Mail Extension)

MMC (Microsoft Management Console)

MMS (Microsoft Media Streaming)

MPEG (Moving Picture Experts Group)

MPIO (multipath network input/output)

MRTG (Multi-Router Traffic Grapher)

MSCS (Microsoft Cluster Service

MSDE (Microsoft SQL Server Desktop Engine)

MTU (Maximum Transmission Unit)

320 | Platform Monitoring Guide

Page 321: 215-06774_netapp-cmds

N

NAS (network-attached storage)

NDMP (Network Data Management Protocol)

NFS (Network File System)

NHT (NetApp Health Trigger)

NIC (network interface card)

NMC (Network Management Console)

NMS (network management station)

NNTP (Network News Transport Protocol)

NTFS (New Technology File System)

NTLM (NetLanMan)

NTP (Network Time Protocol)

NVMEM (nonvolatile memory management)

NVRAM (nonvolatile random-access memory)

O

OFM (Open File Manager)

OFW (Open Firmware)

OLAP (Online Analytical Processing)

OS/2 (Operating System 2)

OSMS (Open Systems Management Software)

OSSV (Open Systems SnapVault)

Abbreviations | 321

Page 322: 215-06774_netapp-cmds

P

PC (personal computer)

PCB (printed circuit board)

PCI (Peripheral Component Interconnect)

pcnfsd (storage daemon)

(PC)NFS (Personal Computer Network File System)

PDU (protocol data unit)

PKI (Public Key Infrastructure)

POP (Post Office Protocol)

POST (power-on self-test)

PPN (physical path name)

PROM (programmable read-only memory)

PSU power supply unit)

PVC (permanent virtual circuit)

Q

QoS (Quality of Service)

QSM (Qtree SnapMirror)

322 | Platform Monitoring Guide

Page 323: 215-06774_netapp-cmds

R

RAD (report archive directory)

RADIUS (Remote Authentication Dial-In Service)

RAID (redundant array of independent disks)

RAID-DP (redundant array of independent disks, double-parity)

RAM (random access memory)

RARP (Reverse Address Resolution Protocol)

RBAC (role-based access control)

RDB (replicated database)

RDMA (Remote Direct Memory Access)

RIP (Routing Information Protocol)

RISC (Reduced Instruction Set Computer)

RLM (Remote LAN Module)

RMC (remote management controller)

ROM (read-only memory)

RPM (revolutions per minute)

rsh (Remote Shell)

RTCP (Real-time Transport Control Protocol)

RTP (Real-time Transport Protocol)

RTSP (Real Time Streaming Protocol)

Abbreviations | 323

Page 324: 215-06774_netapp-cmds

S

SACL (system access control list)

SAN (storage area network)

SAS (storage area network attached storage) or (serial-attached SCSI)

SATA (serial advanced technology attachment)

SCSI (Small Computer System Interface)

SFO (storage failover)

SFSR (Single File SnapRestore operation)

SID (Secure ID)

SIMM (single inline memory module)

SLB (Server Load Balancer)

SLP (Service Location Protocol)

SNMP (Simple Network Management Protocol)

SNTP (Simple Network Time Protocol)

SP (Storage Processor)

SPN (service principal name)

SPOF (single point of failure)

SQL (Structured Query Language)

SRM (Storage Resource Management)

SSD (solid state disk

SSH (Secure Shell)

SSL (Secure Sockets Layer)

STP (shielded twisted pair)

SVC (switched virtual circuit)

324 | Platform Monitoring Guide

Page 325: 215-06774_netapp-cmds

T

TapeSAN (tape storage area network)

TCO (total cost of ownership)

TCP (Transmission Control Protocol)

TCP/IP (Transmission Control Protocol/Internet Protocol)

TOE (TCP offload engine)

TP (twisted pair)

TSM (Tivoli Storage Manager)

TTL (Time To Live)

U

UDP (User Datagram Protocol)

UI (user interface)

UID (user identification number)

Ultra ATA (Ultra Advanced Technology Attachment)

UNC (Uniform Naming Convention)

UPS (uninterruptible power supply)

URI (universal resource identifier)

URL (uniform resource locator)

USP (Universal Storage Platform)

UTC (Universal Coordinated Time)

UTP (unshielded twisted pair)

UUID (universal unique identifier)

UWN (unique world wide number)

Abbreviations | 325

Page 326: 215-06774_netapp-cmds

V

VCI (virtual channel identifier)

VCMDB (Volume Configuration Management Database)

VDI (Virtual Device Interface)

VDisk (virtual disk)

VDS (Virtual Disk Service)

VFM (Virtual File Manager)

VFS (virtual file system)

VI (virtual interface)

vif (virtual interface)

VIRD (Virtual Router ID)

VLAN (virtual local area network)

VLD (virtual local disk)

VOD (video on demand)

VOIP (voice over IP)

VRML (Virtual Reality Modeling Language)

VTL (Virtual Tape Library)

W

WAFL (Write Anywhere File Layout)

WAN (wide area network)

WBEM (Web-Based Enterprise Management)

WHQL (Windows Hardware Quality Lab)

WINS (Windows Internet Name Service)

WORM (write once, read many)

WWN (worldwide name)

WWNN (worldwide node name)

WWPN (worldwide port name)

www (worldwide web)

326 | Platform Monitoring Guide

Page 327: 215-06774_netapp-cmds

Z

ZCS (zoned checksum)

Abbreviations | 327

Page 328: 215-06774_netapp-cmds

328 | Platform Monitoring Guide

Page 329: 215-06774_netapp-cmds

Copyright information

Copyright © 1994–2012 NetApp, Inc. All rights reserved. Printed in the U.S.

No part of this document covered by copyright may be reproduced in any form or by any means—graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in anelectronic retrieval system—without prior written permission of the copyright owner.

Software derived from copyrighted NetApp material is subject to the following license anddisclaimer:

THIS SOFTWARE IS PROVIDED BY NETAPP "AS IS" AND WITHOUT ANY EXPRESS ORIMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIEDWARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE,WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANYDIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIALDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTEGOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESSINTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHERIN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OROTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IFADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

NetApp reserves the right to change any products described herein at any time, and without notice.NetApp assumes no responsibility or liability arising from the use of products described herein,except as expressly agreed to in writing by NetApp. The use or purchase of this product does notconvey a license under any patent rights, trademark rights, or any other intellectual property rights ofNetApp.

The product described in this manual may be protected by one or more U.S. patents, foreign patents,or pending applications.

RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject torestrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and ComputerSoftware clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).

329

Page 330: 215-06774_netapp-cmds

330 | Platform Monitoring Guide

Page 331: 215-06774_netapp-cmds

Trademark information

NetApp, the NetApp logo, Network Appliance, the Network Appliance logo, Akorri,ApplianceWatch, ASUP, AutoSupport, BalancePoint, BalancePoint Predictor, Bycast, CampaignExpress, ComplianceClock, Cryptainer, CryptoShred, Data ONTAP, DataFabric, DataFort, Decru,Decru DataFort, DenseStak, Engenio, Engenio logo, E-Stack, FAServer, FastStak, FilerView,FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexSuite, FlexVol, FPolicy, GetSuccessful,gFiler, Go further, faster, Imagine Virtually Anything, Lifetime Key Management, LockVault,Manage ONTAP, MetroCluster, MultiStore, NearStore, NetCache, NOW (NetApp on the Web),Onaro, OnCommand, ONTAPI, OpenKey, PerformanceStak, RAID-DP, ReplicatorX, SANscreen,SANshare, SANtricity, SecureAdmin, SecureShare, Select, Service Builder, Shadow Tape,Simplicity, Simulate ONTAP, SnapCopy, SnapDirector, SnapDrive, SnapFilter, SnapLock,SnapManager, SnapMigrator, SnapMirror, SnapMover, SnapProtect, SnapRestore, Snapshot,SnapSuite, SnapValidator, SnapVault, StorageGRID, StoreVault, the StoreVault logo, SyncMirror,Tech OnTap, The evolution of storage, Topio, vFiler, VFM, Virtual File Manager, VPolicy, WAFL,Web Filer, and XBB are trademarks or registered trademarks of NetApp, Inc. in the United States,other countries, or both.

IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International BusinessMachines Corporation in the United States, other countries, or both. A complete and current list ofother IBM trademarks is available on the web at www.ibm.com/legal/copytrade.shtml.

Apple is a registered trademark and QuickTime is a trademark of Apple, Inc. in the United Statesand/or other countries. Microsoft is a registered trademark and Windows Media is a trademark ofMicrosoft Corporation in the United States and/or other countries. RealAudio, RealNetworks,RealPlayer, RealSystem, RealText, and RealVideo are registered trademarks and RealMedia,RealProxy, and SureStream are trademarks of RealNetworks, Inc. in the United States and/or othercountries.

All other brands or products are trademarks or registered trademarks of their respective holders andshould be treated as such.

NetApp, Inc. is a licensee of the CompactFlash and CF Logo trademarks.

NetApp, Inc. NetCache is certified RealSystem compatible.

331

Page 332: 215-06774_netapp-cmds

332 | Platform Monitoring Guide

Page 333: 215-06774_netapp-cmds

How to send your comments

You can help us to improve the quality of our documentation by sending us your feedback.

Your feedback is important in helping us to provide the most accurate and high-quality information.If you have suggestions for improving this document, send us your comments by email to [email protected]. To help us direct your comments to the correct division, include in thesubject line the product name, version, and operating system.

You can also contact us in the following ways:

• NetApp, Inc., 495 East Java Drive, Sunnyvale, CA 94089 U.S.• Telephone: +1 (408) 822-6000• Fax: +1 (408) 822-4501• Support telephone: +1 (888) 463-8277

333

Page 334: 215-06774_netapp-cmds

334 | Platform Monitoring Guide

Page 335: 215-06774_netapp-cmds

Index3020 and 3050 systems

POST error messages 1203040 and 3070 systems

POST error messages 12430xx systems

activity LED 45controller front LEDs 45FC port LEDs 46GbE port LEDs 46LEDs on the back of controllers 46power LED 45PSU LEDs 47RLM LEDs 46status LED 45

31xx systemscontroller activity LED 49Ethernet port LEDs 50fan LED 51fault LED 49, 50Fibre Channel port LED 50FRU LEDs 53LEDs on the back of the controller 50LEDs on the front of the chassis 49POST error messages 124power LED 49PSU LEDs 52

32xx systemschassis fault LED 54controller activity LED 54controller fault LED 55controller- I/O expansion module configuration 54dual-controller configuration 54fan LED 59Fibre Channel port LEDs 55GbE port LEDs 55I/O expansion module fault LED 58internal FRU LEDs 60LEDs on the back of the controller 55LEDs on the back of the I/O expansion module 58LEDs on the front of the chassis 54management port LEDs 55, 58NVMEM LED 55POST error messages 132power LED 54PSU LEDs 59SAS port LEDs 55

60xx systemsactivity LED 61fan LEDs 63Fibre Channel port LEDs 62GbE port LEDs 62LEDs on the back of the controller 62LEDs on the front of the controller 61POST error messages 124power LED 61PSU LEDs 64RLM LEDs 62status LED 61

62xx systems10-GbE port LEDs 678-Gb Fibre Channel port LEDs 67chassis fault LED 65console port 67controller activity LED 65controller fault LED 67controller-I/O expansion module configuration 65dual-controller configuration 65fan LEDs 72GbE port LEDs 67I/O expansion module fault LED 71internal FRU LEDs 73LEDs on the back of the controller 67LEDs on the back of the I/O expansion module 71POST error messages 132power LED 65private management port LEDs 67, 71PSU LEDs 72remote management port LEDs 67USB port 67

A

AutoSupport messages 26

B

BIOS and boot loader progressMethod of viewing progress on the console 118method of viewing progress through the Bios

Status sensor 119BMC

e-mail contents 289

Index | 335

Page 336: 215-06774_netapp-cmds

function 289how and when e-mail AutoSupport messages are

sent 289systems containing the 289

BMC-generated messagesBMC_ASUP_UNKNOWN 290REBOOT (abnormal) 290REBOOT (power loss) 290REBOOT (watchdog reset) 290SYSTEM_BOOT_FAILED (POST failed) 290SYSTEM_POWER_OFF (environment) 291USER_TRIGGERED (bmc test 291USER_TRIGGERED (system nmi) 291USER_TRIGGERED (system power cycle) 291USER_TRIGGERED (system power off) 291USER_TRIGGERED (system power on) 292USER_TRIGGERED (system power soft-off) 292USER_TRIGGERED (system reset) 292

Boot error messagesBoot device err 140Cannot initialize labels 140Cannot read labels 140Configuration exceeds max PCI space 140DIMM slot # has correctable ECC errors 141Dirty shutdown in degraded mode 141Disk label processing failed 141Drive %s.%d not supported 141Error detection detected too many errors to analyze

at once 142FC-AL loop down, adapter %d 142File system may be scrambled 142Halted disk firmware too old 143Halted:Illegal configuration 143Invalid PCI card slot %d 143No /etc/rc 143No /etc/rc, running setup 144No disk controllers 144No disks 144No network interfaces 144No NVRAM present 145NVRAM #n downrev 145NVRAM:wrong pci slot 145Panic:DIMM slot #n has uncorrectable ECC errors

145This platform is not supported on this release 145Too many errors in too short time 146Warning:Motherboard Revision not available 146Warning:Motherboard Serial Number not available

146Warning:system serial number is not available 146

Watchdog error 146Watchdog failed 147

D

degraded power, possible cause and remedy 149diagnostic tools

forms and use of 26documentation

where to find platform troubleshooting 27

E

EMS messageswhat information they provide 149

EMS messages about the BMCbmc.asup.crit 292bmc.asup.error 293bmc.asup.init 293bmc.asup.queue 293bmc.asup.send 293bmc.asup.smtp 294bmc.batt.id 294bmc.batt.invalid 294bmc.batt.mfg 294bmc.batt.rev 295bmc.batt.seal 295bmc.batt.unknown 295bmc.batt.unseal 295bmc.batt.upgrade 295bmc.batt.upgrade.busy 296bmc.batt.upgrade.failed 296bmc.batt.upgrade.failure 296bmc.batt.upgrade.ok 297bmc.batt.upgrade.power-off 297bmc.batt.upgrade.voltagelow 297bmc.batt.voltage 297bmc.config.asup.off 298bmc.config.corrupted 298bmc.config.default 298bmc.config.default.pef.filter 298bmc.config.default.pef.policy 299bmc.config.fru.systemserial 299bmc.config.mac.error 299bmc.config.net.error 299bmc.config.upgrade 300bmc.power.on.auto 300bmc.reset.ext 300bmc.reset.int 300bmc.reset.power 300

336 | Platform Monitoring Guide

Page 337: 215-06774_netapp-cmds

bmc.reset.repair 301bmc.reset.unknown 301bmc.sensor.batt.charger.off 301bmc.sensor.batt.charger.on 301bmc.sensor.batt.time.run.invalid 301bmc.ssh.key.missing 302

EMS messages about the RLMrlm.driver.hourly.stats 279rlm.driver.mailhost 280rlm.driver.network.failure 280rlm.driver.timeout 280rlm.firmware.update.failed 281rlm.firmware.upgrade.reqd 281rlm.firmware.version.unsupported 282rlm.heartbeat.bootFromBackup 282rlm.heartbeat.resumed 282rlm.heartbeat.stopped 283rlm.network.link.down 283rlm.notConfigured 284rlm.orftp.failed 284rlm.snmp.traps.off 285rlm.systemDown.alert 285rlm.systemDown.notice 285rlm.systemDown.warning 286rlm.systemPeriodic.keepAlive 286rlm.systemTest.notice 287rlm.userlist.update.failed 287

EMS messages about the SPsp.network.link.down 308sp.notConfigured 308sp.firmware.upgrade.reqd 306sp.firmware.version.unsupported 307sp.heartbeat.resumed 307sp.heartbeat.stopped 307sp.orftp.failed 309sp.snmp.traps.off 309sp.userlist.update.failed 309spmgmt.driver.hourly.stats 310spmgmt.driver.mailhost 311spmgmt.driver.network.failure 311spmgmt.driver.timeout 311

environmental EMS messagesnvmem.battery.capacity.low 166nvram.battery.capacity.low 171nvram.battery.capacity.low.critical 171nvram.battery.capacity.low.warn 171nvram.battery.capacity.normal 171nvram.battery.charging.nocharge 172nvram.battery.charging.normal 172nvram.battery.charging.wrongcharge 172

nvram.battery.current.high.warn 173nvram.battery.current.low 173nvram.battery.current.low.warn 173nvram.battery.current.normal 174nvram.battery.end_of_life.high 174nvram.battery.fault 174nvram.battery.fault.warn 175nvram.battery.fcc.low 175nvram.battery.fcc.low.critical 175nvram.battery.fcc.low.warn 175nvram.battery.fcc.normal 176nvram.battery.power.fault 176nvram.battery.power.normal 176nvram.battery.sensor.unreadable 176nvram.battery.temp.high.warn 177nvram.battery.temp.low 177nvram.battery.temp.normal 178nvram.battery.voltage.high 178nvram.battery.voltage.high.warn 178nvram.battery.voltage.low 179nvram.battery.voltage.low.warn 179nvram.battery.voltage.normal 179Chassis fan FRU failed 149Chassis over temperature on XXXX 150Chassis over temperature shutdown on XXXX 150Chassis Power Degraded:3.3V in warn high state

150Chassis power degraded:PS# 151Chassis Power Fail:PS# 151Chassis Power Shutdown 151Chassis power shutdown:3.3V is in warn low state

152Chassis power supply degraded:PS# 153Chassis power supply fail:PS# 153Chassis power supply off:PS# 153, 154Chassis power supply OK:PS# 154Chassis power supply removed:PS# 154Chassis Power Supply:PS# removed 152Chassis under temperature on XXXX 155Chassis under temperature shutdown on XXXX

155Fan:# is spinning below tolerable speed 155monitor.chassisFan.degraded 156monitor.chassisFan.ok 156monitor.chassisFan.removed 156monitor.chassisFan.slow 156monitor.chassisFan.stop 157monitor.chassisFan.warning 157monitor.chassisFanFail.xMinShutdown 157monitor.chassisPower.degraded 157

Index | 337

Page 338: 215-06774_netapp-cmds

monitor.chassisPower.ok 158monitor.chassisPowerSupplies.ok 158monitor.chassisPowerSupply.degraded 158monitor.chassisPowerSupply.notPresent 158monitor.chassisPowerSupply.off 159monitor.chassisPowerSupply.ok 159monitor.chassisTemperature.cool 159monitor.chassisTemperature.ok 159monitor.chassisTemperature.warm 159monitor.cpuFan.degraded 160monitor.cpuFan.failed 160monitor.cpuFan.ok 160monitor.ioexpansion.unpresent 162monitor.ioexpansionPower.degraded 161monitor.ioexpansionPower.ok 161monitor.ioexpansionTemperature.cool 161monitor.ioexpansionTemperature.ok 161monitor.ioexpansionTemperature.warm 162monitor.nvmembattery.warninglow 162monitor.nvramLowBattery 162monitor.power.unreadable 163monitor.shutdown.cancel 163monitor.shutdown.cancel.nvramLowBattery 163monitor.shutdown.chassisOverTemp 163monitor.shutdown.chassisUnderTemp 164monitor.shutdown.emergency 164monitor.shutdown.ioexpansionOverTemp 164monitor.shutdown.nvramLowBattery.pending 165monitor.temp.unreadable 165Multiple chassis fans have failed 165Multiple fan failure on XXXX 166Multiple power supply fans failed 166nvmem.battery.capacity.low.warn 167nvmem.battery.capacity.normal 167nvmem.battery.current.high 167nvmem.battery.current.high.warn 167nvmem.battery.sensor.unreadable 168nvmem.battery.temp.high 168nvmem.battery.temp.low 168nvmem.battery.temp.normal 169nvmem.battery.voltage.high 169nvmem.battery.voltage.high.warn 169nvmem.battery.voltage.normal 169nvmem.voltage.high 170nvmem.voltage.high.warn 170nvmem.voltage.normal 170nvram.bat.missing.error 170nvram.battery.current.high 172nvram.battery.end_of_life.normal 174nvram.battery.temp.high 177

nvram.hw.initFail 179

F

FAS20xx systemscontroller module fault LED 33controller module LEDs 31Ethernet port LEDs 33fault LED 31Fibre Channel port LEDs 33LEDs on the back of the controller module 33LEDs on the front of the chassis 31NVMEM LED 33power LED 31PSU LEDs 35remote management port LEDs 33startup progress, viewing 118

FAS22xx systemspower LED 37chassis fault LED 37controller activity LED 37controller fault LED 40Fibre Channel port LEDs 40GbE port LEDs 40internal drive LEDs 38internal FRU LEDs 45LEDs on the back of the controller 40LEDs on the front of the chassis 37management port LEDs 40mezzanine card 40NVMEM LED 40POST error messages 132PSU LEDs 43SAS port LEDs 40

FCoE HBA EMS messagesispcna.mpi.dump.saved 269ispcna.mpi.initFailed 270ispcna.mpi.dump 269

Flash Cache module and PAM EMS messagesfal.chan.online.write.warn 234fmm.threshold.bank.degraded 235fmm.threshold.card.degraded 236fmm.threshold.core.offline 236extCache.io.BlockChecksumError 229extCache.io.cardError 229extCache.io.readError 229extCache.io.writeError 230extCache.offline 230extCache.ReconfigComplete 230

338 | Platform Monitoring Guide

Page 339: 215-06774_netapp-cmds

extCache.ReconfigFailed 230extCache.ReconfigStart 231extCache.UECCerror 231extCache.UECCmax 231fal.chan.offline.comp 232fal.chan.online.erase.warn 232fal.chan.online.fail 232fal.chan.online.read.warn 232fal.chan.online.rep.fail 233fal.chan.online.rep.part 233fal.chan.online.rep.succ 233fal.chan.online.rep.ver.err 233fal.init.failed 234fmm.bad.block.detected 234fmm.device.stats.missing 234fmm.domain.card.failure 235fmm.domain.core.failure 235fmm.hourly.device.report 235fmm.threshold.bank.offline 236fmm.threshold.card.failure 236iomem.bbm.bbtl.overflow 237iomem.bbm.new.flash 237iomem.card.disable 237iomem.card.enable 238iomem.card.fail.cecc 238iomem.card.fail.data.crc 238iomem.card.fail.desc.crc 238iomem.card.fail.dimm 237, 239iomem.card.fail.firmware.primary 239iomem.card.fail.fpga 239iomem.card.fail.fpga.primary 240iomem.card.fail.fpga.rev 240iomem.card.fail.internal 241iomem.card.fail.pci 241iomem.card.fail.uecc 241iomem.dimm.log.checksum 242iomem.dimm.log.init 242iomem.dimm.log.read 242iomem.dimm.log.sync 242iomem.dimm.log.write 243iomem.dimm.mismatch.banks 243iomem.dimm.mismatch.burst 243iomem.dimm.mismatch.casLatency 243iomem.dimm.mismatch.columns 244iomem.dimm.mismatch.dataWidth 244iomem.dimm.mismatch.eccWidth 244iomem.dimm.mismatch.ranks 244iomem.dimm.mismatch.rows 245iomem.dimm.mismatch.vendor 245iomem.dimm.spd.banks 245

iomem.dimm.spd.burst 245iomem.dimm.spd.casLatency 246iomem.dimm.spd.checksum 246iomem.dimm.spd.columns 246iomem.dimm.spd.dataWidth 246iomem.dimm.spd.detect 247iomem.dimm.spd.eccWidth 247iomem.dimm.spd.ranks 247iomem.dimm.spd.read 247iomem.dimm.spd.rows 248iomem.dma.crc.data 248iomem.dma.crc.desc 248iomem.dma.internal 248iomem.dma.stall 249iomem.ecc.cecc 249iomem.ecc.correct.off 249iomem.ecc.correct.on 249iomem.ecc.detect.off 250iomem.ecc.detect.on 250iomem.ecc.inject 250iomem.ecc.summary 250iomem.ecc.uecc 251iomem.fail.stripe 251iomem.firmware.package.access 251iomem.firmware.primary 252iomem.firmware.program.complete 252iomem.firmware.program.fail 252iomem.firmware.program.reboot 252iomem.firmware.program.start 252iomem.firmware.rev 253iomem.flash.mismatch.id 253iomem.fru.badInfo 253iomem.fru.checksum 253iomem.fru.read 254iomem.fru.write 254iomem.i2c.link.down 254iomem.i2c.read.addrNACK 254iomem.i2c.read.dataNACK 255iomem.i2c.read.timeout 255iomem.i2c.write.addrNACK 255iomem.i2c.write.dataNACK 255iomem.i2c.write.timeout 256iomem.init.detect.fpga 256iomem.init.detect.pci 256iomem.init.fail 256iomem.memory.flash.syndrome 256iomem.memory.none 257iomem.memory.power.high 257iomem.memory.power.low 257iomem.memory.scrub.start 257

Index | 339

Page 340: 215-06774_netapp-cmds

iomem.memory.size 258iomem.memory.zero.complete 258iomem.memory.zero.start 258iomem.nor.op.failed 258iomem.pci.error.config.bar 258iomem.pio.op.failed 259iomem.remap.block 259iomem.remap.target.bad 259iomem.temp.report 259iomem.train.complete 260iomem.train.fail 260iomem.train.notReady 260iomem.train.start 260iomem.vmargin.high 261iomem.vmargin.low 261iomem.vmargin.nominal 261message generation and reporting 229monitor.extCache.failed 261monitor.flexscale.noLicense 261

H

HBA LEDsdual port, 8-Gb Fibre Channel Virtual Interface

HBA 77dual-port Fibre Channel 74dual-port, 10-Gb, FCoE unified target 84dual-port, 3-Gb SAS 86dual-port, 4-Gb, target-mode Fibre Channel 75dual-port, 8-Gb, target-mode Fibre Channel 75fiber-optic iSCSI target 81quad-port, 4-Gb, Fibre Channel, 12-LED version

80quad-port, 4-Gb, Fibre Channel, four-LED version

78quad-port, 8-Gb SAS 86

L

LEDs30xx controller front 4530xx PSU 4731xx system fan LEDs 5131xx system FRU LEDs 5331xx system LEDs on the back of the controller 5031xx system LEDS on the front of the chassis 4931xx system PSU LEDs 5232xx system fan LEDs 5932xx system internal FRU LEDs 6032xx system LEDs on the back of the controller 55

32xx system LEDs on the back of the I/Oexpansion module 58

32xx system PSU LEDs 5960xx system fan LEDs 6360xx system LEDs on the back of the controller 6260xx system LEDs on the front of the controller 6160xx system PSU LEDs 6462xx LEDs on the back of the controller 6762xx PSU LEDs 7262xx system fan LEDs 7262xx system internal FRU LEDs 7362xx system LEDs on front of chassis 6562xx system LEDs on the back of the I/O

expansion module 71copper, iSCCI, target HBA 82dual port, 8-Gb Fibre Channel Virtual Interface

HBA 77dual-port Fibre Channel HBA 74dual-port GbE NICs 95, 96dual-port, 10-Gb, FCoE unified target HBA 84dual-port, 10GBase-CX4 TOE NICs 103dual-port, 2-Gb VI-MetroCluster adapter 87dual-port, 3-Gb SAS 86dual-port, 4-Gb MetroCluster adapter 89dual-port, 4-Gb, target-mode Fibre Channel HBA

75dual-port, 8-Gb MetroCluster adapter 90dual-port, 8-Gb, target-mode Fibre Channel HBA

75FAS20xx system LEDs on the back of the

controller module 33FAS20xx system LEDs on the front of the chassis

31FAS22xx internal drive LEDs 38FAS22xx system internal FRU LEDs 45FAS22xx system LEDs on the back of the

controller 40FAS22xx system LEDs on the front of the chassis

37FAS22xx system PSU LEDs 43FAS30xx, back of controllers 46fiber-optic iSCSI target HBA 81Flash Cache module 114HBA LEDs

copper, iSCSI, target 82multiport GbE NICs 98NVRAM5 adapter 106NVRAM5 and NVRAM6 media converter 108NVRAM6 adapter 106NVRAM7 adapter 107

340 | Platform Monitoring Guide

Page 341: 215-06774_netapp-cmds

NVRAM8 adapter 108onboard drive failures, FAS20xx systems 31Performance Acceleration Module (PAM) 114PSU, FAS20xx systems 35PSU, SA200 systems 35quad-port TOE NICs 104quad-port, 3-Gb SAS 86quad-port, 4-Gb, Fibre Channel HBA, 12-LED

version 80quad-port, 4-Gb, Fibre Channel HBA, four-LED

version 78SA200 system LEDs on the back of the controller

module 33SA200 system LEDs on the front of the chassis 31SA300 controller front 45SA300 PSU 47SA300, back of controllers 46SA320 system fan LEDs 59SA320 system internal FRU LEDs 60SA320 system LEDs on the back of the controller

55SA320 system LEDs on the back of the I/O

expansion module 58SA320 system PSU LEDs 59SA600 system fan LEDs 63SA600 system LEDs on the back of the controller

62SA600 system LEDs on the front of the controller

61SA600 system PSU LEDs 64SA620 LEDs on the back of the controller 67SA620 PSU LEDs 72SA620 system fan LEDs 72SA620 system internal FRU LEDs 73SA620 system LEDs on front of chassis 65SA620 system LEDs on the back of the I/O

expansion module 71single-port GbE NICs 92single-port GbE NICs, FAS2050 systems only 94single-port TOE NICs 100

M

MetroCluster adapter LEDsdual-port, 2-Gb VI-MetroCluster adapter 87dual-port, 4-Gb MetroCluster adapter 89dual-port, 8-Gb MetroCluster adapter 90

N

NIC LEDsdual-port GbE 95, 96multiport GbE 98single-port GbE 92single-port GbE, FAS2050 systems only 94

NVRAM5 adapterLEDs 106which systems support the 105

NVRAM6 adapterLEDs 106which systems support the 105

NVRAM7 adapterLEDs 107which systems support the 105

NVRAM8 adapterdestage status 108HA pair 108LEDs 108which systems support the 105

O

operational error messagesDisk hung during swap 270Disk n is broken 271Dumping core 271Error dumping core 271FC-AL LINK_FAILURE 271FC-AL RECOVERABLE ERRORS 271Panicking 272RMC Alert:Boot Error 272RMC Alert:Down Appliance 272RMC Alert:OFW POST Error 272when they appear 149

P

POST error messages, 3020 and 3050 systemsAbort Autoboot–POST Failure(s):CPU 120Abort Autoboot–POST Failure(s):MEMORY 120Abort Autoboot–POST Failure(s):RTC, RTC_IO

121Abort Autoboot–POST Failure(s):UCODE 121Autoboot of backup image aborted 121Autoboot of backup image failed 122Autoboot of primary image aborted 122Autoboot of primary image failed 122Invalid FRU EEPROM Checksum 123

Index | 341

Page 342: 215-06774_netapp-cmds

Memory init failure 123No Memory found 123Unsupported system bus speed 124

POST error messages, 3040, 3070, and SA300 systems0200:Failure Fixed Disk 1240230:System RAM Failed at offset: 1240231:Shadow RAM failed at offset 1250232:Extended RAM failed at address line 125,

1300235:Multiple-bit ECC error occurred 126023C:Bad DIMM found in slot # 126023E:Node Memory Interleaving disabled 1270241:Agent Read Timeout 1270242:Invalid FRU information 1270250:System battery is dead 1280251:System CMOS checksum bad 1280253:Clear CMOS jumper detected 1280260:System timer error 1290280:Previous boot incomplete 12902C2:No valid Boot Loader in System Flash–Non

Fatal 12902C3:No valid Boot Loader in System Flash–Fatal

12902FA:Watchdog Timer Reboot (PciInit) 13002FB:Watchdog Timer Reboot (MemTest) 13102FC:LDTStop Reboot (HTLinkInit) 131No message on console 131

POST error messages, 31xx systems0200:Failure Fixed Disk 1240230:System RAM Failed at offset: 1240231:Shadow RAM failed at offset 1250232:Extended RAM failed at address line 1250235:Multiple-bit ECC error occurred 126023C:Bad DIMM found in slot # 126023E:Node Memory Interleaving disabled 1270241:Agent Read Timeout 1270242:Invalid FRU information 1270250:System battery is dead 1280251:System CMOS checksum bad 1280253:Clear CMOS jumper detected 1280260:System timer error 1290280:Previous boot incomplete 12902C2:No valid Boot Loader in System Flash–Non

Fatal 12902C3:No valid Boot Loader in System Flash–Fatal

12902FA:Watchdog Timer Reboot (PciInit) 13002FB:Watchdog Timer Reboot (MemTest) 13102FC:LDTStop Reboot (HTLinkInit) 131No message on console 131

POST error messages, 32xx and SA320 systems023A:ONTAP Detected Bad DIMM in slot: 134023B:BIOS detected SPD checksum error in

DIMM slot: 1340280:Previous boot incomplete - Default

configuration used 136BIOS detected pattern write/read mismatch in

DIMM slot: 134Fatal Error:No DIMM detected and system can not

continue boot! 138Fatal Error! All channels are disabled! 138Fatal Error! All DIMM failed and system can not

continue boot! 139Software memory test failed! 1390200:Failure Fixed Disk 1320230:System RAM Failed at offset: 1320231:Shadow RAM Failed at offset: 1330232:Extended RAM Failed at address line: 1330241:SMBus Read Timeout 1340242:Invalid FRU information 1350250:System battery is dead - Replace and run

SETUP 1350251:System CMOS checksum bad 1350260:System timer error 1350271:Check date and time settings 13502A2:BMC System Error Log (SEL) Full 13602A3:No Response From SP To FRU ID Read

Request 13602C2:No valid Boot Loader in System Flash - Non

Fatal 13702C3:No valid Boot Loader in System Flash - Fatal

138BIOS detected uncorrectable ECC error in DIMM

slot: 133BIOS detected unknown errors in DIMM slot 134Fatal Error! RDIMMs and UDIMMs are mixed!

139Fatal Error! UDIMM in 3rd slot is not supported!

139No message on the console 133

POST error messages, 60xx and SA600 systems0200:Failure Fixed Disk 1240230:System RAM Failed at offset: 1240231:Shadow RAM failed at offset 1250232:Extended RAM failed at address line 1250235:Multiple-bit ECC error occurred 126023C:Bad DIMM found in slot # 126023E:Node Memory Interleaving disabled 1270241:Agent Read Timeout 1270242:Invalid FRU information 127

342 | Platform Monitoring Guide

Page 343: 215-06774_netapp-cmds

0250:System battery is dead 1280251:System CMOS checksum bad 1280253:Clear CMOS jumper detected 1280260:System timer error 1290280:Previous boot incomplete 12902C2:No valid Boot Loader in System Flash–Non

Fatal 12902C3:No valid Boot Loader in System Flash–Fatal

12902F9:FGPA jumper detected 13002FA:Watchdog Timer Reboot (PciInit) 13002FB:Watchdog Timer Reboot (MemTest) 13102FC:LDTStop Reboot (HTLinkInit) 131No message on console 131

POST error messages, 62xx and SA620 systems023A:ONTAP Detected Bad DIMM in slot:: 134023B:BIOS detected SPD checksum error in

DIMM slot: 1340271:Check date and time settings 1350280:Previous boot incomplete - Default

configuration used 136Fatal Error:No DIMM detected and system can not

continue boot! 138Fatal Error! All channels are disabled! 138Fatal Error! All DIMM failed and system can not

continue boot! 139Software memory test failed! 1390200:Failure Fixed Disk 1320230:System RAM Failed at offset: 1320231:Shadow RAM Failed at offset: 1330232:Extended RAM Failed at address line: 1330241:SMBus Read Timeout 1340242:Invalid FRU information 1350250:System battery is dead - Replace and run

SETUP 1350251:System CMOS checksum bad 1350260:System timer error 13502A2:BMC System Error Log (SEL) Full 13602A3:No Response From SP To FRU ID Read

Request 13602C2:No valid Boot Loader in System Flash - Non

Fatal 13702C3:No valid Boot Loader in System Flash - Fatal

138BIOS detected pattern write/read mismatch in

DIMM slot: 134BIOS detected uncorrectable ECC error in DIMM

slot: 133BIOS detected unknown errors in DIMM slot 134

Fatal Error! RDIMMs and UDIMMs are mixed!139

Fatal Error! UDIMM in 3rd slot is not supported!139

No message on the console 133POST error messages, FAS22xx systems

0200:Failure Fixed Disk 1320230:System RAM Failed at offset: 1320232:Extended RAM Failed at address line: 1330250:System battery is dead - Replace and run

SETUP 1350260:System timer error 1350271:Check date and time settings 13502A2:BMC System Error Log (SEL) Full 13602C3:No valid Boot Loader in System Flash - Fatal

138Fatal Error:No DIMM detected and system can not

continue boot! 138Fatal Error! All channels are disabled! 138Fatal Error! UDIMM in 3rd slot is not supported!

139No Response to Controller FRU ID Read Request

via IPMI 137SP FRU Entry is Blank or Checksum Error 1360231:Shadow RAM Failed at offset: 1330251:System CMOS checksum bad 1350280:Previous boot incomplete - Default

configuration used 13602A1:SP Not Found 13602C2:No valid Boot Loader in System Flash - Non

Fatal 137BIOS detected pattern write/read mismatch in

DIMM slot: 134BIOS detected uncorrectable ECC error in DIMM

slot: 133BIOS detected unknown errors in DIMM slot 134BIOS detected unknown errors in DIMM slot: 133Fatal Error! RDIMMs and UDIMMs are mixed!

139No message on the console 133No Response to Midplane FRU ID Read Request

via IPMI 137PSU LEDs

22xx systems 4330xx systems 4731xx systems 5232xx systems 5960xx systems 6462xx systems 72FAS20xx systems 35

Index | 343

Page 344: 215-06774_netapp-cmds

SA200 systems 35SA300 systems 47SA320 systems 59SA600 systems 64SA620 systems 72

R

RLMAutoSupport e-mail contents 276types of messages 275when AutoSupport messages are sent 275when RLM EMS messages are sent 276

RLM-generated messagesHeartbeat loss warning 276Reboot (power loss) critical 277Reboot (watchdog reset) warning 277Reboot warning 277RLM heartbeat loss 277RLM heartbeat stopped 278System boot failed (POST failed) 278User triggered (RLM test) 278User_triggered (system nmi) 278User_triggered (system power cycle) 278User_triggered (system power off) 279User_triggered (system power on) 279User_triggered (system reset) 279

S

SA200 systemscontroller module fault LED 33controller module LEDs 31Ethernet port LEDs 33fault LED 31Fibre Channel port LEDs 33LEDs on the back of the controller module 33LEDs on the front of the chassis 31NVMEM LED 33power LED 31PSU LEDs 35remote management port LEDs 33startup progress, viewing 118

SA300 systemsactivity LED 45controller front LEDs 45FC port LEDs 46GbE port LEDs 46LEDs on the back of controllers 46POST error messages 124

power LED 45PSU LEDs 47RLM LEDs 46status LED 45

SA320 systemschassis fault LED 54controller activity LED 54controller fault LED 55controller- I/O expansion module configuration 54dual-controller configuration 54fan LED 59Fibre Channel port LEDs 55GbE port LEDs 55I/O expansion module fault LED 58internal FRU LEDs 60LEDs on the back of the controller 55LEDs on the back of the I/O expansion module 58LEDs on the front of the chassis 54management port LEDs 55, 58NVMEM LED 55POST error messages 132power LED 54PSU LEDs 59SAS port LEDs 55

SA600 systemsactivity LED 61fan LEDs 63Fibre Channel port LEDs 62GbE port LEDs 62LEDs on the back of the controller 62LEDs on the front of the controller 61POST error messages 124power LED 61PSU LEDs 64RLM LEDs 62status LED 61

SA620 systems10-GbE port LEDs 678-Gb Fibre Channel port LEDs 67chassis fault LED 65console port 67controller activity LED 65controller fault LED 67controller-I/O expansion module configuration 65dual-controller configuration 65fan LED 72GbE port LEDs 67I/O expansion module fault LED 71internal FRU LEDs 73LEDs on the back of the controller 67

344 | Platform Monitoring Guide

Page 345: 215-06774_netapp-cmds

LEDs on the back of the I/O expansion module 71POST error messages 132power LED 65private management port LEDs 67, 71PSU LEDs 72remote management port LEDs 67USB port 67

SAS EMS messagesshm.threshold.spareBlocksConsumed 195shm.threshold.spareBlocksConsumedMax 196ds.sas.config.warning 180ds.sas.crc.err 180ds.sas.drivephy.disableErr 180ds.sas.element.fault 181ds.sas.element.xport.error 181ds.sas.hostphy.disableErr 182ds.sas.invalid.word 182ds.sas.loss.dword 182ds.sas.multPhys.disableErr 183ds.sas.phyRstProb 183ds.sas.running.disparity 183ds.sas.ses.disableErr 184ds.sas.xfer.element.fault 184ds.sas.xfer.export.error 184ds.sas.xfer.not.sent 185ds.sas.xfer.unknown.error 185sas.adapter.bad 186sas.adapter.bootarg.option 186sas.adapter.debug 186sas.adapter.exception 186sas.adapter.failed 187sas.adapter.firmware.down load 187sas.adapter.firmware.fault 187sas.adapter.firmware.update.failed 187sas.adapter.not.ready 188sas.adapter.offline 188sas.adapter.offlining 188sas.adapter.online 189sas.adapter.online.failed 189sas.adapter.onlining 189sas.adapter.reset 189sas.adapter.unexpected.status 190sas.cable.error 190sas.cable.pulled 190sas.cable.pushed 190sas.config.mixed.detected 191sas.device.invalid.wwn 191sas.device.quiesce 191sas.device.resetting 192sas.device.timeout 192

sas.initialization.failed 193sas.link.error 193sas.port.disabled 193sas.port.down 193sas.shelf.conflict 194sasmon.adapter.phy.disable 194sasmon.adapter.phy.event 195sasmon.disable.module 195

SAS HBAsdual-port, 3-Gb SAS HBA ports and cable 86quad-port, 3-Gb SAS HBA ports and cable 86

Service ProcessorSee SP

SES EMS messagesses.shelf.unsupportAllowErr 211ses.access.noEnclServ 196ses.access.noMoreValidPaths 197ses.access.noShelfSES 197ses.access.sesUnavailable 198ses.badShareStorageConfigErr 198ses.bridge.fw.getFailWarn 199ses.bridge.fw.mmErr 199ses.channel.rescanInitiated 199ses.config.drivePopError 200ses.config.IllegalEsh270 200ses.config.shelfMixError 200ses.config.shelfPopError 200ses.disk.configOk 201ses.disk.illegalConfigWarn 201ses.disk.pctl.timeout 199, 201ses.download.powerCyclingChannel 201ses.download.shelfToReboot 202ses.download.suspendIOForPowerCycle 202ses.drive.PossShelfAddr 202ses.drive.shelfAddr.mm 203ses.exceptionShelfLog 203ses.extendedShelfLog 204ses.fw.emptyFile 204ses.fw.resourceNotAvailable 204ses.giveback.restartAfter 205ses.giveback.wait 205ses.psu.coolingReqError 205ses.psu.powerReqErrorr 205ses.remote.configPageError 206ses.remote.elemDescPageError 206ses.remote.faultLedError 206ses.remote.flashLedError 207ses.remote.shelfListError 207ses.remote.statPageError 207ses.shelf.changedID 207

Index | 345

Page 346: 215-06774_netapp-cmds

ses.shelf.ctrlFailErr 208ses.shelf.em.ctrlFailErr 208ses.shelf.IdBasedAddr 208ses.shelf.invalNum 209ses.shelf.mmErr 209ses.shelf.OSmmErr 210ses.shelf.powercycle.done 210ses.shelf.powercycle.start 210ses.shelf.sameNumReassign 210ses.shelf.unsupportedErr 211ses.startTempOwnership 211ses.status.ATFCXError 211ses.status.ATFCXInfo 212ses.status.currentError 212ses.status.currentInfo 212ses.status.currentWarning 213ses.status.displayError 213ses.status.displayInfo 213ses.status.displayWarning 214ses.status.driveError 214ses.status.driveOk 214ses.status.driveWarning 215ses.status.electronicsError 215ses.status.electronicsInfo 215ses.status.electronicsWarn 215ses.status.ESHPctlStatus 216ses.status.fanError 216ses.status.fanInfo 216ses.status.fanWarning 216ses.status.ModuleError 217ses.status.ModuleInfo 217ses.status.ModuleWarn 217ses.status.psError 218ses.status.psInfo 218ses.status.psWarning 218ses.status.temperatureError 219ses.status.temperatureInfo 219ses.status.temperatureWarning 220ses.status.upsError 220ses.status.upsInfo 220ses.status.volError 221ses.status.volWarning 221ses.system.em.mmErr 221ses.tempOwnershipDone 222sfu.adapterSuspendIO 222sfu.auto.update.off.impact 222sfu.ctrllerElmntsPerShelf 222sfu.downloadCtrllerBridge 223sfu.downloadError 223sfu.downloadingController 223

sfu.downloadingCtrllerR1XX 223sfu.downloadStarted 224sfu.downloadSuccess 224sfu.downloadSummary 224sfu.downloadSummaryErrors 224sfu.FCDownloadFailed 224sfu.firmwareDownrev 225sfu.firmwareUpToDate 225sfu.partnerInaccessible 225sfu.partnerNotResponding 226sfu.partnerRefusedUpdate 226sfu.partnerUpdateComplete 226sfu.partnerUpdateTimeout 226sfu.rebootRequest 227sfu.rebootRequestFailure 227sfu.resumeDiskIO 227sfu.SASDownloadFailed 227sfu.statusCheckFailure 228sfu.suspendDiskIO 228sfu.suspendSES 228

SPAutoSupport e-mail contents 304EMS messages about the SP 306SP-generated AutoSupport messages 304when AutoSupport messages are sent 303when SP EMS messages are sent 304

SP messagestypes available for troubleshooting 303

SP-generated messagesHEARTBEAT_LOSS 304REBOOT (abnormal) 305SYSTEM_BOOT_FAILED (post failed) 305USER_TRIGGERED (sp test) 305USER_TRIGGERED (system nmi) 305USER_TRIGGERED (system power cycle) 306USER_TRIGGERED (system power off) 306USER_TRIGGERED (system reset) 306

startup error messagesboot messages 118POST messages 117types of 117

T

TOE NIC LEDsdual-port, 10GBase-CX4 103quad-port 104single-port 100

toolsforms and use of diagnostic 26

346 | Platform Monitoring Guide

Page 347: 215-06774_netapp-cmds

troubleshootingtypes of SP messages available for 303

TroubleshootingHow AutoSupport messages help with

troubleshooting 26information sources 25Where LEDs appear 25where messages are displayed 25where to find platform documentation for 27

UUSB boot device

EMS messages 262USB EMS messages

usb.adapter.debug 262usb.adapter.exception 262usb.adapter.failed 262usb.adapter.reset 263

usb.device.failed 263usb.device.initialize.failed 263usb.device.maximum.connected 264usb.device.protocol.mismatch 264usb.device.removed 265usb.device.timeout 265usb.device.unsupported 265usb.device.unsupported.speed 266usb.external.device.not.used 266usb.externalHub.notSupported 266usb.port.error 266usb.port.reset 267usb.port.state.indeterminate 267usb.port.status.inconsistent 267usbmon.boot.device.failed 268usbmon.boot.device.pfa 268usbmon.disable.module 268usbmon.unable.to.monitor 269

Index | 347

Page 348: 215-06774_netapp-cmds