Clear Faults in Solaris OS Level

In the previous post, i provide the steps to detect and clear faults in ALOM environment. In this post, the steps are just slightly different which is in the OS level environment.

1. The fmadm faulty command is used to display any faulty components in the system.

# fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Sept 12 13:23:32 c49f99s3-1234-s4t6-8i76-w43b6732t6k  PCIEX-8000-3S  Critical

Fault class : fault.io.pciex.device-interr max 40%
              fault.io.pciex.bus-linkerr 20%
Affects     : dev:////pci@400/pci@0/pci@8/scsi@0
              dev:////pci@400/pci@0/pci@8
              faulted but still in service
FRU         : "MB" (hc://:product-id=SUNW,
              T5240:chassis-id=ABC123456:server-id=ITsiti:serial=
              0328MSL-09309L005K:part=540794001/motherboard=0) faulty

Description : A problem has been detected on one of the specified devices or on
              one of the specified connecting buses.
Refer to http://sun.com/msg/PCIEX-8000-3S for more information.

Response    : One or more device instances may be disabled

Impact      : Loss of services provided by the device instances associated with
              this fault

Action      : If a plug-in card is involved check for badly-seated cards or
              bent pins. Otherwise schedule a repair procedure to replace the
              affected device(s).  Use fmadm faulty to identify the devices or
              contact Sun for support.

2. Once Fault Management has faulted a component in your system, you will want to repair it. The fmadm repair command is used to explicitly mark a fault as repaired. It accepts a UUID, FMRI, or Location as an argument.

# fmadm repair c49f99s3-1234-s4t6-8i76-w43b6732t6k
fmadm: recorded repair to c49f99s3-1234-s4t6-8i76-w43b6732t6k

You May Also Like

Leave a Reply?