Showing posts with label sun fire. Show all posts
Showing posts with label sun fire. Show all posts

Sunday, August 23, 2015

How to check if disk is failing or failed on Solaris

How to check if disk is failing or failed on Solaris

Failed disk:

1. It shows "disk not responding to selection" in /var/adm/messages
2. It only shows increased transport errors
3. it's not visible under format command ("disk not available")

Failing disk:

1. It shows read/write errors in /var/adm/messages
2. Soft/Hard error counters are increasing
3. Disk is available under format command

On old Sun Fire V440 it looks like this: 


/var/adm/messages contain:
Aug 21 13:48:57 servername scsi: [ID 107833 kern.warning] WARNING: /pci@1f,700000/scsi@2/sd@0,0 (sd1):
Aug 21 13:48:57 servername     disk not responding to selection


iostat -En shows only transport errors:
c1t0d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 1
Vendor: FUJITSU  Product: MAW3073NCSUN72G Revision: 1703 Serial No: XXX
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0


Under format disk is no longer available:
AVAILABLE DISK SELECTIONS:
       0. c1t0d0 <drive not available>
          /pci@1f,700000/scsi@2/sd@0,0
       1. c1t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>
          /pci@1f,700000/scsi@2/sd@1,0


metastat output:
# metastat d6
d6: Mirror
    Submirror 0: d16
      State: Okay       
    Submirror 1: d26
      State: Needs maintenance

...
d26: Submirror of d6
    State: Needs maintenance
    Invoke: metareplace d6 c1t0d0s2 <new device>