Showing posts with label hardware. Show all posts
Showing posts with label hardware. Show all posts

Sunday, August 23, 2015

How to check if disk is failing or failed on Solaris

How to check if disk is failing or failed on Solaris

Failed disk:

1. It shows "disk not responding to selection" in /var/adm/messages
2. It only shows increased transport errors
3. it's not visible under format command ("disk not available")

Failing disk:

1. It shows read/write errors in /var/adm/messages
2. Soft/Hard error counters are increasing
3. Disk is available under format command

On old Sun Fire V440 it looks like this: 


/var/adm/messages contain:
Aug 21 13:48:57 servername scsi: [ID 107833 kern.warning] WARNING: /pci@1f,700000/scsi@2/sd@0,0 (sd1):
Aug 21 13:48:57 servername     disk not responding to selection


iostat -En shows only transport errors:
c1t0d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 1
Vendor: FUJITSU  Product: MAW3073NCSUN72G Revision: 1703 Serial No: XXX
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0


Under format disk is no longer available:
AVAILABLE DISK SELECTIONS:
       0. c1t0d0 <drive not available>
          /pci@1f,700000/scsi@2/sd@0,0
       1. c1t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>
          /pci@1f,700000/scsi@2/sd@1,0


metastat output:
# metastat d6
d6: Mirror
    Submirror 0: d16
      State: Okay       
    Submirror 1: d26
      State: Needs maintenance

...
d26: Submirror of d6
    State: Needs maintenance
    Invoke: metareplace d6 c1t0d0s2 <new device>


Thursday, August 20, 2015

changing ILO settings from OS using hponcfg

hponcfg is quite useful tool if you're going to automate changing ILO settings on multiple machines.
It works the same way on various versions of ILO.

To get the current settings into file:

hponcfg -w current.xml 

Note: I've noticed that it won't drop the "whole" config but only the most important things (i.e. secondary and tertiary dns server won't be included even if it's defined).

To set some new settings described in the xml file:

hponcfg -f update.xml

Note: you don't need to put whole config, you can change one parameter if needed.


If you're lucky and ILO driver works properly you should see something like this:

hponcfg -w current.xml
HP Lights-Out Online Configuration utility
Version 4.0.1 Date 09/24/2012 (c) Hewlett-Packard Company, 2012
Firmware Revision = 1.16 Device type = iLO 3 Driver name =
Management Processor configuration is successfully written to file


If you can't connect to ILO from OS:

HPONCFG RILOE-II/iLO setup and configuration utility
Version 4.0.1
Date 09/24/2012 (c) Hewlett-Packard Company, 2012

ERROR: Unable to establish communication with iLO/RILOE-II.


Try to restart hp-snmp-agents and usually it will resolve the problem.

/etc/init.d/hp-snmp-agents stop
/etc/init.d/hp-snmp-agents start