Thursday, September 10, 2015

Failed login control on RHEL6 with pam_tally2

pam_tally2 module is available in RHEL and CentOS and it can be used to protect your system against bruteforce attacks.

Enabling pam_tally2

Edit /etc/pam.d/password-auth and add this line on top of the auth lines:
auth        required onerr=fail deny=3 unlock_time=900

Then add following line on top of the account lines:
account required

Parameters to this module are simple:
If something weird happens (like unable to open the file), return with PAM_SUCCESS if onerr=succeed is given, else with the corresponding PAM error code. 

Deny access if tally for this user exceeds 3 times.

Allow access after 900 seconds (15 minutes) after failed attempt. If this option is used the user will be locked out for the specified amount of time after he exceeded his maximum allowed attempts. If this option is not set administrator will need to unlock user's account manually.

Check if you have following options set in /etc/ssh/sshd_config:
UsePAM yes
ChallengeResponseAuthentication yes

Testing pam_tally2

login as: pajarito
Using keyboard-interactive authentication.
Access denied
Using keyboard-interactive authentication.
Access denied
Using keyboard-interactive authentication.
Access denied
Using keyboard-interactive authentication.
Account locked due to 3 failed logins

As you can see after third attempt user's account was locked.

Verifying and unlocking users

To check current pam_tally2 statistics run pam_tally2 command:
# pam_tally2
Login           Failures Latest failure     From
jsmith              3    09/09/15 15:17:21

To unlock a user use the "-r" flag:
# pam_tally2 -u pajarito -r
Login           Failures Latest failure     From
jsmith              3    09/09/15 15:20:49

Finally if the output of pam_tally2 is empty it means that no account has been locked.

Monday, September 7, 2015

Automated partition creation with fdisk and sfdisk

To perform automated partition creation or modification you can pass all the commands via echo directly to fdisk:

echo -e "o\nn\np\n1\n\n\nw" | fdisk /dev/sdc

The commands are:
o - create a new empty DOS partition table
n - add a new partition
p - create primary partition
(enter) - set first cylinder to the default value (1)
(enter) - set the last cylinder to the default value (end of the drive)
w - write table to disk and exit

Quick way to clone partition table from one drive to another

You can use sfdisk to save the partition table from the already prepared drive and copy it to another.

As you can see below "-d" option will create a text file which can be easily altered if needed.

[root@centos ~]# sfdisk -d /dev/sdb > file

[root@centos ~]# cat file
# partition table of /dev/sdb
unit: sectors

/dev/sdb1 : start=       63, size=  1044162, Id=83
/dev/sdb2 : start=        0, size=        0, Id= 0
/dev/sdb3 : start=        0, size=        0, Id= 0
/dev/sdb4 : start=        0, size=        0, Id= 0

[root@centos ~]# sfdisk /dev/sdc < file
# sfdisk /dev/sdc < file
Checking that no-one is using this disk right now ...

Disk /dev/sdc: 65 cylinders, 255 heads, 63 sectors/track
 /dev/sdc: unrecognized partition table type
Old situation:
No partitions found
New situation:
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sdc1            63   1044224    1044162  83  Linux
/dev/sdc2             0         -          0   0  Empty
/dev/sdc3             0         -          0   0  Empty
/dev/sdc4             0         -          0   0  Empty
Warning: no primary partition is marked bootable (active)
This does not matter for LILO, but the DOS MBR will not boot this disk.
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)

Friday, September 4, 2015

Simple process monitoring script with email alerting

If you don't have or don't want to install additional software for system/application monitoring (like Nagios, Zabbix, Munin, Big Brother, etc.) you may use this simple script:


HOST=$(uname -n)

OUTPUT=$(ps -ef | grep -c "$PROGRAM")
if [ $OUTPUT -eq 1 ]; then

  if [ -f $TMPFILE ]; then
    echo "Lock file exists"
    echo "$DATE $HOST program \"$PROGRAM\" is not running" | mailx -s "\"$PROGRAM\" is not running on $HOST" $MAIL
    touch $TMPFILE


In PROGRAM variable put the name of the process that you expect to be running, make sure that the monitoring script name will not contain the same string.
Basically, if the program is running "ps -ef | grep program" will return 2 or more rows (one with the program itself and the second one with "grep program").
Otherwise it will only return one row ("grep program") which will trigger the alert and you will get an email.
By creating TMPFILE script will avoid bothering you again and again about the same issue.
Make sure to remove that file after you restart monitored process.

Once the script is ready save it and add to cron, i.e.:
$ crontab -e
* * * * * /path/to/the/script > /dev/null 2>&1

Friday, August 28, 2015

can't start kdump service on virtual machine

# service kdump start
No kdump initial ramdisk found.                            [WARNING]
Rebuilding /boot/initrd-2.6.32-504.23.4.el6.x86_64kdump.img
No module vmmemctl found for kernel 2.6.32-504.23.4.el6.x86_64, aborting.
Failed to run mkdumprd

# lsmod | grep vmmemctl
vmmemctl        13966 0

Fixing VMMEMCTL module issue:

You can disable this module by editing  /etc/vmware-tools/locations and changing answer VMMEMCTL_CONFED from yes to no.

More general approach:

More general way to handle missing modules is to ignore the ones which can not be found:
Edit /etc/sysconfig/kdump and set MKDUMPRD_ARGS="--allow-missing"
# service kdump start
WARNING: No module vmmemctl found for kernel 2.6.32-504.23.4.el6.x86_64, continuing anyway

Tuesday, August 25, 2015

Difference between du and df outputs

Sometimes people say they performed cleanup but filesystem is still (almost) full and df is giving different results than du:

$ df -h /tmp
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3        20G   19G     0 100% /tmp

$ du -sm /tmp
1       /tmp

To find the missing bit you need to check if the deleted files are still in use (in other words those files might be still open):
# lsof | grep deleted
mysqld    2456     mysql    5u   REG       0,19        0  2025554220 (deleted) /tmp/iboy1WVS
mysqld    2456     mysql    6u   REG       0,19        0  2025554284 (deleted) /tmp/ibwlUTGy
mysqld    2456     mysql    7u   REG       0,19        0  2025554322 (deleted) /tmp/ibecOavf

To reclaim the space you need to bounce the process which is still using those files.
If you can't or don't want to kill running proceses you can try to truncate those "deleted" files:
cat /dev/null > /proc/2456/fd/5
cat /dev/null > /proc/2456/fd/6
cat /dev/null > /proc/2456/fd/7

Sunday, August 23, 2015

How to check if disk is failing or failed on Solaris

How to check if disk is failing or failed on Solaris

Failed disk:

1. It shows "disk not responding to selection" in /var/adm/messages
2. It only shows increased transport errors
3. it's not visible under format command ("disk not available")

Failing disk:

1. It shows read/write errors in /var/adm/messages
2. Soft/Hard error counters are increasing
3. Disk is available under format command

On old Sun Fire V440 it looks like this: 

/var/adm/messages contain:
Aug 21 13:48:57 servername scsi: [ID 107833 kern.warning] WARNING: /pci@1f,700000/scsi@2/sd@0,0 (sd1):
Aug 21 13:48:57 servername     disk not responding to selection

iostat -En shows only transport errors:
c1t0d0          Soft Errors: 0 Hard Errors: 0 Transport Errors: 1
Vendor: FUJITSU  Product: MAW3073NCSUN72G Revision: 1703 Serial No: XXX
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

Under format disk is no longer available:
       0. c1t0d0 <drive not available>
       1. c1t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>

metastat output:
# metastat d6
d6: Mirror
    Submirror 0: d16
      State: Okay       
    Submirror 1: d26
      State: Needs maintenance

d26: Submirror of d6
    State: Needs maintenance
    Invoke: metareplace d6 c1t0d0s2 <new device>

Thursday, August 20, 2015

changing ILO settings from OS using hponcfg

hponcfg is quite useful tool if you're going to automate changing ILO settings on multiple machines.
It works the same way on various versions of ILO.

To get the current settings into file:

hponcfg -w current.xml 

Note: I've noticed that it won't drop the "whole" config but only the most important things (i.e. secondary and tertiary dns server won't be included even if it's defined).

To set some new settings described in the xml file:

hponcfg -f update.xml

Note: you don't need to put whole config, you can change one parameter if needed.

If you're lucky and ILO driver works properly you should see something like this:

hponcfg -w current.xml
HP Lights-Out Online Configuration utility
Version 4.0.1 Date 09/24/2012 (c) Hewlett-Packard Company, 2012
Firmware Revision = 1.16 Device type = iLO 3 Driver name =
Management Processor configuration is successfully written to file

If you can't connect to ILO from OS:

HPONCFG RILOE-II/iLO setup and configuration utility
Version 4.0.1
Date 09/24/2012 (c) Hewlett-Packard Company, 2012

ERROR: Unable to establish communication with iLO/RILOE-II.

Try to restart hp-snmp-agents and usually it will resolve the problem.

/etc/init.d/hp-snmp-agents stop
/etc/init.d/hp-snmp-agents start

Saturday, June 20, 2015

Solaris gzip and tar one-liners

As the Solaris tar do not handle compression you can use following one-liners to get the files archived and compress in one line:

1. Archive and compress folder:

tar cf - folder_name | gzip -c > filename.tar.gz

2. Decompress and unpack:

gzcat filename.tar.gz | tar -xpf -