royong
11-06-2008, 09:53
Hi all
Basically the following is a quick bash script that I obtained from the Internet that has been amended for use on my own managed servers to monitor software RAID devices using a reference of a good /proc/mdstat file. We'll need to set up a cron job to run this script as often as you would like to check the RAID status. I have done mine twice a day and is shown below. Requires sendmail (or equivalent) on the local machine.
Here goes ... let's setup the directory structure first
# cd /root
# mkdir myscripts
# cd myscripts
# mkdir raidcheck
Now for the executables
cat > /root/myscripts/raidcheck/raidgetstatus.sh <<"EOF"
#!/bin/sh
# get active raid status
cat /proc/mdstat
EOF
Part 2
cat > /root/myscripts/raidcheck/raidcheck.sh <<"EOF"
#! /bin/sh
MAILTO="support@XXXX.com"
RAIDHOST=`/bin/hostname`
DATE=`/bin/date`
LOCATION=/root/myscripts/raidcheck/
# you must have issued
# raidgetstatus.sh >raidgood
# in the local directory before this routine will work
# get the current raid status
RAIDSTATUS=`$LOCATION/raidgetstatus.sh`
# get the existing good raid status
RAIDREF=`cat $LOCATION/raidgood`
if [ "${RAIDSTATUS}" != "${RAIDREF}" ]; then
echo "" > $LOCATION/tmp
echo "An error has been detected in the RAID Device status of ${RAIDHOST}" >> $LOCATION/tmp
echo "Tested on ${DATE}" >> $LOCATION/tmp
echo "" >> $LOCATION/tmp
echo "A good RAID device status should be:" >> $LOCATION/tmp
echo "${RAIDREF}" >> $LOCATION/tmp
echo "" >> $LOCATION/tmp
echo "The currrent device status is:" >> $LOCATION/tmp
echo "${RAIDSTATUS}" >> $LOCATION/tmp
cat $LOCATION/tmp | mail -s "ALERT ALERT ALERT - RAID Device Failure on ${RAIDHOST}" $MAILTO
fi
if [ "${RAIDSTATUS}" = "${RAIDREF}" ]; then
echo "" > $LOCATION/tmp
echo "RAID Device on ${RAIDHOST} is functioning normally" >> $LOCATION/tmp
echo "Tested on ${DATE}" >> $LOCATION/tmp
echo "A good RAID device status should be:" >> $LOCATION/tmp
echo "${RAIDREF}" >> $LOCATION/tmp
echo "" >> $LOCATION/tmp
echo "The current device status is:" >> $LOCATION/tmp
echo "${RAIDSTATUS}" >> $LOCATION/tmp
cat $LOCATION/tmp | mail -s "RAID Device Functioning normally on ${RAIDHOST}" $MAILTO
fi
Run the following once to get a GOOD reference status
cd /root/myscripts/raidcheck/
./raidgetstatus.sh > raidgood
Some simple permissions and adding 2 daily cron jobs to run at 0801hrs and 2001hrs respectively
# chmod +x /root/myscripts/raidcheck/*.sh
# echo "01 8 * * * root /root/myscripts/raidcheck/raidcheck.sh > /dev/null" >> /etc/crontab
# echo "01 20 * * * root /root/myscripts/raidcheck/raidcheck.sh > /dev/null" >> /etc/crontab
Now to test it out
# /root/myscripts/raidcheck/raidcheck.sh
You should receive an email with regards to the status of the raid device. Going forward, the script will run at 0801hrs and 2001hrs - you will receive an email report of the status of the RAID devices. If there is any difference between the "good" status of /proc/mdstat - of which we have a copy in the file /root/myscripts/raidcheck/raidgood, then a report asking you to check the RAID status will be generated.
Hope this helps .... :P
Basically the following is a quick bash script that I obtained from the Internet that has been amended for use on my own managed servers to monitor software RAID devices using a reference of a good /proc/mdstat file. We'll need to set up a cron job to run this script as often as you would like to check the RAID status. I have done mine twice a day and is shown below. Requires sendmail (or equivalent) on the local machine.
Here goes ... let's setup the directory structure first
# cd /root
# mkdir myscripts
# cd myscripts
# mkdir raidcheck
Now for the executables
cat > /root/myscripts/raidcheck/raidgetstatus.sh <<"EOF"
#!/bin/sh
# get active raid status
cat /proc/mdstat
EOF
Part 2
cat > /root/myscripts/raidcheck/raidcheck.sh <<"EOF"
#! /bin/sh
MAILTO="support@XXXX.com"
RAIDHOST=`/bin/hostname`
DATE=`/bin/date`
LOCATION=/root/myscripts/raidcheck/
# you must have issued
# raidgetstatus.sh >raidgood
# in the local directory before this routine will work
# get the current raid status
RAIDSTATUS=`$LOCATION/raidgetstatus.sh`
# get the existing good raid status
RAIDREF=`cat $LOCATION/raidgood`
if [ "${RAIDSTATUS}" != "${RAIDREF}" ]; then
echo "" > $LOCATION/tmp
echo "An error has been detected in the RAID Device status of ${RAIDHOST}" >> $LOCATION/tmp
echo "Tested on ${DATE}" >> $LOCATION/tmp
echo "" >> $LOCATION/tmp
echo "A good RAID device status should be:" >> $LOCATION/tmp
echo "${RAIDREF}" >> $LOCATION/tmp
echo "" >> $LOCATION/tmp
echo "The currrent device status is:" >> $LOCATION/tmp
echo "${RAIDSTATUS}" >> $LOCATION/tmp
cat $LOCATION/tmp | mail -s "ALERT ALERT ALERT - RAID Device Failure on ${RAIDHOST}" $MAILTO
fi
if [ "${RAIDSTATUS}" = "${RAIDREF}" ]; then
echo "" > $LOCATION/tmp
echo "RAID Device on ${RAIDHOST} is functioning normally" >> $LOCATION/tmp
echo "Tested on ${DATE}" >> $LOCATION/tmp
echo "A good RAID device status should be:" >> $LOCATION/tmp
echo "${RAIDREF}" >> $LOCATION/tmp
echo "" >> $LOCATION/tmp
echo "The current device status is:" >> $LOCATION/tmp
echo "${RAIDSTATUS}" >> $LOCATION/tmp
cat $LOCATION/tmp | mail -s "RAID Device Functioning normally on ${RAIDHOST}" $MAILTO
fi
Run the following once to get a GOOD reference status
cd /root/myscripts/raidcheck/
./raidgetstatus.sh > raidgood
Some simple permissions and adding 2 daily cron jobs to run at 0801hrs and 2001hrs respectively
# chmod +x /root/myscripts/raidcheck/*.sh
# echo "01 8 * * * root /root/myscripts/raidcheck/raidcheck.sh > /dev/null" >> /etc/crontab
# echo "01 20 * * * root /root/myscripts/raidcheck/raidcheck.sh > /dev/null" >> /etc/crontab
Now to test it out
# /root/myscripts/raidcheck/raidcheck.sh
You should receive an email with regards to the status of the raid device. Going forward, the script will run at 0801hrs and 2001hrs - you will receive an email report of the status of the RAID devices. If there is any difference between the "good" status of /proc/mdstat - of which we have a copy in the file /root/myscripts/raidcheck/raidgood, then a report asking you to check the RAID status will be generated.
Hope this helps .... :P