Difference between revisions of "Disk Error Alerts"

From Amahi Wiki
Jump to: navigation, search
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Finding out a disk is bad after it crashes can often prove disastrous.  To be warned of a failing disk is something we all would appreciate.  This can be done via [http://smartmontools.sourceforge.net/ smartmontools]], a free software package that can monitor S.M.A.R.T. attributes and run hard drive self-tests. Basically, S.M.A.R.T. may give you enough of a warning that you can safely backup all your data before your hard drive dies. Obviously, nothing replaces regular backups, but it's absolutely better than knowing nothing!   
+
{{MessageBox|
 +
backgroundcolor = #faa|
 +
image =Warning.png|
 +
heading =WARNING|
 +
message = This is recommended only for advanced users, proceed with caution.}}
 +
Finding out a disk is bad after it crashes can often prove disastrous.  To be warned of a failing disk is something we all would appreciate.  This can be done via [http://smartmontools.sourceforge.net/ smartmontools], a free software package that can monitor S.M.A.R.T. attributes and run hard drive self-tests. Basically, S.M.A.R.T. may give you enough of a warning that you can safely backup all your data before your hard drive dies. Obviously, nothing replaces regular backups, but it's absolutely better than knowing nothing!   
  
<small>Special thanks to NeverSimple for documenting this process.</small>
 
  
First, Sendmail is off by default with Amahi installed.  You will need to enable it, so do the following as '''root''' user:
+
First, Sendmail is off by default with Amahi installed.  You will need to enable it, so do the following as '''root''' user ('''NOTE:'''  Skip if using Postfix):
{{Code|service sendmail start
+
service sendmail start
chkconfig sendmail on}}
+
chkconfig sendmail on
 
If you prefer to have alerts sent to an email address outside your HDA, try one of the following tutorials:
 
If you prefer to have alerts sent to an email address outside your HDA, try one of the following tutorials:
 
* [[Enable_Outgoing_Emails|Enable Outgoing Emails]]
 
* [[Enable_Outgoing_Emails|Enable Outgoing Emails]]
Line 12: Line 16:
  
 
This may already be installed in your system.  If not, as '''root''' user do:
 
This may already be installed in your system.  If not, as '''root''' user do:
{{Code|yum -y install smartmontools}}
+
yum -y install smartmontools
  
 
smartmontools comes with two programs; smartctl which is meant for interactive use and smartd which continuously monitors S.M.A.R.T.
 
smartmontools comes with two programs; smartctl which is meant for interactive use and smartd which continuously monitors S.M.A.R.T.
  
 
You can do a quick test to see if it recognizes your drives (replace /dev/sda by the drive(s) present on your system):
 
You can do a quick test to see if it recognizes your drives (replace /dev/sda by the drive(s) present on your system):
{{Code|smartctl -i /dev/sda}}
+
smartctl -i /dev/sda
  
To setup smartd to monitor your system automatically, edit the file '''/etc/smartd.conf''' and check for a line that begins with ''DEVICESCAN''. Comment it out by adding a ‘#’ to the beginning of the line something like this:
+
To setup smartd to monitor your system automatically, edit the file ''/etc/smartd.conf'' (alternate location is ''/etc/smartmontools/smartd.conf'') and check for a line that begins with ''DEVICESCAN''. Comment it out by adding a ‘#’ to the beginning of the line something like this:
{{Text|Text=#DEVICESCAN -H -m root -n standby,10,q}}
+
#DEVICESCAN -H -m root -n standby,10,q
  
  
 
Add the following line to /etc/smartd.conf:
 
Add the following line to /etc/smartd.conf:
{{Text|Text=/dev/sda -n standby -a -I 194 -W 6,45,55 -R 5 -M daily -M test -m root}}
+
/dev/sda -n standby -a -I 194 -W 6,45,55 -R 5 -M daily -M test -m root
  
 
This an example from the config file:
 
This an example from the config file:
Line 42: Line 46:
  
 
Start the daemon with:
 
Start the daemon with:
{{Code|service smartd start}}
+
service smartd start
  
To restart after a reboot:
+
To enable start on boot:
{{Code|chkconfig smartd on}}
+
chkconfig smartd on
  
 
You can read local mail sent to root using [http://www.amahi.org/apps/webmin Webmin].
 
You can read local mail sent to root using [http://www.amahi.org/apps/webmin Webmin].
  
 
'''NOTE:'''  You will receive a test email each day or so, one for each drive you identify to be monitored.
 
'''NOTE:'''  You will receive a test email each day or so, one for each drive you identify to be monitored.

Latest revision as of 23:42, 19 April 2015

Warning.png WARNING
This is recommended only for advanced users, proceed with caution.


Finding out a disk is bad after it crashes can often prove disastrous. To be warned of a failing disk is something we all would appreciate. This can be done via smartmontools, a free software package that can monitor S.M.A.R.T. attributes and run hard drive self-tests. Basically, S.M.A.R.T. may give you enough of a warning that you can safely backup all your data before your hard drive dies. Obviously, nothing replaces regular backups, but it's absolutely better than knowing nothing!


First, Sendmail is off by default with Amahi installed. You will need to enable it, so do the following as root user (NOTE: Skip if using Postfix):

service sendmail start
chkconfig sendmail on

If you prefer to have alerts sent to an email address outside your HDA, try one of the following tutorials:


This may already be installed in your system. If not, as root user do:

yum -y install smartmontools

smartmontools comes with two programs; smartctl which is meant for interactive use and smartd which continuously monitors S.M.A.R.T.

You can do a quick test to see if it recognizes your drives (replace /dev/sda by the drive(s) present on your system):

smartctl -i /dev/sda

To setup smartd to monitor your system automatically, edit the file /etc/smartd.conf (alternate location is /etc/smartmontools/smartd.conf) and check for a line that begins with DEVICESCAN. Comment it out by adding a ‘#’ to the beginning of the line something like this:

#DEVICESCAN -H -m root -n standby,10,q


Add the following line to /etc/smartd.conf:

/dev/sda -n standby -a -I 194 -W 6,45,55 -R 5 -M daily -M test -m root

This an example from the config file:

'/dev/sda' is the drive you want to monitor
'-n standby' will not wake up the drive if it is 'sleeping' or in 'standby' to poll it for status
'-a' contains the most common options. you probably want this
'-I 194' don't monitor normalized temperature changes, but...
'-W 6,45,5' track temperature changes >= 6 Celsius, report temperatures >= 45 Celsius; send mail when temperature >= 55 celcius
'-R 5' changes in Raw value of Reallocated Sector Count.
'-M daily' send reports daily. (The default is to send only one warning email for each type of disk problem)
'-M test' send a single test email immediately upon smartd startup. This allows one to verify that email is delivered correctly. 
'-m root' Send a warning email to the email address root (you can replace that with any email address provide you can send mail with your HDA)

You'll need a line like that for every drive in the server you want to monitor. Recommend to check the man page for smartd to see all the available options. There are a lot of them.

Start the daemon with:

service smartd start

To enable start on boot:

chkconfig smartd on

You can read local mail sent to root using Webmin.

NOTE: You will receive a test email each day or so, one for each drive you identify to be monitored.