Nagios setup and notification testing

  15 Aug 2014

Nagios, what is it exactly? It is a monitoring application for network, computer system and IT infrastructure. It is said to be most widely used piece of software for monitoring servers.

What Nagios does is it watches your host resources like disk usage, memory usage, processor load etc. So, you can spot problems before they occur. It checks some network services likes HTTP, SMTP, SSH, POP3, ICMP, etc. It can alert you or people in your specified contacts (via email, SMS etc.). So, you know immediately when service or host problems occur and get fixed. It also has a web frontend to share availability information with other people who care about your infrastructure. That means you can plan, manage, budget and upgrades your entire IT systems as welll as minimize downtime and business losses.

It sounds awesome, I want to try it. How to install it, then? I will guide installation on Ubuntu Server. (The version doesn’t matter, I assume) And I will use aptitude as it is bit more user friendly than apt-get as it seems. (To do so, just install it by: sudo apt-get install aptitude if you don’t have one). Let’s quickly install Nagios3 package with:

aptitude install nagios3

Then, you will be prompted for the password that is used as an administration login to its web interface.

nagios admin password prompted

Enter the password, you aptitude will install some common nagios plugins that are useful.

Optional, verify nagios user is created and process is running

$cat /etc/passwd |grep nagios nagios:x:114:116::/var/lib/nagios:/bin/false

$ps -ef |grep nagios nagios   25908     1  0 17:28 ?        00:00:00 /usr/sbin/nagios3 -d /etc/nagios3/nagios.cfg

There should have similar output as the above example. Confirm if the main configuration is good by using -v option:

nagios3 -v /etc/nagios3/nagios.cfg

Then, go to url http://(Host)/nagios3/. Use ‘nagiosadmin’ as the user and previously provided password while you were installing to login.

welcome page

The installation is just done and it’s pretty easy so far. But the installed Nagios watches only for the host it is installed on. So, what next to do is we will add some other remote hosts and some services to monitor. Edit contacts_nagios2.cfg file in /etc/nagios3/conf.d directory and change email root@localhost line to be your email. This was for an email alert. You can also check if Nagios could send an email by:

/usr/bin/printf “%b” “Test, please ignore.\n notify-service-by-email alert. \n” | /usr/bin/mail -s “Testing Alert” “your-email@address”

Where your-email@address is a placeholder for an actual email address you are willing to get email sent to.

enter image description here

Ensure you also check in Spam box, not only Inbox. Because some email providers like Gmail, for instance, it is usually gone into Spam. But once you mark it is not Spam, then you are fine later.

Now we are good to add remote hosts to monitor…

The remote host we will add is a staging server to monitor. Inside the /etc/nagios3/conf.d directory, there are *.cfg files that look like this:

    $/etc/nagios3/conf.d# tree
    ├── contacts_nagios2.cfg
    ├── extinfo_nagios2.cfg
    ├── generic-host_nagios2.cfg
    ├── generic-service_nagios2.cfg
    ├── host-gateway_nagios3.cfg
    ├── hostgroups_nagios2.cfg
    ├── localhost_nagios2.cfg
    ├── services_nagios2.cfg
    └── timeperiods_nagios2.cfg

    0 directories, 9 files

And these files are what you instruct nagios to do its stuff that we have mentioned before (monitor, alert etc.). Let’s say we add a .cfg file for each remote host. S o, just create the file /etc/nagios3/conf.d/staging.cfg contains contents like:

    define host{
            use                     generic-host            ; Name of host template to use
            host_name               staging
            alias                   staging.serverdomain
            address                 <IP-address>

where address is server ip address, host_name is the name of your server and alias is the fully qualified domain name for the server.

Next, We will take a look at what services that are supposed to be checked. What to do is by looking at the file name services_nagios2.cfg at the same directory we’ve mentioned above. And at this point, just use check_ssh and check_http which are good services to monitor as we just started.

Add all hosts and we need to monitor, that said staging and live into those checks.

    # A list of your web servers
    define hostgroup {
            hostgroup_name  http-servers
                    alias           HTTP servers
                    members         localhost, staging, live

    # A list of your ssh-accessible servers
    define hostgroup {
            hostgroup_name  ssh-servers
                    alias           SSH servers
                    members         localhost, staging, live
            } Restart Nagios

/etc/init.d/nagios3 restart

If your configuration files are correct, the the the Hostgroup Overview page should look like this: enter image description here

After a few minutes, every service should change from PENDING to OK. (That means http and ssh service are running fine). Like so,

enter image description here

comments powered by Disqus