Nagios Core Configuration - host, hostgrop, hostdependency, hostescalation

In the case that you have followed steps mentioned at installation of the Nagios Core document at this web page the predefined configuration file with host related objects are located at:

/opt/nagios-<VERSION>/etc/objects/localhost.cfg
/opt/nagios-<VERSION>/etc/objects/printer.cfg
/opt/nagios-<VERSION>/etc/objects/switch.cfg
/opt/nagios-<VERSION>/etc/objects/windows.cfg

Preface

Nagios Core need to know basic information about monitored node. Like the IP, FQDN, … . According to this “host” object is used to configure all node related information so that they can be used at Nagios Core (for example in service definition).

Mentioned “host” objects is possible to group and to objects that include larger part of the monitored infrastructure. It’s possible also to configure relation between monitored “host” objects. In this way Nagios Core will be able to do event correlation, to reduce load on support team.

hosts

Description

One of the main objects used at Nagios Core is probably “host” object. Usually it`s describing monitored node and providing information related to this node. Mentioned information can be used at “host” related objects of Nagios Core (like related “command” or “service” objects).

Definition of “host” object is including as well “command” that will be used for “host” object “up/down” monitoring. Commonly it is ICMP test of node availability, but it’s possible to customize it.

Location

Nagios Core is providing to you already predefined set of “host” objects that is possible to copy and modify to for your node.
Located at (in the case that you have followed the installation of Nagios Core described on this website):
/opt/nagios-<VERSION>/etc/objects/localhost.cfg
/opt/nagios-<VERSION>/etc/objects/printer.cfg
/opt/nagios-<VERSION>/etc/objects/switch.cfg
/opt/nagios-<VERSION>/etc/objects/windows.cfg

Location Customization

In the case that you prefer to use your own configuration file, to store your customized configuration it is possible to define path to your configuration file.

In this case Nagios Core need to know where to search for the customized configuration file.

According to this it is required to update the main Nagios Core configuration file – “nagios.cfg”
It is possible to specify:

cfg_file=/<path>/<to>/<your>/<config>/<file>         # Direct path to you customized configuration file
cfg_dir=/<path>/<to>/<your>/<config>/<dir>           # Path to the directory where to search for the config file.

Official documentation

In most of my documents I’m preventing to copying of the official documentation. On another hand I think at this point it is really handy as I will not reinvent the wheel.

Description:

A host definition is used to define a physical server, workstation, device, etc. that resides on your network.

Definition Format:

define host{
            host_name                     host_name                        # Mandatory parameter
            alias                         alias                            # Mandatory parameter
            display_name                  display_name
            address                       address
            parents                       host_names
            hourly_value                  #
            hostgroups                    hostgroup_names
            check_command                 command_name
            initial_state                 [o,d,u]
            max_check_attempts            #                                # Mandatory parameter
            check_interval                #
            retry_interval                #
            active_checks_enabled         [0/1]
            passive_checks_enabled        [0/1]
            check_period                  timeperiod_name                  # Mandatory parameter
            obsess_over_host|obsess       [0/1]
            check_freshness               [0/1]
            freshness_threshold           #
            event_handler                 command_name
            event_handler_enabled         [0/1]
            low_flap_threshold            #
            high_flap_threshold           #
            flap_detection_enabled        [0/1]
            flap_detection_options        [o,d,u]
            process_perf_data             [0/1]
            retain_status_information     [0/1]
            retain_nonstatus_information  [0/1]
            contacts                      contacts                         # Mandatory parameter
            contact_groups                contact_groups                   # Mandatory parameter
            notification_interval         #
            first_notification_delay      #
            notification_period           timeperiod_name                  # Mandatory parameter
            notification_options          [d,u,r,f,s]
            notifications_enabled         [0/1]
            stalking_options              [o,d,u]
            notes                         note_string
            notes_url                     url
            action_url                    url
            icon_image                    image_file
            icon_image_alt                alt_string
            vrml_image                    image_file
            statusmap_image               image_file
            2d_coords                     x_coord,y_coord
            3d_coords                     x_coord,y_coord,z_coord
           }

Directive Descriptions:

host_name: This directive is used to define a short name used to identify the host. It is used in host group and service definitions to reference this particular host. Hosts can have multiple services (which are monitored) associated with them. When used properly, the $HOSTNAME$ macro will contain this short name.
alias: This directive is used to define a longer name or description used to identify the host. It is provided in order to allow you to more easily identify a particular host. When used properly, the $HOSTALIAS$ macro will contain this alias/description.
address: This directive is used to define the address of the host. Normally, this is an IP address, although it could really be anything you want (so long as it can be used to check the status of the host). You can use a FQDN to identify the host instead of an IP address, but if DNS services are not available this could cause problems. % When used properly, the $HOSTADDRESS$ macro will contain this address. Note: If you do not specify an address directive in a host definition, the name of the host will be used as its address. A word of caution about doing this, however - if DNS fails, most of your service checks will fail because the plugins will be unable to resolve the host name.
display_name: This directive is used to define an alternate name that should be displayed in the web interface for this host. If not specified, this defaults to the value you specify for the host_name directive. Note: The current CGIs do not use this option, although future versions of the web interface will.
parents: This directive is used to define a comma-delimited list of short names of the "parent" hosts for this particular host. Parent hosts are typically routers, switches, firewalls, etc. that lie between the monitoring host and a remote hosts. A router, switch, etc. which is closest to the remote host is considered to be that host's "parent". Read the "Determining Status and Reachability of Network Hosts" document located here for more information. If this host is on the same network segment as the host doing the monitoring (without any intermediate routers, etc.) the host is considered to be on the local network and will not have a parent host. Leave this value blank if the host does not have a parent host (i.e. it is on the same segment as the Nagios host). The order in which you specify parent hosts has no effect on how things are monitored.
hourly_value: This directive is used to represent the value of the host to your organization. The value is currently used when determining whether to send notifications to a contact. If the host's hourly value plus the hourly values of all of the host's services is greater than or equal to the contact's minimum value, the contact will be notified. For example, you could set this value and the minimum value of contacts such that a system administrator would be notified when a development server goes down, but the CIO would only be notified when the company's production ecommerce database server was down. The value could also be used as a sort criteria when generating reports or for calculating a good system administrator's bonus. The hourly value defaults to zero.
hostgroups: This directive is used to identify the short name(s) of the hostgroup(s) that the host belongs to. Multiple hostgroups should be separated by commas. This directive may be used as an alternative to (or in addition to) using the members directive in hostgroup definitions.
check_command: This directive is used to specify the short name of the command that should be used to check if the host is up or down. Typically, this command would try and ping the host to see if it is "alive". The command must return a status of OK (0) or Nagios will assume the host is down. If you leave this argument blank, the host will not be actively checked. Thus, Nagios will likely always assume the host is up (it may show up as being in a "PENDING" state in the web interface). This is useful if you are monitoring printers or other devices that are frequently turned off. The maximum amount of time that the notification command can run is controlled by the host_check_timeout option.
initial_state: By default Nagios will assume that all hosts are in UP states when it starts. You can override the initial state for a host by using this directive. Valid options are: o = UP, d = DOWN, and u = UNREACHABLE.
max_check_attempts: This directive is used to define the number of times that Nagios will retry the host check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the host check. Note: If you do not want to check the status of the host, you must still set this to a minimum value of 1. To bypass the host check, just leave the check_command option blank.
check_interval: This directive is used to define the number of "time units" between regularly scheduled checks of the host. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation.
retry_interval: This directive is used to define the number of "time units" to wait before scheduling a re-check of the hosts. Hosts are rescheduled at the retry interval when they have changed to a non-UP state. Once the host has been retried max_check_attempts times without a change in its status, it will revert to being scheduled at its "normal" rate as defined by the check_interval value. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation.
active_checks_enabled *: This directive is used to determine whether or not active checks (either regularly scheduled or on-demand) of this host are enabled. Values: 0 = disable active host checks, 1 = enable active host checks (default).
passive_checks_enabled *: This directive is used to determine whether or not passive checks are enabled for this host. Values: 0 = disable passive host checks, 1 = enable passive host checks (default).
check_period: This directive is used to specify the short name of the time period during which active checks of this host can be made.
obsess_over_host / obsess *: This directive determines whether or not checks for the host will be "obsessed" over using the ochp_command.
check_freshness *: This directive is used to determine whether or not freshness checks are enabled for this host. Values: 0 = disable freshness checks, 1 = enable freshness checks (default).
freshness_threshold: This directive is used to specify the freshness threshold (in seconds) for this host. If you set this directive to a value of 0, Nagios will determine a freshness threshold to use automatically.
event_handler: This directive is used to specify the short name of the command that should be run whenever a change in the state of the host is detected (i.e. whenever it goes down or recovers). Read the documentation on event handlers for a more detailed explanation of how to write scripts for handling events. The maximum amount of time that the event handler command can run is controlled by the event_handler_timeout option.
event_handler_enabled *: This directive is used to determine whether or not the event handler for this host is enabled. Values: 0 = disable host event handler, 1 = enable host event handler.
low_flap_threshold: This directive is used to specify the low state change threshold used in flap detection for this host. More information on flap detection can be found here. If you set this directive to a value of 0, the program-wide value specified by the low_host_flap_threshold directive will be used.
high_flap_threshold: This directive is used to specify the high state change threshold used in flap detection for this host. More information on flap detection can be found here. If you set this directive to a value of 0, the program-wide value specified by the high_host_flap_threshold directive will be used.
flap_detection_enabled *: This directive is used to determine whether or not flap detection is enabled for this host. More information on flap detection can be found here. Values: 0 = disable host flap detection, 1 = enable host flap detection.
flap_detection_options: This directive is used to determine what host states the flap detection logic will use for this host. Valid options are a combination of one or more of the following: o = UP states, d = DOWN states, u = UNREACHABLE states.
process_perf_data *: This directive is used to determine whether or not the processing of performance data is enabled for this host. Values: 0 = disable performance data processing, 1 = enable performance data processing.
retain_status_information: This directive is used to determine whether or not status-related information about the host is retained across program restarts. This is only useful if you have enabled state retention using the retain_state_information directive. Value: 0 = disable status information retention, 1 = enable status information retention.
retain_nonstatus_information: This directive is used to determine whether or not non-status information about the host is retained across program restarts. This is only useful if you have enabled state retention using the retain_state_information directive. Value: 0 = disable non-status information retention, 1 = enable non-status information retention.
contacts: This is a list of the short names of the contacts that should be notified whenever there are problems (or recoveries) with this host. Multiple contacts should be separated by commas. Useful if you want notifications to go to just a few people and don't want to configure contact groups. You must specify at least one contact or contact group in each host definition.
contact_groups: This is a list of the short names of the contact groups that should be notified whenever there are problems (or recoveries) with this host. Multiple contact groups should be separated by commas. You must specify at least one contact or contact group in each host definition.
notification_interval: This directive is used to define the number of "time units" to wait before re-notifying a contact that this host is still down or unreachable. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. If you set this value to 0, Nagios will not re-notify contacts about problems for this host - only one problem notification will be sent out.
first_notification_delay: This directive is used to define the number of "time units" to wait before sending out the first problem notification when this host enters a non-UP state. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. If you set this value to 0, Nagios will start sending out notifications immediately.
notification_period: This directive is used to specify the short name of the time period during which notifications of events for this host can be sent out to contacts. If a host goes down, becomes unreachable, or recoveries during a time which is not covered by the time period, no notifications will be sent out.
notification_options: This directive is used to determine when notifications for the host should be sent out. Valid options are a combination of one or more of the following: d = send notifications on a DOWN state, u = send notifications on an UNREACHABLE state, r = send notifications on recoveries (OK state), f = send notifications when the host starts and stops flapping, and s = send notifications when scheduled downtime starts and ends. If you specify n (none) as an option, no host notifications will be sent out. If you do not specify any notification options, Nagios will assume that you want notifications to be sent out for all possible states. Example: If you specify d,r in this field, notifications will only be sent out when the host goes DOWN and when it recovers from a DOWN state.
notifications_enabled *: This directive is used to determine whether or not notifications for this host are enabled. Values: 0 = disable host notifications, 1 = enable host notifications.
stalking_options: This directive determines which host states "stalking" is enabled for. Valid options are a combination of one or more of the following: o = stalk on UP states, d = stalk on DOWN states, and u = stalk on UNREACHABLE states. More information on state stalking can be found here.
notes: This directive is used to define an optional string of notes pertaining to the host. If you specify a note here, you will see the it in the extended information CGI (when you are viewing information about the specified host).
notes_url: This variable is used to define an optional URL that can be used to provide more information about the host. If you specify an URL, you will see a red folder icon in the CGIs (when you are viewing host information) that links to the URL you specify here. Any valid URL can be used. If you plan on using relative paths, the base path will the the same as what is used to access the CGIs (i.e. /cgi-bin/nagios/). This can be very useful if you want to make detailed information on the host, emergency contact methods, etc. available to other support staff.
action_url: This directive is used to define an optional URL that can be used to provide more actions to be performed on the host. If you specify an URL, you will see a red "splat" icon in the CGIs (when you are viewing host information) that links to the URL you specify here. Any valid URL can be used. If you plan on using relative paths, the base path will the the same as what is used to access the CGIs (i.e. /cgi-bin/nagios/).
icon_image: This variable is used to define the name of a GIF, PNG, or JPG image that should be associated with this host. This image will be displayed in the various places in the CGIs. The image will look best if it is 40×40 pixels in size. Images for hosts are assumed to be in the logos/ subdirectory in your HTML images directory (i.e. /usr/local/nagios/share/images/logos).
icon_image_alt: This variable is used to define an optional string that is used in the ALT tag of the image specified by the <icon_image> argument.
vrml_image: This variable is used to define the name of a GIF, PNG, or JPG image that should be associated with this host. This image will be used as the texture map for the specified host in the statuswrl CGI. Unlike the image you use for the <icon_image> variable, this one should probably not have any transparency. If it does, the host object will look a bit wierd. Images for hosts are assumed to be in the logos/ subdirectory in your HTML images directory (i.e. /usr/local/nagios/share/images/logos).
statusmap_image: This variable is used to define the name of an image that should be associated with this host in the statusmap CGI. You can specify a JPEG, PNG, and GIF image if you want, although I would strongly suggest using a GD2 format image, as other image formats will result in a lot of wasted CPU time when the statusmap image is generated. GD2 images can be created from PNG images by using the pngtogd2 utility supplied with Thomas Boutell's gd library. The GD2 images should be created in uncompressed format in order to minimize CPU load when the statusmap CGI is generating the network map image. The image will look best if it is 40×40 pixels in size. You can leave these option blank if you are not using the statusmap CGI. Images for hosts are assumed to be in the logos/ subdirectory in your HTML images directory (i.e. /usr/local/nagios/share/images/logos).
2d_coords: This variable is used to define coordinates to use when drawing the host in the statusmap CGI. Coordinates should be given in positive integers, as they correspond to physical pixels in the generated image. The origin for drawing (0,0) is in the upper left hand corner of the image and extends in the positive x direction (to the right) along the top of the image and in the positive y direction (down) along the left hand side of the image. For reference, the size of the icons drawn is usually about 40×40 pixels (text takes a little extra space). The coordinates you specify here are for the upper left hand corner of the host icon that is drawn. Note: Don't worry about what the maximum x and y coordinates that you can use are. The CGI will automatically calculate the maximum dimensions of the image it creates based on the largest x and y coordinates you specify.
3d_coords: This variable is used to define coordinates to use when drawing the host in the statuswrl CGI. Coordinates can be positive or negative real numbers. The origin for drawing is (0.0,0.0,0.0). For reference, the size of the host cubes drawn is 0.5 units on each side (text takes a little more space). The coordinates you specify here are used as the center of the host cube.

hostgrop

Description

Sometime it’s really handy to group “host” objects in to larger group, so that you can maintain them as one object. With help of “hostgroup” object at Nagios Core it’s possible to create one object that include several “host” or another “hostgroup” objects.

Nice example where to use it, is the case that you’re looking for possibility to group all nodes based on the platform.

For example in this hierarchy:

All_Hosts
- - Servers
- - - - MS_Windows
- - - - Linux
- - - - UX
- - - - BSD
- - Network
- - - - Cisco
- - - - HP Procure
- - - - H3C
- - - - Ruby
- - Another

As well another use case is to group all nodes based on location, so that you’ll be able to create several “hotgroups” including another “hostgroup” or “hosts”.

For example in this hierarchy:

Customer (will include all “hostgroups” on “Region” level)
- - Region (will include all “hostgroups” on “Country” level)
- - - - Country (will include all “hostgroups” on “Town” level)
- - - - - - Town (will include all “hostgroups” on “Street” level)
- - - - - - - - Street (will include all “hostgroups” on “Building” level)
- - - - - - - - - - Building (will include all “hostgroups” on “Room” level)
- - - - - - - - - - - - Room (will include all “hostgroups” on “Rack” level)
- - - - - - - - - - - - - - Rack (will include all “hosts” mounted in this “Rack”)
- - - - - - - - - - - - - - - - Host

Location Customization

In the case that you prefer to use your own configuration file, to store your customized configuration it is possible to define path to your configuration file.

In this case Nagios Core need to know where to search for the customized configuration file.

According to this it is required to update the main Nagios Core configuration file – “nagios.cfg” \It is possible to specify:

cfg_file=/<path>/<to>/<your>/<config>/<file>     # Direct path to you customized configuration file
cfg_dir=/<path>/<to>/<your>/<config>/<dir>       # Path to the directory where to search for the config file.

Official documentation

In most of my documents I’m preventing to copying of the official documentation. On another hand I think at this point it is really handy as I will not reinvent the wheel.

Description:

A host group definition is used to group one or more hosts together for simplifying configuration with object tricks or display purposes in the CGIs.

Definition Format:

define hostgroup{
                 hostgroup_name         hostgroup_name          # Mandatory parameter
                 alias                  alias                   # Mandatory parameter
                 members                hosts
                 hostgroup_members      hostgroups
                 notes                  note_string
                 notes_url              url
                 action_url             url
                }

Directive Descriptions:

hostgroup_name: This directive is used to define a short name used to identify the host group.
alias: This directive is used to define is a longer name or description used to identify the host group. It is provided in order to allow you to more easily identify a particular host group.
members: This is a list of the short names of hosts that should be included in this group. Multiple host names should be separated by commas. This directive may be used as an alternative to (or in addition to) the hostgroups directive in host definitions.
hostgroup_members: This optional directive can be used to include hosts from other "sub" host groups in this host group. Specify a comma-delimited list of short names of other host groups whose members should be included in this group.
notes: This directive is used to define an optional string of notes pertaining to the host. If you specify a note here, you will see the it in the extended information CGI (when you are viewing information about the specified host).
notes_url: This variable is used to define an optional URL that can be used to provide more information about the host group. If you specify an URL, you will see a red folder icon in the CGIs (when you are viewing hostgroup information) that links to the URL you specify here. Any valid URL can be used. If you plan on using relative paths, the base path will the the same as what is used to access the CGIs (i.e. /cgi-bin/nagios/). This can be very useful if you want to make detailed information on the host group, emergency contact methods, etc. available to other support staff.
action_url: This directive is used to define an optional URL that can be used to provide more actions to be performed on the host group. If you specify an URL, you will see a red "splat" icon in the CGIs (when you are viewing hostgroup information) that links to the URL you specify here. Any valid URL can be used. If you plan on using relative paths, the base path will the the same as what is used to access the CGIs (i.e. /cgi-bin/nagios/).


hostdependency

Description

When Nagios Core is providing monitoring for larger infrastructure, it’s required in case of failure detection to do root cause analyze. In this way we can reduce load on the support team and focus on the main issues.

To configure topology based correlation between “host” objects is possible to use “hostdependency” object of Nagios Core.

In this case Nagios Core will correlate events based on configured topology relations. Instead of setting alarms for all “host” behind the affected node, Nagios Core will use “Unreachable” state for all “host” object behind of the affected “host”. According to this it’s required to configure “host” objects to prevent notification when “host” has “Unreachable” status.

Location Customization

In the case that you prefer to use your own configuration file, to store your customized configuration it is possible to define path to your configuration file.

In this case Nagios Core need to know where to search for the customized configuration file.

According to this it is required to update the main Nagios Core configuration file – “nagios.cfg”
It is possible to specify:

cfg_file=/<path>/<to>/<your>/<config>/<file>         # Direct path to you customized configuration file
cfg_dir=/<path>/<to>/<your>/<config>/<dir>           # Path to the directory where to search for the config file.

Official documentation

In most of my documents I’m preventing to copying of the official documentation. On another hand I think at this point it is really handy as I will not reinvent the wheel.

Description:

Host dependencies are an advanced feature of Nagios that allow you to suppress notifications for hosts based on the status of one or more other hosts. Host dependencies are optional and are mainly targeted at advanced users who have complicated monitoring setups. More information on how host dependencies work (read this!) can be found here.

Definition Format:

define hostdependency{
                      dependent_host_name             host_name               # Mandatory parameter
                      dependent_hostgroup_name        hostgroup_name
                      host_name                       host_name               # Mandatory parameter
                      hostgroup_name                  hostgroup_name
                      inherits_parent                 [0/1]
                      execution_failure_criteria      [o,d,u,p,n]
                      notification_failure_criteria   [o,d,u,p,n]
                      dependency_period               timeperiod_name
                     }

Directive Descriptions:

dependent_host_name: This directive is used to identify the short name(s) of the dependent host(s). Multiple hosts should be separated by commas.
dependent_hostgroup_name: This directive is used to identify the short name(s) of the dependent hostgroup(s). Multiple hostgroups should be separated by commas. The dependent_hostgroup_name may be used instead of, or in addition to, the dependent_host_name directive.
host_name: This directive is used to identify the short name(s) of the host(s) that is being depended upon (also referred to as the master host). Multiple hosts should be separated by commas.
hostgroup_name: This directive is used to identify the short name(s) of the hostgroup(s) that is being depended upon (also referred to as the master host). Multiple hostgroups should be separated by commas. The hostgroup_name may be used instead of, or in addition to, the host_name directive.
inherits_parent: This directive indicates whether or not the dependency inherits dependencies of the host that is being depended upon (also referred to as the master host). In other words, if the master host is dependent upon other hosts and any one of those dependencies fail, this dependency will also fail.
execution_failure_criteria: This directive is used to specify the criteria that determine when the dependent host should not be actively checked. If the master host is in one of the failure states we specify, the dependent host will not be actively checked. Valid options are a combination of one or more of the following (multiple options are separated with commas): o = fail on an UP state, d = fail on a DOWN state, u = fail on an UNREACHABLE state, and p = fail on a pending state (e.g. the host has not yet been checked). If you specify n (none) as an option, the execution dependency will never fail and the dependent host will always be actively checked (if other conditions allow for it to be). Example: If you specify u,d in this field, the dependent host will not be actively checked if the master host is in either an UNREACHABLE or DOWN state.
notification_failure_criteria: This directive is used to define the criteria that determine when notifications for the dependent host should not be sent out. If the master host is in one of the failure states we specify, notifications for the dependent host will not be sent to contacts. Valid options are a combination of one or more of the following: o = fail on an UP state, d = fail on a DOWN state, u = fail on an UNREACHABLE state, and p = fail on a pending state (e.g. the host has not yet been checked). If you specify n (none) as an option, the notification dependency will never fail and notifications for the dependent host will always be sent out. Example: If you specify d in this field, the notifications for the dependent host will not be sent out if the master host is in a DOWN state.
dependency_period: This directive is used to specify the short name of the time period during which this dependency is valid. If this directive is not specified, the dependency is considered to be valid during all times.

hostescalation

Description

The main idea of “hostescalaton” object at Nagios Core is to configure automated escalation process for detected issues.

For example it can be used for automated escalation of the issues based on the hierarchy of delivery model.
- 1st detection of the issue ,send notification to 1st level support team
- 10th detection of the issue, send notification to 2nd level support team
- 20th detection of the issue, send notification to 3th level support team
- 30th detection of the issue, send notification to Escalation Manager

I know that this idea is nice and it’s possible to automate the business process.

Any way please do not use it in this way.

- In the case that 1st level team has already started investigation, probably they have already contacted relevant vendors.
- After some time the same alarm will be sent to 2nd and later on to 3th level support team.
- This 2nd and 3th level team will need to start the whole investigation from scratch, instead of continue to work on the case with the information that has 1st level support team has already collected.

On another hand “hostescalation” object is providing us the possibility to automatize fixing of some issues. In this case we can run script that will try to fix the issue, instead of sending notification at 1st detection of the issue. In case that the issue will be still present during next polling period we can sent notification to responsible team.

In this way it is possible to automate some kind of host related issues. Like in the case that we are running a Virtual Server we can try to start him before we will send “Host Down” alarm.

Location Customization

In the case that you prefer to use your own configuration file, to store your customized configuration it is possible to define path to your configuration file.

In this case Nagios Core need to know where to search for the customized configuration file.

According to this it is required to update the main Nagios Core configuration file – “nagios.cfg“ It is possible to specify:

cfg_file=/<path>/<to>/<your>/<config>/<file>       # Direct path to you customized configuration file
cfg_dir=/<path>/<to>/<your>/<config>/<dir>         # Path to the directory where to search for the config file.

Official documentation

In most of my documents I’m preventing to copying of the official documentation. On another hand I think at this point it is really handy as I will not reinvent the wheel.

Description:

Host escalations are completely optional and are used to escalate notifications for a particular host. More information on how notification escalations work can be found here.

Definition Format:

define hostescalation{
                      host_name               host_name          # Mandatory parameter
                      hostgroup_name          hostgroup_name
                      contacts                contacts           # Mandatory parameter
                      contact_groups          contactgroup_name  # Mandatory parameter
                      first_notification      #                  # Mandatory parameter
                      last_notification       #                  # Mandatory parameter
                      notification_interval   #                  # Mandatory parameter
                      escalation_period       timeperiod_name
                      escalation_options      [d,u,r]
                     }

Directive Descriptions:

host_name: This directive is used to identify the short name of the host that the escalation should apply to.
hostgroup_name: This directive is used to identify the short name(s) of the hostgroup(s) that the escalation should apply to. Multiple hostgroups should be separated by commas. If this is used, the escalation will apply to all hosts that are members of the specified hostgroup(s).
first_notification: This directive is a number that identifies the first notification for which this escalation is effective. For instance, if you set this value to 3, this escalation will only be used if the host is down or unreachable long enough for a third notification to go out.
last_notification: This directive is a number that identifies the last notification for which this escalation is effective. For instance, if you set this value to 5, this escalation will not be used if more than five notifications are sent out for the host. Setting this value to 0 means to keep using this escalation entry forever (no matter how many notifications go out).
contacts: This is a list of the short names of the contacts that should be notified whenever there are problems (or recoveries) with this host. Multiple contacts should be separated by commas. Useful if you want notifications to go to just a few people and don't want to configure contact groups. You must specify at least one contact or contact group in each host escalation definition.
contact_groups: This directive is used to identify the short name of the contact group that should be notified when the host notification is escalated. Multiple contact groups should be separated by commas. You must specify at least one contact or contact group in each host escalation definition.
notification_interval: This directive is used to determine the interval at which notifications should be made while this escalation is valid. If you specify a value of 0 for the interval, Nagios will send the first notification when this escalation definition is valid, but will then prevent any more problem notifications from being sent out for the host. Notifications are only sent out when the host recovers. This is useful if you want to stop having notifications sent out after a certain amount of time. Note: If multiple escalation entries for a host overlap for one or more notification ranges, the smallest notification interval from all escalation entries is used.
escalation_period: This directive is used to specify the short name of the time period during which this escalation is valid. If this directive is not specified, the escalation is considered to be valid during all times.
escalation_options: This directive is used to define the criteria that determine when this host escalation is used. The escalation is used only if the host is in one of the states specified in this directive. If this directive is not specified in a host escalation, the escalation is considered to be valid during all host states. Valid options are a combination of one or more of the following: r = escalate on an UP (recovery) state, d = escalate on a DOWN state, and u = escalate on an UNREACHABLE state. Example: If you specify d in this field, the escalation will only be used if the host is in a DOWN state.


Example

Host Correlation

In this example we can configure several hosts located in different datacenters across several regions. We will use one main datacenter for services shared across all regions where we will locate as well our Nagios Core server.

As well in each region and city we will have one datacenter that will interconnect local branches and provide specific services used only at local level.

It will be only light over view to present the configuration possibilities. For better understanding please see the map.

Map

                                                                                 Customer - XYZ
                                                                                          |router-xyz1 (10.0.0.2)
                                                                                          |---------------------- router-xyz (10.0.0.1 <- HSRP IP) | ----- NagiosCore (10.0.0.101)
                                                                                          |router-xyz2 (10.0.0.3)
                       |------------------------------------------------------------------|-------------------------------------------------------------------|
                       | router-apac-xyz1(10.0.0.12)                                      |router-emea-xyz1 (10.0.0.22)                                       | router-ams-xyz1 (10.0.0.32)
                       | router-apac-xyz2(10.0.0.13)                                      |router-emea-xyz2 (10.0.0.23)                                       | router-ams-xyz2 (10.0.0.33)
                 Region APAC                                                        Region EMEA                                                         Region AMS
                       | router-apac-xyz1(10.1.0.2)                                       |router-emea-xyz1 (10.2.0.2)                                        | router-ams-xyz1 (10.3.0.2)
                       | router-apac-xyz2(10.1.0.3)                                       |router-emea-xyz2 (10.2.0.3)                                        | router-ams-xyz2 (10.3.0.3)
         |--------------------------|   |----------------------------|   |----------------------------|
         |router-TO-xyz1(10.1.0.12) |router-KL-xyz1(10.1.0.22)             |router-LO-xyz1(10.2.0.12)   |router-PA-xyz1(10.2.0.22)             |router-NY-xyz1(10.3.0.12)   |router-TR-xyz1(10.3.0.22)
         |router-TO-xyz2(10.1.0.13) |router-KL-xyz2(10.1.0.23)             |router-LO-xyz2(10.2.0.13)   |router-PA-xyz2(10.2.0.23)             |router-NY-xyz2(10.3.0.13)   |router-TR-xyz2(10.3.0.23)
       Japan-Tokyo          Malaysia-Kuala_Lumpur                        UK-London                France-Paris                              USA-New_York              Canada-Toronto
         |router-TO-xyz1(10.1.1.2)  |router-KL-xyz1(10.1.2.2)              |router-LO-xyz1(10.2.1.2)    |router-PA-xyz1(10.2.2.2)              |router-NY-xyz1(10.3.1.2)    |router-TR-xyz1(10.3.2.2)
         |router-TO-xyz2(10.1.1.3)  |router-KL-xyz2(10.1.2.3)              |router-LO-xyz2(10.2.1.3)    |router-PA-xyz2(10.2.2.3)              |router-NY-xyz2(10.3.1.3)    |router-TR-xyz2(10.3.2.3)
         |router-TO (10.1.1.1 HSRP) |router-KL (10.1.2.1 HSRP)             |router-LO (10.2.1.1 HSRP)   |router-PA (10.2.2.1 HSRP)             |router-NY (10.3.1.1 HSRP)   |ruter-TR (10.3.2.1 HSRP)
      |---------------|   |---------------|   |---------------|   |---------------|   |---------------|   |---------------|
Host1(10.1.1.101)     |      Host3(10.1.2.101)   |                  Host5(10.2.1.101)   |        Host7(10.2.2.101)   |                  Host9(10.3.1.101)   |        Host11(10.3.2.101)  |
          Host2(10.1.1.102)           Host4(10.1.2.102)                      Host6(10.2.1.102)            Host8(10.2.2.102)                      Host10(10.3.1.102)            Host12(10.3.2.102)

Configuration

define host{                                                                                  # Template that we will use for shared Host configuration
        name                            xyz-host                                              # Tempalte name
        check_period                    24x7                                                  # Monitoring 24/7
        check_interval                  5                                                     # Polling interval 5 min
        retry_interval                  1
        max_check_attempts              3                                                     # 3 time try to poll the device until changing the Hard status
        check_command                   check-host-alive                                      # monitoring script
        notification_interval           0                                                     # Send notification only once
        notification_options            d,f                                                   # When the device is DOWN or FLAPPING
        contact_groups                  admins                                                # Who will be contacted
        notifications_enabled           1
        event_handler_enabled           1
        flap_detection_enabled          1
        process_perf_data               1
        retain_status_information       1
        retain_nonstatus_information    1
        notification_period             24x7
        register                        0                                                   # As it is only template do not take it as a Nagios Core Object
        }

######################## Routers                                                            # Configuration of "host" objects for our routers
## Main Customer XYZ DC
define host{
        host_name                       router-xyz.xyz.org                                  # Router Name
        alias                           router-xyz
        parents                         localhost                                           # Parent "host"
        address                         10.0.0.1                                            # IP
        use                             xyz-host                                            # TEmplate to be used
       }

define host{
        host_name                       router-xyz1.xyz.org                                 # Router Name
        alias                           router-xyz1
        parents                         router-xyz.xyz.org                                  # Parent "host"
        address                         10.0.0.2                                            # IP
        use                             xyz-host                                            # TEmplate to be used
       }

define host{
        host_name                       router-xyz2.xyz.org                                 # Router Name
        alias                           router-xyz2
        parents                         router-xyz.xyz.org                                  # Parent "host"
        address                         10.0.0.3                                            # IP
        use                             xyz-host                                            # TEmplate to be used
       }

#### APAC Main DC
    define host{
            host_name                       router-APAC-xyz1.xyz.org
            alias                           router-APAC-xyz1
            parents                         router-xyz1.xyz.org,router-xyz2.xyz.org
            address                         10.0.0.12
            use                             xyz-host
           }

    define host{
            host_name                       router-APAC-xyz2.xyz.org
            alias                           router-APAC-xyz2
            parents                         router-xyz1.xyz.org,router-xyz2.xyz.org
            address                         10.0.0.13
            use                             xyz-host
           }

###### Japan-Tokyo
      define host{
              host_name                       router-TO-xyz1.xyz.org
              alias                           router-TO-xyz1
              parents                         router-APAC-xyz1.xyz.org,router-APAC-xyz2.xyz.org
              address                         10.1.0.12
              use                             xyz-host
             }

      define host{
              host_name                       router-TO-xyz2.xyz.org
              alias                           router-TO-xyz2
              parents                         router-APAC-xyz1.xyz.org,router-APAC-xyz2.xyz.org
              address                         10.1.0.13
              use                             xyz-host
             }

      define host{
              host_name                       router-TO.xyz.org
              alias                           router-TO
              parents                         router-TO-xyz1.xyz.org,router-TO-xyz2.xyz.org
              address                         10.1.1.1
              use                             xyz-host
             }

###### Malaysia-Kuala_Lumpur
      define host{
              host_name                       router-KL-xyz1.xyz.org
              alias                           router-KL-xyz1
              parents                         router-APAC-xyz1.xyz.org,router-APAC-xyz2.xyz.org
              address                         10.1.0.22
              use                             xyz-host
             }

      define host{
              host_name                       router-KL-xyz2.xyz.org
              alias                           router-KL-xyz2
              parents                         router-APAC-xyz1.xyz.org,router-APAC-xyz2.xyz.org
              address                         10.1.0.23
              use                             xyz-host
             }

     define host{
              host_name                       router-KL.xyz.org
              alias                           router-KL
              parents                         router-KL-xyz1.xyz.org,router-KL-xyz2.xyz.org
              address                         10.1.2.1
              use                             xyz-host
             }

#### EMEA Main DC
    define host{
            host_name                       router-EMEA-xyz1.xyz.org
            alias                           router-EMEA-xyz1
            parents                         router-xyz1.xyz.org,router-xyz2.xyz.org
            address                         10.0.0.22
            use                             xyz-host
           }

    define host{
            host_name                       router-EMEA-xyz2.xyz.org
            alias                           router-EMEA-xyz2
            parents                         router-xyz1.xyz.org,router-xyz2.xyz.org
            address                         10.0.0.23
            use                             xyz-host
           }

###### UK-London
      define host{
              host_name                       router-LO-xyz1.xyz.org
              alias                           router-LO-xyz1
              parents                         router-EMEA-xyz1.xyz.org,router-EMEA-xyz2.xyz.org
              address                         10.2.0.12
              use                             xyz-host
             }

      define host{
              host_name                       router-LO-xyz2.xyz.org
              alias                           router-LO-xyz2
              parents                         router-EMEA-xyz1.xyz.org,router-EMEA-xyz2.xyz.org
              address                         10.2.0.13
              use                             xyz-host
             }

      define host{
              host_name                       router-LO.xyz.org
              alias                           router-LO
              parents                         router-LO-xyz1.xyz.org,router-LO-xyz2.xyz.org
              address                         10.2.1.1
              use                             xyz-host
             }

###### France-Paris
      define host{
              host_name                       router-PA-xyz1.xyz.org
              alias                           router-PA-xyz1
              parents                         router-EMEA-xyz1.xyz.org,router-EMEA-xyz2.xyz.org
              address                         10.2.0.22
              use                             xyz-host
             }

      define host{
              host_name                       router-PA-xyz2.xyz.org
              alias                           router-PA-xyz2
              parents                         router-EMEA-xyz1.xyz.org,router-EMEA-xyz2.xyz.org
              address                         10.2.0.23
              use                             xyz-host
             }

      define host{
              host_name                       router-PA.xyz.org
              alias                           router-PA
              parents                         router-PA-xyz1.xyz.org,router-PA-xyz2.xyz.org
              address                         10.2.2.1
              use                             xyz-host
             }

#### AMS  Main DC
    define host{
            host_name                       router-AMS-xyz1.xyz.org
            alias                           router-AMS-xyz1
            parents                         router-xyz1.xyz.org,router-xyz2.xyz.org
            address                         10.0.0.32
            use                             xyz-host
           }

    define host{
            host_name                       router-AMS-xyz2.xyz.org
            alias                           router-AMS-xyz2
            parents                         router-xyz1.xyz.org,router-xyz2.xyz.org
            address                         10.0.0.33
            use                             xyz-host
           }

###### USA-New_York
      define host{
              host_name                       router-NY-xyz1.xyz.org
              alias                           router-NY-xyz1
              parents                         router-AMS-xyz1.xyz.org,router-AMS-xyz2.xyz.org
              address                         10.3.0.12
              use                             xyz-host
             }

      define host{
              host_name                       router-NY-xyz2.xyz.org
              alias                           router-NY-xyz2
              parents                         router-AMS-xyz1.xyz.org,router-AMS-xyz2.xyz.org
              address                         10.3.0.13
              use                             xyz-host
             }

      define host{
              host_name                       router-NY.xyz.org
              alias                           router-NY
              parents                         router-NY-xyz1.xyz.org,router-NY-xyz2.xyz.org
              address                         10.3.1.1
              use                             xyz-host
             }

###### Canada-Toronto
      define host{
              host_name                       router-TR-xyz1.xyz.org
              alias                           router-TR-xyz1
              parents                         router-AMS-xyz1.xyz.org,router-AMS-xyz2.xyz.org
              address                         10.3.0.22
              use                             xyz-host
             }

      define host{
              host_name                       router-TR-xyz2.xyz.org
              alias                           router-TR-xyz2
              parents                         router-AMS-xyz1.xyz.org,router-AMS-xyz2.xyz.org
              address                         10.3.0.23
              use                             xyz-host
             }

      define host{
              host_name                       router-TR.xyz.org
              alias                           router-TR
              parents                         router-TR-xyz1.xyz.org,router-TR-xyz2.xyz.org
              address                         10.3.2.1
              use                             xyz-host
             }

######################## HOSTS

define host{
        host_name                       host1.xyz.org
        alias                           host1
        parents                         router-TO.xyz.org
        address                         10.1.1.101
        use                             xyz-host
       }

define host{
        host_name                       host2.xyz.org
        alias                           host2
        address                         10.1.1.102
        parents                         router-TO.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host3.xyz.org
        alias                           host3
        address                         10.1.2.101
        parents                         router-KL.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host4.xyz.org
        alias                           host4
        address                         10.1.2.102
        parents                         router-KL.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host5.xyz.org
        alias                           host5
        address                         10.2.1.101
        parents                         router-LO.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host6.xyz.org
        alias                           host6
        address                         10.2.1.102
        parents                         router-LO.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host7.xyz.org
        alias                           host7
        address                         10.2.2.101
        parents                         router-PA.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host8.xyz.org
        alias                           host8
        address                         10.2.2.102
        parents                         router-PA.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host9.xyz.org
        alias                           host1
        address                         10.3.1.101
        parents                         router-NY.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host10.xyz.org
        alias                           host10
        address                         10.3.1.102
        parents                         router-NY.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host11.xyz.org
        alias                           host11
        address                         10.3.2.101
        parents                         router-TR.xyz.org
        use                             xyz-host
       }

define host{
        host_name                       host12.xyz.org
        alias                           host12
        address                         10.3.2.102
        parents                         router-TR.xyz.org
        use                             xyz-host
       }

########################  HostGroup
define hostgroup{
                 hostgroup_name         Customer_XYZ                                                       # Name of the Host Group
                 alias                  Customer_XYZ                                                       # Alias
                 members                router-xyz.xyz.org,router-xyz1.xyz.org,router-xyz2.xyz.org         # Host members of the Host Group
                 hostgroup_members      Region_APAC,Region_EMEA,Region_AMS                                 # Host Group members of the Host Group
                }
## APAC
  define hostgroup{
                   hostgroup_name         Region_APAC
                   alias                  Region_APAC
                   members                router-APAC-xyz1.xyz.org,router-APAC-xyz2.xyz.org
                   hostgroup_members      Japan-Tokyo,Malaysia-Kuala_Lumpur
                  }
#### Japan-Tokyo
    define hostgroup{
                     hostgroup_name         Japan-Tokyo
                     alias                  Japan-Tokyo
                     members                host1.xyz.org,host2.xyz.org,router-TO-xyz1.xyz.org,router-TO-xyz2.xyz.org
                    }

#### Malaysia-Kuala_Lumpur
    define hostgroup{
                     hostgroup_name         Malaysia-Kuala_Lumpur
                     alias                  Malaysia-Kuala_Lumpur
                     members                host3.xyz.org,host4.xyz.org,router-KL-xyz1.xyz.org,router-KL-xyz2.xyz.org
                    }

## EMEA
  define hostgroup{
                   hostgroup_name         Region_EMEA
                   alias                  Region_EMEA
                   members                router-EMEA-xyz1.xyz.org,router-EMEA-xyz2.xyz.org
                   hostgroup_members      UK-London,France-Paris
                  }

#### UK-London
    define hostgroup{
                     hostgroup_name         UK-London
                     alias                  UK-London
                     members                host5.xyz.org,host6.xyz.org,router-LO-xyz1.xyz.org,router-LO-xyz2.xyz.org
                    }
#### France-Paris
    define hostgroup{
                     hostgroup_name         France-Paris
                     alias                  France-Paris
                     members                host7.xyz.org,host8.xyz.org,router-PA-xyz1.xyz.org,router-PA-xyz2.xyz.org
                    }

## AMS
  define hostgroup{
                   hostgroup_name         Region_AMS
                   alias                  Region_AMS
                   members                router-AMS-xyz1.xyz.org,router-AMS-xyz2.xyz.org
                   hostgroup_members      USA-New_York,Canada-Toronto
                  }

#### USA-New_York
    define hostgroup{
                     hostgroup_name         USA-New_York
                     alias                  USA-New_York
                     members                host9.xyz.org,host10.xyz.org,router-NY-xyz1.xyz.org,router-NY-xyz2.xyz.org
                    }

#### Canada-Toronto
    define hostgroup{
                     hostgroup_name         Canada-Toronto
                     alias                  Canada-Toronto
                     members                host11.xyz.org,host12.xyz.org,router-TR-xyz1.xyz.org,router-TR-xyz2.xyz.org
                    }

Now it is possible to reload the Nagios Core configuration:

[root@NagiosCore ~]# /etc/init.d/nagios reload

To see the updated configuration you can go to your web GUI and you can check the:
- Map
or
- Host Groups

URL's

Navigation
Print/export
QR Code
QR Code wiki:infrastructure_tools:nagios:nagios_core_configuration_-_host_hostgrop_hostdependency_hostescalation (generated for current page)