New in PNP 0.6.x

PNP 0.6.x Preview

The work on the new version 0.6.x is in full progress.

Starting with version 0.6.x we switch from subversion to GIT. The sourcecode is already available on sourceforge.

Functions implemented already

back to contents | system requirements

About PNP

System requirements

PNP mandatory requires valid performance data of nagios plugins.

So what is this performance data?

The output of a nagios plugin up to nagios 2.x is limited to one line. When the plugin produces performance data, it is divided into two parts. The pipe symbol (“|”) is used as a delimiter.

Example check_icmp :

 OK - 127.0.0.1: rta 2.687ms, lost 0% | rta=2.687ms;3000.000;5000.000;0; pl=0%;80;100;;

resulting in the text on the left side of the pipe symbol

 OK - 127.0.0.1: rta 2.687ms, lost 0%

and the performance data

  rta=2.687ms;3000.000;5000.000;0; pl=0%;80;100;;

Performance data is designed for automatic processing. The format is specified within the Developer Guidelines (you'll find an excerpt here) but should be exemplified here nonetheless:

  rta=2.687ms;3000.000;5000.000;0;
   |    |  |    |         |     | |
   |----|--|----|---------|-----|-|----- * label 
        |--|----|---------|-----|-|----- * current value
           |----|---------|-----|-|----- unit ( UOM = UNIT of Measurement ) 
                |---------|-----|-|----- warning threshold
                          |-----|-|----- critical threshold 
                                |-|----- minimum value 
                                  |----- maximum value
                                  

Value marked with * are mandatory. All other values are optional.

Several data series are separated by blanks. The actual data must not contains any blanks. If the label contains blanks, it has to be surrounded by single quotes.

Required Software

License

PNP is licensed under GPL 2

Download

Development of PNP is organized using Sourceforge.Net. PNP is registered under “PNP4nagios”.

The current stable version of 0.6.x can be found in the download area: Sourceforge Download

Starting with PNP 0.6.x the source code repository was switched from SVN to GIT.

The current development can be viewed anytime at http://pnp4nagios.git.sourceforge.net/. Clicking on PNP Devel version will download an archive containing the latest version.

Support

PRIOR to support questions please make sure that you have verified certain things described under verify your installation.

The developers and helpers are present on a separate board at http://www.nagios-portal.org and will be informed about new postings in the PNP-section. Postings in english will be answered as well.
After registering as a user please fill in the profile regarding operating system and PNP version used. Please mention if you used a package or compiled the sources. Please mark successfully solved threads by adding ”[solved]” to the title as it helps other users to find a solution for their problem.

The mailing lists on Sourceforge can be used to request support (and are limited to english):

pnp4nagios-users: users list for general questions regarding configuration. Please state your operating system and PNP version

pnp4nagios-devel: devel list for suggestions and error reports. Please state your operating system and PNP version

pnp4nagios-checkins: the checkin list automatically contains changes to the SVN repository

Storage

Performance data will be stored in Round Robin Databases using RRDtool. That means that after some time the oldest data will be dropped at the “end” and it will be replaced by new values “at the beginning”.

Various intervals provide for different resolutions. Using the defaults allows to store the data with a resolution of one minute for the last two days, five minutes resolution for ten days, 30 minutes resolution for 90 days and 6 hours resolution for four years. The increasing interval causes averaging of the data which leads to smaller max values. This not an error of PNP.

Using this storage format the size of the files will stay the same over time. Per datasource you will need approx. 400 KB.

Statistics and links to Sourceforge

back to contents | PNP modes

The art of collecting data

PNP supports several modes to process performance data. The modes differ in complexity and the performance to be expected.

The following image shows the connections between Nagios, PNP and RRDtool

Nagios invokes a command for every host and every service whose performance data should be processed. Depending on the mode you choose the data will be passed to process_perfdata.pl or will be written to temporary files and processed at a later time. process_perfdata.pl writes the data to XML files and stores them in RRD files using RRDtool.

Before you choose a mode please read the documentation and decide which way will be the best for installation.

The modes in comparison

Synchronous Mode

The “synchronous mode” is the simplest and easiest to set up. Nagios will call the perl script process_perfdata.pl for every service and host, respectively, to process the data. The synchronous mode will work very good up to about 1.000 services in a 5 minute interval.

Bulk Mode

In bulk mode Nagios writes the necessary data to a temporary file. After expiration of a defined time the file will be processed in one piece and deleted afterwards.

The number of calls of process_perfdata.pl will be reduced to a fraction. Depending on time and the amount of collected data there will be much less system calls. Instead, process_perfdata.pl will run longer.

Note Using this mode you should keep an eye on the runtime of process_perfdata.pl. While it is running to process data nagios will not execute any checks.

snippet of var/perfdata.log:

2007-10-18 12:05:01 [21138] 71 Lines processed
2007-10-18 12:05:01 [21138] .../spool/service-perfdata-1192701894-PID-21138 deleted
2007-10-18 12:05:01 [21138] PNP exiting (runtime 0.060969s) ...

71 lines were processed in 0.06 seconds. This will be the data volume of about 2000 services und processing using a 10 second interval. It means we blocked nagios for exactly 0.06 seconds.

Bulk Mode with NPCD

Viewing from Nagios this is the best way of processing because Nagios will not be blocked.

Nagios again uses a temporary file to store the data and executes a command after expiration of a certain time. Instead of immediate processing by process_perfdata.pl the file is moved to a spool directory. As moving a file inside the same filesystem nearly takes no time nagios is able to execute crucial work immediately.

The NPCD daemon (Nagios Performance C Daemon) will monitor the directory for new files and will pass the names to process_perfdata.pl. Processing of performance data is decoupled completely from nagios. NPCD itself is able to start multiple thread for processing the data.

Bulk Mode with npcdmod

This scenario includes npcdmod.o, an NEB-module. This module reduces the configuration of the “Bulk Mode with NPCD” to a mere two lines in nagios.cfg

This mode is similar to “Bulk Mode with NPCD” and it is exactly the same functionality and the same performance.

Gearman Mode

Since version 0.6.12 PNP4Nagios can be driven as a gearman worker. This way large Nagios environments are possible using mod_gearman. Nagios and PNP4Nagios can be run on different machines.

You need a mod_gearman environment up and running like described by Sven Nierlein on http://labs.consol.de/lang/en/nagios/mod-gearman/.

The decision

Which mode you choose will depend on the size of your Nagios installation. You will find theses terms throughout the documentation.

back to contents | installation

PNP 0.6.x Downloads

Current stable PNP Version

Changes can be tracked on pnp4nagios.git.sourceforge.net

The current Version is pnp4nagios-0.6.18.tar.gz

Latest Devel Version

pnp4nagios-head.tar.gz

This is allways the latest GIT HEAD Version


Last Update: Sat Sep 1 04:05:01 CEST 2012

ChangeLog

pnp-0.6.19 ??/??/2012

pnp-0.6.18 06/28/2012

pnp-0.6.17 03/25/2012

pnp-0.6.16 11/21/2011

pnp-0.6.15 09/15/2011

pnp-0.6.14 08/05/2011

pnp-0.6.13 05/19/2011

pnp-0.6.12 04/22/2011

pnp-0.6.11 01/15/2011

pnp-0.6.10 12/15/2010

pnp-0.6.7 09/27/2010

pnp-0.6.6 08/07/2010

pnp-0.6.5 07/09/2010

pnp-0.6.4 06/03/2010

pnp-0.6.3 03/16/2010

pnp-0.6.2 12/23/2009

pnp-0.6.1 11/22/2009

pnp-0.6.0 10/30/2009

Upgrade to version 0.6.x

The web-frontend has been completely rewritten and is now based on the PHP MVC framework Kohana. This leads to changed dependencies which must be checked prior to installation.

Note: At first an upgrade is like a new installation. Afterwards some changes should be made which are described further down.

Without specifying any options during ./configure PNP 0.4.x was installed below an existing Nagios-Installation at /usr/local/nagios.

Without specifying any options during ./configure PNP 0.6.x will be installed in a separate directory at /usr/local/pnp4nagios, i.e. it should be viewed as an independent application.

Note: It is sufficient to copy the *.rrd files from the old to the new location. They contain the data The *.xml files are recreated every time new performance data arrives as they contain meta information. The internal structure of the xml files has changed so you wouldn't be able to use them either way.

Comparison of the structure

Summary of a PNP 0.4.14 installation

./configure
...
*** Configuration summary for pnp 0.4.14 05-02-2009 ***

  General Options:
  -------------------------         -------------------
  Nagios user/group:                nagios nagios
  Install directory:                /usr/local/nagios
  HTML Dir:                         /usr/local/nagios/share/pnp
  Config Dir:                       /usr/local/nagios/etc/pnp
  Location of rrdtool binary:       /usr/bin/rrdtool Version 1.3.1
  RRDs Perl Modules:                FOUND (Version 1.3001)
  RRD Files stored in:              /usr/local/nagios/share/perfdata
  process_perfdata.pl Logfile:      /usr/local/nagios/var/perfdata.log
  Perfdata files (NPCD) stored in:  /usr/local/nagios/var/spool/perfdata/

Summary of a PNP 0.6.0 installation

./configure
...
*** Configuration summary for pnp4nagios-0.6.0 07-30-2009 ***

  General Options:
  -------------------------         -------------------
  Nagios user/group:                nagios nagios
  Install directory:                /usr/local/pnp4nagios
  HTML Dir:                         /usr/local/pnp4nagios/share
  Config Dir:                       /usr/local/pnp4nagios/etc
  Location of rrdtool binary:       /usr/bin/rrdtool Version 1.3.1
  RRDs Perl Modules:                FOUND (Version 1.3001)
  RRD Files stored in:              /usr/local/pnp4nagios/var/perfdata
  process_perfdata.pl Logfile:      /usr/local/pnp4nagios/var/perfdata.log
  Perfdata files (NPCD) stored in:  /usr/local/pnp4nagios/var/spool

  Web Interface Options:  -------------------------         -------------------
  HTML URL:                         http://localhost/pnp4nagios/
  Apache Config File:               /etc/apache2/conf.d/pnp4nagios.conf

Looking at these lines result in the parameters to be changed and the upgrade strategy.

Adjustments

The templates of the action_url definitions have changed. Instead of ”/nagios/pnp” the URL should be ”/pnp4nagios” and instead of “index.php” now “graph” will be used.

define host {
  name       host-pnp
  register   0
  action_url /pnp4nagios/graph?host=$HOSTNAME$
}

define service {
  name       srv-pnp
  register   0
  action_url /pnp4nagios/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
}

The definitions for the preview popup function are similar

define host {
   name       host-pnp
   action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_' class='tips' rel='/pnp4nagios/index.php/popup?host=$HOSTNAME$&srv=_HOST_
   register   0
}

define service {
   name       srv-pnp
   action_url /pnp4nagios/graph?host=$HOSTNAME$&srv=$SERVICEDESC$' class='tips' rel='/pnp4nagios/popup?host=$HOSTNAME$&srv=$SERVICEDESC$
   register   0
}

Attention: It is not an error that the strings in front and after “class” contain only one quote.

Other than described in the 0.4.x documentation these templates can be used for Nagios 2.x and 3.x.

The variables in the files in the templates folder have to be initialised before first use. Example

$lower = ""

Earlier you were able to append to variables which weren't initialised before first use. Example:

foreach ($DS as $i) {
    $def[1] .= "DEF:var$i=$rrdfile:$DS[$i]:AVERAGE " ;

Now you have to change that to

$def[1] = "";
foreach ($DS as $i) {
    $def[1] .= "DEF:var$i=$rrdfile:$DS[$i]:AVERAGE " ;


Constants in template files don't work anymore, so that they have to be converted to variables.

define("_WARNRULE", '#FFFF00');

may be changed to

 $WARNRULE = '#FFFF00';

Please keep in mind that all occurrences have to be changed ;-).

Upgrade scenario using NPCD

  1. planning the new setup
  2. perform test installation and acquaint oneself with the new system
  3. create backup of the old installation
  4. install PNP 0.6.x at /usr/local/pnp4nagios
  5. make install-config
  6. make install-webconf
  7. reload Apache
  8. test Apache-config
    1. call of /pnp4nagios has to report an empty perfdata directory
  9. create /usr/local/pnp4nagios/etc/npcd.cfg from npcd.cfg-sample
    1. check paths and adapt changes from 0.4.x if necessary
  10. adjust all paths in nagios.cfg to the new PNP installation
  11. adjust all paths in the command definitions
  12. stop npcd using /etc/init.d/npcd stop
  13. make install-init installs the new init script for npcd
  14. /etc/init.d/nagios stop
  15. copy /usr/local/nagios/share/perfdata to /usr/local/pnp4nagios/var/perfdata. Attention: check the permissions
  16. /etc/init.d/npcd start
  17. /etc/init.d/nagios start

Installation

The installation of PNP will be described in more detail. It is expected that nagios was compiled from source and is located in /usr/local/nagios.
Attention: The description applies to the developer version PNP 0.6.0.
Please note that PNP has to be configured after the installation.

Make and more

The installation of PNP is controlled by makefiles. The system is analyzed after invocation of ./configure and the detected values are tranferred to makefiles.

Please unpack PNP as user root:

tar -xvzf pnp4nagios-HEAD.tar.gz
cd pnp4nagios

./configure is to be called from the directory pnp4nagios.

./configure

Note: Without specifying any options user and group will be “nagios”. If you have different values then please use the parameters ”--with-nagios-user” and ”--with-nagios-group”, respectively. Using Icinga the call might be

./configure --with-nagios-user=icinga --with-nagios-group=icinga

Some lines run across the screen. The output at the end is important.

*** Configuration summary for pnp4nagios-0.6.2 23-12-2009 ***

  General Options:
  -------------------------         -------------------
  Nagios user/group:                nagios nagios
  Install directory:                /usr/local/pnp4nagios
  HTML Dir:                         /usr/local/pnp4nagios/share
  Config Dir:                       /usr/local/pnp4nagios/etc
  Location of rrdtool binary:       /usr/bin/rrdtool Version 1.2.12
  RRDs Perl Modules:                FOUND (Version 1.2012)
  RRD Files stored in:              /usr/local/pnp4nagios/var/perfdata
  process_perfdata.pl Logfile:      /usr/local/pnp4nagios/var/perfdata.log
  Perfdata files (NPCD) stored in:  /usr/local/pnp4nagios/var/spool

  Web Interface Options:  -------------------------         -------------------
  HTML URL:                         http://localhost/pnp4nagios/
  Apache Config File:               /etc/apache2/conf.d/pnp4nagios.conf


  Review the options above for accuracy.  If they look okay,
  type 'make all' to compile.

The paths shown should be checked. If the displayed values aren't correct you can change them calling ./configure with appropriate options.
Attention: “Location of rrdtool binary” means path including name of binary! If necessary it can be specified using the following syntax:

 ./configure --with-rrdtool=/usr/local/rrdtool-1.2.xx/bin/rrdtool
 ./configure --help 

shows the supported options.

Invoking

 make all

compiles the components like NPCD which are written in C

 make install

copies everything to the right places in the file system. The paths were already shows during ./configure.

After the installation of the program and HTML files you can copy a sample Apache configuration file to your web-server config directory

 make install-webconf

You can call

 make install-config

optionally. This way config files for process_perfdata.pl and npcd are copied to etc/pnp.

To install the NPCD Init script call

 make install-init

All these steps are combined in

 make fullinstall

Attention: After copying the configuration file for the web server you have to restart the web server (service httpd restart or /etc/init.d/apache2 restart, respectively).

Update

The update of a 0.6.x version works (nearly) the same way as an installation. Please note that you have to call ./configure with the same options you used during the first installation. Please check if you changed anything in the folder share/templates.dist. Own templates should be placed in share/templates to avoid being overwritten.
Attention: If you changed config.php then you should save this file before it is overwritten when you execute make install-config.

You can skip make install-webconf and make install-init because nothing changed between 0.6.x versions.

The components

After installation the components of PNP were copied to the appropriate places in the file system. These are

the PHP-Files for the web-frontend in

 /usr/local/pnp4nagios/share/pnp

the data collector process_perfdata.pl in

 /usr/local/pnp4nagios/libexec

sample config files with the suffix -sample in

 /usr/local/pnpnagios/etc

the config file config.php for the web frontend in

 /usr/local/pnp4nagios/etc

back to contents | configuration

Configuration

The configuration of the already mentioned modes of performance data processing will be described in more detail.

Synchronous Mode

The synchronous mode is the simplest way to integrate the data collector process_perfdata.pl into nagios. Every event will trigger an execution of process-service-perfdata.

Initially you have to enable processing of performance data in nagios.cfg. Please note that this directive might already exist in the config file. Default is “0”.

 process_performance_data=1

Data processing has to be disabled in the definition of every host or service whose performance data should NOT be processed.

define service {
   ...
   process_perf_data 0
   ...
}

Since Nagios 3.x it is possible to deactivate the export of environment variables (as part of optimizing the system for maximum performance). Unfortunately this directive has to be enabled to use the synchronous mode. So either you use the default value (which means that the export is enabled) or you define the variable in nagios.cfg

enable_environment_macros=1

Additionally the command to process performance data is to be specified in nagios.cfg

 service_perfdata_command=process-service-perfdata

Starting with Nagios 3.0 it may be useful to enable processing of performance data for hosts as well. Due to changed host check logic Nagios 3 now performs regularly scheduled host checks.

 host_perfdata_command=process-host-perfdata

Nagios has to be notified about the referenced commands as well. If you used the quickstart installation guides for Nagios you can modify the definitions in commands.cfg. You can see that calling process_perfdata.pl doesn't require any arguments apart from specifing the option -d ( DATATYPE ) if you want to process performance data resulting from host checks.

define command {
       command_name    process-service-perfdata
       command_line    /usr/bin/perl /usr/local/pnp4nagios/libexec/process_perfdata.pl
}

define command {
       command_name    process-host-perfdata
       command_line    /usr/bin/perl /usr/local/pnp4nagios/libexec/process_perfdata.pl -d HOSTPERFDATA
}

Note process_perfdata.pl cannot be started under control of ePN ( embedded Perl Nagios ). Therefore the script is explicitly called using /usr/bin/perl ( or where you perl binary is located ). If you use Nagios 3.x or do not use ePN there is no need to specify /usr/bin/perl.

Bulk Mode

Bulk mode is a bit more complicated than the synchronous mode but reduces the load on the nagios server significantly because the data collector process_perfdata.pl is not invoked for every service/host check.

In bulk mode Nagios writes the data to a temporary file in a defined format. This file is processed by process_perfdata.pl at certain intervals. Nagios will take care for starting and running it periodically.

Processing of performance data has to be enabled in nagios.cfg

 process_performance_data=1

Additionally some new directives are required

#
# service performance data
#
service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file

#
# host performance data starting with Nagios 3.0
# 
host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file

Attention: Please note that these template definitions differ from the ones delivered in nagios.cfg!

The directives and their meaning:

The used commands have to be announced to Nagios. If you used the quickstart installation guides for Nagios you can modify the definitions in commands.cfg.

define command{
       command_name    process-service-perfdata-file
       command_line    /usr/local/pnp4nagios/libexec/process_perfdata.pl --bulk=/usr/local/pnp4nagios/var/service-perfdata
}

define command{
       command_name    process-host-perfdata-file
       command_line    /usr/local/pnp4nagios/libexec/process_perfdata.pl --bulk=/usr/local/pnp4nagios/var/host-perfdata
}

NOTE:

Because there is more data to process than in synchronous mode process_perfdata.pl will take longer to do this so you should check the TIMEOUT value in etc/process_perfdata.cfg and adjust it appropriately.

Bulk Mode with NPCD

The configuration is identical to the Bulk Mode except for the used command. Processing of performance data has to be enabled in nagios.cfg

 process_performance_data=1

Additionally some new directives are required

#
# service performance data
#
service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file

#
# host performance data starting with Nagios 3.0
# 
host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file

Attention: Please note that these template definitions differ from the ones delivered in nagios.cfg!

The directives and their meaning:

The used commands have to be announced to Nagios. If you used the quickstart installation guides for Nagios you can modify the definitions in commands.cfg.

define command{
       command_name    process-service-perfdata-file
       command_line    /bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata.$TIMET$
}

define command{
       command_name    process-host-perfdata-file
       command_line    /bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata.$TIMET$
}

Using these commands the file service-perfdata will be moved to var/spool/ after the interval specified in service_perfdata_file_processing_interval has passed. The Nagios macro $TIMET$ is appended to the filename to avoid overwriting of old files unintentionally. The macro $TIMET$ contains the current timestamp in time_t format (seconds since the UNIX epoch).

In the directory /usr/local/pnp4nagios/var/spool/ files are gathered to be processed by NPCD.

NPCD monitors the spool directory and passes the file names to process_perfdata.pl. This way processing of performance data is completely decoupled from nagios.

Before starting NPCD you have to check the paths to the spool directory and to process_perfdata.pl specified in the config file npcd.cfg. The only thing that remains is to start NPCD.

 /usr/local/pnp4nagios/bin/npcd -d -f /usr/local/pnp4nagios/etc/npcd.cfg

The option -d starts NPCD as a daemon in the background.

Bulk Mode with NPCD and npcdmod

This mode uses the event broker module npcdmod.o. The flow of data is identical to “bulk mode with NPCD”. The internal perfdata routines of Nagios activated by the “*_perf_data_*” directives in nagios.cfg are *NOT* used anymore. The module npcdmod.o takes over the task of processing the data required by PNP.

Pro:

Adjustments in nagios.cfg:

process_performance_data=1
broker_module=/usr/local/pnp4nagios/lib/npcdmod.o config_file=/usr/local/pnp4nagios/etc/npcd.cfg

All other directives mentioned on this page must NOT be used.

Attention: If you have changed the value of event_broker_options from -1 to another value then please note that PNP needs the bits 2 and 3 set (0b01100). Make sure that the resultung value has these bits set because otherwise there will be no performance data to process.

After restarting Nagios information regarding the start of the module will be logged.

Excerpt from nagios.log

[1277545053] npcdmod: Copyright (c) 2008-2009 Hendrik Baecker (andurin@process-zero.de) - http://www.pnp4nagios.org
[1277545053] npcdmod: /usr/local/pnp4nagios/etc/npcd.cfg initialized
[1277545053] npcdmod: spool_dir = '/usr/local/pnp4nagios/var/spool/'.
[1277545053] npcdmod: perfdata file '/usr/local/pnp4nagios/var/perfdata.dump'.
[1277545053] npcdmod: Ready to run to have some fun!
[1277545053] Event broker module '/usr/local/pnp4nagios/lib/npcdmod.o' initialized successfully.

Gearman Mode

Since version 0.6.12 PNP4Nagios can be driven as a gearman worker. This way large Nagios environments are possible using mod_gearman. Nagios and PNP4Nagios can be run on different machines.

You need a mod_gearman environment up and running like described by Sven Nierlein on http://labs.consol.de/lang/en/nagios/mod-gearman/.

You'll find a section on gearman in etc/process_perfdata.cfg:

PREFORK = 1
GEARMAN_HOST = localhost:4730
REQUESTS_PER_CHILD = 10000
ENCRYPTION = 1
KEY = should_be_changed
#KEY_FILE = /usr/local/pnp4nagios/etc/secret.key

Using PREFORK = <n> you specify the number of child processes.

GEARMAN_HOST = <host>:<port> specifies host and port of the server running the gearman daemon providing the data.

REQUEST_PER_CHILD = <n> enables you to define the number of requests processed per process.

ENCRYPTION = <n> specifies whether to use encryption (“1”) or not. Default is an activated encyrption which should be changed only in special cases. You can either use KEY = <key phrase> or 'KEYFILE =<key file> to specify the location of a file containing the key phrase. etc/init.d/pnp_gearman_worker start contains links to the perl script process_perfdata.pl and the config file process_perfdata.cfg''.

After starting the daemon process using

 /etc/init.d/pnp_gearmon_worker start

the performance data will be processed which is provided by the gearmand daemon on the Nagios server.

back to contents | checking the functionality

Checking the installation

If everything went well until now you can try to call PNP using your web browser. When using the installation with default values PNP should be called using http://<server name>/pnp4nagios/. The first time you will see a page “PNP4Nagios Environment Tests” which includes different checks of necessary components. Obviously all checks have to be passed successfully before you can proceed. Please follow the instructions given on that page.

If all tests have passed *successfully* the file pnp4nagios/share/install.php can be deleted or renamed. Not till then the web interface is reachable.

Alternatively you can create a file called pnp4nagios/share/install.ignore which will prevent the call of the installer after further updates.

If you receive the message “PHP magic_quotes_gpc is deprecated” then please locate your php.ini and set the value to Off.

Called without any arguments PNP looks for RRD and XML files in pnp4nagios/var/perfdata and shows all graphs of the first host.

ATTENTION: Immediately after (re-)starting Nagios after you enabled the processing of performance data you will get error messages in your browser because performance data has to be collected and stored in RRD files. Depending on the check interval you are using you have to wait some time before you can view the first graphs.

Debug Logfile

Calling make install-config during installation will create a sample config file etc/process_perfdata.cfg-sample. The values in the sample file will correspond to the defaults used by process_perfdata.pl so normally you do not have a file called process_perfdata.cfg while running the procedure.
However you can influence the way process_perfdata.pl works by changing options which have to be specified in process_perfdata.cfg.

The most important options launching PNP are LOG_LEVEL and LOG_FILE. We recommend setting the LOG_LEVEL value to “2” so you can track what process_perfdata.pl will do. Most likely we will ask for excerpts from perfdata.log if you open a support request on the mailing lists as well as the output of the verify_pnp_config script so please provide them ;-).

During normal operation the debug level should be set to 0 to avoid performance issues due to unnecessary entries in the log file.

Something went wrong

Some basic settings should be checked

1. Have any RRD and XML files been created? process_perfdata.pl will create a new directory under pnp/perfdata for every host. In this directory an RRD database and an XML file will be created for every service. The host data will be stored in _HOST_.xml and _HOST_.rrd respectively.
If graphing stops out of a sudden then open the appropriate XML file. There are two tags called <RC> and <TXT>. <RC> shows the return code of the RRDtool update and <TXT> a textual description.
Sometimes you have to specify additional options so that performance data is produced. In some cases a wrapper script might help.
However not all checks provide performance data. That applies - among others - to “check_ping” in contrast to “check_icmp” which does provide data (starting with Nagios plugin version 1.4.12 check_ping does provide performance data).
Using the web interface the detail information of hosts/services shows a field “Performance Data”. If it is empty there is no data available so no files are written to the appropriate directory and that is why PNP does not provide you with graphs!
The following image shows the information of a “PING” service. The output of the plugin is surrounded by a blue border, the performance data by a red one.
status information

2. Has nagios called process_perfdata.pl? In the config file for process_perfdata.pl (etc/process_perfdata.cfg) you can increase the debug level. Data processing will be logged in var/perfdata.log.

3. Graphs are shown without text? Have a look at the requirements.

4. Some graphs are shown, others report the error “parser error: Input is not proper UTF-8” or something similar. Please check if your data contains “special” characters not present in the ASCII set. Try to set XML_ENC in process_perfdata.cfg to ISO-8859-1 or something appropriate. Wait until the xml file is newly created and retry.

5. Using the npcdmod module the value of the nagios.cfg directive event_broker_options may have to be adapted if it was modified. You'll find some details here.

6. You can use the script verify_pnp_config.pl after installation to check your settings and if performance data is present.

7. Things look OK, but some files are being left in the spool directory (/usr/local/pnp4nagios/var/spool/<perfdata_filename>-PID-<process_perfdata_pid>). If process_perdata.pl is not able to write to the destination directory (/usr/local/pnp4nagios/share/perfdata/<host>), it will stop and not remove the file. That will increase the size of the spool directory and slow down performance data processing. This problem is likely to occur if you have copied directories from a previous installation and/or manually created directories and left them with wrong permissions or wrong ownership.

back to contents | verify_pnp_config.pl

verify_pnp_config

In case of problems there is a script called verify_pnp_config.pl located on http://verify.pnp4nagios.org. It enables you to check the configuration settings as well as performance data of hosts or services. It can be used prior and during runtime of PNP.

Download

wget http://verify.pnp4nagios.org/verify_pnp_config

Test

The verify script is located on http://verify.pnp4nagios.org an needs three start options

lenny:~# perl verify_pnp_config --mode npcdmod --config=/usr/local/nagios/etc/nagios.cfg --pnpcfg=/usr/local/pnp4nagios/etc
[INFO]  ========== Starting Environment Checks ============
[INFO]  My version is: verify_pnp_config-0.6.14-R.31
[INFO]  Reading /usr/local/nagios/etc/nagios.cfg
[OK  ]  Running product is 'nagios'
[OK  ]  object_cache_file is defined
[OK  ]  object_cache_file=/usr/local/nagios/var/objects.cache
[INFO]  Reading /usr/local/nagios/var/objects.cache
[OK  ]  resource_file is defined
[OK  ]  resource_file=/usr/local/nagios/etc/resource.cfg
[INFO]  Reading /usr/local/nagios/etc/resource.cfg
[INFO]  Reading /usr/local/pnp4nagios/etc/process_perfdata.cfg
[INFO]  Reading /usr/local/pnp4nagios/etc/pnp4nagios_release
[OK  ]  Found PNP4Nagios version "0.6.14"
[OK  ]  Effective User is 'nagios'
[OK  ]  User nagios exists with ID '1000'
[OK  ]  Effective group is 'nagios'
[OK  ]  Group nagios exists with ID '1000'
[INFO]  ========== Checking npcdmod Mode Config  ============
[OK  ]  process_performance_data is 1 compared with '/1/'
[OK  ]  event_broker_options is defined
[OK  ]  event_broker_options=-1
[OK  ]  event_broker_option bits 2 and 3 enabled (12)
[OK  ]  broker_module is defined
[OK  ]  broker_module=/usr/local/pnp4nagios/lib/npcdmod.o config_file=/usr/local/pnp4nagios/etc/npcd.cfg
[OK  ]  npcdmod.o config file is /usr/local/pnp4nagios/etc/npcd.cfg
[OK  ]  /usr/local/pnp4nagios/etc/npcd.cfg used by npcdmod.o is readable
[OK  ]  npcd daemon is running
[OK  ]  /usr/local/pnp4nagios/etc/npcd.cfg is used by npcd and readable
[OK  ]  npcd and npcdmod.o are using the same config file (/usr/local/pnp4nagios/etc/npcd.cfg)
[INFO]  Nagios config looks good so far
[INFO]  ========== Checking config values ============
[INFO]  Reading /usr/local/pnp4nagios/etc/npcd.cfg
[OK  ]  Script /usr/local/pnp4nagios/libexec/process_perfdata.pl is executable
[INFO]  ========== Starting global checks ============
[OK  ]  status_file is defined
[OK  ]  status_file=/dev/shm/status.dat
[INFO]  Reading /dev/shm/status.dat
[INFO]  ==== Starting rrdtool checks ====
[OK  ]  RRDTOOL is defined
[OK  ]  RRDTOOL=/usr/bin/rrdtool
[OK  ]  /usr/bin/rrdtool is executable
[OK  ]  RRDtool 1.3.1  Copyright 1997-2008 by Tobias Oetiker <tobi@oetiker.ch>
[OK  ]  USE_RRDs is defined
[OK  ]  USE_RRDs=1
[OK  ]  Perl RRDs modules are loadable
[INFO]  ==== Starting directory checks ====
[OK  ]  RRDPATH is defined
[OK  ]  RRDPATH=/usr/local/pnp4nagios/var/perfdata
[OK  ]  Perfdata directory '/usr/local/pnp4nagios/var/perfdata' exists
[WARN]  62 hosts/services are not providing performance data
[WARN]  'process_perf_data 1' is set for 43 hosts/services which are not providing performance data!
[WARN]  'process_perf_data 0' is set for 27 of your hosts/services
[OK  ]  'process_perf_data 1' is set for 243 of your hosts/services
[INFO]  ==== System sizing ====
[OK  ]  269 hosts/service objects defined
[INFO]  ==== Check statistics ====
[WARN]  Warning: 3, Critical: 0
[WARN]  Checks finished...

Nagios web frontend

Of course PNP should be easily accessible. You do not want to search long for the right graph.

Nagios itself features external URLs using so called extended info configs. Due to changes between Nagios 2.x and Nagios 3.x both versions are described.

Nagios 2.x

With Nagios 2.x the integration of external URLs into the nagios web interface is made using Extended Info Objects for services. For PNP we use the directive action_url to call the PNP web frontend with the appropriate options.

define serviceextinfo {
   host_name             localhost
   service_description   load
   action_url            /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
}

You have to specify an additional Extended Info Definition for every service.

Nagios 3.x

Since nagios 3.0 the action_url-directive has be moved to the host or service definition. This way the definition of URLs to the PNP-interface has been simplified. The serviceextinfo and hostextinfo definitions are deprecated.

First two nagios templates are defined. If you used the Nagios quickstart installation guides you can append these lines to templates.cfg:

define host {
   name       host-pnp
   action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_
   register   0
}

define service {
   name       srv-pnp
   action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
   register   0
}

These two templates can now be included via “use srv-pnp” or “use host-pnp” for services and hosts respectively. If you used the quickstart installation guide you might for example edit the file localhost.cfg and add the template to the host or service definition as follows:

define host{
        use                     linux-server,host-pnp    ; Name of host templates to use
                                                         ; This host definition will inherit all variables that are defined
                                                         ; in (or inherited by) the linux-server host template definition.
        host_name               localhost
        alias                   localhost
        address                 127.0.0.1
        }
define service{
        use                     local-service,srv-pnp   ; Name of service template to use
        host_name               localhost
        service_description     PING
        check_command           check_ping!100.0,20%!500.0,60%
        }

The links to the correct URLs are created automagically.

Tips: if you want to open the PNP window in your main frame (on the right of the menu) instead of a new page, just set action_url_target=main in your nagios cgi.cfg

Popups

You can integrate PNP into Nagios in a way that you have current graphs without clicking any icons. This can be accomplished using the CGI Includes which allow us to include JavaScript code in the status detail view ( status.cgi ).

Prerequisites:

Definition:

define host {
   name       host-pnp
   action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_' class='tips' rel='/pnp4nagios/index.php/popup?host=$HOSTNAME$&srv=_HOST_
   register   0
}

define service {
   name       srv-pnp
   action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$' class='tips' rel='/pnp4nagios/index.php/popup?host=$HOSTNAME$&srv=$SERVICEDESC$
   register   0
}

After a restart of Nagios (after modifying the definitions) the result might look like this:

back to contents | config options

PNP Web Frontend

The behaviour of the PNP Web-Frontend can be controlled through the config file etc/config.php. This file will be overwritten during updates of PNP as the paths and options are detected during ./configure.

Own adjustments should be made in etc/config_local.php. If this file does not exist the file config.php can be taken as a guideline.

etc/config.php

Following the most important parameters:

The path to the RRDtool binary. Will be detected by ./configure

 $conf['rrdtool'] = "/usr/bin/rrdtool";

Height and width of the RRD graphs

 $conf['graph_width'] = "500";
 $conf['graph_height'] = "100";

Screen sizes may vary, pages sizes won't. The following two directives enable you to specify different sizes for the creation of PDFs. If they aren't specified the values of the graph sizes are taken.

 $conf['pdf_width'] = "675";
 $conf['pdf_height'] = "100";

Additional options passed with every call of RRDTool, for example --slope-mode to smooth the graphs

 $conf['graph_opt'] = "";

The path to the RRD and XML files created by process_perfdata.pl

 $conf['rrdbase'] = "/usr/local/pnp4nagios/var/perfdata/";

The path to the config file for the pages.

 $conf['page_dir'] = "/usr/local/pnp4nagios/etc/pages/";

PNP pages will be refreshed every n seconds

 $conf['refresh'] = "90";

Max. age of RRD files in seconds. After reaching this value links to the graphs will be marked as inactive

 $conf['max_age'] = 60*60*6;

Base URL to the Nagios CGIs

 $conf['nagios_base'] = "/nagios/cgi-bin";

List of users who are allowed to view links to the services of the current host

 $conf['allowed_for_service_links'] = "EVERYONE";

List of users who can view/access the host search field

 $conf['allowed_for_host_search'] = "EVERYONE";

If PNP is called with a host only ( index.php?host=<myserver> ), the defined user is shown an overview of all services related to this host

 $conf['allowed_for_host_overview'] = "EVERYONE";

The periods of time the RRD graphs will show are determined using the array $views[]. The title and number of graphs can be specified globally in this place

$views[] = array('title' => 'One Hour',  'start' => (60*60) );
$views[] = array('title' => '4 Hours',   'start' => (60*60*4) );
$views[] = array('title' => '25 Hours',  'start' => (60*60*25) );
$views[] = array('title' => 'One Week',  'start' => (60*60*25*7) );
$views[] = array('title' => 'One Month', 'start' => (60*60*24*32) );
$views[] = array('title' => 'One Year',  'start' => (60*60*24*380) );

You can add more views ($views[5], …) but please keep in mind that under normal circumstances ALL views you defined are shown.

back to contents | timeranges

Timeranges

In the overview PNP shows five timeranges which can be defined in config.php.

Additionally you can influence the timeranges via the URL. This can be useful to automatically create PDF documents. The ranges can be defined using the options “start” and “end”.

Example:

 pnp4nagios/graph?host=<hostname>&srv=<servicedesc>&start=-1week

The graph will start one week prior to the current date and time. It will end at the current timestamp.

start end view result
all views ending at current timestamp
x all views starting at defined date
x all views ending at defined date
x x one view between the two dates
x one view ending at current timestamp
x x one view starting at defined date
x x one view ending at defined date

Examples of different specifications

format description
2009W04 4. week of 2009
1.5.2009 May, 1st 2009
-1 day one day back
-3 weeks 3 weeks back
-1 year one year back
yesterday yesterday

back to contents | pages

Pages

“pages” provide the opportunity to collect graphs of different hosts/services on one page. That way - as an example - you can display the traffic rates of all tape libraries. Regular expressions are possible so you can accomplish a lot with only few definitions - provided that you have appropriate names. The directory specified using “$conf['page_dir']” contains one or more file with the extension ”.cfg”.

Comments start with a hash-sign (#) and are possible within lines as well. Each file contains a “page” definition which specifies the name of the page and it determines whether the following graph definition contains regular expressions or not.
The description behind page_name appears in the list of available pages and will be used as title of the browser window. Attention: “host_name” and “service_desc” refer to the name of the file in the perfdata directory, not to the definition in Nagios. Blanks are replaced by underscores (_).

define page {
       use_regex 1		# 0 = use no regular expressions, 1 = use regular expressions
       page_name test-page	# page description
}

One or more “graph” definitions follow:

define graph {
       host_name       host1,host2,host3
       service_desc    Current_Load
}

Attention: The list of host name will only work if you use regex 0!

define graph {
       host_name       host4
       service_desc    Current_Users
}

And now some definitions with regular expressions. At first all hosts whose names are starting with “Tape”:

define graph {
       host_name       ^Tape
       service_desc    Traffic
}

all hosts whose names are ending with “00”:

define graph {
       host_name       00$
       service_desc    Load
}

all services of localhost whose names contain “a” or “o”, respectively:

define graph {
       host_name       localhost
       service_desc    a|o
}

all services whose names contain an underscore followed by (at least) three digits on all hosts whose names start with “UX”:

define graph {
       host_name       ^UX
       service_desc    _\d{3}
}

In some cases you may want to limit the display to just one graph. To accomplish this you can use the optional directive “source” followed by a number specifying the position within the RRD file starting at 0

define graph {
       host_name       host1,host2,host3
       service_desc    PING
       source          1
}

back to contents | data export

Data export

PNP provides access to RRD data using the xport controller. The output format can be specified. At the moment the formats xml, json and csv are supported.

The controller can be called using the URL

/pnp4nagios/xport/<format>?host=<hostname>&srv=<servicedesc>

whereas <format> has to be replaced with the desired format.

You can also use wget to generate images and place them in periodic reports. One example may be:

wget -O image.png 'http://<user>:<pass>@<nagios-server>/pnp4nagios/image?host=<hostname>&srv=<service>&view=2&source=0'

view=<n> limits the graph to the timeperiod specified in config.php
source=<n> only shows one data source if you have more than one in your RRD file

Instead of view you can use start and/or end to specify the time period. For details please look at "time ranges".

back to contents | templates

What are templates?

PNP uses templates to influence the appearance of RRD graphs.

The selected check_command determines which template will be used to control the graph. Following will be described where templates are stored and how the decision for the “right” template is made.

What template will be used when?

Templates are stored at two places in the file system.

If the graph for the service “http” on host “localhost” should be shown, PNP will look for the XML file perfdata/localhost/http.xml and read its contents. The XML files are created automatically and contain information about the particular host and service. The header contains information about the plugin and the performance data. The XML tag <TEMPLATE> identifies which PNP template will be used for this graph.

/localhost/http.xml

<NAGIOS>
  <DATASOURCE>
    <TEMPLATE>check_http</TEMPLATE>
    <DS>1</DS>
    <NAME>time</NAME>
    <UNIT>s</UNIT>
    <ACT>0.006721</ACT>
    <WARN>1.000000</WARN>
    <CRIT>2.000000</CRIT>
    <MIN>0.000000</MIN>
    <MAX></MAX>
  </DATASOURCE>
  <DATASOURCE>
    <TEMPLATE>check_http</TEMPLATE>
    <DS>2</DS>
    <NAME>size</NAME>
    <UNIT>B</UNIT>
    <ACT>263</ACT>
    <WARN></WARN>
    <CRIT></CRIT>
    <MIN>0</MIN>
    <MAX></MAX>
  </DATASOURCE>
...
</NAGIOS>

PNP will look for a template with the name check_http.php in the following sequence:

  1. templates/check_http.php
  2. templates.dist/check_http.php
  3. templates/default.php
  4. templates.dist/default.php

The template default.php takes an exceptional position as it is used every time no other applicable template is found.

Creating own templates

PNP templates are PHP files which are included during execution of PNP using the PHP function include(). This means that every PHP code in templates will be interpreted so manipulation of all values is possible.

PNP template must have the following characteristics:

  1. templates must contain valid PHP code.
  2. templates must not create any output.
  3. the two arrays $opt[] and $def[] have to be filled

These two arrays are used to call 'rrdtool graph' so every option is possible that RRDtool supports. All options of RRDtool are described very thoroughly on the RRDtool Homepage.

If both arrays contain more than one set of data graphs will be created for every set.

Inside the templates the data from the related XML files can be used.

Using the relatively simple template response.php we will describe the most important options.

<?php
#
$opt[1] = "--title \"Response Time For $hostname / $servicedesc\" ";
#
$def[1] =  "DEF:var1=$RRDFILE[1]:$DS[1]:AVERAGE " ;
$def[1] .= "AREA:var1#00FF00:\"Response Times \" " ;
$def[1] .= "LINE1:var1#000000 " ;
$def[1] .= "GPRINT:var1:LAST:\"%3.4lg %s$UNIT[1] LAST \" ";
$def[1] .= "GPRINT:var1:MAX:\"%3.4lg %s$UNIT[1] MAX \" ";
$def[1] .= "GPRINT:var1:AVERAGE:\"%3.4lg %s$UNIT[1] AVERAGE \" ";
?>

Note: as the number (1) and the letter “L” look alike in this listing: the format ”%3.4lg” contains a small letter.

$opt[1] = ”--title … sets RRDtool options for the first set of data, here the title as you can see. Embedded quotes are masked using a backslash (\). The variables $hostname and $servicedesc were determined through the call of PNP and are available for the template as well.

$def[1] = “DEF:var1=$RRDFILE[1]:$DS[1]:AVERAGE ”; defines which data is to be read from which RRD file. $RRDFILE[1] contains the path to the RRD file of this service. $DS[1] refers to the first data series from the RRD file.

$def[1] .= “AREA:var1#00FF00:\”Response Times \” ”; the operator ”.=” appends more data to the array $def[1]. An area will be drawn using data from the variable var1. The color is defined in HEX notation #00FF00 (red, green, blue). The label is “Response Times”.

$def[1] .= “LINE1:var1#000000 ”; As completion of the just drawn area a line (LINE1) will be drawn in black (#000000).

$def[1] .= “GPRINT:var1:LAST:\”%3.4lg %s$UNIT[1] LAST \” ”;
$def[1] .= “GPRINT:var1:MAX:\”%3.4lg %s$UNIT[1] MAX \” ”;
$def[1] .= “GPRINT:var1:AVERAGE:\”%3.4lg %s$UNIT[1] AVERAGE \” ”;

The three GPRINT lines build up the caption for the graph. The current values are formatted using the printf syntax.

Available variables

Using the data collector process_perfdata.pl PNP stores not only performance data but other values exported by Nagios. These values are stored in the XML file associated to the appropriate service.

In the first part of the XML file the performance data is stored in separate components.

<NAGIOS>
  <DATASOURCE>
    <TEMPLATE>check_http</TEMPLATE>
    <DS>1</DS>
    <NAME>time</NAME>
    <UNIT>s</UNIT>
    <ACT>0.006721</ACT>
    <WARN>1.000000</WARN>
    <CRIT>2.000000</CRIT>
    <MIN>0.000000</MIN>
    <MAX></MAX>
  </DATASOURCE>
....
</NAGIOS>

The field <DS> designates the data source and is used to identify the data series of the RRD files and is the key of the following arrays as well.

The array $UNIT[1] contains the unit of measurement of the first data series.

The XML file contains other information. When process_perfdata.pl is used in default mode all available macros are at hand with the current values. For the benefit of readability the following lines show only an extract.

<NAGIOS>
...
  <NAGIOS_SERVICENOTIFICATIONID>8418</NAGIOS_SERVICENOTIFICATIONID>
  <NAGIOS_SERVICENOTIFICATIONNUMBER>0</NAGIOS_SERVICENOTIFICATIONNUMBER>
  <NAGIOS_SERVICEOUTPUT>HTTP OK HTTP/1.1 200 OK - 10087 bytes in 0.125 seconds</NAGIOS_SERVICEOUTPUT>
  <NAGIOS_SERVICEPERCENTCHANGE>0.00</NAGIOS_SERVICEPERCENTCHANGE>
  <NAGIOS_SERVICEPERFDATA>time=0.124811s;;;0.000000 size=10087B;;;0</NAGIOS_SERVICEPERFDATA>
  <NAGIOS_SERVICEPERFDATAFILE></NAGIOS_SERVICEPERFDATAFILE>
  <NAGIOS_SERVICEPROBLEMID>0</NAGIOS_SERVICEPROBLEMID>
  <NAGIOS_SERVICESTATE>OK</NAGIOS_SERVICESTATE>
  <NAGIOS_SERVICESTATEID>0</NAGIOS_SERVICESTATEID>
  <NAGIOS_SERVICESTATETYPE>HARD</NAGIOS_SERVICESTATETYPE>
  <NAGIOS_SHORTDATETIME>27-12-2007 13:51:23</NAGIOS_SHORTDATETIME>
...
</NAGIOS>

The various XML fields can be used as variables in the PNP templates. Each field is available as a variable with the same name.

The value of the field <NAGIOS_SERVICEOUTPUT> is available as the variable $NAGIOS_SERVICEOUTPUT.

back to contents | custom templates

Custom Templates

As already described under ”What are templates ?” the appearance of graphs depends on the check command used.

There are situations where this behaviour must be overruled, for example when universal commands have been defined.

PNP, especially process_perfdata.pl, will search for a config file (<check_command>;.cfg) in the etc/check_commands directory and read its contents (if available). The following options can be defined in it:

CUSTOM_TEMPLATE

Outgoing from the following example of a Nagios command-definition:

define command {
  command_name check_nrpe
  command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a "$ARG2$"
}

This would lead to a call of the check_nrpe.php template even when the monitored host would use a completely different plugin which is called via NRPE.

As our example command is called check_nrpe it will be searched for etc/check_commands/check_nrpe.cfg.

During installation a sample config file with the extension .cfg-sample is copied to etc/check_commands.

# check_command check_nrpe!load!-w 4,4,4 -c 5,5,5
# ________0__________|       |       |
# ________1__________________|       |
# ________2__________________________|
#
CUSTOM_TEMPLATE = 1

CUSTOM_TEMPLATE = 1 assures that only the contents of $ARG1$ will be used as a template name. As $ARG1$ contains “load” in this example the template name would result in “load.php”.

CUSTOM_TEMPLATE = 0,1 results in → “check_nrpe_load.php”

CUSTOM_TEMPLATE = 1,0 results in → “load_check_nrpe.php”

This option has effect only during creation of the RRD database.

DATATYPE

The option “DATATYPE” controls the datatype which is used during creation of the RRD database. Default is “GAUGE”. For consecutive values the type should be “COUNTER”. Plugin-developers should use the unit “c” for counters but this is not always the case.

To set all datasources to COUNTER

DATATYPE = COUNTER

Setting datasources to different types

DATATYPE = GAUGE,GAUGE,COUNTER,COUNTER

More datatypes are explained in the RRDTool documentation found at rrdcreate.

This option has effect only during creation of the RRD database.

USE_MIN_ON_CREATE and USE_MAX_ON_CREATE

In a few situations it might be necessary to limit the values which are valid for RRDTool.

RRD databases can be created with fixed minimum and maximum values. You will find further details at http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html.

Account for the maximum value taken from the performance data

USE_MAX_ON_CREATE = 1

Account for the minimum value taken from the performance data

USE_MIN_ON_CREATE = 1

This option has effect only during creation of the RRD database.

RRD_STORAGE_TYPE

RRD_STORAGE_TYPE = SINGLE

The option RRD_STORAGE_TYPE defines the kind of data storage.

Possible values are MULTIPLE and SINGLE, respectively.

SINGLE: A RRD database per service

MULTIPLE: One or more RRD databases per service. Each datasource will be stored in a separate RRD database.

ATTENTION: The data will not be migrated automatically! You will find a conversion script here.

This option has effect only during creation of the RRD database.

RRD_HEARTBEAT

Starting with PNP 0.6.1

RRD_HEARTBEAT = 305

After <RRD_HEARTBEAT> seconds RRDtool expects new data.

More information at http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html

This option has effect only during creation of the RRD database.

Hints on Template Names

In most situations, one can easily get desired template names, by using suitable command object definitions.

Consider the followng example:

define command {
  command_name check_by_ssh
  command_line /usr/bin/ssh $HOSTADDRESS$ $ARG1$
}

with commands like:

  …
  check_command check_by_ssh!/usr/lib/nagios/plugins/check_load -w 4,4,4 -c 5,5,5
  …

Even when using “CUSTOM_TEMPLATE = 1” one would end up in template names like “_usr_lib_nagios_plugins_check_load_-w_4,4,4_-c_5,5,5”, which is highly undesired, especially because of the parameters in it.

Solution 1: Split parameters into separate $ARGn$

A simple solution is to use the following command object definition:

define command {
  command_name check_by_ssh
  command_line /usr/bin/ssh $HOSTADDRESS$ $ARG1$ $ARG2$
}

with commands like:

  …
  check_command check_by_ssh!/usr/lib/nagios/plugins/check_load!-w 4,4,4 -c 5,5,5
  …

(notice the additional “!”)

This even works, when $ARG2$ is let empty.

Of course one would still need to set “CUSTOM_TEMPLATE = 1”.

Solution 2: Hide the remote executor inside the command object definition

Another way is to “hide” the remote excutor in the respective command object definitions.

Instead of defining:

define command {
  command_name check_by_ssh
  command_line /usr/bin/ssh $HOSTADDRESS$ $ARG1$ $ARG2$
}

one would define the following for every command to be remotely executed:

define command {
  command_name check_load_by_ssh
  command_line /usr/bin/ssh $HOSTADDRESS$ /usr/lib/nagios/plugins/check_load $ARG1$
}

with commands like:

  …
  check_load_by_ssh!-w 4,4,4 -c 5,5,5
  …

Of course one must not set “CUSTOM_TEMPLATE = 1” in this way.

Which of above two solutions one follows is largely a matter of taste.

back to contents | PNP in distributed environments

Distributed Systems

If Nagios is implemented as a distributed system you have to decide where PNP should be installed.

From a technical view this question is not important. PNP can be installed on the slave(s) as well as on the master server. Or only on the master?

If PNP is running on the master you have to make sure that data passed via send_nsca from the slave server(s) contains performance data. Often another check command is used on the master.

To help PNP on the master to recognize which check command was used on the slave to collect the information process_perfdata.pl responds to an additional field at the end of the performance data.

OK - 127.0.0.1: rta 2.687ms, lost 0% | rta=2.687ms;3000.000;5000.000;0; pl=0%;80;100;; [check_icmp]

If PNP finds a string enclosed in brackets at the end of performance data it will be recognized as check command and will be used as PNP template.

Nagios documentation related to this topic can be found here. The command used in the documentation can be adapted easily.

define command{
	command_name	submit_check_result
	command_line	/usr/local/nagios/libexec/eventhandlers/submit_check_result $HOSTNAME$ '$SERVICEDESC$' $SERVICESTATE$ '$SERVICEOUTPUT$'
	}

should be changed to

define command{
	command_name	submit_check_result
	command_line	/usr/local/nagios/libexec/eventhandlers/submit_check_result $HOSTNAME$ '$SERVICEDESC$' $SERVICESTATE$ '$SERVICEOUTPUT$ | $SERVICEPERFDATA$ [$SERVICECHECKCOMMAND$]'
	}

check_multi plugin

The plugin check_multi is one of the first plugins which uses new features of Nagios 3.x. Check_multi can execute multiple Nagios plugins but returns only results like a single service. The output of check_multi comprises of several lines to be able to display the amount of information.

This results in some difficulties for PNP which has to extract the information of several plugins from the performance data. Together with Matthias Flacke, developer of check_multi, we have found a solution to assign the data to the appropriate plugins.

back to contents | support of rrdcached

RRDtool Cache Daemon

In large installations sooner or later one will recognize that processing the performance data will result in a relatively high I/O load. RRDtool has to do very much disk updates but cannot use the disk cache in an optimal way.

One improvement is made by collecting and sorting the data. It is more effective to write many updates to an RRD database in one block. The disk cache can be used more effectively that way.

The current RRDtool ( SVN trunk 1550+ ) contains rrdcached which should improve exactly this situation.

At this point I'd like to thank Florian octo Forster, Kevin Brintnall and Tobi Oetiker. The development of this daemon has been coordinated exemplary on the rrd-developers mailing list.

Mode of operation

The rrdcached is working as a daemon in the background and opens a UNIX or TCP socket to wait for requests of rrdtool. Due to security reasons newer versions of rrdcached cannot use absolute paths for network access anymore so the only possible way are unix sockets.

rrdcached

rrdcached recognizes some important options which are passed during startup.

Option -l defines the socket the daemon will listen for update requests. The default TCP port will be 42217.

-l unix:/path/to/rrdcached.sock
-l /path/to/rrdcached.sock
-l 127.0.0.1
-l 127.0.0.1:8888

Option -P specifies which commands are usable with the RRD data bases

-P FLUSH,PENDING

Option -s allows to change the group ownership of the unix socket

-s nagios

Option -m sets the permissions of the unix socket in the usual octal format

-m 0660

Option -w specifies the interval (in seconds) the data will be written to disk.

-w 1800

Option -z defines a maximum delay which will be used to spread the write cycles over a certain range [0-delay] to avoid parallel write accesses. The value of option -z must not be larger than -w.

-z 1800

Option -p defines a PID file

-p /var/run/rrdcached.pid

Option -j defines the path to a journaling directory. All requests will be logged there so that they can be processed after a restart in case the daemon crashes.

-j /var/cache/rrdcached

These options may result in a call of rrdcached with the following parameters

 rrdcached -w 1800 -z 1800 -p /tmp/rrdcached.pid -j /tmp  -s nagios -m 0660 -l unix:/tmp/rrdcached.sock

rrdtool

RRDtool itself will be informed about the daemon using the option --daemon=<socket>.

 rrdtool --daemon=unix:/tmp/rrdcached.sock update ...

Of course this has to correspond with the options of rrdcached!

Integration into PNP

Because two components of PNP have to prepared for the use of rrdcached there are changes in two config files.

1. Adjustment of process_perfdata.cfg for the data collector process_perfdata.pl

# EXPERIMENTAL rrdcached Support
# Use only with rrdtool svn revision 1511+
#
RRD_DAEMON_OPTS = unix:/var/run/rrdcached.sock

2. Adjustment of config_local.php (or config.php) for the web interface

#
# EXPERIMENTAL rrdcached Support
# Use only with rrdtool svn revision 1511+
#
# $conf['RRD_DAEMON_OPTS'] = 'unix:/tmp/rrdcached.sock';
$conf['RRD_DAEMON_OPTS'] = 'unix:/var/run/rrdcached.sock';

The sample files contain the relevant options.

back to contents | migrating RRD files

NPCD

NPCD (Nagios-Perfdata-C-Daemon) was written to provide an asynchronous mode to handle performance data with nagios.

Introduction

In large nagios installations, your average check latency may increase to a non-acceptable high value. This means that Nagios should do a check at time x but actually does it y seconds later.

If you tell the Nagios core that you want to process the performance data after every single check this is doing well for a certain amount of checks but above this limit you will run into latency problems.

To reduce the number of actions for each check you can use the Bulk Mode which gathers performance data for some time and then lets the Nagios core execute the <host|service>_perfdata_file_processing_command or you can tell Nagios to just move the perfdata_files to a spool directory.

This move is a very fast action for the Nagios core and the core will be done with the processing of performance data and can continue to do what it should do: execute other checks, sending notifications, and so on.

How it works

As mentioned above the Nagios process has finished its work with moving the performance data file to a spool directory but this won't bring the data into the RRD files.

For this task you can start npcd to have a look at the defined spool directory and start an action for every file which is found.

After NPCD starts running it will build a list of filenames found in perfdata_spool_dir and starts new threads for every filename and executes the perfdata_file_run_cmd with the optional perfdata_file_run_cmd_arg as an additional argument.

Since the perfdata files in the spool dir are in the same format as for the 'normal' bulk mode NPCD should execute process_perfdata.pl in Bulk Mode.

Advantages / Disadvantages

Pro:

Con:

NPCD Config

You have to control NPCD with its own configuration file like the rolled out npcd.cfg-sample file.

Just rename it to npcd.cfg to start NPCD like this:

/usr/local/pnp4nagios/bin/npcd -f /usr/local/pnp4nagios/etc/npcd.cfg

or

/usr/local/pnp4nagios/bin/npcd -d -f /usr/local/pnp4nagios/etc/npcd.cfg

to run in Daemon Mode (background).

Hint: If you decide to not rename the config file, it might be overwritten by a future update of PNP.

npcd.cfg-sample

These are the essential configuration directives for NPCD:

# Privilege Options
user = nagios
group = nagios

# Logging Options
log_type = syslog
log_file = /usr/local/pnp4nagios/var/npcd.log
max_logfile_size = 10485760
log_level=0

# Processing Options
perfdata_spool_dir = /usr/local/pnp4nagios/var/spool/
perfdata_file_run_cmd = /usr/local/pnp4nagios/libexec/process_perfdata.pl
perfdata_file_run_cmd_args = -b

# Thread Options
npcd_max_threads=5

# greedy options
use_load_threshold = 0
load_threshold = 10.0

# Process Options
pid_file=/var/run/npcd.pid

The directives

back to contents | wrapper script

check_procs is an example for a plugin which doesn't deliver performance data:

./check_procs -a ndo2db -w 1: -c 0:
PROCS OK: 2 processes with args 'ndo2db'

This can be changed with the following wrapper script

check_procs.sh

#!/bin/bash
LINE=`/usr/local/nagios/libexec/check_procs $*`
RC=$?
COUNT=`echo $LINE | awk '{print $3}'`
PROCS=`expr $COUNT - 1`
LINE=`echo $LINE | sed "s/: $COUNT /: $PROCS /"`
echo $LINE \| procs=$PROCS
exit $RC

Now you'll get the number together with the required label

./check_procs.sh -a ndo2db -w 1: -c 0:
PROCS OK: 2 processes with args 'ndo2db'| procs=2

2.6. Performance data

Performance data is defined by Nagios as “everything after the | of the plugin output” - please refer to Nagios documentation for information on capturing this data to logfiles. However, it is the responsibility of the plugin writer to ensure the performance data is in a “Nagios plugins” format. This is the expected format:

'label'=value[UOM];[warn];[crit];[min];[max]

Notes:

  1. space separated list of label/value pairs
  2. label can contain any characters
  3. the single quotes for the label are optional. Required if spaces, = or ' are in the label
  4. label length is arbitrary, but ideally the first 19 characters are unique (due to a limitation in RRD). Be aware of a limitation in the amount of data that NRPE returns to Nagios
  5. to specify a quote character, use two single quotes
  6. warn, crit, min/ or max/ may be null (for example, if the threshold is not defined or min and max do not apply). Trailing unfilled semicolons can be dropped
  7. min and max are not required if UOM=%
  8. value, min and max in class [-0-9.]. Must all be the same UOM
  9. warn and crit are in the range format (see Section 2.5). Must be the same UOM
  10. UOM (unit of measurement) is one of:
    • no unit specified - assume a number (int or float) of things (eg, users, processes, load averages)
    • s - seconds (also us, ms)
    • % - percentage
    • B - bytes (also KB, MB, TB, GB?)
    • c - a continous counter (such as bytes transmitted on an interface)

It is up to third party programs to convert the Nagios plugins performance data into graphs.

Origin: http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN201