Varnish is a web application accelerator and sits in front of your web server. It speeds up your application by caching some, if not all of the content meaning it reduces the load on your web server and can reduce the load on you backend as less lookups will be needed by the frontend.

As varnish is the first thing users hit it is imperative it is working properly and you have statistics on how it is performing. This plugin can give you a better insight into how affectively varnish is running and tell you if its having problems.

You can download the latest version here:
check_varnish-v1.0.tar

Install

To use this plugin you need to have varnishstat installed which is installed by default when you install varnish.

Perl is also required for this plugin. If you don’t have Perl installed you can install in by running the command below

1 sudo apt-get install perl
2
3 or
4
5 sudo yum install perl

Now you can download the file above and extract it:

Now you should have a file “check_varnish.pl” make sure that it has execute permissions:

1 chmod u+x check_varnish.pl

You now need to copy this file to your nagios plugins folder. You should consult your nagios config to find out where this is. Mine was ‘/usr/lib/nagios/plugin’

1 mv check_varnish.pl /usr/lib/nagios/plugins/.

How To

check_varnish.pl – Monitor and report on varnish usage

check_varnish.pl [-c|–cache] [-b|–bin <varnishstatbinary>] [-d|–backend <total|ratio>] [-s|–stats <varnish statfield>]  [-t|–technique <lt|gt>] [-w|–warning <number>] [-c|–critical <number>] [-h|–help]

DESCRIPTION

This script will report on various varnish stats including: varnish cache hit ratio backend error count (Total or Ratio) Any other counter in varnishstat If no counters are required the script will ensure the varnish binary is running

OPTIONS

-a –cache – this will make the script output cache_hit ratio perfdata

-b –bin <varnishstat> – to specify a different location of the default varnishstat binary location. Default is ‘/usr/bin/varnishstat’

-d –backend <all|success|unhealthy|busy|fail|reuse|toolate|recycle|retry> – specify script to output backend data you can output ratio, total or both

-h –help – output this message

-w –warning <number> – specify the warning threshold. Required for cache and backend checks

-c –critical <number> – specify the critical threshold. Required for cache and backend checks

-s –stats <varnishstat field> – specify a comma separated list of all the stats you wish to check Critical and Warning can be specified and all values will be compared to these values.

-t –technique <lt|gt> – when specifying stats you can also specify what technique you wish to use to compare the values to the thresholds. specify lt for less than and gt for greater than. Default is gt

EXAMPLES

Check varnish is running

./check_varnish.pl

Check varnish Cache Hit Ratio and warn if ratio is below 0.8

./check_varnish.pl -a -w 0.8 -c 0.6

Check varnish Backends

./check_varnish.pl -d all

Check varnish client requests and drops

./check_varnish.pl -s client_drop,client_req

Nagios Set Up

Once you have run the command in the CLI and all is working you can add the command:

1 define command {
2 command_name                    check_varnish
3 command_line                    $USER1$/check_varnish.pl $ARG1$
4 register                        1
5 }

$USER1$ is your variable pointing to your nagios plugins folder and $ARG1$ are any command line arguments you specify in the service.

1 define service {
2 host_name                       localhost
3 service_description             Varnish
4 check_command                   check_varnish!--cache -w 0.6 -c 0.4
5 register                        1
6 }

The service above will give a warning if the hit ratio goes below 0.6 and critical if the ratio goes below 0.4

NRPE

The below is a line that can be used in the NRPE configuration for remote monitoring:

1 command[check_varnish_cache_hit]=/usr/lib/nagios/plugins/check_varnish.pl --cache -w 0.6 -c 0.4

The NRPE service could look like this:

1 define service {
2 host_name                  varnishserver
3 service_description             Varnish Cache Hit Ratio
4 check_command                   check_nrpe!check_varnish_cache_hit
5 register                        1
6 }