Nagiosgraph for Postgresql replication

5 Jun

To track postgresql replication lag with nagios you need to create a plugin to track nagios replication. I initially tried to read the ‘slave_lag’ directly from Postgresql, but permissions etc were a pain – so i just created a cron that dumped it every 5 mins and this reads that… the command to read the lag from Postgresql itself is commented out:


# too hard with permissions
#delay=$( sudo -u postgresql psql -h127.0.0.1 -p5433 -c "SELECT extract(epoch from now() - pg_last_xact_replay_timestamp()) AS slave_lag;" 2>/dev/null | tail -n 3 | head -n 1 | awk '{$1=$1};1')

delay=`tail -n 1 /tmp/postgres_lag.txt | awk '{print $4}'`
delay_int=`printf "%.0f" $delay`
output="Replication Delay: $delay seconds"

if [ "$delay_int" -le 300 ]
 echo "OK- $output"
 exit 0
elif [ $delay_int -le 2000 ]
 echo "WARNING- $output"
 exit 1
elif [ $delay_int -gt 2000 ]
 echo "CRITICAL- $output"
 exit 2
echo "UNKNOWN- $output"
exit 3

You then need to edit your nagiosgraph ‘map’ file (called ‘map’) and add this:

# Replication delay
/output:.*eplication Delay: ([.\d]+)\sseconds/
and push @s, [ 'seconds',
 [ 'data', GAUGE, $1 ] ];

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: