Getting alerts from an offline node

The forum for help and support with FreeNATS as well as any useful hints and tips
rad
Posts: 11
Joined: Wed Jan 20, 2010 8:48 pm

Getting alerts from an offline node

Post by rad » Wed Jan 20, 2010 9:05 pm

Greetings. I am running FreeNATS 1.04.6b via the virtual appliance. One of the servers I have been monitoring with it has been offline for a few days now, but I was still getting alerts about it from FreeNATS. Every so often I would get ping failed email followed shortly be an alert closed message. I checked the node history and find the chart showing activity spikes anywhere from several times and hour to one or two overy few hours, with a top value listed as 0.1. I connect to the VM, log in, and run a continuous ping against the downed server to see if I'm getting a response from a router or something that FreeNATS may be misinterpreting as a response from the node. While running those pings, I checked the node history again and see that a new activity spike. I check back through the the pings on the SSH session to the VM and find nothing but ping failures.

Any thoughts on what could be causing the activity spikes to be recorded? The VM is running on a VMware vSphere 4 host.

dave
Site Admin
Posts: 260
Joined: Fri May 30, 2008 9:09 pm
Location: UK
Contact:

Re: Getting alerts from an offline node

Post by dave » Thu Jan 21, 2010 7:31 am

Hi,

Well it's either some sort of paranormal sense (I see dead servers...) or more likely a bug though an unusual one that I'm surprised hasn't shown up before.

A few questions:

- Is the spike/pass results showing 0.1 or 0.001?
- Are the units showing as seconds (s) or milliseconds (ms)?
- How "far" (in logical terms - how many hops) away is the node?
- When you tried to ping the host I'm presuming you used the standard GNU ping? If so - see below

Ping test:

Can you try the following which will test the in-built ping functionality rather than the (no doubt actually unbuggy GNU version):

Download freenats-1.06.4b.tar.gz, expand it and copy the server/test directory into /opt/freenats/server/ (see below for step-by-step)
cd to /opt/freenats/server/test
as root do: ./ping.sh [hostname-or-ip]
This will continuously use the internal FreeNATS ping against the host (Ctrl-C to stop) - see if this has the same random return results

Dependent on what we find I'll have a poke in the code and/or send you a debug version of something to try and get some more details on the phantom returns.

Cheers,

Dave.

Step-by-Step download+install test instructions:

cd /tmp
wget http://www.purplepixie.org/freenats/dow ... .6b.tar.gz
gzip -d freenats-1.04.6b.tar.gz
tar -xvf freenats-1.04.6.tar
cd freenats-1.04.6b/server
cp -Rf test /opt/freenats/server/

rad
Posts: 11
Joined: Wed Jan 20, 2010 8:48 pm

Re: Getting alerts from an offline node

Post by rad » Thu Jan 21, 2010 3:47 pm

dave wrote:Hi,

Well it's either some sort of paranormal sense (I see dead servers...) or more likely a bug though an unusual one that I'm surprised hasn't shown up before.

A few questions:

- Is the spike/pass results showing 0.1 or 0.001?
- Are the units showing as seconds (s) or milliseconds (ms)?
- How "far" (in logical terms - how many hops) away is the node?
- When you tried to ping the host I'm presuming you used the standard GNU ping? If so - see below
Thanks for getting back to me.

1. The results are showing as .1
2. It looks like the graph is showing the units as milliseconds. Here's a screen cap of the graph for yesterday, when the server was offline:

Image

3. It's in the same LAN.
4. Probably. I just used the ping command at the CUI console.

I'll try the rest of your idea later and get back to you when I do.

dave
Site Admin
Posts: 260
Joined: Fri May 30, 2008 9:09 pm
Location: UK
Contact:

Re: Getting alerts from an offline node

Post by dave » Fri Jan 22, 2010 5:40 pm

Hi,

Ok having spent a little while reading up on the ICMP message specs I have made a tweak which might address your problem.

It's in the latest alpha 1.08.1a.

The chances are you can go ahead and upgrade your production box to this version but if you have loads of config you might like to back it up first or even run the alpha on a second VM.

Instructions on downloading a custom version to a VM are here: http://www.purplepixie.org/freenats/wik ... ic_Version

The download URL would be http://www.purplepixie.org/freenats/dow ... .1a.tar.gz in the wget line.

Let me know how you get on.

Regards,

Dave.

rad
Posts: 11
Joined: Wed Jan 20, 2010 8:48 pm

Re: Getting alerts from an offline node

Post by rad » Fri Jan 22, 2010 5:46 pm

Thanks, Dave. I'll try to set up a second VM and install that update this afternoon.

rad
Posts: 11
Joined: Wed Jan 20, 2010 8:48 pm

Re: Getting alerts from an offline node

Post by rad » Tue Jan 26, 2010 4:49 pm

Hi, Dave. I've copied to test folder from 1.08.1a into the server folder on my dev VM. I'll let you know how things go.

As an update, we've had a couple nodes at remote sites go offline for a little while due to site power failures over the last few days. During those times, I had the same issue occur. However, after looking at the results, it appears that I am only getting the ghost alerts from servers. Nodes for network switches, UPS management cards, and remote access cards inside the servers do not have any ghost activity spikes recorded in the history graph during the time that the site was offline.

dave
Site Admin
Posts: 260
Joined: Fri May 30, 2008 9:09 pm
Location: UK
Contact:

Re: Getting alerts from an offline node

Post by dave » Tue Jan 26, 2010 5:08 pm

Hi,

Curiouser and curiouser.

Please make sure you update the entire dev VM to 1.08.1a not just the test directory. In fact you don't need to copy that - just configure a node (or nodes) in 1.08.1a and see what happens when they're off.

Cheers,

Dave.

rad
Posts: 11
Joined: Wed Jan 20, 2010 8:48 pm

Re: Getting alerts from an offline node

Post by rad » Tue Jan 26, 2010 5:20 pm

Will do. Thanks.

rad
Posts: 11
Joined: Wed Jan 20, 2010 8:48 pm

Re: Getting alerts from an offline node

Post by rad » Wed Jan 27, 2010 4:32 pm

I upgraded my dev VM to 1.08.1a and will keep you posted on how things go. Everything seems to be working fine after the upgrade, regardless of all the MySQL errors that went by during the schema upgrade.

dave
Site Admin
Posts: 260
Joined: Fri May 30, 2008 9:09 pm
Location: UK
Contact:

Re: Getting alerts from an offline node

Post by dave » Wed Jan 27, 2010 6:29 pm

Hi - MySQL errors par for the course owing to the way I upgrade (a "rough" upgrade via myrug).

Let me know if it still sees phantoms.

Cheers,

Dave.

Post Reply