Friday, August 5, 2011

How SNMP probes in InterMapper determine lost packets or "No SNMPvX Response"

Here is an excellent write-up by William W. Fisher about the behavior of SNMP probes by InterMapper.  As we are building our DC Insight solution on top of InterMapper, a thorough understand of InterMapper is very important.
I would like to republish it here for those who are interested to have a deeper understand of how InterMapper behaves.
InterMapper polls devices by sending a request packet and receiving back a response packet. If InterMapper sends a request packet and does not receive a response within the specified timeout (usually 3 seconds), IM counts that packet as 'lost' and retries the request. If InterMapper's request fails to elicit a response three consecutive times, the device's status is set to down. (3 is the default)
With an SNMP probe, the lost packets are SNMP packets. There are three possibilities for where the packet was 'lost':
1. The request didn't reach the target device.
2. The target device did not generate a response within 3 seconds.
3. The response did not make it back to InterMapper.
The SNMP probe is slightly complicated by the fact that the final retry will be a ping packet instead of SNMP. We implemented it this way after finding that some devices do not reliably answer SNMP packets on time. For example, a busy router might leave SNMP packets unanswered, but answer pings immediately. (Responding to a SNMP query is more computationally intensive than answering an ICMP echo). A device that answers the final ping retry is marked as "No SNMP response".
If pings get through fine, but an occasional SNMP packet is lost to one particular device, my sense is that nothing is wrong with the network. I would advise that you increase the threshold for packet loss for that one device to 10% and leave it at that.
Let me know if this helps.


Bill Fisher
Dartware, LLC