Zabbix 2.2 features, part 11 – SNMP monitoring improvements

Zabbix offers a lot of methods for data gathering, including SNMP. SNMP has been a popular protocol for many years and probably will stay that way – it’s used on routers, switches, UPS devices, storage arrays and lots of other devices.  Zabbix 2.2 will improve the existing SNMP support in several ways.

Articles in 2.2 feature series:

SNMPv3 related improvements

SNMP versions 1 and 2c are still popular, but offer very weak security measures. SNMPv3, among other improvements, does provide better security as well and is gaining popularity lately. Zabbix 2.2 has two SNMPv3 related improvements.

Context name support

SNMPv3 offers the ability to specify a “context”. Context is a way to identify one of many identical or similar entities behind a single SNMP endpoint. Citing from RFC 3411:

An SNMP context, or just “context” for short, is a collection of management information accessible by an SNMP entity. An item of management information may exist in more than one context. An SNMP entity potentially has access to many contexts.

For example, a UPS management station might be connected to individual devices, but expose only a single SNMP interface. These devices would have identical OIDs for some things (system description, temperature), but it would not be able to identify which device we want the information about. This is where SNMP context comes into play – here the context would be the individual UPS device name.

Starting with Zabbix 2.2, for SNMPv3 a context name may be specified. It is supported in:

  • normal items
  • low level discovery rule
  • low level discovery item prototype
  • network discovery
Specifying SNMP context in network discovery configuration

To allow easy management of devices where multiple items would need to use context, user macros may be used in the context field in all of the supported locations.

SHA/AES support

Currently only MD5 for authentication and DES for privacy in SNMPv3 are supported by Zabbix. More secure methods are available and have been requested by Zabbix users, thus 2.2 will also support SHA for authentication and AES for privacy.

AES and SHA support in network discovery configuration

SNMP retry and timeout changes

Currently, Zabbix server or proxy will not specify timeout or retry for SNMP operations, resulting in the defaults from the Net-SNMP library being used. The default timeout is one second and the default retry count is 5 seconds, thus a check which would take a bit longer than one second would never succeed, but would still be attempted 6 times in total. Additionally, this would significantly slow down SNMP network discovery, as any device not responding to SNMP would still take around 6 seconds to probe (for each SNMP check).

Starting with Zabbix 2.2, the daemons will use the Timeout parameter value from the corresponding configuration file (defaulting to 3 seconds) and will attempt no retries if the request failed (for example, timed out because of network issues or incorrect credentials).

Complex OIDs for low level discovery

Currently, SNMP low level discovery only used the last value from the OID. This caused problems when the index was longer. For example, in the following OIDs the last two numbers together represent the index:

CISCO-POP-MGMT-MIB::cpmDS1ActiveDS0s.6.0
CISCO-POP-MGMT-MIB::cpmDS1ActiveDS0s.6.1
CISCO-POP-MGMT-MIB::cpmDS1ActiveDS0s.7.0

When discovering by OID CISCO-POP-MGMT-MIB::cpmDS1ActiveDS0s, Zabbix would create items for the first two OIDs as 0 and 1, then fail to create item for the third OID. Now the full OID part will be used.

Additionally, strings as indexes will be supported in Zabbix 2.2.

8 thoughts on “Zabbix 2.2 features, part 11 – SNMP monitoring improvements”

  1. Have I already mentioned I love you guys?

    This is way cool! The Complex OID’s for LLD will revitalize my idea of monitoring CDP neighbours. (as Cisco uses the second last character for index)
    In a large and complex network, revealing the root cause of an outage can be a pain.
    300 “device offline” and one “Neighbour lost” triggers active. Filter, and go straight to repair on the port or cable in question. 🙂

  2. > Starting with Zabbix 2.2, the daemons will use the Timeout parameter value from the corresponding configuration file (defaulting to 3 seconds) and will attempt no retries if the request failed (for example, timed out because of network issues or incorrect credentials).

    In Zabbix 2.0 we used Timeout = 30 seconds to executing external checks. SNMP checks was worked fine, because it used default timeout of SNMP libraries.

    Now, in Zabbix 2.2 we have throuble: Timeout = 30 is too big, and we need wery many pollers to run checks. If Timeout = 3, then needed same number of pollers like before, but external checks does not works, because it needed more than 3 seconds to complete!

    How we may solve that throuble?

  3. I solve my throuble by editing source code. I added separate SNMPTimeout parameter for SNMP-checks. Here my trivial patch:

    diff -Naur old/zabbix_proxy/proxy.c src/zabbix_proxy/proxy.c
    --- old/zabbix_proxy/proxy.c    2013-12-04 15:19:26.000000000 +0600
    +++ src/zabbix_proxy/proxy.c    2013-12-04 15:20:53.000000000 +0600
    @@ -43,6 +43,7 @@
     #include "../zabbix_server/pinger/pinger.h"
     #include "../zabbix_server/poller/poller.h"
     #include "../zabbix_server/poller/checks_ipmi.h"
    +#include "../zabbix_server/poller/checks_snmp.h"
     #include "../zabbix_server/trapper/trapper.h"
     #include "../zabbix_server/snmptrapper/snmptrapper.h"
     #include "proxyconfig/proxyconfig.h"
    diff -Naur old/zabbix_server/poller/checks_snmp.c src/zabbix_server/poller/checks_snmp.c
    --- old/zabbix_server/poller/checks_snmp.c      2013-12-04 15:19:26.000000000 +0600
    +++ src/zabbix_server/poller/checks_snmp.c      2013-12-04 15:22:28.000000000 +0600
    @@ -33,6 +33,7 @@
     }
     zbx_snmp_index_t;
     
    +int CONFIG_SNMP_TIMEOUT;
     static zbx_snmp_index_t        *snmpidx = NULL;
     static int             snmpidx_count = 0, snmpidx_alloc = 16;
     
    @@ -268,10 +269,10 @@
                            break;
            }
     
    -       session.retries = 0;                            /* number of retries after failed attempt */
    -                                                       /* (net-snmp default = 5) */
    -       session.timeout = CONFIG_TIMEOUT * 1000 * 1000; /* timeout of one attempt in microseconds */
    -                                                       /* (net-snmp default = 1 second) */
    +       session.retries = 0;                                    /* number of retries after failed attempt */
    +                                                               /* (net-snmp default = 5) */
    +       session.timeout = CONFIG_SNMP_TIMEOUT * 1000 * 1000;    /* timeout of one attempt in microseconds */
    +                                                               /* (net-snmp default = 1 second) */
     
     #ifdef HAVE_IPV6
            if (SUCCEED != get_address_family(item->interface.addr, &family, err, MAX_STRING_LEN))
    diff -Naur old/zabbix_server/poller/checks_snmp.h src/zabbix_server/poller/checks_snmp.h
    --- old/zabbix_server/poller/checks_snmp.h      2013-12-04 15:19:26.000000000 +0600
    +++ src/zabbix_server/poller/checks_snmp.h      2013-12-04 15:21:18.000000000 +0600
    @@ -26,7 +26,7 @@
     #include "sysinfo.h"
     
     extern char    *CONFIG_SOURCE_IP;
    -extern int     CONFIG_TIMEOUT;
    +extern int     CONFIG_SNMP_TIMEOUT;
     
     int    get_value_snmp(DC_ITEM *item, AGENT_RESULT *value);
     
    diff -Naur old/zabbix_server/server.c src/zabbix_server/server.c
    --- old/zabbix_server/server.c  2013-12-04 15:19:26.000000000 +0600
    +++ src/zabbix_server/server.c  2013-12-04 15:19:47.000000000 +0600
    @@ -44,6 +44,7 @@
     #include "pinger/pinger.h"
     #include "poller/poller.h"
     #include "poller/checks_ipmi.h"
    +#include "poller/checks_snmp.h"
     #include "timer/timer.h"
     #include "trapper/trapper.h"
     #include "snmptrapper/snmptrapper.h"
  4. Fixed and more common patch with added parameter SNMPRetries:

    diff -Naur old/zabbix_proxy/proxy.c src/zabbix_proxy/proxy.c
    --- old/zabbix_proxy/proxy.c    2013-12-04 15:19:26.000000000 +0600
    +++ src/zabbix_proxy/proxy.c    2013-12-04 16:36:49.000000000 +0600
    @@ -43,6 +43,7 @@
     #include "../zabbix_server/pinger/pinger.h"
     #include "../zabbix_server/poller/poller.h"
     #include "../zabbix_server/poller/checks_ipmi.h"
    +#include "../zabbix_server/poller/checks_snmp.h"
     #include "../zabbix_server/trapper/trapper.h"
     #include "../zabbix_server/snmptrapper/snmptrapper.h"
     #include "proxyconfig/proxyconfig.h"
    @@ -407,6 +408,10 @@
     #endif
                    {"Timeout",                     &CONFIG_TIMEOUT,                        TYPE_INT,
                            PARM_OPT,       1,                      30},
    +               {"SNMPTimeout",                 &CONFIG_SNMP_TIMEOUT,                   TYPE_INT,
    +                       PARM_OPT,       1,                      30},
    +               {"SNMPRetries",                 &CONFIG_SNMP_RETRIES,                   TYPE_INT,
    +                       PARM_OPT,       1,                      10},
                    {"TrapperTimeout",              &CONFIG_TRAPPER_TIMEOUT,                TYPE_INT,
                            PARM_OPT,       1,                      300},
                    {"UnreachablePeriod",           &CONFIG_UNREACHABLE_PERIOD,             TYPE_INT,
    diff -Naur old/zabbix_server/poller/checks_snmp.c src/zabbix_server/poller/checks_snmp.c
    --- old/zabbix_server/poller/checks_snmp.c      2013-12-04 15:19:26.000000000 +0600
    +++ src/zabbix_server/poller/checks_snmp.c      2013-12-04 16:33:11.000000000 +0600
    @@ -33,6 +33,8 @@
     }
     zbx_snmp_index_t;
     
    +int CONFIG_SNMP_TIMEOUT;
    +int CONFIG_SNMP_RETRIES;
     static zbx_snmp_index_t        *snmpidx = NULL;
     static int             snmpidx_count = 0, snmpidx_alloc = 16;
     
    @@ - #include "pinger/pinger.h"
     #include "poller/poller.h"
     #include "poller/checks_ipmi.h"
    +#include "poller/checks_snmp.h"
     #include "timer/timer.h"
     #include "trapper/trapper.h"
     #include "snmptrapper/snmptrapper.h"
    @@ -360,6 +361,10 @@
     #endif
                    {"Timeout",                     &CONFIG_TIMEOUT,                        TYPE_INT,
                            PARM_OPT,       1,                      30},
    +               {"SNMPTimeout",                 &CONFIG_SNMP_TIMEOUT,                   TYPE_INT,
    +                       PARM_OPT,       1,                      30},
    +               {"SNMPRetries",                 &CONFIG_SNMP_RETRIES,                   TYPE_INT,
    +                       PARM_OPT,       1,                      10},
                    {"TrapperTimeout",              &CONFIG_TRAPPER_TIMEOUT,                TYPE_INT,
                            PARM_OPT,       1,                      300},
                    {"UnreachablePeriod",           &CONFIG_UNREACHABLE_PERIOD,             TYPE_INT,

Leave a Reply