Zabbix 2.4 features, part 7 – Improved troubleshooting

We already looked at one incredibly useful feature, available in 2.4 – ability to change log level while Zabbix daemons are running. Zabbix 2.4 also provides several other improvements that should help a lot with problem troubleshooting.

Articles in 2.4 feature series:

Among the neatly packed improvements for 2.4 are some not so huge ones, but they could save quite a bit of time by having better validation and error messages in many cases.

Configuration file validation

  • Previously, configuration file validation was a bit haphazard – some things were checked before writing to the logfile, some after, thus the daemon could seemingly start successfully, but then write some complaint about the configuration file in the log. With Zabbix 2.4, configuration file is checked for any problems before writing to the logfile. All errors are thrown back at the user in the terminal.
  • Alias parameter keys in the agent configuration file are now validated to be proper Zabbix item keys – previously invalid characters could be used, but the item itself – not.
  • It was possible to set StartPollersUnreachable to 0 while having agent, SNMP and other items configured that would actually use unreachable pollers. If a host would become unreachable, it would be never checked again. This was extremely confusing and Zabbix server and proxy now refuse to start if StartPollersUnreachable is set to 0 and there are other regular, IPMI or Java pollers started.
$ zabbix_server
zabbix_server [10476]: ERROR: "StartPollersUnreachable"
 configuration parameter must not be 0 if regular, IPMI or Java
 pollers are started

More detail than just ZBX_UNSUPPORTED

If an item failed on the Zabbix agent previously, we would only get back the dreaded ZBX_UNSUPPORTED. Zabbix agents now provide detailed information on why items become not supported. From the user perspective, errors that previously were only visible in agent daemon logfile now will be displayed in the frontend. Additionally, maximum error message length has been increased from 128 to 2048 symbols. That should prevent the most useful part of the error message being cut off in some cases.

On the trigger side, if evaluation for nodata() trigger function fails due to lack of data on the server, a more informative message is displayed. Previous vague message of Evaluation failed for function… has been extended with item does not have enough data after server start or item creation.

Note that error message length for triggers is still limited to 128 symbols, and truncating of the value can be seen in the screenshot above. Feature request ZBXNEXT-2445 asks to increase the length of this field.

Starting daemon with the wrong database

Previously, starting Zabbix server with the proxy database would fail with messages that could result in various levels of confusion. Starting proxy with the server database would result either in incorrect operation or proxy crash, depending on the proxy version. Incorrect operation could even result in nearly complete configuration loss if proxy was pointed at an operation server database.

In 2.4, both of these daemons try to check whether the database matches their expectations and gracefully refuse to start if not so:

 11021:20140907:185445.167 cannot use database "zabbix_this":
  Zabbix server cannot work with a Zabbix proxy database
 18155:20140907:185909.249 cannot use database "zabbix_that":
  Zabbix proxy cannot work with a Zabbix server database
This entry was posted in Technical and tagged . Bookmark the permalink.

Leave a Reply