One of the questions for those of us that use Zabbix on a large scale is “Just how much data can Zabbix ingest before it blows up spectacularly?” Some of the work I’ve been doing lately revolves around that question. I have an extremely large environment (around 32000+ devices) that could potentially be monitored entirely by Zabbix in the future.
Zabbix offers a lot of methods for data gathering, including SNMP. SNMP has been a popular protocol for many years and probably will stay that way – it’s used on routers, switches, UPS devices, storage arrays and lots of other devices. Zabbix 2.2 will improve the existing SNMP support in several ways.
Some might recall that back in 2011 we dug into old logfiles and produced a 5 year graph of Zabbix user count in the #zabbix IRC channel. At the same time, monitoring at a higher rate – hourly – was set up, and data collection started. Now that it’s been 2 years since that graph, let’s take a look at the new graph, how the user count has changed in two years and how Zabbix copes with a 7 year graph.
Zabbix trigger expressions provide an incredibly flexible way of defining problem conditions. If you can express your problem using plain English or any other human language, there is a great chance it could be represented using triggers.
I’ve noticed that even experienced Zabbix users are not always aware of the true power of triggers. The article is about defining problems in a smart way so that all alerts generated by Zabbix will be about real issues. No flapping, no false alarms anymore. Interested?
Zabbix comes with an impressive list of supported metrics for virtually all platforms. It covers the monitoring of performance and availability of OS including CPU, memory, network, processes, files, kernel parameters and more. Zabbix also performs agent-less checks for well-known services such as FTP, SSH, IMAP, POP3, HTTP, TCP, etc.
The Monitoring -> Latest data page in the Zabbix frontend allows to see values for items. Items are grouped by application (if assigned), and they can be expanded and collapsed. Previously, any such operation would result in a full page reload. 2.2 will make this operation happen without a page reload.
In several previous articles in the Zabbix 2.2 series we already discussed several improvements for web monitoring – the ability to template it, customise the amount of retries and the ability to specify an HTTP proxy on the scenario level. There’s more – in 2.2 it will also be possible to parse content from a page and reuse it in further scenario steps.
In the previous article in 2.2 series we explored a new ability to extract values from a webpage. This was not the only feature that was extended this way – several other items gained similar functionality – notably, file content parsing and logfile parsing. The latter has been a popular feature request and should be good news for many users.
And now for some more detail on changes for item keys vfs.file.regexp[], vfs.file.regmatch[], log[] and logrt[].
Zabbix has allowed to check whether a webpage contains a specific string for a long time – using the web.page.regexp[] agent item one could verify whether page contents match a regular expression or not, and return the matched string. But what if multiple matches were possible, but we were interested in a specific one? There was no built-in way to do that, but it is coming for Zabbix 2.2.
So who cares about monitoring the environment?
Exciting news! After a lot of hard work and hundreds of cups of coffee we’re proud to announce that the new documentation of the Zabbix API is complete. The improved API documentation provides both a high level overview of the available methods and in-depth descriptions of each method separately.