Zabbix has allowed to check whether a webpage contains a specific string for a long time – using the web.page.regexp[] agent item one could verify whether page contents match a regular expression or not, and return the matched string. But what if multiple matches were possible, but we were interested in a specific one? There was no built-in way to do that, but it is coming for Zabbix 2.2.
Articles in 2.2 feature series:
- Part 1 – Automatic database upgrading
- Part 2 – Templated web monitoring
- Part 3 – Web scenario retries
- Part 4 – HTTP proxy for web monitoring
- Part 5 – Better value mapping
- Part 6 – Returning values from webpages
- Part 7 – Value extracting from logfiles and more
- Part 8 – Reusing content in web monitoring
- Part 9 – No more full page reload in latest data
- Part 10 – Support of loadable modules
- Part 11 – SNMP monitoring improvements
If you have an application that exposes some internal variables using a webpage – for example, like this:
application.sessions.free=685 application.sessions.active=1013 application.cache.free=425
– you might want to get actual values for nice graphs, easy triggering and so on. But a regular expression of [0-9]+ would not be able to distinguish between these lines.
Currently that is possible using user parameters, external checks or Zabbix sender. Zabbix 2.2 will make this even easier by allowing a built-in method. This will be available as an extension to the existing web.page.regexp[] item. The current syntax accepts 5 parameters:
web.page.regexp[host, <path>, <port>, <regexp>, <length>]
This is extended in 2.2 by adding a sixth parameter: output. This parameter allows to specify individual regular expression subgroups, easily extracting the desired number in our example.
Confusing. Let’s look at an item key we could use to extract the active session count:
web.page.regexp[host,application/status,12345,"application.sessions.active=([0-9]+)",,\1]
The first three parameters are simple – specifying the host, path and port for the page we want to reach. The fourth parameter is the regular expression and it is a bit more interesting. Notice how we are looking for the specific text we need the value for – application.sessions.active= . After that, a subgroup in parenthesis asks for all digits, coming after the equal sign. As the regular expression includes square brackets (which are also used to enclose item key parameters), we had to doublequote it – although that is a good idea in other cases, too. We skip the fifth parameter, length.
The new, sixth, parameter is set to \1 . This tells Zabbix to extract the first subgroup and use that as a result. In our case there is a single subgroup only, we don’t have much choice there.
As a result, Zabbix will extract the session count number only and we will be able to see graphs and write triggers easily.
Additional notes
While the basic usage is simple enough, there are some extra things that can be done and are worth knowing.
Adding extra strings
As some might have wondered again, this syntax for the new parameter seems to imply that we can actually prepend/append other strings to the extracted value – and that’s true. For example, setting the last parameter to Session count: \1 would store a human readable string. Of course, that could not be graphed and triggers would become much more complicated, so in this case it is not suggested.
Reordering data
Similarly we may also reorder output by shuffling around the subgroup references. For example, the following item key would print the session count first, followed by a colon, space and the application.sessions.active string (which was the first matched subgroup):
web.page.regexp[127.0.0.1,status,80,"(application.sessions.active)=([0-9]+)",,\2: \1]
Referencing non-existent subgroups
In our example, we created one subgroup and referenced it with \1 . What would happen if we would reference a non-existent subgroup, like \2 ? Very simple – that would be replaced with an empty string.
Returning more
Probably the most attractive functionality is returning specific subgroups, but in some cases it would be beneficial to return either the whole match or the whole line. That functionality is not lost – omitting the output parameter will return the whole line and \0 will return the whole matched string.
Documentation says that “0 – is replaced with the matched text”
https://www.zabbix.com/documentation/2.2/manual/config/items/itemtypes/zabbix_agent
Current article says ” will return the whole matched string”
Documentation looks more clear for me 🙂
Because text != string 🙂
“string” could be interpreted as “line”, so “text” is a more clear term for me.
Actually I looked to the doc because it was not clear (IMO) what will actually do.
hmm. might be – maybe it could be more clear if it said “whole matched substring” 😉
thanks for the clarification
meh, “\ 0” without a space between them are hidden in comments here 🙂