“Why on earth was I not notified?!”

“Why on earth was I not notified?!” — ever heard that question from a fellow worker? Setting up notifications can be a challenge — and not only for beginners. Normally, debugging such cases is cumbersome, complex and requires a good understanding of how Zabbix works. Were you ever asked for a list of people who would be notified on some event? It’s hard to tell, until the event actually happens. Or at least it used to:

The Action Simulator tries to relieve you from these problems and make you and your co-workers happy again.

Update: Presenting the Action simulator at the Zabbix Conference 2013

Disclaimer

First off, the Action Simulator is not part of Zabbix, or at least not yet. It’s a community effort instead. It’s not mature software, but rather work in progress. It certainly has bugs and doesn’t cover every use case yet. But that should not frighten you: You can readily use it with your 2.0.4 frontend. No re-compilation, no database schema changes and it will certainly not damage your data. But let’s take a look at what it can do first!

What it does and what it looks like

The Action Simulator’s goal is to clearly answer who is notified in case of a specific event. Notifications often don’t turn out as expected, for various reasons, like permissions or mistakes in action conditions. To debug such cases, the simulator provides additional information. No events are caused and no notifications are actually sent. It’s really a simulator. The only visible change is an additional column in the trigger configuration list. It contains links that open a pop-up window. The pop-up contains a list of all notifications and collapsible debugging information.

Trigger configuration list with new column

The only change to the frontend is a new column in the trigger configuration list

The information in the pop-up helps you to hunt down errors in:

  • Action conditions (logical errors, typos, lax conditions, …)
  • Operations (forgotten users, group memberships)
  • Permissions to host
  • Media setup (user media not defined, severity, …)

Example

Let’s assume you set up a trigger to fire if the checksum of sshd changes. You expect 3 actions to run, as soon as the event happens, doing the following:

  • The two GNU/Linux admins are notified via SMS and e-mail
  • Head of security gets an e-mail
  • Network administration gets an e-mail

Turns out, it doesn’t happen! The only thing that works, is e-mail for the GNU/Linux admins. What happened to SMS? What happened to the other two e-mails?

Taking a look at the Action Simulator pop-up, things become clear:

Annotated action pop-up

Debugging tables make it easy to see why things happen as they do

How to try it

You can easily test the Action Simulator in a copy of the frontend in any environment. No database changes or re-compilation is necessary. You can just copy your frontend directory, apply the patch to the copy and configure your web server to serve this frontend in parallel.

# Download the patch
wget -O /tmp/action_simv3.2.patch http://www.geofrogger.net/review/actionsim/v3/action_simv3.2.patch

# Copy your frontend
cp -pr /usr/share/zabbix /usr/local/share/zabbix_actionsim

# Patch it
cd /usr/local/share/zabbix_actionsim
patch -p0 < /tmp/action_simv3.2.patch
  • Update, 24 Jan 2013: http://www.geofrogger.net/review/actionsim/v4/action_simv4.patch
  • Update, 17 Feb 2013: http://www.geofrogger.net/review/actionsim/v4/action_simv4.1.patch
  • Update, 17 Feb 2013: http://www.geofrogger.net/review/actionsim/v4/action_simv4.2.patch
  • Update, 18 Feb 2013: http://www.geofrogger.net/review/actionsim/v4/action_simv4.3.patch
  • Update, 22 Mar 2013: http://www.geofrogger.net/review/actionsim/v4/action_simv4.4.patch
  • Update, 7 Sep 2013: Way advanced, but still at an alpha stage:

    http://www.geofrogger.net/review/actionsim/v5/action_simv5.0a.patch

  • http://www.geofrogger.net/review/actionsim/v5/action_simv5.0c.patch
  • Update, 20 Nov 2013: Important functional and performance update:
    http://www.geofrogger.net/review/actionsim/v5/action_simv5.1.patch
  • Update, 20 Nov 2013: Zabbix 2.2 draft of version 5.1:
    http://www.geofrogger.net/review/actionsim/v5/zabbix-2.2/action_simv5.1-zabbix2.2.patch

Keep an eye open for later versions! All that’s left, is copying and manipulating your web server configuration to serve this directory under a different URL and reload it. You can run as many frontends in parallel as you want. The patch applies to vanilla Zabbix 2.0.4 frontend without errors.

Some known limitations

  • Only works properly for Zabbix Super Admins
  • Only shows notifications — no commands
  • Has no sense of escalation yet
  • Doesn’t yet consider temporal constraints for actions and user media, as well as some other action conditions
  • Update, 17 Feb 2013: Commands are shown from v4 on.
  • Update, 10 Sep 2013: Escalation and temporal constraints are considered from v5 on

Please be graceful, report bugs and share your opinion on the current design. The author is working towards escalation at the time of this writing.

For further information, known bugs and limitations, as well as future plans, please visit: https://www.zabbix.org/wiki/Docs/action_simulator

Also consider to vote on the ZBXNEXT-97 ticket!

This entry was posted in Community, How To, Technical and tagged , , , . Bookmark the permalink.

13 Responses to “Why on earth was I not notified?!”

  1. HenrikJ says:

    Volter, awesome idea!

  2. Very nice. Debugging why users get or don’t get notifications has always been a hassle.

    About time that Zabbix SIA employs a UI specialist für Zabbix 3.0…

  3. volter says:

    Action simulator version 4 is out!

    It offers vast improvements in function, as well as a number of corrections. Among them:

    * Handles remote commands
    * Time constraints now work as action condition and for user media but only use the current time yet
    * Template condition now works for discovered triggers
    * Show operation description for messaging instead of “User” or “Group”
    * Improve debugging output for host group and maintenance condition
    * Added time and escalation step columns in preparation for escalation handling
    * Label non-operational UI elements “NI” — Not implemented
    * Actions without conditions no longer cause errors in debugging
    * Show templateids instead of triggerids in template condition debugging

    Being in your frontend directory, please run:

    patch -p1 < /path/to/action_simv4.patch

    http://www.geofrogger.net/review/actionsim/v4/
    http://www.geofrogger.net/review/actionsim/action_sim.txt

  4. Raymond Kuiper says:

    Very nice! I’m sure to try it :)

  5. Romeo Theriault says:

    This is really, really helpful. I hope this gets included in future versions of zabbix.

  6. volter says:

    * Now actually works with other backends than PostgreSQL
    * Draw an empty cell on disabled trigger rows instead of dropping it

    Being in your frontend directory, please run:

    patch -p1 < /path/to/action_simv4.1.patch

    http://www.geofrogger.net/review/actionsim/v4/
    http://www.geofrogger.net/review/actionsim/action_sim.txt

  7. volter says:

    * Doesn’t show the “Actions” column for users other than Super Admins
    * Doesn’t show the “Actions” column for lists that only contain templates

    Being in your frontend directory, please run:

    patch -p1 < /path/to/action_simv4.2.patch

    http://www.geofrogger.net/review/actionsim/v4/
    http://www.geofrogger.net/review/actionsim/action_sim.txt

  8. volter says:

    4.3 corrects an error in the debugging table, when using template conditions and a trigger was not template-based

  9. volter says:

    4.4 works with PHP < 5.3

  10. volter says:

    New version 5 alpha, see the updated article for details!

  11. volter says:

    Important update 5.1:

    - Corrected escalation time calculation
    - Corrected period interpretation
    - Speed-up

    http://www.geofrogger.net/review/actionsim/action_sim.txt
    http://www.geofrogger.net/review/actionsim/v5/