Remote commands are powerful yet underestimated feature of Zabbix to react on trigger-based events and fix basic issues automatically. However, they require a specific and complex setup to be used in a real-world environment.
Learn how to use Zabbix actions and remote commands in a productive environment and how to make your setup simple and robust.
Contents
I. Introduction (0:42)
II. Simple use case: Remote Action (1:45)
III. How to manage different types of web servers and techniques (6:30)
IV. Event Tags (9:22)
V. Service on Kubernetes (19:28)
VI. How to do it better (23:53)
VII. Conclusion (28:31)
Simple use case: Remote Action
Let’s consider a simple use case where we have a web server, and we monitor the web server availability, as well as certificate validity. If one of these checks is going to fail, the initial action would be to restart the web server.
Monitoring and remote action use case
For instance, if those web servers are not responding and the certificate is not valid, the certificate might have been updated in between, while the web server was not.
1. In this case, we need to restart the web server to see what’s going on. To do that in Zabbix, we need to set up two items — web server availability and certificate validity, and to define the triggers.
Basic item/trigger configuration
2. Then we need to configure an action. Here we create a trigger condition to select the action, which is hard-coded into our triggers. This action fires only if Zabbix triggers change their status.
In the action definition, you can see SSH-type execution. This means that Zabbix will actually “SSH into” the target system, automatically log in using user name and password or the keys, and then execute the defined command on the target system.
NOTE. The action in Zabbix is selecting the current host, i.e. the one that actually has fired the trigger.
How to manage different types of web servers and techniques
If we have different types of web servers, which execute commands in a different way, the system will work differently.
A mix of different types of servers
We use the same template for all of these services and the same item, but the remote commands will differ, as there could not be the one and only action for different services.
The challenges:
-
- How can we use the same templates to define items and trigger and separate actions to execute the proper remote command?
- How can we pass additional specific information to the remote command to limit the number of actions?
Event Tags
Event Tags were introduced on the trigger level in Zabbix 3.2. In Zabbix 4.2, Event Tags support was added on the host and template level.
Event Tags allow us to do the following:
Event Tags can be used as filters in actions, and that is the option we are looking for. The idea is that we attach event tags to the trigger or to the host. And these event tags will carry information about the type of web servers and services. This information is then used inside the action to actually execute the commands.
For our simple use case, we:
-
- Define event tags on the host level to enable / select actions.
- Define macros on the host level to use the proper service name and mechanism in the remote command.
- Use event tags as a filter condition in Zabbix actions.
1. Host macros and tags
Here you can specify that the remote command (RC) is ‘On‘ to use it as a filter in the action, and define the service — the webserver. The trigger executing the remote command will be displayed on the dashboard.
2. Action conditions
Action conditions based on event tags
The action does not include specific triggers. In fact, there’s no trigger filter at all, but we’re using our event tags. The ‘On‘ value of RC tag is to turn our remote action selectively on and off.
3. Action command
Lastly, we need to unify the execution in one action instead of having each action for web server and service type.
Here we use SSH, replace username and password with macros. We can set macros on the host level, on the template level, and on a global level.
Since we are using SSH, we can assume that we have a shell available. So this is basically executed in the shell, assigned to the SSH user. Our Zabbix macros are used in the command line.
NOTE. If you know the tag name, you can append the tag name to the macro to expand the value of an event.
So we, look to the service manager and a macro to see whether we have to execute system control or service. And then we add some logging information and execute the restart either using the systemctl or service command.
If you look through the corresponding action log, maintained for each action being executed, you’ll see that our macros have been properly expanded. So, we see the system and ‘Service Webserver‘ working.
This means we have one action, and this solution would cover system control, service, etc. whether the webserver is using Apache, CentOS, Debian, or any other solution.
Service on Kubernetes
Zabbix allows for monitoring Linux distributions, as well as Docker deployment, or Kubernetes deployment. These solutions work in our scenario as well, but we need to have some other definitions and make minor changes.
1. Host macros and tags
Here we introduce a new host macro, {$DEPLOYMENT}. With Kubernetes, you typically have a controller or a master, where you execute commands. So, you do not execute your commands on the container hosts, but on one central station, which makes things much easier. We need this macro here as identification performed by configuration names or configuration file names. In our case, we have a configuration name — ‘nginx-deployment‘.
Then we add a new tag — Kubernetes, which performs like our remote command tag and is an identifier of our action.
2. We also need to update our previous action. To do this, we add an additional filter saying that our previous action should only be executed if the tag Kubernetes does not exist. In this case, we can distinguish between systems that running Kubernetes and other systems. Then we add the new action.
3. In this scenario, the target will be Kubernetes Manager and commands in the command line will change accordingly.
So, with this solution, we are able to restart any web server in the Kubernetes environment and also on a virtual machine, a local host, etc.
How to do it better
However, the larger is your deployment or environment, the more hazards you’ll have, such as access credentials, shared key setup, etc. It is also very difficult to actually test those actions.
There is one way to actually do this better — Taskrunner, which performs tasks on a remote host. Taskrunner are often used in larger environments.
The idea here is that we use Zabbix action to not perform the remote command against the target host but to actually trigger a predefined task on the Taskrunner, which will then perform the required actions.
For instance, we are using Rundeck Taskrunner, which works perfectly. There you can define different so-called projects, and each project can have different jobs, while each job is a collection of tasks. So, when Zabbix middleware triggers a job, the job then executes the task, and then you get a response in Zabbix.
Taskrunner offers a lot of advantages, including:
-
- easier integration:
Easier integration with Taskrunner
-
- simplified actions:
Simplified actions with Taskrunner
In this case, when a trigger fires, Zabbix, instead of actually performing the task against the remote system, is addressing a middleware, which actually executes the proper task, gets the result, and updates an event in Zabbix. So, you’ll see the outcome in Zabbix immediately.
Conclusion
Zabbix offers advantages of event tags in remote commands execution. At the same time, in larger environments, you can use middleware, such as Taskrunner.
Nice Blog Herr Alper!
Just what I was looking for but…..
I defined a SNMP Item
On that item I defined a trigger
In that trigger I defined a trigger tag, which get it’s value from a regsub.
Now in an action for that trigger, {EVENT.TAGS.Tagname} does not seems to have a value.
The same TAG has a value in the item.
But when using the regsub in the action (as defined in the trigger), I got the value I expected to get in the {EVENTS.TAGS.Tagname} so I can do what I had in mind.
Are you sure the TAG also should have the value in the action?
I’m on 4.4
Theo