Fedora is a global open source project and Linux distribution that provides a platform for innovation and collaboration.
Its infrastructure is managed by a dedicated team of professionals and volunteers who maintain a wide array of services, from build systems to collaboration platforms.
The challenge
For many years, Fedora relied on Nagios for its primary monitoring. While reliable for its time, Nagios presented several significant challenges as the infrastructure grew:
- Technological debt. The system was very old and lacked the modern features required for complex infrastructure.
- Simplistic alerting. Nagios was limited to basic “OK,” “Warning,” or “Critical” states, offering no nuance or sophisticated levels of severity.
- A lack of native trend data. Nagios does not store check history or trend data. To obtain historical insights, the team had to run a separate collectd instance and manually add items to it.
- Configuration drift. Monitoring was managed via a monolithic Ansible role that wrote out text configuration files. Because application definitions and their monitoring were in different places, new nodes or services were sometimes missed in the monitoring setup.
- Monolithic complexity. The Ansible code used to drive Nagios was extremely dense, utilizing complex loops that made it difficult to read, follow, or debug, and sometimes limited flexibility in rolling out new checks.
The solution
Fedora chose Zabbix as its next-generation monitoring platform due to its open source nature, active maintenance, ability to self-host, and robust feature set that addressed Nagios’s shortcomings. The transition focused on several key technical improvements:
- Ansible-driven configuration. Fedora leverages the Zabbix Ansible collection to drive the Zabbix API. This ensures that 100% of the infrastructure configuration – including templates, host definitions, and SAML authentication—is managed as code.
- Decentralized monitoring definitions. Unlike the monolithic Nagios role, application monitoring is now defined directly within the relevant application’s Ansible role. Adding a node to monitoring typically requires only two Ansible tasks: ensuring the template is up-to-date and adding the host to that template.
- Sophisticated trigger logic. By moving trigger logic from the agent to the server, Zabbix allows Fedora to use historical trend data (e.g., values over the last hour) rather than just the most recent check result.
- Versatile data collection. Zabbix’s ability to monitor everything from RAID devices and certificates to database queries and network devices out-of-the-box made it a better fit than more HTTP-focused tools.
The results
The migration to Zabbix has transformed Fedora’s operational visibility in the following ways:
- Unified visibility. The team now has integrated trend data and monitoring in one place, eliminating the need for separate tools like collectd.
- Improved reliability. Managing monitoring through the Zabbix API and Ansible roles has reduced the risk of “missing” nodes, as monitoring is now part of the application’s definition of done.
- Infrastructure as code. The ability to rebuild the entire monitoring configuration from Ansible (even without a database backup) provides high resilience and simplifies upgrades.
- Community alignment. By adopting Zabbix, Fedora has standardized its operations with CentOS (which already uses Zabbix), allowing for shared expertise across teams.
In conclusion
By moving from Nagios to Zabbix, Fedora has successfully retired significant technical debt and implemented a modern, scalable, and fully automated monitoring system. The flexibility of the Zabbix API combined with the power of Ansible has allowed the project to move monitoring from a centralized “black box” to a core component of every application’s deployment.
To learn more about how Zabbix can modernize large-scale open source infrastructures, get in touch with us.
About Fedora
The Fedora Project is an international partnership of open source and free software developers sponsored by Red Hat. This collaboration combines community led creativity with Red Hat’s resource investment to drive innovation of Linux technologies.