In today’s class let’s talk about where the disk space goes. Which items and hosts objects consume the disk space the most.

The post will cover things like:

I. Biggest tables in a database
II. Biggest data coming to the instance right now
III. Biggest data inside one partition of the DB table
IV. Print hosts and items which consumes the most disk space

Biggest tables

In general, the leading tables are:

history
history_uint
history_str
history_text
history_log
events

‘history_uint’ will store integers. ‘history’ will store decimal numbers.
‘history_str’, ‘history_text’, ‘history_log’ stores textual data.
In the table ‘events’ goes problem events, internal events, agent auto-registration events, discovery events.

Have a look yourself in a database which tables take the most space. On MySQL:

SELECT table_name,
       table_rows,
       data_length,
       index_length,
       round(((data_length + index_length) / 1024 / 1024 / 1024),2) "Size in GB"
FROM information_schema.tables
WHERE table_schema = "zabbix"
ORDER BY round(((data_length + index_length) / 1024 / 1024 / 1024),2) DESC
LIMIT 8;

On PostgreSQL:

SELECT *, pg_size_pretty(total_bytes) AS total , pg_size_pretty(index_bytes) AS index ,
       pg_size_pretty(toast_bytes) AS toast , pg_size_pretty(table_bytes) AS table
FROM (SELECT *, total_bytes-index_bytes-coalesce(toast_bytes, 0) AS table_bytes
   FROM (SELECT c.oid,
             nspname AS table_schema,
             relname AS table_name ,
             c.reltuples AS row_estimate ,
             pg_total_relation_size(c.oid) AS total_bytes ,
             pg_indexes_size(c.oid) AS index_bytes ,
             pg_total_relation_size(reltoastrelid) AS toast_bytes
      FROM pg_class c
      LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
      WHERE relkind = 'r' ) a) a;

Detect big data coming to instance in the last 5 minutes

MySQL ‘history_log’:

SELECT SUM(LENGTH(value)) AS 'chars',
CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS 'URL'
FROM history_log
WHERE clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE)
GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;

MySQL ‘history_text’:

SELECT SUM(LENGTH(value)) AS 'chars',
CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS 'URL'
FROM history_text
WHERE clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE)
GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;

MySQL ‘history_str’:

SELECT SUM(LENGTH(value)) AS 'chars',
CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS 'URL'
FROM history_str
WHERE clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE)
GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;

PostgreSQL ‘history_text’:

SELECT CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS URL,
SUM(LENGTH(value)) FROM history_text
WHERE clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE')
GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;

PostgreSQL ‘history_log’:

SELECT CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS URL,
SUM(LENGTH(value)) FROM history_log
WHERE clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE')
GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;

PostgreSQL ‘history_str’:

SELECT CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS URL,
SUM(LENGTH(value)) FROM history_str
WHERE clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE')
GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;

Which hosts consume the most space

MySQL ‘history_text’:

SELECT SUM(LENGTH(history_text.value)) AS 'chars', hosts.name AS 'name'
FROM history_text
JOIN items ON (items.itemid=history_text.itemid)
JOIN hosts ON (hosts.hostid=items.hostid)
WHERE history_text.clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE)
GROUP BY hosts.name ORDER BY SUM(LENGTH(history_text.value)) DESC LIMIT 5;

MySQL ‘history_log’:

SELECT SUM(LENGTH(history_log.value)) AS 'chars', hosts.name AS 'name'
FROM history_log
JOIN items ON (items.itemid=history_log.itemid)
JOIN hosts ON (hosts.hostid=items.hostid)
WHERE history_log.clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE)
GROUP BY hosts.name ORDER BY SUM(LENGTH(history_log.value)) DESC LIMIT 5;

MySQL ‘history_str’:

SELECT SUM(LENGTH(history_str.value)) AS 'chars', hosts.name AS 'name'
FROM history_str
JOIN items ON (items.itemid=history_str.itemid)
JOIN hosts ON (hosts.hostid=items.hostid)
WHERE history_str.clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE)
GROUP BY hosts.name ORDER BY SUM(LENGTH(history_str.value)) DESC LIMIT 5;

PostgreSQL ‘history_text’:

SELECT SUM(LENGTH(history_text.value)) AS "chars", hosts.name AS "name"
FROM history_text
JOIN items ON (items.itemid=history_text.itemid)
JOIN hosts ON (hosts.hostid=items.hostid)
WHERE history_text.clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE')
GROUP BY hosts.name ORDER BY SUM(LENGTH(history_text.value)) DESC LIMIT 5;

PostgreSQL ‘history_log’:

SELECT SUM(LENGTH(history_log.value)) AS "chars", hosts.name AS "name"
FROM history_log
JOIN items ON (items.itemid=history_log.itemid)
JOIN hosts ON (hosts.hostid=items.hostid)
WHERE history_log.clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE')
GROUP BY hosts.name ORDER BY SUM(LENGTH(history_log.value)) DESC LIMIT 5;

PostgreSQL ‘history_str’:

SELECT SUM(LENGTH(history_str.value)) AS "chars", hosts.name AS "name"
FROM history_str
JOIN items ON (items.itemid=history_str.itemid)
JOIN hosts ON (hosts.hostid=items.hostid)
WHERE history_str.clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE')
GROUP BY hosts.name ORDER BY SUM(LENGTH(history_str.value)) DESC LIMIT 5;

Analyze in partition level (MySQL)

On MySQL, if database table partitioning is enabled we can list the biggest partitions on a filesystem:

cd /var/lib/mysql/zabbix
ls -lh history_log#*

It will print:

-rw-r-----. 1 mysql mysql  44M Jan 24 20:23 history_log#p#p2021_02w.ibd
-rw-r-----. 1 mysql mysql  24M Jan 24 21:20 history_log#p#p2021_03w.ibd
-rw-r-----. 1 mysql mysql 128K Jan 11 00:59 history_log#p#p2021_04w.ibd

From previous output, we can take partition name ‘p2021_02w’ and use it in a query:

SELECT ho.hostid, ho.name, count(*) AS records, 
(count(*)* (SELECT AVG_ROW_LENGTH FROM information_schema.tables 
WHERE TABLE_NAME = 'history_log' and TABLE_SCHEMA = 'zabbix')/1024/1024) AS 'Total size average (Mb)', 
sum(length(history_log.value))/1024/1024 + 
sum(length(history_log.clock))/1024/1024 +
sum(length(history_log.ns))/1024/1024 + 
sum(length(history_log.itemid))/1024/1024 AS 'history_log Column Size (Mb)'
FROM history_log PARTITION (p2021_02w)
LEFT OUTER JOIN items i on history_log.itemid = i.itemid 
LEFT OUTER JOIN hosts ho on i.hostid = ho.hostid 
WHERE ho.status IN (0,1)
GROUP BY ho.hostid
ORDER BY 4 DESC
LIMIT 10;

You can reproduce a similar scenario while listing:

ls -lh history_text#*
ls -lh history_str#*

Free up disk space (MySQL)

Deleting a host in GUI will not free up data space on MySQL. It will create empty rows in table where the new data can be inserted. If you want to really free up disk space, we can rebuild partition. At first list all possible partition names:

SHOW CREATE TABLE history\G

To rebuild partition:

ALTER TABLE history REBUILD PARTITION p202101160000;

Free up disk space (PostgreSQL)

On PostgreSQL, there is a process which is responsible for vacuuming the table. To ensure a vacuum has been done lately, kindly run:

SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum
FROM pg_stat_all_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC;

In output, we look at ‘n_dead_tup’ it means a dead tuple.
If the last auto vacuum has not occurred in last 10 days, it’s bad. We have to install a different definition. We can increase vacuum priority by having:

vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.01
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = 3000
autovacuum_max_workers = 6

Alright. That is it for today.