In today’s class let’s talk about where the disk space goes. Which items and hosts objects consume the disk space the most.
The post will cover things like:
I. Biggest tables in a database
II. Biggest data coming to the instance right now
III. Biggest data inside one partition of the DB table
IV. Print hosts and items which consumes the most disk space
Biggest tables
In general, the leading tables are:
history history_uint |
history_str history_text history_log |
events |
‘history_uint’ will store integers. ‘history’ will store decimal numbers.
‘history_str’, ‘history_text’, ‘history_log’ stores textual data.
In the table ‘events’ goes problem events, internal events, agent auto-registration events, discovery events.
Have a look yourself in a database which tables take the most space. On MySQL:
SELECT table_name, table_rows, data_length, index_length, round(((data_length + index_length) / 1024 / 1024 / 1024),2) "Size in GB" FROM information_schema.tables WHERE table_schema = "zabbix" ORDER BY round(((data_length + index_length) / 1024 / 1024 / 1024),2) DESC LIMIT 8;
On PostgreSQL:
SELECT *, pg_size_pretty(total_bytes) AS total , pg_size_pretty(index_bytes) AS index , pg_size_pretty(toast_bytes) AS toast , pg_size_pretty(table_bytes) AS table FROM (SELECT *, total_bytes-index_bytes-coalesce(toast_bytes, 0) AS table_bytes FROM (SELECT c.oid, nspname AS table_schema, relname AS table_name , c.reltuples AS row_estimate , pg_total_relation_size(c.oid) AS total_bytes , pg_indexes_size(c.oid) AS index_bytes , pg_total_relation_size(reltoastrelid) AS toast_bytes FROM pg_class c LEFT JOIN pg_namespace n ON n.oid = c.relnamespace WHERE relkind = 'r' ) a) a;
Detect big data coming to instance in the last 5 minutes
MySQL ‘history_log’:
SELECT SUM(LENGTH(value)) AS 'chars', CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS 'URL' FROM history_log WHERE clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE) GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;
MySQL ‘history_text’:
SELECT SUM(LENGTH(value)) AS 'chars', CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS 'URL' FROM history_text WHERE clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE) GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;
MySQL ‘history_str’:
SELECT SUM(LENGTH(value)) AS 'chars', CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS 'URL' FROM history_str WHERE clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE) GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;
PostgreSQL ‘history_text’:
SELECT CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS URL, SUM(LENGTH(value)) FROM history_text WHERE clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE') GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;
PostgreSQL ‘history_log’:
SELECT CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS URL, SUM(LENGTH(value)) FROM history_log WHERE clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE') GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;
PostgreSQL ‘history_str’:
SELECT CONCAT('history.php?itemids%5B0%5D=', itemid ,'&action=showlatest' ) AS URL, SUM(LENGTH(value)) FROM history_str WHERE clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE') GROUP BY itemid ORDER BY SUM(LENGTH(value)) DESC LIMIT 5;
Which hosts consume the most space
MySQL ‘history_text’:
SELECT SUM(LENGTH(history_text.value)) AS 'chars', hosts.name AS 'name' FROM history_text JOIN items ON (items.itemid=history_text.itemid) JOIN hosts ON (hosts.hostid=items.hostid) WHERE history_text.clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE) GROUP BY hosts.name ORDER BY SUM(LENGTH(history_text.value)) DESC LIMIT 5;
MySQL ‘history_log’:
SELECT SUM(LENGTH(history_log.value)) AS 'chars', hosts.name AS 'name' FROM history_log JOIN items ON (items.itemid=history_log.itemid) JOIN hosts ON (hosts.hostid=items.hostid) WHERE history_log.clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE) GROUP BY hosts.name ORDER BY SUM(LENGTH(history_log.value)) DESC LIMIT 5;
MySQL ‘history_str’:
SELECT SUM(LENGTH(history_str.value)) AS 'chars', hosts.name AS 'name' FROM history_str JOIN items ON (items.itemid=history_str.itemid) JOIN hosts ON (hosts.hostid=items.hostid) WHERE history_str.clock > UNIX_TIMESTAMP(NOW() - INTERVAL 5 MINUTE) GROUP BY hosts.name ORDER BY SUM(LENGTH(history_str.value)) DESC LIMIT 5;
PostgreSQL ‘history_text’:
SELECT SUM(LENGTH(history_text.value)) AS "chars", hosts.name AS "name" FROM history_text JOIN items ON (items.itemid=history_text.itemid) JOIN hosts ON (hosts.hostid=items.hostid) WHERE history_text.clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE') GROUP BY hosts.name ORDER BY SUM(LENGTH(history_text.value)) DESC LIMIT 5;
PostgreSQL ‘history_log’:
SELECT SUM(LENGTH(history_log.value)) AS "chars", hosts.name AS "name" FROM history_log JOIN items ON (items.itemid=history_log.itemid) JOIN hosts ON (hosts.hostid=items.hostid) WHERE history_log.clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE') GROUP BY hosts.name ORDER BY SUM(LENGTH(history_log.value)) DESC LIMIT 5;
PostgreSQL ‘history_str’:
SELECT SUM(LENGTH(history_str.value)) AS "chars", hosts.name AS "name" FROM history_str JOIN items ON (items.itemid=history_str.itemid) JOIN hosts ON (hosts.hostid=items.hostid) WHERE history_str.clock > EXTRACT(epoch FROM NOW()-INTERVAL '5 MINUTE') GROUP BY hosts.name ORDER BY SUM(LENGTH(history_str.value)) DESC LIMIT 5;
Analyze in partition level (MySQL)
On MySQL, if database table partitioning is enabled we can list the biggest partitions on a filesystem:
cd /var/lib/mysql/zabbix ls -lh history_log#*
It will print:
-rw-r-----. 1 mysql mysql 44M Jan 24 20:23 history_log#p#p2021_02w.ibd -rw-r-----. 1 mysql mysql 24M Jan 24 21:20 history_log#p#p2021_03w.ibd -rw-r-----. 1 mysql mysql 128K Jan 11 00:59 history_log#p#p2021_04w.ibd
From previous output, we can take partition name ‘p2021_02w’ and use it in a query:
SELECT ho.hostid, ho.name, count(*) AS records, (count(*)* (SELECT AVG_ROW_LENGTH FROM information_schema.tables WHERE TABLE_NAME = 'history_log' and TABLE_SCHEMA = 'zabbix')/1024/1024) AS 'Total size average (Mb)', sum(length(history_log.value))/1024/1024 + sum(length(history_log.clock))/1024/1024 + sum(length(history_log.ns))/1024/1024 + sum(length(history_log.itemid))/1024/1024 AS 'history_log Column Size (Mb)' FROM history_log PARTITION (p2021_02w) LEFT OUTER JOIN items i on history_log.itemid = i.itemid LEFT OUTER JOIN hosts ho on i.hostid = ho.hostid WHERE ho.status IN (0,1) GROUP BY ho.hostid ORDER BY 4 DESC LIMIT 10;
You can reproduce a similar scenario while listing:
ls -lh history_text#* ls -lh history_str#*
Free up disk space (MySQL)
Deleting a host in GUI will not free up data space on MySQL. It will create empty rows in table where the new data can be inserted. If you want to really free up disk space, we can rebuild partition. At first list all possible partition names:
SHOW CREATE TABLE history\G
To rebuild partition:
ALTER TABLE history REBUILD PARTITION p202101160000;
Free up disk space (PostgreSQL)
On PostgreSQL, there is a process which is responsible for vacuuming the table. To ensure a vacuum has been done lately, kindly run:
SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum FROM pg_stat_all_tables WHERE n_dead_tup > 0 ORDER BY n_dead_tup DESC;
In output, we look at ‘n_dead_tup’ it means a dead tuple.
If the last auto vacuum has not occurred in last 10 days, it’s bad. We have to install a different definition. We can increase vacuum priority by having:
vacuum_cost_page_miss = 10 vacuum_cost_page_dirty = 20 autovacuum_vacuum_threshold = 50 autovacuum_vacuum_scale_factor = 0.01 autovacuum_vacuum_cost_delay = 20ms autovacuum_vacuum_cost_limit = 3000 autovacuum_max_workers = 6
Alright. That is it for today.
Hi Kadikis,
| In this case “If the last auto vacuum has not occurred in last 10 days” recommended VACUUM FULL in the database or alter de parameters only?
thank you!