There are many types of queries and of course all are important , but when it comes to monitoring and debugging , we are generally in a hurry and do not remember what to query to check the problem/issue though . We at that time thinks what’s can we find and what will be the query for this , we then search on internet all somehow make some queries and then after issue we forgot . Here in this Blog i will brief some queries and its uses specially in case of Monitoring and Debugging Postgresql Related issue .
In case of Monitoring and Debugging , We generally have following category :
- Size Related Issues
- Performance Related issues
Sometimes These issues are interlinked
Size Related Issues
Following will be basic postgres commands/queries which will be very helpful in monitoring or debugging size related issues :
- How to check Size of Database — — many times we need to check that which database is culprit i.e… which database is consuming major space . Following is query to check size of database in decreasing order according to size consumed
SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database ORDER BY pg_database_size(pg_database.datname) desc;
- After that We may need to Find Which Table/Index is consuming the most space for this we have the following query :
SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_relation_size(C.oid)) AS "size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') ORDER BY pg_relation_size(C.oid) DESC LIMIT 20;
- Many Times There are Not much records in the Table But due to dead tuples size of table/index may increase , So to check which table has highest dead tuples following the the query :
- First Goto the Database then :
SELECT schemaname,relname,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup DESC limit 10;
schemaname | relname | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum
— — — — — — + — — — — — — — — — — — — — — — — — — — — — — — — — + — — — — — — + — — — — — — + — — — — — — — — — — — — — — — — — + — — — — — — — — — — — — — — — — —
public | campaign_session_denormalized_data | 1123219 | 114268349 | 2021–01–10 18:27:34.050087+05:30 | 2021–01–19 14:08:58.062574+05:30
Here , from above output, we can also determine if auto vacuum is running properly or not i.e.. When last auto vacuum ran on this any particular table whose dead tuples are high .
Overtime due to MVCC, your table will grow in size (called table bloat) — this is why regular VACUUM is needed. This query will show you list of tables and indexes with the most bloats.
The value represents the number of “wasted bytes”, or the difference between what is actually used by the table and index, and what we compute that it should be.
The way it works is it estimates the optimized size of the table/index by a calculation from each row sizes times total rows, and compare that against the actual table size. Do note that this is an estimate, not an actual figure.
with foo as ( SELECT schemaname, tablename, hdr, ma, bs, SUM((1-null_frac)*avg_width) AS datawidth, MAX(null_frac) AS maxfracsum, hdr+( SELECT 1+COUNT(*)/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename ) AS nullhdr FROM pg_stats s, ( SELECT (SELECT current_setting('block_size')::NUMERIC) AS bs, CASE WHEN SUBSTRING(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants GROUP BY 1,2,3,4,5 ), rs as ( SELECT ma,bs,schemaname,tablename, (datawidth+(hdr+ma-(CASE WHEN hdr%ma=0 THEN ma ELSE hdr%ma END)))::NUMERIC AS datahdr, (maxfracsum*(nullhdr+ma-(CASE WHEN nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM foo ), sml as ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, bs, CEIL((cc.reltuples*((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::FLOAT)) AS otta, COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::FLOAT)),0) AS iotta -- very rough approximation, assumes all cols FROM rs JOIN pg_class cc ON cc.relname = rs.tablename JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema' LEFT JOIN pg_index i ON indrelid = cc.oid LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid ) SELECT current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/ ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::FLOAT/otta END)::NUMERIC,1) AS tbloat, CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes, iname, /*ituples::bigint, ipages::bigint, iotta,*/ ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::FLOAT/iotta END)::NUMERIC,1) AS ibloat, CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes FROM sml ORDER BY wastedbytes DESC
Query extracted but rewrote from checkpostgres
current_database | schemaname | tablename | tbloat | wastedbytes | iname | ibloat | wastedibytes
— — — — — — — — — + — — — — — — + — — — — — — — — — — — — — — — — -+ — — — — + — — — — — — -+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -+ — — — — + — — — — — — —
dashboard | public | job_logs | 1.1 | 4139507712 | job_logs_pkey | 0.2 | 0
dashboard | public | job_logs | 1.1 | 4139507712 | index_job_logs_on_job_id_and_created_at | 0.4 | 0
dashboard | public | events | 1.1 | 3571736576 | events_pkey | 0.1 | 0
dashboard | public | events | 1.1 | 3571736576 | index_events_on_tenant_id | 0.1 | 0
dashboard | public | events | 1.1 | 3571736576 | index_events_on_event_type | 0.2 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_status | 0.0 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_tag | 0.3 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_tenant_id | 0.2 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_created_at | 0.2 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_created_at_queued_or_running | 0.0 | 21086208
You should be looking at:
- tbloat: table bloat, ratio between what it current is, and what it can be optimized to.
- wastedbytes: number of bytes wasted
- ibloat & wastedibytes: same as above, but for indexes.
When you see a table with high bloats, then consider running VACUUM ANALYZE on it,
Performance Related issues
For performance Related monitoring we need to run following queries too much in day to day scenario
- Get Running Queries (And Lock statuses) in PostgreSQL
SELECT S.pid, age(clock_timestamp(), query_start),usename,query,L.mode,L.locktype,L.granted,s.datname FROM pg_stat_activity S inner join pg_locks L on S.pid = L.pid order by L.granted, L.pid DESC;
Above query many help if some particular query is running for hours and has taken some lock due to which others queries are stuck on lock . It also shows database name on which qurie is running . We can also determine if some connection are open and no query running so there may be some session leak in your database (for more for about session leak read following article)
- To kill some particular query which is causing the issue following is the query :
This will send a SIGINT to the current process.
Hope It will help all of you in altleast some times in your debugging . Please comment and follow me.