AWS Redshift has proven to be a highly scalable and reliable data warehouse solution. Like any other database/data warehouse system, we need to constantly monitor it. Some systems are more sophisticated than others, some offer better management consoles than others.
In the case of AWS Redshift the console will give you infrastructure or cluster related info related to health, configuration, etc. But, we also need to monitor things like data health, query execution, space available. There’s an extensive list of system tables and views that you can query to get this information. They are easy to identify because their names are prefixed with STL, STV, SVL or SVV.
I have made my own catalog of useful admin queries I constantly run in order to have a better idea of where I stand.
STL – tables are generated from logs that have been persisted to disk to provide a history of the system.
More information, here.
STV – tables are virtual tables that contain snapshots of the current system data.
More information, here.
SVV – View that reference STV tables.
SVL – Views that reference SVL table.
More information, here,
AWS Redshift useful admin queries
I have made a list of the queries I use most on a regular basis (which I keep updating constantly):
List of current sessions/connections
SELECT * FROM stv_sessions;
List of locked tables
SELECT * FROM stv_locks;
Terminate or kill a session/connection
Cancel a running query
Error detail for queries that fail
SELECT * FROM stl_load_errors ORDER BY starttime DESC;
Get list of DDL statements
DDL statements include:
CREATE SCHEMA, TABLE, VIEW
DROP SCHEMA, TABLE, VIEW
ALTER SCHEMA, TABLE
SELECT xid, starttime, SEQUENCE, SUBSTRING(TEXT,1,200) AS TEXT FROM stl_ddltext ORDER BY xid DESC, SEQUENCE;
List of tables by user
SELECT * FROM pg_tables WHERE schemaname = 'USER_NAME';
Summary data for a given query
SELECT * FROM svl_query_summary WHERE query = QUERY_ID ORDER BY stm, seg, step;
List of queries currently ‘Running’
SELECT pid, TRIM(user_name), starttime, query, SUBSTRING(query,1,20), status FROM stv_recents WHERE status = 'Running';
Disk space summary
SELECT SUM(capacity) / 1024 AS capacity_gbytes, SUM(used) / 1024 AS used_gbytes, (SUM( capacity) - SUM(used)) / 1024 AS free_gbytes FROM stv_partitions WHERE part_begin = 0;
Disk space summary by table
SELECT TRIM(pgdb.datname) AS DATABASE, TRIM(pgn.nspname) AS SCHEMA, TRIM(a.name) AS TABLE, b.mbytes, a.rows FROM (SELECT db_id, id, name, SUM(ROWS) AS ROWS FROM stv_tbl_perm a GROUP BY db_id, id, name) AS a JOIN pg_class AS pgc ON pgc.oid = a.id JOIN pg_namespace AS pgn ON pgn.oid = pgc.relnamespace JOIN pg_database AS pgdb ON pgdb.oid = a.db_id JOIN (SELECT tbl, COUNT(*) AS mbytes FROM stv_blocklist GROUP BY tbl) b ON a.id = b.tbl ORDER BY mbytes DESC, a.db_id, a.name;
Table definition: fields summary including sortkey and distkey
-- Show which schemas are included in the search SHOW search_path; -- Add all necessary schemas to include in search path SET search_path TO '$user','public','SCHEMA_1','SCHEMA_2'; -- Show table definitions SELECT * FROM pg_table_def WHERE schemaname = 'SCHEMA_NAME' -- optional AND tablename = 'TABLE_NAME'; -- optional
Show query text for specific query
SELECT query, TRIM(querytxt) AS sqlquery FROM stl_query WHERE query = 295235 ORDER BY query DESC LIMIT 5;
I hope this information is as useful for you as it is for me!