The Ultimate PostgreSQL Troubleshooting Guide: Performance, Queries, and Logs

PostgreSQL is renowned for its robustness, scalability, and flexibility, making it one of the most powerful relational database management systems available today.

However, like any complex software, even the best PostgreSQL setups can encounter performance issues, slow queries, or unexpected failures.

Whether you’re managing a mission-critical application or a high-traffic web service, understanding how to troubleshoot and optimize PostgreSQL is essential for maintaining a smooth and responsive database environment.

In this guide, we’ll walk you through the essential techniques and tools needed to diagnose and resolve common PostgreSQL performance bottlenecks.

Get Your Free Linux training!

Join our free Linux training and discover the power of open-source technology. Enhance your skills and boost your career! Start Learn Linux for Free today!

We’ll cover everything from monitoring server health and analyzing query performance to working with logs and identifying potential trouble spots in real-time.

Whether you’re a seasoned PostgreSQL DBA or a developer seeking to fine-tune your system, this guide will provide you with practical steps and insights to ensure your PostgreSQL instance runs at peak performance.

Let’s dive into the world of PostgreSQL troubleshooting and give you the tools to keep your database healthy, responsive, and running efficiently.

Table of Contents

Monitoring the PostgreSQL Message Log: Uncovering Insights and Issues

The PostgreSQL message log is a valuable source of information about server activity, errors, and warnings. Monitoring this log proactively can help you detect and address problems early on.

Setting Up Log Rotation: Most Linux distributions have log file rotation configured by default for PostgreSQL packages. Ensure that log rotation is in place to prevent the log file from growing excessively large.
Configuring PostgreSQL Logging: Fine-tune PostgreSQL’s logging parameters to control the amount and type of information logged. Key parameters include:
- log_line_prefix: Defines the format of each log line, including timestamps and process IDs.
- log_min_duration_statement: Sets the minimum duration for a statement to be logged, helping to filter out short-running queries.
- Parameters like log_checkpoints, log_connections, log_disconnections, log_lock_waits, log_temp_files, log_autovacuum_min_duration, and log_error_verbosity control logging of specific events.
Utilizing Log Analyzers: Tools like pgBadger can parse PostgreSQL log files and generate HTML reports with visualizations, simplifying log analysis. These reports can provide insights into:
- Slow queries and their execution times.
- Errors and warnings, helping to identify recurring issues and areas for improvement.
Manual Log Scanning: You can also scan log files manually using command-line utilities like grep to search for specific patterns or error messages.

Real-Time Viewing Using pgAdmin: A Glimpse into Server Activity

pgAdmin, a graphical administration tool, offers real-time insights into PostgreSQL server activity. It provides a dashboard with views displaying:

General Server Overview: Overall server status and key metrics.
Connections: Active connections, including user names and databases.
Activity: Current database operations and their status.
Running Transactions: Transactions in progress, providing insights into concurrency and potential blocking.

pgAdmin serves as a convenient tool for quickly assessing the current state of the PostgreSQL server and identifying potential areas of concern.

Checking User and Computer Connectivity: Ensuring Proper Access

The pg_stat_activity system view tracks all running PostgreSQL backends, including information about connected users and clients.

Verifying User Connection: To check if a user is connected, query pg_stat_activity, filtering by the usename column. The datname column indicates the database the user is connected to.
Identifying Client Connections: You can determine if a specific computer is connected by examining the client_addr, client_port, and optionally, the client_hostname columns in pg_stat_activity. The application_name field, if set by the client application, can provide further context about the connection.

Repeatedly Executing Queries in psql: Monitoring Changes Over Time

The psql interactive terminal provides the \watch meta-command, allowing you to repeatedly execute a query at specified intervals. This feature is useful for:

Monitoring Dynamic Metrics: Observe changes in system statistics or query results over time.
Tracking Server Activity: Track active connections, running queries, or other server-level metrics.

Checking Running Queries: Gaining Visibility into Active Operations

You can inspect currently running queries using the pg_stat_activity view.

Listing All Running Queries: A basic query on pg_stat_activity reveals information about connected backends, including the query text, user name, database name, and process ID.
Filtering by Backend Type: Use the backend_type column to distinguish between client backends (regular user sessions) and other PostgreSQL processes.
Identifying Active Queries: Filter queries by the state column, selecting only those with state = ‘active’ to focus on queries that are currently executing.

Monitoring the Progress of Commands and Queries: Tracking Execution Status

PostgreSQL 16 introduces progress reporting for certain long-running commands.

Identifying Commands with Progress Reporting: Progress reporting is currently available for commands like ANALYZE, VACUUM, CREATE INDEX, REINDEX, BASE BACKUP, and COPY.
Accessing Progress Information: Specific catalog views provide progress details for each command type:
- pg_stat_progress_analyze: Progress of ANALYZE commands.
- pg_stat_progress_vacuum: Progress of VACUUM commands.
- pg_stat_progress_cluster: Progress of VACUUM FULL and CLUSTER commands.
- pg_stat_progress_create_index: Progress of CREATE INDEX and REINDEX commands.
- pg_stat_progress_basebackup: Progress of BASE BACKUP commands.
- pg_stat_progress_copy: Progress of COPY commands.
Interpreting Progress Metrics: Each command type has specific phases or states. Consult the documentation for details on interpreting progress metrics.
Monitoring CREATE INDEX Progress: Monitor the pg_stat_progress_create_index view to track progress of index creation, including the current_locker_pid column if using the CONCURRENTLY option.
Calculating BASE BACKUP Progress: Utilize the backup_streamed and backup_total columns in pg_stat_progress_basebackup to calculate the backup progress percentage.
Determining COPY Progress: Calculate the progress percentage for COPY commands based on the information in pg_stat_progress_copy.

Identifying and Resolving Blocked Queries: Maintaining Database Responsiveness

Blocked queries can hinder performance and user experience.

Detecting Blocked Queries: The wait_event_type and wait_event
columns in pg_stat_activity can indicate if a query is waiting for another process or resource.
Finding the Blocking Process: The pg_blocking_pids(pid) function returns an array of process IDs for sessions blocking the specified process.
Managing Locks: The pg_locks view provides information about acquired locks, helping to understand the cause of blocking.
Resolving Deadlocks: PostgreSQL automatically handles deadlocks, but understanding their causes can help prevent them.

Killing a Specific Session: Handling Unresponsive Processes

In cases where a session becomes unresponsive or causes issues, you may need to terminate it.

Canceling a Query: The pg_cancel_backend(pid) function attempts to cancel the current query being executed by the specified backend process. This action is typically less disruptive than terminating the entire session.
Terminating a Backend Process: The pg_terminate_backend(pid) function forcefully terminates the backend process, ending the session abruptly.
Permission Considerations: Only superusers or users with appropriate privileges can cancel or terminate other users’ sessions.
Using statement_timeout: Setting the statement_timeout parameter can automatically cancel queries that exceed a specified duration, preventing long-running queries from consuming excessive resources.
Handling Idle Transactions: Queries on pg_stat_activity can identify and terminate idle transactions that are holding resources unnecessarily.

Tracking Table Usage: Understanding Data Access Patterns

You can monitor table usage to determine if a table is actively being used and when it was last accessed.

Checking Current Usage: Capture a snapshot of table usage statistics using the pg_stat_user_tables view. After a period of time, compare the snapshot to the current pg_stat_user_tables data to see if usage counts have changed.
Identifying Last Usage:
- For read operations, PostgreSQL 16 introduces the last_seq_scan (last sequential scan) and last_idx_scan (last index scan) columns in pg_stat_user_tables.
- For write operations, PostgreSQL doesn’t have built-in information about last write time. You can:
  - Utilize a custom table to periodically store snapshots of pg_stat_user_tables data.
  - Analyze table file timestamps using the table_file_access_info function, which provides the last access and modification times based on the filesystem. Keep in mind that file timestamps may be affected by processes other than direct data modifications (e.g., autovacuum).

Monitoring I/O Statistics: Assessing Disk Usage and Performance

Monitoring I/O statistics helps you understand how PostgreSQL is utilizing disk resources.

Server-Level I/O Statistics: The pg_stat_io view provides cumulative I/O statistics for the entire server, including reads and writes for different backend types.
Table-Level I/O Statistics: The pg_statio_user_tables view offers I/O statistics for individual tables, showing block reads and hits for both heap and index data.
Calculating Total Bytes: Multiply the block counts by the block size to get total bytes read or written.

Managing Disk Space Usage by Temporary Data: Optimizing Storage

PostgreSQL can utilize temporary files for various operations, including temporary tables, sorting, and hashing. Managing this temporary data effectively is crucial for performance and storage efficiency.

Temporary Tablespaces: If the temp_tablespaces parameter is set, temporary files will be stored in the specified tablespaces. Use queries to check the size of these tablespaces.
Default Temporary File Location: When temp_tablespaces is empty, temporary files are stored in the pgsql_tmp directory within the main database directory.
Monitoring Temporary File Usage: The pg_stat_database view provides cumulative statistics on temporary file usage per database, including the number of files and total bytes used.
Controlling Temporary File Size: The temp_file_limit parameter restricts the total size of all temporary files used by queries.
Cleaning Up Unused Files: In releases prior to PostgreSQL 14, manual cleanup of temporary files in the pgsql_tmp directory might be needed after a system crash.
Logging Temporary File Creation: Set log_temp_files to a non-zero value to log the creation of temporary files exceeding the specified size.

Understanding Why Queries Slow Down: A Multifaceted Approach

Queries that previously performed well may experience slowdowns due to various factors.

Outdated Statistics: Outdated table statistics can lead the optimizer to choose inefficient query plans. Regularly run ANALYZE on tables involved in slow queries.
Increased Data Volume: As tables grow, queries may need to process more data, leading to slower execution times. Use LIMIT to restrict the result set, consider cursors for large datasets, and optimize queries to fetch only necessary data.
System Overload: If the database server is overloaded due to high CPU, memory, or disk I/O usage, query performance can suffer. Investigate and address system resource contention.
Memory Pressure: Insufficient memory can force PostgreSQL to use disk for operations that would ideally be performed in memory, slowing down queries. Monitor memory usage and consider increasing work_mem appropriately.
Table and Index Bloat: Bloated tables and indexes can significantly degrade performance. Identify bloat using queries and implement appropriate VACUUM strategies to reclaim space and improve efficiency.

Analyzing Real-Time Performance with pg_stat_statements: Deep Dive into Query Behavior

The pg_stat_statements extension provides detailed statistics on query execution, enabling in-depth performance analysis.

Installation and Configuration: Install the extension and configure parameters like shared_preload_libraries in postgresql.conf.
Analyzing Query Statistics: The pg_stat_statements view offers insights into:
- Query frequency (calls column).
- Total execution time (total_exec_time column).
- Average execution time (calculated as total_exec_time / calls).
- Memory and I/O usage.
Understanding Query Normalization: pg_stat_statements replaces constant values in queries with placeholders, aggregating statistics for similar queries.
Tracking Planning Time: Enable the pg_stat_statements.track_planning parameter to capture query planning time, providing insight into optimizer performance.
Analyzing Temporary File Usage: pg_stat_statements tracks temporary file statistics for queries, helping to identify potential bottlenecks related to temporary file I/O.

Tracking Important Metrics Over Time with pg_statviz: Visualizing Performance Trends

The pg_statviz extension captures snapshots of PostgreSQL statistics, allowing you to track key metrics over time and visualize performance trends.

Installation and Configuration: Install the extension and set up periodic snapshots using a job scheduler like cron.
Visualizing Data: The pg_statviz utility generates graphs from the collected data, visualizing various metrics such as tuple read/write rates and transaction rates.
Customizing Output: Configure the pg_statviz utility to filter data by time ranges, select specific modules for analysis, and specify output directories.

By diligently monitoring and analyzing PostgreSQL’s logs, system views, and utilizing extensions like pg_stat_statements and pg_statviz, you can gain valuable insights into server performance, identify bottlenecks, and implement effective troubleshooting strategies to ensure smooth operation and optimal user experience.

PostgreSQL Monitoring and Diagnosis SQL Queries

Checking Whether a User Is Connected

SELECT datname FROM pg_stat_activity WHERE usename = 'bob';

This query checks if the user ‘bob’ is currently connected to any database. The datname column will show the name of the database if the user is connected.

Checking Whether a Computer Is Connected


SELECT datname, usename, client_addr, client_port, 
application_name FROM pg_stat_activity
WHERE backend_type = 'client backend';

This query lists information about all client connections to the database. The output includes the database name (datname), username (usename), client IP address (client_addr), client port number (client_port), and the application name (application_name) associated with the connection. The backend_type filter ensures only client connections are included.

Repeatedly Executing a Query in psql


\watch 5

This command in psql repeatedly executes the previous query every 5 seconds. To stop the execution, press Ctrl + C.

Checking Which Queries Are Running


SELECT datname, usename, state, backend_type, query
FROM pg_stat_activity;

This query retrieves information about all active backends, including the database name, username, backend state, backend type, and the currently executing query.

To filter for only client backends, you can add:


WHERE backend_type = 'client backend'

To show only active queries, you can modify the query as follows:


SELECT datname, usename, state, query
FROM pg_stat_activity
WHERE state = 'active'
AND backend_type = 'client backend';

Watching the Longest Queries


SELECT 
current_timestamp - query_start AS runtime,
datname, usename, query 
FROM pg_stat_activity
WHERE state = 'active'
ORDER BY 1 DESC;

This query lists currently running queries sorted by their runtime (longest running at the top). The runtime is calculated by subtracting the query start time (query_start) from the current timestamp.

To filter queries running for more than 1 minute:


SELECT 
current_timestamp - query_start AS runtime,
datname, usename, query 
FROM pg_stat_activity
WHERE state = 'active'
AND current_timestamp - query_start > '1 min'
ORDER BY 1 DESC;

Monitoring the Progress of Commands

To get the pid for the commands you want to monitor the progress of:


SELECT pid, query
FROM pg_stat_activity
WHERE state = 'active';

Then, depending on the type of command, you will query a specific pg_stat_progress view using the obtained pid.

Checking Which Queries Are Active or Blocked


SELECT datname, usename, wait_event_type, wait_event, pid, backend_type, query
FROM pg_stat_activity
WHERE wait_event_type IS NOT NULL
AND wait_event_type NOT IN ('Activity', 'Client');

This query identifies queries that are currently waiting on some event. The wait_event_type and wait_event columns provide details about the type of wait and the object being waited on.

Knowing Who Is Blocking a Query


SELECT datname, usename, wait_event_type, wait_event, pid, pg_blocking_pids(pid) AS blocked_by, backend_type, query
FROM pg_stat_activity
WHERE wait_event_type IS NOT NULL
AND wait_event_type NOT IN ('Activity', 'Client');

This query is similar to the previous one but adds the blocked_by column using the pg_blocking_pids(pid) function to identify the process ID(s) of the backend(s) blocking the current query.

Killing a Specific Session

To cancel the current query of a backend (if there is one):


SELECT pg_cancel_backend(pid);

Replace pid with the process ID of the backend you want to cancel.

To terminate a backend forcefully:


SELECT pg_terminate_backend(pid);

Replace pid with the process ID of the backend you want to terminate.

Knowing Whether Anybody Is Using a Specific Table

Step 1: Create a temporary table to store the current table usage statistics:


CREATE TEMPORARY TABLE tmp_stat_user_tables AS
SELECT * FROM pg_stat_user_tables;

Step 2: Wait for some time to allow for table usage.

Step 3: Run the following query to check for changes in table usage statistics:


SELECT * FROM pg_stat_user_tables n
JOIN tmp_stat_user_tables t
ON n.relid=t.relid
AND (n.seq_scan,n.idx_scan,n.n_tup_ins,n.n_tup_upd,n.n_tup_del)
<> (t.seq_scan,t.idx_scan,t.n_tup_ins,t.n_tup_upd,t.n_tup_del);

Knowing When a Table Was Last Used

To get the last sequential and index scan times for a table:


SELECT last_seq_scan, last_idx_scan
FROM pg_stat_user_tables
WHERE relname='table_name';

Replace 'table_name' with the actual name of the table.

To get the last access and modification times of the files associated with a table:


SELECT * FROM table_file_access_info('schema_name','table_name');

Replace 'schema_name' with the schema of the table and 'table_name' with the actual name of the table. This utilizes the table_file_access_info function defined in the source.

Monitoring I/O Statistics

To get the total bytes read and written by backend type:


SELECT backend_type, reads * op_bytes AS bytes_read, writes * op_bytes AS bytes_written
FROM pg_stat_io;

This query calculates the total I/O bytes for each backend type by multiplying the number of read and write operations with the average bytes per operation.

To get I/O statistics for a specific table:


WITH b AS (
    SELECT current_setting('block_size')::int AS blcksz
)
SELECT heap_blks_read * blcksz AS heap_read, heap_blks_hit * blcksz AS heap_hit, 
       idx_blks_read * blcksz AS idx_read, idx_blks_hit * blcksz AS idx_hit
FROM pg_statio_user_tables, b
WHERE relname='table_name';

Replace 'table_name' with the actual name of the table. This query retrieves I/O statistics related to table and index reads and hits for a specific table.

Usage of Disk Space by Temporary Data

To check if temporary tablespaces are defined:


SELECT current_setting('temp_tablespaces');

To check the disk space used by each temporary tablespace:


WITH temporary_tablespaces AS (
    SELECT unnest(string_to_array(current_setting('temp_tablespaces'), ',')) AS temp_tablespace
)
SELECT tt.temp_tablespace, 
       pg_tablespace_location(t.oid) AS location, 
       pg_tablespace_size(t.oid) AS size
FROM temporary_tablespaces tt
JOIN pg_tablespace t ON t.spcname = tt.temp_tablespace
ORDER BY 1;

This query extracts information about temporary tablespaces and displays their names, locations, and the disk space they are using.

To get the total size of temporary files used for query processing (when temp_tablespaces is not set):


SELECT current_setting('data_directory') || '/base/pgsql_tmp';

This query returns the path to the directory where temporary files are stored for query processing.

To get the cumulative number of temporary files and space used by each database:


SELECT datname, temp_files, temp_bytes, stats_reset
FROM pg_stat_database
WHERE datname is not null;

This query displays information about temporary file usage for each database, including the cumulative number of files, total bytes used, and the timestamp of the last statistics reset.

Analyzing the Real-time Performance of Your Queries

To get the top 10 most frequent queries:


SELECT query FROM pg_stat_statements ORDER BY calls DESC LIMIT 10;

This query lists the 10 most frequently executed queries based on the calls column in the pg_stat_statements view.

To get the queries with the highest average execution time:


SELECT query, total_exec_time/calls AS avg, calls 
FROM pg_stat_statements ORDER BY 2 DESC;

This query displays queries sorted by their average execution time, calculated by dividing the total execution time by the number of calls.

Tracking Important Metrics over Time

To take a snapshot of the current database statistics using the pg_statviz extension:

SELECT pgstatviz.snapshot();

This function creates a timestamped snapshot of various database statistics and stores them in tables within the pgstatviz schema.

Please note that some of these SQL statements require superuser privileges or specific permissions to access system views and functions. Ensure you have the necessary privileges before running these queries.