Understanding Write-Ahead Logging (WAL) in PostgreSQL

In PostgreSQL, replication mechanisms—whether physical or logical—rely heavily on the Write-Ahead Logging (WAL) system.

WAL serves as a fundamental component to ensure data consistency, durability, and to support replication processes.

This article explains how WAL works in PostgreSQL, including its role in replication, file management, and system recovery.

Table of Contents

What is Write-Ahead Logging (WAL)?

Write-Ahead Logging (WAL) is a technique used in PostgreSQL to ensure data integrity and durability.

Get Your Free Linux training!

Join our free Linux training and discover the power of open-source technology. Enhance your skills and boost your career! Learn Linux for Free!

Whenever a transaction modifies the database—whether it involves an **INSERT**, **UPDATE**, **DELETE**, or a **COMMIT**—PostgreSQL first writes the changes to a log file before updating any data files on disk.

This process ensures that in the event of a system failure (such as a crash or power outage), no committed transactions are lost. The log file is referred to as the WAL log, and it plays a crucial role in the recovery and replication processes.

How Does WAL Work?

Each change made by a transaction is captured in a WAL record. These records are initially written to an in-memory buffer. Upon transaction commit, they are flushed to disk in a WAL segment file. The WAL record contains information about the transaction, such as the operation type (**INSERT**, **UPDATE**, **DELETE**), along with the data being modified.

A critical component of WAL is the Log Sequence Number (LSN). The LSN uniquely identifies the location of a WAL record within the log file.

It is a 64-bit number, represented as two 32-bit hexadecimal values separated by a “/”. For example, you can query the current WAL LSN with the following SQL command:

SELECT pg_current_wal_lsn();

The LSN helps PostgreSQL track where each change is stored in the WAL log, which is important for replication and recovery.

The output shows the current LSN in the format <LSN High>/<LSN Low>, where both parts are hexadecimal numbers.
In this case, 0/3000020 represents the position in the WAL log where the most recent record has been written.

Checkpoints in PostgreSQL

A checkpoint in PostgreSQL marks the point where all data files are updated to reflect the changes recorded in the WAL. The checkpoint ensures that if the system crashes, recovery can begin from a known, consistent point. Checkpoints occur periodically and are necessary for PostgreSQL to maintain a balance between system performance and data safety.

During a checkpoint, the system writes all WAL records up to that point into the data files, which reduces the recovery time after a crash. The checkpoint process also involves writing the WAL to disk in such a way that only committed transactions are included, ensuring data consistency.

WAL Segments and WAL File Management

The WAL log itself is managed through WAL segments, which are physical files where WAL records are stored. These segment files are typically 16 MB in size and are stored in the pg_wal directory of the PostgreSQL data directory. The names of these WAL segment files are 24-digit numbers that follow a sequential pattern.

For example, you can query the current WAL segment name using the following SQL:

SELECT pg_walfile_name(pg_current_wal_lsn());

WAL segment files are managed by PostgreSQL under several conditions:

Segment file rotation: A new WAL segment is generated when the current segment file is filled.
Manual switching: The pg_switch_wal() function can be issued to force a new WAL segment to begin.
Archive mode: If the archive_mode setting is enabled, WAL segment files are archived after they are written.

Once a checkpoint occurs, PostgreSQL determines how many WAL segment files need to be retained. It bases this estimation on the number of files consumed in previous checkpoint cycles and ensures that between min_wal_size (default 80 MB) and max_wal_size (default 1 GB) worth of WAL files are kept.

WAL-based Replication in PostgreSQL

WAL forms the backbone of PostgreSQL’s replication features, particularly for physical replication and logical replication. These replication techniques leverage WAL to replicate data across multiple PostgreSQL instances.

WAL Archiving and Log Shipping

One common method of replication is log shipping, where WAL files are copied from the primary server to a secondary (standby) server. This is a form of file-based replication that allows the standby server to replay WAL files and mirror the primary server’s state.

To archive WAL files, PostgreSQL can be configured to execute a user-defined script that copies each WAL segment to a remote location. This process is referred to as WAL archiving. Once archived, these WAL files can be used for point-in-time recovery or to set up standby servers.

Log shipping can be divided into two types:

File-level log shipping: WAL files are copied to the standby server, which operates in recovery mode and applies these WAL files as they arrive.
Block-level log shipping (Streaming Replication): This is an improvement over file-level log shipping, where WAL records are streamed directly to the standby server in real-time, reducing replication lag.

Streaming replication involves the primary server sending WAL records directly to the standby server over a network connection, using the replication protocol. This method allows the standby server to be in sync with the primary server with minimal delay, making it ideal for high-availability configurations.

Replication Slots

In the context of replication, a replication slot is a mechanism that ensures WAL files are retained as long as they are needed by replication clients. A replication slot tracks the oldest WAL segment file and oldest transaction ID that a standby server requires. This feature prevents the primary server from discarding WAL files that are still needed by a standby, avoiding replication failures.

Replication slots are especially useful when dealing with slow or intermittent connections between the primary and standby servers. PostgreSQL will retain the necessary WAL files for as long as they are required by the replication slots.

Recovering from Crashes Using WAL

One of the key benefits of WAL is its role in crash recovery. In the event of a system crash, PostgreSQL uses the WAL to replay committed transactions that may not have been written to the data files. Upon restarting, PostgreSQL begins recovery from the last known redo point (or checkpoint), ensuring that no committed transactions are lost.

This recovery process uses the WAL to bring the database back to a consistent state. If the system crashes before a checkpoint, WAL records are used to replay any changes since the last checkpoint. This guarantees that the database can recover to a state where all committed transactions are present.

Conclusion

In PostgreSQL, Write-Ahead Logging (WAL) is a critical feature that ensures data integrity, consistency, and durability. By writing all changes to a WAL log before modifying data files, PostgreSQL can recover from system crashes and failures while ensuring that no committed transactions are lost.

Furthermore, WAL plays a central role in replication, enabling both physical and logical replication mechanisms, including log shipping and streaming replication. Understanding how WAL works is essential for managing PostgreSQL clusters, ensuring data safety, and optimizing replication performance.

Monda

Sunday 1st of December 2024

I have to say, reading this post really helped me understand Write-Ahead Logging (WAL) in PostgreSQL better! When I first started using PostgreSQL in production, I wasn’t fully aware of how crucial WAL is to the database’s overall performance and reliability. Like many newcomers, I thought of WAL mainly as a mechanism for crash recovery, but I’ve come to realize that it's much more than that!