Back to blog
Mar 01, 2025
5 min read

Navigating PostgreSQL - Pages and Blocks

Dive deep into PostgreSQL's storage architecture to understand how pages and blocks form the foundation of database performance and efficiency
#PostgreSQL #Database #Tutorial #Pages #Storage #Performance
Share this article

Table of Contents

  1. What Are Pages and Blocks?
  2. Page Structure
    1. Page Header (PageHeaderData)
    2. Item Identifiers (ItemIdData)
    3. Items (Rows/Tuples)
    4. CTID: The Physical Location Identifier
  3. Page Types
  4. Takeaways
  5. References

What Are Pages and Blocks?

In PostgreSQL, the terms “page” and “block” are often used interchangeably. A page is a fixed-length block of data, typically 8KB (8192 bytes) in size, though this can be modified when compiling the server.

When PostgreSQL reads or writes data to disk, it does so in these page-sized chunks. This is the fundamental storage and I/O unit in PostgreSQL.

postgres=# SELECT current_setting('block_size');
current_setting
-----------------
8192
(1 row)

Page Structure

Each page in PostgreSQL follows a specific layout with several components:

Page Structure Diagram Anatomy of a PostgreSQL page

  1. PageHeader (24 bytes) - Contains metadata about the page
  2. ItemIdData - Array of item identifiers pointing to the actual items
  3. Free space - Unallocated space where new data can be inserted
  4. Items - The actual data items (rows/tuples)
  5. Special space - Used by index access methods to store specialized data

Let’s look at these components in more detail:


Don’t be intimidated by these technical details! Understanding PostgreSQL’s storage layout provides context for query execution and performance. This knowledge forms the foundation for deeper exploration of PostgreSQL’s internal workings in future articles.


Page Header (PageHeaderData)

The page header occupies the first 24 bytes of each page and contains crucial information:

Page Header

FieldSizeDescription
pd_lsn8 bytesLSN (Log Sequence Number) for WAL (Write-Ahead Logging)
pd_checksum2 bytesPage checksum for data integrity
pd_flags2 bytesFlag bits
pd_lower2 bytesOffset to the start of free space
pd_upper2 bytesOffset to the end of free space
pd_special2 bytesOffset to the start of special space
pd_pagesize_version2 bytesPage size and version information
pd_prune_xid4 bytesOldest unpruned XMAX on the page
create extension pageinspect;
-- Get the page header of the first page of the table: test
postgres=# SELECT * FROM page_header(get_raw_page(‘test’,0));
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
-----------|----------|-------|-------|-------|---------|----------|---------|----------
0/16FC9DB0 | 0 | 4 | 928 | 960 | 8192 | 8192 | 4 | 0
(1 row)

Item Identifiers (ItemIdData)

Following the page header are item identifiers, each requiring 4 bytes. These act as pointers to the actual items stored in the page. Each identifier contains:

  • A byte-offset to the start of an item
  • The length of the item in bytes
  • Attribute bits affecting its interpretation

When a new row is inserted, a new item identifier is allocated from the beginning of the unallocated space, and pd_lower is increased accordingly.

Items (Rows/Tuples)

The actual data items (rows or tuples) are stored in the space allocated backward from the end of unallocated space. Each row has a structure called HeapTupleHeaderData followed by the actual column values.

The row header contains information like:

  • Transaction IDs (t_xmin, t_xmax)
  • Command IDs (t_cid)
  • Item pointer (t_ctid)
  • Information masks and flags
  • Offset to user data

CTID: The Physical Location Identifier

In PostgreSQL, each row has a unique physical identifier called the CTID (Current Tuple Identifier), which represents its location as a pair of values: (page_number, tuple_index).

Page Types

PostgreSQL supports several types of pages, each with a specific purpose:

  • Table Pages: Store table rows
  • Index Pages: Store index entries
  • Bitmap Pages: Store bitmap index entries
  • Heap Pages: Store heap tuples
  • Special Pages: Store special data (e.g., TOAST data)

Takeaways

  • Pages are the smallest unit of data that can be read from or written to disk in PostgreSQL.
  • Each page has a header that contains metadata about the page.
  • Pages are used to store data for tables, indexes, and other database objects.
  • All data read and written to disk is done in pages.

References