Understanding EXPLAIN

EXPLAIN is your primary tool for understanding how the database executes queries. Learning to read execution plans is essential for query optimization.

EXPLAIN Basics

-- Basic explain (estimated plan)
EXPLAIN SELECT * FROM orders WHERE customer_id = 123;

-- With actual execution times
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 123;

-- With buffer/IO information
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = 123;

-- Different output formats
EXPLAIN (FORMAT JSON) SELECT ...;
EXPLAIN (FORMAT YAML) SELECT ...;
EXPLAIN (FORMAT XML) SELECT ...;

-- All options
EXPLAIN (
    ANALYZE,      -- Actually run the query
    BUFFERS,      -- Show buffer usage
    COSTS,        -- Show cost estimates (default: on)
    TIMING,       -- Show actual timing (default: on with ANALYZE)
    VERBOSE,      -- Show additional info
    FORMAT TEXT   -- Output format
) SELECT ...;

-- Basic explain
EXPLAIN SELECT * FROM orders WHERE customer_id = 123;

-- With analyze (MySQL 8.0.18+)
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 123;

-- Extended explain
EXPLAIN EXTENDED SELECT * FROM orders WHERE customer_id = 123;

-- JSON format (more details)
EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE customer_id = 123;

-- Tree format (MySQL 8.0.16+)
EXPLAIN FORMAT=TREE SELECT * FROM orders WHERE customer_id = 123;

-- Estimated execution plan
SET SHOWPLAN_ALL ON;
GO
SELECT * FROM orders WHERE customer_id = 123;
GO
SET SHOWPLAN_ALL OFF;

-- Actual execution plan with statistics
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SELECT * FROM orders WHERE customer_id = 123;

-- XML plan
SET SHOWPLAN_XML ON;
GO
SELECT * FROM orders WHERE customer_id = 123;
GO
SET SHOWPLAN_XML OFF;

-- In SSMS: Ctrl+L for estimated, Ctrl+M for actual plan

Reading PostgreSQL EXPLAIN Output

Basic Structure

EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 123;

EXPLAIN Output Breakdown

Index Scan using idx_orders_customer on orders  (cost=0.42..8.44 rows=1 width=48) 
   (actual time=0.015..0.016 rows=1 loops=1) 
   Index Cond: (customer_id = 123)
   Buffers: shared hit=3
Planning Time: 0.085 ms
Execution Time: 0.032 ms

Understanding the Output

Component	Example	Meaning
Node Type	`Index Scan`	How data is accessed
Index	`idx_orders_customer`	Which index is used
Cost	`0.42..8.44`	Startup cost..total cost (arbitrary units)
Rows	`rows=1`	Estimated rows to return
Width	`width=48`	Average row size in bytes
Actual Time	`0.015..0.016`	Real execution time (ms)
Actual Rows	`rows=1`	Real rows returned
Loops	`loops=1`	Times this node executed
Buffers	`shared hit=3`	Pages read from cache

Key Components

Cost Explained

The cost is shown as (cost=startup..total):

Cost Example

(cost=0.42..8.44 rows=1 width=48)
--    ^^^^ ^^^^ --
--    |    └─ Total cost to return all rows
--    └─ Startup cost (before first row)

Part	Value	Meaning
Startup	`0.42`	Work before returning first row
Total	`8.44`	Total work to return all rows

Cost Factors

The optimizer uses seq_page_cost, random_page_cost, cpu_tuple_cost, and other parameters to calculate cost. You can view these:

SHOW seq_page_cost;       -- Default: 1.0 --
SHOW random_page_cost;    -- Default: 4.0 --
SHOW cpu_tuple_cost;      -- Default: 0.01 --

Important

Cost units are arbitrary - they're only useful for comparing plans within the same query, not across different queries.

Row Estimates

Row Estimate

(cost=0.42..8.44 rows=1 width=48)

The rows=1 is the estimated number of rows this node will return.

Why estimates matter:

Optimizer uses row estimates to choose plans
Wrong estimates → wrong plan → slow queries

Common causes of bad estimates:

Stale statistics (run ANALYZE)
Correlated columns
Complex expressions
Non-uniform data distribution

Actual Execution (EXPLAIN ANALYZE only)

Actual Metrics

(actual time=0.015..0.016 rows=1 loops=1)
--           ^^^^^ ^^^^^ ^^^^ ^^^^^^^ --
--           |     |     |    └─ Times node executed
--           |     |     └─ Actual rows returned
--           |     └─ Time to return all rows (ms)
--           └─ Time to return first row (ms)

Part	Value	Meaning
Start time	`0.015`	Time to return first row (ms)
End time	`0.016`	Time to return all rows (ms)
Rows	`1`	Actual rows returned
Loops	`1`	Times this node executed

Red Flag: Estimate Mismatch

When estimated and actual rows differ significantly:

Index Scan  (cost=... rows=100) (actual rows=10000) -- [!code error]
--                    ^^^^^^^          ^^^^^^^
--                    Expected 100     Got 10,000!

This means statistics are stale. Fix with:

ANALYZE table_name;

Buffer Usage (BUFFERS option)

EXPLAIN (ANALYZE, BUFFERS) SELECT ...;

Buffer Statistics

Buffers: shared hit=3 read=2 dirtied=0 written=0
--       ^^^^^^^^^^ ^^^^^^ --
--       From cache From disk (slow!)

Metric	Meaning
shared hit	Pages found in cache ✅ Fast
shared read	Pages read from disk ⚠️ Slow
dirtied	Pages modified
written	Pages written to disk

More hits = faster. High read count means data isn't cached - consider if this is expected or if you need more shared_buffers.

Sort  (cost=100..110 rows=100)
  Sort Key: orders.created_at DESC
  ->  Hash Join  (cost=10..80 rows=100)
        Hash Cond: (orders.customer_id = customers.id)
        ->  Seq Scan on orders  (cost=0..50 rows=1000)
              Filter: (total > 100)
        ->  Hash
              ->  Index Scan on customers  (cost=0..20 rows=50)
                    Index Cond: (status = 'active')

Read from inside out, bottom to top:

Index Scan on customers → finds active customers
Hash → builds hash table from customers
Seq Scan on orders → filters orders > 100
Hash Join → matches orders to customers
Sort → sorts result by date

Index Scan (rows=100) (actual rows=50000) -- [!code error]

Solution:

ANALYZE table_name; 
-- Or for specific columns:
ANALYZE table_name(column_name);

Pattern: Seq Scan on Large Table

Seq Scan on large_table (cost=0..100000) -- [!code warning]
  Filter: (indexed_column = 'value')

Solutions:

\di large_table -- Check if index exists

CREATE INDEX idx ON large_table(indexed_column); 

ANALYZE large_table; -- Check statistics

-- Check for function on column
WHERE LOWER(column) = 'value'
WHERE column = 'VALUE'

Pattern: High Loops Count

Nested Loop (actual loops=100000) -- [!code error]
  ->  Seq Scan on table_a
  ->  Index Scan on table_b

Solutions:

CREATE INDEX idx ON table_b(join_column); 

-- Or force hash join (if appropriate)
SET enable_nestloop = off;

Pattern: Sort on Disk

Sort (Sort Method: external merge Disk: 50MB) -- [!code warning]

Solutions:

SET work_mem = '256MB'; 

-- Or add index to avoid sort
CREATE INDEX idx ON table(sort_column);

Visual Tools

explain.depesz.com

Run EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
Copy output to https://explain.depesz.com
Get visual breakdown with problem areas highlighted

Features:

Highlights slow nodes
Shows exclusive vs inclusive time
Color-codes by performance

pgAdmin Graphical Explain

Open Query Tool
Write your query
Click "Explain Analyze" (F7)
View graphical plan

Features:

Visual node representation
Click nodes for details
Shows data flow

pev2 (Postgres Explain Visualizer 2)

https://explain.dalibo.com/

Run EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
Paste JSON output
Get interactive visualization

Features:

Modern interface
Statistics comparison
Shareable links

Best Practices

EXPLAIN Checklist

Always use ANALYZE for real data, not estimates
Use BUFFERS to see I/O patterns
Compare estimated vs actual rows to detect statistics issues
Look for the most expensive nodes first
Check for sequential scans on large tables
Watch for high loop counts in nested loops
Monitor buffer reads vs hits for caching efficiency

Understanding EXPLAIN

EXPLAIN Basics

Reading PostgreSQL EXPLAIN Output

Basic Structure

Key Components

Cost Explained

Row Estimates

Actual Execution (EXPLAIN ANALYZE only)

Buffer Usage (BUFFERS option)

Common Node Types

Scan Types

Join Types

Interpreting Complex Plans

Reading Plan Trees

Cost Accumulation

Troubleshooting Patterns

Pattern: Wrong Row Estimate

Pattern: Seq Scan on Large Table

Pattern: High Loops Count

Pattern: Sort on Disk

Visual Tools

explain.depesz.com

pgAdmin Graphical Explain

pev2 (Postgres Explain Visualizer 2)

Best Practices

Next Steps

Join Optimization

Anti-patterns

On this page

Understanding EXPLAIN

Sequential Scan (Seq Scan)

Index Scan

Index Only Scan

Bitmap Scan

Nested Loop

Hash Join

Merge Join

Join Optimization

Anti-patterns

On this page