Advanced Index Types

Partial indexes, covering indexes, expression indexes, and other specialized indexing techniques

Beyond basic B-Tree and specialized indexes, databases offer advanced indexing techniques to optimize specific query patterns and reduce storage overhead.

Partial Indexes

Partial indexes index only a subset of rows, based on a WHERE condition.

-- Index only active users
CREATE INDEX idx_active_users ON users(email)
WHERE status = 'active';

-- Index only recent orders
CREATE INDEX idx_recent_orders ON orders(customer_id)
WHERE created_at > '2024-01-01';

-- Index only non-null values
CREATE INDEX idx_phone ON customers(phone)
WHERE phone IS NOT NULL;

-- Query must match the partial condition
-- ✅ Uses partial index
SELECT * FROM users WHERE email = 'x@example.com' AND status = 'active';

-- ❌ Cannot use partial index (condition doesn't match)
SELECT * FROM users WHERE email = 'x@example.com';

-- Index only unprocessed items
CREATE INDEX idx_pending_orders ON orders(id)
WHERE status = 'pending';

-- Index soft-deleted records separately
CREATE INDEX idx_active_products ON products(category_id)
WHERE deleted_at IS NULL;

-- Index hot data (frequently accessed)
CREATE INDEX idx_recent_logs ON logs(event_type)
WHERE created_at > CURRENT_DATE - INTERVAL '7 days';

-- Index specific enum values
CREATE INDEX idx_vip_customers ON customers(email)
WHERE tier = 'vip';

-- Unique constraint on subset
CREATE UNIQUE INDEX idx_unique_active_email ON users(email)
WHERE status = 'active';

Unique Partial Indexes

Partial unique indexes allow duplicates in non-indexed rows:

-- Allow multiple inactive emails, but active must be unique
CREATE UNIQUE INDEX idx_unique_active_email ON users(email)
WHERE status = 'active';

Size Reduction

-- Full index: All 10 million rows
CREATE INDEX idx_orders_full ON orders(customer_id);
-- Size: 200 MB

-- Partial index: Only pending orders (50,000 rows)
CREATE INDEX idx_orders_pending ON orders(customer_id)
WHERE status = 'pending';
-- Size: 1 MB

Improved Insert Performance

Less rows to index = faster inserts for rows that don't match the condition.

More Efficient Queries

Smaller index = faster scans, better cache utilization.

Covering Indexes (INCLUDE)

Covering indexes include additional columns that aren't part of the index key, enabling index-only scans.

-- Include columns not used for searching
CREATE INDEX idx_orders_covering ON orders(customer_id)
INCLUDE (total, status, created_at);

-- Query uses index-only scan
EXPLAIN ANALYZE
SELECT total, status FROM orders WHERE customer_id = 123;
-- Output: Index Only Scan using idx_orders_covering

-- All columns in SELECT and WHERE must be in index
-- ❌ Cannot use index-only scan (name not included)
SELECT name FROM orders WHERE customer_id = 123;

-- SQL Server: INCLUDE clause
CREATE INDEX idx_orders_covering ON orders(customer_id)
INCLUDE (total, status, created_at);

-- Filtered covering index
CREATE INDEX idx_active_orders ON orders(customer_id)
INCLUDE (total, status)
WHERE status = 'active';

✅ Use Covering Indexes When:

Scenario	Example
Frequent SELECT of few columns	Dashboard queries
Avoiding table lookups	High-performance reads
Index-only aggregations	`SELECT COUNT(*), SUM(total)`

❌ Avoid When:

Scenario	Reason
Many included columns	Large index size
Frequent updates to included columns	Index maintenance overhead
Included columns are wide	Memory/storage cost

-- Check if query uses index-only scan
EXPLAIN (ANALYZE, BUFFERS)
SELECT total, status FROM orders WHERE customer_id = 123;

-- Look for:
-- "Index Only Scan" (PostgreSQL)
-- "Index Seek + Key Lookup" vs "Index Seek only" (SQL Server)

Expression Indexes

Index the result of an expression or function rather than a column directly.

-- Index on lowercase email
CREATE INDEX idx_email_lower ON users(LOWER(email));

-- Query must use same expression
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

-- Index on date part
CREATE INDEX idx_order_year ON orders(EXTRACT(YEAR FROM created_at));

-- Query
SELECT * FROM orders WHERE EXTRACT(YEAR FROM created_at) = 2024;

-- Index on computed value
CREATE INDEX idx_order_total ON order_items(quantity * price);

-- Query
SELECT * FROM order_items WHERE quantity * price > 1000;

-- Index on custom function result
CREATE FUNCTION get_domain(email TEXT) RETURNS TEXT AS $$
    SELECT split_part(email, '@', 2);
$$ LANGUAGE SQL IMMUTABLE;

CREATE INDEX idx_email_domain ON users(get_domain(email));

-- Query by domain
SELECT * FROM users WHERE get_domain(email) = 'gmail.com';

-- IMMUTABLE requirement
-- Function MUST be IMMUTABLE (same input = same output always)
-- STABLE and VOLATILE functions cannot be indexed

Function Immutability

Only IMMUTABLE functions can be used in expression indexes. The function must always return the same result for the same input.

-- Index specific JSON path
CREATE INDEX idx_user_email ON users((data->>'email'));
CREATE INDEX idx_user_age ON users(((data->>'age')::int));

-- Query
SELECT * FROM users WHERE data->>'email' = 'user@example.com';
SELECT * FROM users WHERE (data->>'age')::int > 21;

-- Index nested JSON
CREATE INDEX idx_address_city ON users((data->'address'->>'city'));

-- Query
SELECT * FROM users WHERE data->'address'->>'city' = 'New York';

Unique Indexes

-- Unique index
CREATE UNIQUE INDEX idx_users_email ON users(email);

-- Composite unique
CREATE UNIQUE INDEX idx_user_tenant ON users(tenant_id, email);

-- Equivalent to UNIQUE constraint
ALTER TABLE users ADD CONSTRAINT unique_email UNIQUE (email);

-- Unique only for active records
CREATE UNIQUE INDEX idx_active_email ON users(email)
WHERE status = 'active';

-- Allows: inactive users with duplicate emails
-- Prevents: active users with duplicate emails

-- Unique only for non-null
CREATE UNIQUE INDEX idx_unique_phone ON users(phone)
WHERE phone IS NOT NULL;

-- Case-insensitive unique
CREATE UNIQUE INDEX idx_email_unique_ci ON users(LOWER(email));

-- Prevents both 'User@Example.com' and 'user@example.com'

-- Unique on computed value
CREATE UNIQUE INDEX idx_unique_full_name 
ON users((first_name || ' ' || last_name));

Multi-Column Index Strategies

Leftmost Prefix Rule

Index (A, B, C) can efficiently serve queries on:

A
A, B
A, B, C

But NOT efficiently on:

B alone
C alone
B, C

Equality Before Range

Put equality conditions before range conditions.

-- For query: WHERE status = 'active' AND created_at > '2024-01-01'
-- ✅ Good: (status, created_at)
CREATE INDEX idx_orders ON orders(status, created_at);

-- ❌ Less optimal: (created_at, status)
-- Range on first column limits use of second

High Cardinality First

Put more selective columns first for better filtering.

-- user_id has high cardinality, status has low cardinality
-- ✅ More selective first
CREATE INDEX idx_orders ON orders(user_id, status);

Consider Query Variations

Sometimes multiple indexes are better than one composite.

-- If you query by A, by B, and by A+B:
CREATE INDEX idx_a ON t(a);
CREATE INDEX idx_b ON t(b);
-- Database can combine with bitmap scan

Database-Specific Features

-- BRIN (Block Range INdex) for large sorted tables
CREATE INDEX idx_logs_time ON logs USING BRIN (created_at);
-- Very small, great for append-only tables

-- SP-GiST for partitioned spaces
CREATE INDEX idx_ip ON networks USING SPGIST (ip_range inet_ops);

-- Bloom filter index
CREATE EXTENSION bloom;
CREATE INDEX idx_bloom ON t USING BLOOM (a, b, c);
-- Good for many-column equality searches

-- Concurrent index creation
CREATE INDEX CONCURRENTLY idx_email ON users(email);

-- Index with specific collation
CREATE INDEX idx_name ON users(name COLLATE "C");

-- Descending index (MySQL 8.0+)
CREATE INDEX idx_created ON orders(created_at DESC);

-- Functional index (MySQL 8.0.13+)
CREATE INDEX idx_email_lower ON users((LOWER(email)));

-- Invisible index (for testing)
ALTER TABLE users ALTER INDEX idx_email INVISIBLE;
-- Query won't use it, but it's still maintained

-- Prefix index for large columns
CREATE INDEX idx_description ON products(description(100));

-- InnoDB: Primary key is clustered
-- Secondary indexes include primary key automatically

-- Columnstore index (for analytics)
CREATE NONCLUSTERED COLUMNSTORE INDEX idx_cs
ON orders(customer_id, total, created_at);

-- Filtered index
CREATE INDEX idx_pending ON orders(customer_id)
WHERE status = 'pending';

-- Index with online rebuild
ALTER INDEX idx_email ON users REBUILD WITH (ONLINE = ON);

-- Compressed index
CREATE INDEX idx_email ON users(email)
WITH (DATA_COMPRESSION = PAGE);

-- Computed column index
ALTER TABLE orders ADD total_with_tax AS (total * 1.1);
CREATE INDEX idx_total_tax ON orders(total_with_tax);

Index Maintenance

Best Practices Summary

Index Design Guidelines

Start with query patterns - Index based on actual queries
Use EXPLAIN - Verify indexes are being used
Monitor usage - Remove unused indexes
Consider writes - More indexes = slower writes
Partial for subsets - Don't index what you don't query
Covering for reads - Include columns to avoid lookups
Expression for transformations - Index computed values
Maintain regularly - Rebuild bloated indexes

Advanced Index Types

Partial Indexes

Size Reduction

Improved Insert Performance

More Efficient Queries

Covering Indexes (INCLUDE)

✅ Use Covering Indexes When:

❌ Avoid When:

Expression Indexes

Unique Indexes

Multi-Column Index Strategies

Leftmost Prefix Rule

Equality Before Range

High Cardinality First

Consider Query Variations

Database-Specific Features

Index Maintenance

Best Practices Summary

Next Steps

Query Optimization

PostgreSQL Guide

On this page

Advanced Index Types

Monitoring Index Health

Rebuilding Indexes

Analyzing Index Usage

Query Optimization

PostgreSQL Guide

On this page