DevDocsDev Docs
Indexing Strategies

Advanced Index Types

Partial indexes, covering indexes, expression indexes, and other specialized indexing techniques

Beyond basic B-Tree and specialized indexes, databases offer advanced indexing techniques to optimize specific query patterns and reduce storage overhead.

Partial Indexes

Partial indexes index only a subset of rows, based on a WHERE condition.

-- Index only active users
CREATE INDEX idx_active_users ON users(email)
WHERE status = 'active';

-- Index only recent orders
CREATE INDEX idx_recent_orders ON orders(customer_id)
WHERE created_at > '2024-01-01';

-- Index only non-null values
CREATE INDEX idx_phone ON customers(phone)
WHERE phone IS NOT NULL;

-- Query must match the partial condition
-- ✅ Uses partial index
SELECT * FROM users WHERE email = 'x@example.com' AND status = 'active';

-- ❌ Cannot use partial index (condition doesn't match)
SELECT * FROM users WHERE email = 'x@example.com';
-- Index only unprocessed items
CREATE INDEX idx_pending_orders ON orders(id)
WHERE status = 'pending';

-- Index soft-deleted records separately
CREATE INDEX idx_active_products ON products(category_id)
WHERE deleted_at IS NULL;

-- Index hot data (frequently accessed)
CREATE INDEX idx_recent_logs ON logs(event_type)
WHERE created_at > CURRENT_DATE - INTERVAL '7 days';

-- Index specific enum values
CREATE INDEX idx_vip_customers ON customers(email)
WHERE tier = 'vip';

-- Unique constraint on subset
CREATE UNIQUE INDEX idx_unique_active_email ON users(email)
WHERE status = 'active';

Unique Partial Indexes

Partial unique indexes allow duplicates in non-indexed rows:

-- Allow multiple inactive emails, but active must be unique
CREATE UNIQUE INDEX idx_unique_active_email ON users(email)
WHERE status = 'active';

Size Reduction

-- Full index: All 10 million rows
CREATE INDEX idx_orders_full ON orders(customer_id);
-- Size: 200 MB

-- Partial index: Only pending orders (50,000 rows)
CREATE INDEX idx_orders_pending ON orders(customer_id)
WHERE status = 'pending';
-- Size: 1 MB

Improved Insert Performance

Less rows to index = faster inserts for rows that don't match the condition.

More Efficient Queries

Smaller index = faster scans, better cache utilization.

Covering Indexes (INCLUDE)

Covering indexes include additional columns that aren't part of the index key, enabling index-only scans.

-- Include columns not used for searching
CREATE INDEX idx_orders_covering ON orders(customer_id)
INCLUDE (total, status, created_at);

-- Query uses index-only scan
EXPLAIN ANALYZE
SELECT total, status FROM orders WHERE customer_id = 123;
-- Output: Index Only Scan using idx_orders_covering

-- All columns in SELECT and WHERE must be in index
-- ❌ Cannot use index-only scan (name not included)
SELECT name FROM orders WHERE customer_id = 123;
-- SQL Server: INCLUDE clause
CREATE INDEX idx_orders_covering ON orders(customer_id)
INCLUDE (total, status, created_at);

-- Filtered covering index
CREATE INDEX idx_active_orders ON orders(customer_id)
INCLUDE (total, status)
WHERE status = 'active';

✅ Use Covering Indexes When:

ScenarioExample
Frequent SELECT of few columnsDashboard queries
Avoiding table lookupsHigh-performance reads
Index-only aggregationsSELECT COUNT(*), SUM(total)

❌ Avoid When:

ScenarioReason
Many included columnsLarge index size
Frequent updates to included columnsIndex maintenance overhead
Included columns are wideMemory/storage cost
-- Check if query uses index-only scan
EXPLAIN (ANALYZE, BUFFERS)
SELECT total, status FROM orders WHERE customer_id = 123;

-- Look for:
-- "Index Only Scan" (PostgreSQL)
-- "Index Seek + Key Lookup" vs "Index Seek only" (SQL Server)

Expression Indexes

Index the result of an expression or function rather than a column directly.

-- Index on lowercase email
CREATE INDEX idx_email_lower ON users(LOWER(email));

-- Query must use same expression
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

-- Index on date part
CREATE INDEX idx_order_year ON orders(EXTRACT(YEAR FROM created_at));

-- Query
SELECT * FROM orders WHERE EXTRACT(YEAR FROM created_at) = 2024;

-- Index on computed value
CREATE INDEX idx_order_total ON order_items(quantity * price);

-- Query
SELECT * FROM order_items WHERE quantity * price > 1000;
-- Index on custom function result
CREATE FUNCTION get_domain(email TEXT) RETURNS TEXT AS $$
    SELECT split_part(email, '@', 2);
$$ LANGUAGE SQL IMMUTABLE;

CREATE INDEX idx_email_domain ON users(get_domain(email));

-- Query by domain
SELECT * FROM users WHERE get_domain(email) = 'gmail.com';

-- IMMUTABLE requirement
-- Function MUST be IMMUTABLE (same input = same output always)
-- STABLE and VOLATILE functions cannot be indexed

Function Immutability

Only IMMUTABLE functions can be used in expression indexes. The function must always return the same result for the same input.

-- Index specific JSON path
CREATE INDEX idx_user_email ON users((data->>'email'));
CREATE INDEX idx_user_age ON users(((data->>'age')::int));

-- Query
SELECT * FROM users WHERE data->>'email' = 'user@example.com';
SELECT * FROM users WHERE (data->>'age')::int > 21;

-- Index nested JSON
CREATE INDEX idx_address_city ON users((data->'address'->>'city'));

-- Query
SELECT * FROM users WHERE data->'address'->>'city' = 'New York';

Unique Indexes

-- Unique index
CREATE UNIQUE INDEX idx_users_email ON users(email);

-- Composite unique
CREATE UNIQUE INDEX idx_user_tenant ON users(tenant_id, email);

-- Equivalent to UNIQUE constraint
ALTER TABLE users ADD CONSTRAINT unique_email UNIQUE (email);
-- Unique only for active records
CREATE UNIQUE INDEX idx_active_email ON users(email)
WHERE status = 'active';

-- Allows: inactive users with duplicate emails
-- Prevents: active users with duplicate emails

-- Unique only for non-null
CREATE UNIQUE INDEX idx_unique_phone ON users(phone)
WHERE phone IS NOT NULL;
-- Case-insensitive unique
CREATE UNIQUE INDEX idx_email_unique_ci ON users(LOWER(email));

-- Prevents both 'User@Example.com' and 'user@example.com'

-- Unique on computed value
CREATE UNIQUE INDEX idx_unique_full_name 
ON users((first_name || ' ' || last_name));

Multi-Column Index Strategies

Leftmost Prefix Rule

Index (A, B, C) can efficiently serve queries on:

  • A
  • A, B
  • A, B, C

But NOT efficiently on:

  • B alone
  • C alone
  • B, C

Equality Before Range

Put equality conditions before range conditions.

-- For query: WHERE status = 'active' AND created_at > '2024-01-01'
-- ✅ Good: (status, created_at)
CREATE INDEX idx_orders ON orders(status, created_at);

-- ❌ Less optimal: (created_at, status)
-- Range on first column limits use of second

High Cardinality First

Put more selective columns first for better filtering.

-- user_id has high cardinality, status has low cardinality
-- ✅ More selective first
CREATE INDEX idx_orders ON orders(user_id, status);

Consider Query Variations

Sometimes multiple indexes are better than one composite.

-- If you query by A, by B, and by A+B:
CREATE INDEX idx_a ON t(a);
CREATE INDEX idx_b ON t(b);
-- Database can combine with bitmap scan

Database-Specific Features

-- BRIN (Block Range INdex) for large sorted tables
CREATE INDEX idx_logs_time ON logs USING BRIN (created_at);
-- Very small, great for append-only tables

-- SP-GiST for partitioned spaces
CREATE INDEX idx_ip ON networks USING SPGIST (ip_range inet_ops);

-- Bloom filter index
CREATE EXTENSION bloom;
CREATE INDEX idx_bloom ON t USING BLOOM (a, b, c);
-- Good for many-column equality searches

-- Concurrent index creation
CREATE INDEX CONCURRENTLY idx_email ON users(email);

-- Index with specific collation
CREATE INDEX idx_name ON users(name COLLATE "C");
-- Descending index (MySQL 8.0+)
CREATE INDEX idx_created ON orders(created_at DESC);

-- Functional index (MySQL 8.0.13+)
CREATE INDEX idx_email_lower ON users((LOWER(email)));

-- Invisible index (for testing)
ALTER TABLE users ALTER INDEX idx_email INVISIBLE;
-- Query won't use it, but it's still maintained

-- Prefix index for large columns
CREATE INDEX idx_description ON products(description(100));

-- InnoDB: Primary key is clustered
-- Secondary indexes include primary key automatically
-- Columnstore index (for analytics)
CREATE NONCLUSTERED COLUMNSTORE INDEX idx_cs
ON orders(customer_id, total, created_at);

-- Filtered index
CREATE INDEX idx_pending ON orders(customer_id)
WHERE status = 'pending';

-- Index with online rebuild
ALTER INDEX idx_email ON users REBUILD WITH (ONLINE = ON);

-- Compressed index
CREATE INDEX idx_email ON users(email)
WITH (DATA_COMPRESSION = PAGE);

-- Computed column index
ALTER TABLE orders ADD total_with_tax AS (total * 1.1);
CREATE INDEX idx_total_tax ON orders(total_with_tax);

Index Maintenance

Best Practices Summary

Index Design Guidelines

  1. Start with query patterns - Index based on actual queries
  2. Use EXPLAIN - Verify indexes are being used
  3. Monitor usage - Remove unused indexes
  4. Consider writes - More indexes = slower writes
  5. Partial for subsets - Don't index what you don't query
  6. Covering for reads - Include columns to avoid lookups
  7. Expression for transformations - Index computed values
  8. Maintain regularly - Rebuild bloated indexes

Next Steps

On this page