Common Anti-patterns

Understanding common SQL anti-patterns helps you write efficient queries from the start. These patterns appear repeatedly in slow-performing applications.

The Deadly Seven

1. SELECT *

SELECT * FROM orders WHERE customer_id = 123; -- [!code error]

-- Why it's bad:
-- 1. Transfers unnecessary data
-- 2. Cannot use index-only scans
-- 3. Breaks if schema changes
-- 4. Includes large TEXT/BLOB columns

SELECT id, total, status, created_at 
FROM orders
WHERE customer_id = 123;

-- With covering index, enables index-only scan
CREATE INDEX idx_orders_customer ON orders(customer_id) 
INCLUDE (id, total, status, created_at);

Performance Difference

Metric	SELECT *	Specific Columns
Data transferred	100%	20-50%
Memory usage	Higher	Lower
Network latency	Higher	Lower
Index-only scan	❌ Never	✅ Possible
Cache efficiency	Poor	Better

2. N+1 Query Problem

# Application code:
customers = query("SELECT * FROM customers LIMIT 100")

for customer in customers: -- [!code error]
    orders = query(f""" -- [!code error]
        SELECT * FROM orders  -- [!code error]
        WHERE customer_id = {customer.id} -- [!code error]
    """) -- [!code error]
    # Process orders...

# Results in 101 queries:
# 1 for customers + 100 for orders

-- Solution 1: JOIN
SELECT c.*, o.*
FROM customers c 
LEFT JOIN orders o ON c.id = o.customer_id
LIMIT 100;

-- Solution 2: Batch fetch
SELECT * FROM orders
WHERE customer_id IN (1, 2, 3, ... 100); 

-- Solution 3: Subquery
SELECT * FROM orders
WHERE customer_id IN ( 
    SELECT id FROM customers LIMIT 100
);

-- PostgreSQL: Find repeated queries
SELECT query, calls, mean_time
FROM pg_stat_statements
WHERE query LIKE '%customer_id = $1%'
ORDER BY calls DESC;

-- Look for:
-- - High call counts with similar query patterns
-- - Many queries with different literal values
-- - Sequential query execution patterns

-- MySQL: Enable general log temporarily
SET GLOBAL general_log = 'ON';
-- Check mysql.general_log table

3. Functions on Indexed Columns

-- Function prevents index usage
SELECT * FROM users WHERE LOWER(email) = 'user@example.com'; -- [!code error]
SELECT * FROM orders WHERE YEAR(created_at) = 2024; -- [!code error]
SELECT * FROM products WHERE CONCAT(category, '-', name) = 'Electronics-Phone'; -- [!code error]

-- Solution 1: Store normalized data
UPDATE users SET email = LOWER(email); 
SELECT * FROM users WHERE email = 'user@example.com'; 

-- Solution 2: Expression index
CREATE INDEX idx_email_lower ON users(LOWER(email)); 
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

-- Solution 3: Range instead of function
SELECT * FROM orders 
WHERE created_at >= '2024-01-01'
  AND created_at < '2025-01-01'; 

-- Solution 4: Generated column
ALTER TABLE products ADD COLUMN full_key 
  GENERATED ALWAYS AS (CONCAT(category, '-', name)) STORED; 
CREATE INDEX idx_full_key ON products(full_key);

Anti-pattern	Better Approach
`YEAR(date) = 2024`	`date >= '2024-01-01' AND date < '2025-01-01'`
`LOWER(email)`	Expression index or normalize data
`CAST(string AS INT)`	Store as proper type
`COALESCE(col, 0) = 0`	`col = 0 OR col IS NULL`
`col + 0 = 10`	`col = 10`
`SUBSTRING(col, 1, 3)`	Prefix index or separate column

4. Inefficient OR Conditions

-- OR on different columns prevents optimal index use
SELECT * FROM products
WHERE category_id = 5 OR supplier_id = 10; -- [!code warning]

-- Database might:
-- 1. Do full table scan
-- 2. Do bitmap scans and combine (better)
-- 3. Use neither index efficiently

-- Solution 1: UNION ALL (often faster)
SELECT * FROM products WHERE category_id = 5
UNION ALL 
SELECT * FROM products WHERE supplier_id = 10 AND category_id <> 5; 

-- Solution 2: UNION (deduplicates)
SELECT * FROM products WHERE category_id = 5
UNION
SELECT * FROM products WHERE supplier_id = 10;

-- Solution 3: Composite index (if pattern is common)
CREATE INDEX idx_category_supplier ON products(category_id, supplier_id);

-- ✅ OR on same column uses index
SELECT * FROM products
WHERE category_id = 5 OR category_id = 10;
-- Equivalent to: IN (5, 10)

-- ✅ OR with bitmap scans (PostgreSQL)
-- Database can combine multiple index scans
EXPLAIN ANALYZE
SELECT * FROM products
WHERE category_id = 5 OR supplier_id = 10;
-- Look for: BitmapOr with multiple Bitmap Index Scans

5. Implicit Type Conversion

-- String compared to integer (Index on customer_id INTEGER)
SELECT * FROM orders WHERE customer_id = '123'; -- [!code error]
-- Database may cast every row's customer_id to string

-- Different string types
SELECT * FROM users WHERE phone = 12345; -- [!code error]
-- phone is VARCHAR

-- Date/timestamp mismatch
SELECT * FROM orders WHERE created_at = '2024-01-01'; -- [!code error]
-- created_at is TIMESTAMP, '2024-01-01' is DATE

-- Match types exactly
SELECT * FROM orders WHERE customer_id = 123; 

-- Explicit casting
SELECT * FROM users WHERE phone = '12345'; 

-- Proper date handling
SELECT * FROM orders 
WHERE created_at >= '2024-01-01 00:00:00'::timestamp
  AND created_at < '2024-01-02 00:00:00'::timestamp; 

-- Or use date range with expression index
SELECT * FROM orders 
WHERE created_at::date = '2024-01-01';

-- Check column types
\d+ table_name

-- PostgreSQL: Look for implicit casts in EXPLAIN
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = '123';
-- May show: Filter: ((customer_id)::text = '123'::text)

-- Or check for seq scan where index expected

6. Missing or Wrong Indexes

-- No index on frequently filtered column
SELECT * FROM logs WHERE event_type = 'error'; -- [!code error]
-- Full table scan on millions of rows

-- Wrong column order in composite index
-- Index: (status, customer_id)
SELECT * FROM orders WHERE customer_id = 123; -- [!code warning]
-- Cannot efficiently use index (leftmost prefix rule)

-- Too many indexes
-- Every INSERT/UPDATE must maintain all indexes
-- More indexes = slower writes

-- Index on filter columns
CREATE INDEX idx_logs_event_type ON logs(event_type); 

-- Correct column order (query-driven)
-- For: WHERE customer_id = ? AND status = ?
CREATE INDEX idx_orders ON orders(customer_id, status); 

-- Cover common queries
CREATE INDEX idx_orders_covering ON orders(customer_id) 
INCLUDE (total, status); 

-- Partial index for subset
CREATE INDEX idx_active_orders ON orders(customer_id) 
WHERE status = 'active';

-- PostgreSQL: Find missing indexes
SELECT 
    relname AS table,
    seq_scan,
    seq_tup_read,
    idx_scan,
    seq_tup_read / NULLIF(seq_scan, 0) AS avg_rows_per_scan
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC;

-- High seq_tup_read with low idx_scan = missing index

-- MySQL: Analyze table
SHOW INDEX FROM table_name;
EXPLAIN SELECT ...; -- Look for full table scans

-- SQL Server: Missing index DMVs
SELECT * FROM sys.dm_db_missing_index_details;

7. Inefficient Pagination

-- OFFSET-based pagination degrades with page number
SELECT * FROM products
ORDER BY created_at DESC
LIMIT 20 OFFSET 100000; -- [!code error]

-- Database must:
-- 1. Find 100,020 rows
-- 2. Sort them all
-- 3. Skip 100,000
-- 4. Return 20

-- Page 1: Fast
-- Page 5000: Very slow

-- Cursor-based (keyset) pagination
-- First page
SELECT * FROM products
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- Next page (use last row's values)
SELECT * FROM products
WHERE (created_at, id) < ('2024-01-15 10:00:00', 12345) 
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- With index: (created_at DESC, id DESC)
-- Always O(1) regardless of page number
CREATE INDEX idx_products_cursor ON products(created_at DESC, id DESC);

Pagination Guidelines

Scenario	Recommendation
Infinite scroll	Cursor-based
API pagination	Cursor-based
Admin dashboard	Limit max page number
Search results	Consider search engine (Elasticsearch)
Reports	Export in batches

-- ✅ Indexed columns for sorting
CREATE INDEX idx_products_date_id ON products(created_at DESC, id DESC);

-- ✅ Limit maximum offset
SELECT * FROM products
ORDER BY created_at DESC
LIMIT 20 OFFSET LEAST(1000, $offset);  -- Cap at 50 pages

-- ✅ Use estimated counts for UI
SELECT reltuples::bigint AS estimate
FROM pg_class WHERE relname = 'products';
-- Faster than COUNT(*) for display purposes

More Anti-patterns

Quick Reference

Anti-pattern	Solution
`SELECT *`	List specific columns
N+1 queries	JOIN or batch fetch
Function on indexed column	Expression index or restructure
OR on different columns	UNION or bitmap scan
Type mismatch	Match types exactly
Missing index	Add appropriate index
Large OFFSET	Cursor-based pagination
`COUNT(*) > 0`	`EXISTS`
`NOT IN` with NULLs	`NOT EXISTS`
Correlated subquery	JOIN or window function
`DISTINCT` hiding duplicates	Fix JOINs or use `EXISTS`

Common Anti-patterns

The Deadly Seven

1. SELECT *

Performance Difference

2. N+1 Query Problem

3. Functions on Indexed Columns

4. Inefficient OR Conditions

5. Implicit Type Conversion

6. Missing or Wrong Indexes

More Anti-patterns

Quick Reference

Next Steps

Query Optimization

Indexing Strategies

On this page

Common Anti-patterns

COUNT(*) for Existence Check

NOT IN with NULLs

Correlated Subqueries

DISTINCT Instead of Proper JOINs

Storing Comma-Separated Values

Query Optimization

Indexing Strategies

On this page