Most “we need a bigger database” conversations are actually “we need to fix five queries” conversations. Hardware is cheap until it isn’t, and throwing RDS instance sizes at a slow application is the most expensive way to avoid reading a query plan.
This is the playbook we use when an IWWOMI client tells us their database is on fire. It assumes Postgres because that’s what most production systems we touch run on, but the principles port directly to MySQL, SQL Server, and anything else that speaks SQL.
The 80/20 rule is closer to 95/5
Pull the slow query log from any production database and you’ll find the same shape every time: a handful of queries account for the overwhelming majority of database time. Not 80/20 — more like 95/5. One bad query running 200 times a second will dwarf the cost of a hundred well-written ones.
This matters because it changes where you spend your effort. Don’t refactor the schema. Don’t migrate to a different database engine. Find the top ten queries by total time and fix those.
In Postgres, the pg_stat_statements extension gives you exactly this:
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
Sort by total_exec_time, not mean_exec_time. A query that takes 2 seconds but runs once a day matters less than one that takes 5ms but runs ten thousand times a minute.
If you cannot name your top five most expensive queries from memory, you are not optimizing — you are guessing.
Indexing: when to add, when to walk away
Indexes feel free. They are not. Every index is a parallel data structure the database must update on every INSERT, UPDATE, and DELETE that touches the indexed columns. This is write amplification, and it is the reason your seemingly innocent “let me index everything that might be queried” instinct will eventually destroy your write throughput.
A missing index on a 10M-row table can take a query from 5ms to 4 seconds. The same index on a write-heavy table can take a transaction from 2ms to 20ms. Both numbers matter.
Rules we follow:
- Index columns used in
WHERE,JOIN ON, andORDER BY. If a column never appears in any of those, it doesn’t need an index. - For composite indexes, lead with the most selective column. An index on
(tenant_id, created_at)is useless if every query filters bycreated_atonly. - Use partial indexes when most rows are irrelevant. If 99% of your
orderstable isstatus = 'completed'and you only ever querystatus = 'pending', a partial index is 100x smaller and faster:
CREATE INDEX idx_orders_pending
ON orders (created_at)
WHERE status = 'pending';
- Drop indexes that aren’t used.
pg_stat_user_indexestells you which indexes have anidx_scancount of zero. Those are pure overhead.
For a deeper treatment of how B-tree indexes actually work, use-the-index-luke.com is still the best resource on the internet.
The N+1 problem and why your ORM is lying to you
The N+1 query problem is the single most common performance bug we find in code reviews. It looks innocent:
posts = Post.objects.all()
for post in posts:
print(post.author.name)
That’s one query for posts, then N queries for authors. With 500 posts you’ve just done 501 round trips to the database. At 2ms per round trip that’s a one-second response time from queries that should take 5ms total.
ORMs hide this because they make post.author look like a property access. It isn’t — it’s a network call. Every ORM has a solution: select_related / prefetch_related in Django, includes in Rails, eager loading in Sequelize, Include in EF Core. Learn yours and use it.
The fix:
posts = Post.objects.select_related('author').all()
One query. Done.
Reading EXPLAIN ANALYZE without crying
EXPLAIN ANALYZE is the single most important tool in your database toolkit. Most developers run it, see a wall of text, and close the tab. Don’t.
Take this query:
EXPLAIN ANALYZE
SELECT u.email, COUNT(o.id)
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2026-01-01'
GROUP BY u.email;
A healthy plan reads from bottom to top and looks like this:
HashAggregate (cost=... rows=12000) (actual time=180.2..183.4 rows=11842 loops=1)
-> Hash Right Join (actual time=20.1..150.3 rows=84000 loops=1)
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders o (actual time=0.01..40.2 rows=500000 loops=1)
-> Hash (actual time=18.9..18.9 rows=12000 loops=1)
-> Index Scan using idx_users_created_at on users u
(actual time=0.03..15.2 rows=12000 loops=1)
Three things to look at, in order:
- Seq Scan on a big table. That
Seq Scan on ordersis reading 500K rows. If you usually filter orders by user, add an index onorders.user_id. - Row estimate vs actual. If the planner thinks a step returns 100 rows and it actually returns 100,000, your statistics are stale. Run
ANALYZE. - The slowest node. Time is cumulative up the tree. Find the node where the gap between its
actual timeand its children’sactual timeis largest. That’s where the work is happening.
Connection pooling: PgBouncer or bust
Postgres connections are heavy. Each one forks a backend process and allocates several megabytes of memory. A Postgres instance configured for 100 connections will fall over the moment your Node.js app fleet decides it needs 800.
Use PgBouncer. Put it between your app and Postgres. A minimal production config:
[databases]
appdb = host=db.internal port=5432 dbname=appdb
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 2000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3
server_idle_timeout = 600
server_tls_sslmode = require
Two settings matter most:
pool_mode = transactionis what you want 95% of the time. It lets one Postgres connection serve thousands of clients by handing the connection over between transactions. The catch: you cannot use session-level features likeLISTEN/NOTIFY, prepared statements (without workarounds), or temporary tables.default_pool_size = 25means PgBouncer will open at most 25 connections per database+user pair. With 2000 client connections sharing 25 server connections, you’re getting 80x amplification on your Postgres connection budget.
For cloud-managed Postgres, see our cloud migration guide for notes on RDS Proxy and equivalent services.
Caching: Redis, Memcached, and the hardest problem in computer science
The two hard things in computer science are naming things, cache invalidation, and off-by-one errors. We can’t help with the first one.
Redis vs Memcached: for most cases, pick Redis. Memcached is simpler and slightly faster for pure key-value workloads, but Redis gives you sorted sets, pub/sub, Lua scripting, persistence, and replication. The performance gap is smaller than it used to be, and the feature gap matters.
Cache invalidation strategies, ranked by how much pain they cause:
- TTL-only. Set a TTL and let stale data exist for that window. Simplest. Works when stale data is acceptable for 30 seconds to 5 minutes.
- Write-through. On every write, update the cache and the database in the same transaction-ish flow. No staleness, but doubles the surface area for bugs.
- Cache-aside with explicit invalidation. Read from cache, fall through to DB on miss. On writes, delete the cache key. The default for most apps. The trap: you must invalidate every key that derives from the changed data, and lists of things are easy to forget.
- Event-driven invalidation. Publish change events from the DB (via WAL, CDC, or application events) and let cache consumers invalidate themselves. Powerful, but it’s a distributed system now.
Cache what’s expensive to compute and cheap to be wrong about. Don’t cache user balances. Do cache the home page feed.
Read replicas and sharding: know when
Read replicas are easy. Provision one, point read-only queries at it, done. The two gotchas:
- Replication lag. A read replica is always a few milliseconds (or seconds, under load) behind the primary. If a user writes a comment and immediately reloads the page reading from a replica, the comment is missing. Either route post-write reads to the primary, or accept the eventual consistency.
- You’re not actually reducing write load. Replicas replay every write. They scale reads, not writes.
Sharding is hard. Don’t do it until you have to. The signals that you have to:
- Your primary’s write IOPS is saturated and vertical scaling has run out.
- A single table is so large that maintenance operations (
VACUUM, index builds, backups) take longer than your maintenance window. - You’re already doing application-level multi-tenancy and a tenant-keyed shard scheme is natural.
Sharding turns one operational problem into N operational problems. If you adopt it, consider whether the microservices architecture pattern of database-per-service gets you most of the benefit with less coordination cost.
Postgres-specific: VACUUM, partial indexes, JSONB
A few Postgres details that bite in production:
VACUUM is not optional. Postgres uses MVCC, which means updates and deletes don’t remove rows — they mark them dead. VACUUM reclaims that space. autovacuum handles this in the background, but on high-write tables the defaults are too conservative. Tune per-table:
ALTER TABLE events SET (
autovacuum_vacuum_scale_factor = 0.05,
autovacuum_analyze_scale_factor = 0.02
);
This makes autovacuum trigger at 5% dead tuples instead of the default 20%. On a 100M-row table that’s the difference between vacuuming at 5M dead rows and 20M.
JSONB indexing. JSONB is great until you query inside it without an index and discover sequential scans on a 50GB table. GIN indexes make JSONB containment queries fast:
CREATE INDEX idx_events_payload ON events USING GIN (payload jsonb_path_ops);
-- This query now uses the index
SELECT * FROM events WHERE payload @> '{"type": "checkout"}';
Use jsonb_path_ops when you only need containment (@>); it’s smaller and faster than the default GIN operator class.
Partial indexes for soft-deletes. If your tables have a deleted_at column and every query filters WHERE deleted_at IS NULL, every index should be partial:
CREATE INDEX idx_users_email
ON users (email)
WHERE deleted_at IS NULL;
Smaller index, faster scans, and the planner uses it automatically when your query matches the predicate.
Monitoring is the optimization
You can’t optimize what you don’t measure. The minimum:
pg_stat_statementsenabled and queried weekly.- Slow query log threshold at 100ms in development, 500ms in production.
- Dashboards for connection count, replication lag, cache hit ratio (target: above 99%), and transaction rate.
- Alerts on autovacuum lag, deadlocks per minute, and replica lag exceeding your application’s tolerance.
We covered the broader observability stack in DevOps best practices — database monitoring slots in there.
For deeper reading, the official Postgres performance docs are good and short.
Need a performance audit?
If your application is slow and you can’t tell whether the problem is the database, the application code, or the infrastructure, that’s exactly the audit IWWOMI runs most often. We dig into the query log, the plans, the indexes, and the schema, and come back with a prioritized list of the five things that will actually move the latency curve.
Get in touch and we’ll take a look.