-
|
We have an orders table with over 50 million rows. Queries come in many forms: Filter by user_id + created_at (most common) Filter by status + created_at (for background jobs) Filter by region + status (for reporting) Full‑text search on notes column (rare) We currently have single‑column indexes on user_id, status, and created_at. Some queries are fast, others are slow (full table scan). Adding more indexes slows down inserts/updates. What’s the right indexing strategy? Should we use composite indexes? Covering indexes? Partial indexes? How many indexes is “too many”? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
You’ve hit the classic indexing dilemma – speed vs write overhead. There’s no single perfect index, but here’s a practical approach for PostgreSQL / MySQL (similar logic applies elsewhere) 1. Replace single‑column indexes with composite onesYour single indexes on user_id and created_at are partially useful but not optimal. The database can only use one index per table reference in most plans (unless bitmap scan). Create composite indexes that match the most common filtering order: Query pattern Best composite index 2. Use covering indexes for the most critical queriesA covering index includes all columns that the query needs, so the database never touches the table itself Example: a query that only needs id, user_id, created_at, and amount: CREATE INDEX idx_covering ON orders (user_id, created_at, id, amount);Now EXPLAIN will show Using index (no table access). This is extremely fast but adds more maintenance 3. Partial indexes for rare conditionsIf a query always looks for status = 'pending' (only 1% of rows), a partial index saves space and overhead: CREATE INDEX idx_pending_created ON orders (created_at) WHERE status = 'pending';The index is tiny and only updated when rows change from/to pending 4. Full‑text search – separate solutionDon’t put a B‑Tree index on notes. Use PostgreSQL’s GIN index with tsvector: CREATE INDEX idx_notes_gin ON orders USING GIN (to_tsvector('english', notes));Or better, move search to a dedicated tool like Elasticsearch or Meilisearch if it’s critical 5. How many indexes is “too many”?A rough guideline for an OLTP table (frequent writes):
|
Beta Was this translation helpful? Give feedback.
You’ve hit the classic indexing dilemma – speed vs write overhead. There’s no single perfect index, but here’s a practical approach for PostgreSQL / MySQL (similar logic applies elsewhere)
1. Replace single‑column indexes with composite ones
Your single indexes on user_id and created_at are partially useful but not optimal. The database can only use one index per table reference in most plans (unless bitmap scan). Create composite indexes that match the most common filtering order:
Query pattern Best composite index
WHERE user_id = ? AND created_at > ?(user_id, created_at)– user_id first (equality), then created_at (range)WHERE status = ? AND created_at > ?(status, created_at)– statu…