11 February 2026

Choosing the Right MongoDB Shard Key: A Data-Driven Approach

Sharding is how MongoDB scales horizontally — splitting data across multiple servers so no single machine has to handle everything. It works well when done right, but the entire strategy hinges on one critical decision: choosing the shard key.

I built a shard key analyzer that wraps MongoDB’s native analysis commands in an interactive UI, letting you evaluate and compare candidate shard keys before committing. This post covers the concepts behind sharding, what makes a good shard key, the MongoDB commands that help you decide, and how the analyzer ties it all together.

What Is Sharding?

A standalone MongoDB instance or replica set stores all your data on one set of machines. That works until it doesn’t — disk fills up, queries slow down, write throughput hits a ceiling. Sharding splits a collection’s documents across multiple shards (each shard being its own replica set), so reads and writes can be distributed.

MongoDB uses the shard key — one or more fields present in every document — to decide which shard owns each document. The shard key divides the data into chunks, and MongoDB’s balancer moves chunks between shards to keep things even.

            mongos (router)
           /       |       \
      Shard A   Shard B   Shard C
      ch 1-4    ch 5-8    ch 9-12

When a query includes the shard key, mongos routes it directly to the right shard (targeted query). When it doesn’t, mongos broadcasts the query to every shard and merges results (scatter-gather). Targeted queries are fast; scatter-gather queries get slower as you add shards.

Why the Shard Key Matters

The shard key determines three things:

How data is distributed — whether documents spread evenly or pile up on one shard
How queries are routed — whether reads hit one shard or all of them
How writes are distributed — whether inserts fan out or create hot spots

MongoDB supports resharding and refining a shard key if you need to change course after sharding, so you’re not permanently locked in. That said, resharding involves copying data across shards, so it’s worth putting in the effort upfront to choose a good key. Here are the three failure modes a poor shard key can lead to:

Hot spots — If your shard key has low cardinality (few distinct values) or is monotonically increasing (like a timestamp), all new writes land on the same shard. The other shards sit idle while one is overloaded.

Scatter-gather queries — If your application mostly queries by a field that isn’t the shard key, every query hits every shard. Adding more shards makes queries slower, not faster.

Jumbo chunks — If one shard key value appears in millions of documents, MongoDB can’t split that chunk further. You end up with an immovable chunk that keeps growing.

Shard Key Best Practices

A good shard key has three properties:

High Cardinality

The key should have many distinct values — ideally as many as there are documents. This gives MongoDB room to split data into small, balanced chunks.

Key	Cardinality	Verdict
`userId` (UUID)	Millions	Good
`email`	Millions	Good
`region`	4	Bad — only 4 chunks possible
`status`	5	Bad — data piles into a few chunks
`isActive`	2	Bad — two chunks, one probably huge

Low Frequency (Even Distribution)

Even with many distinct values, the distribution matters. If 80% of documents have the same customerId, that customer’s chunk becomes a hot spot.

Non-Monotonic

Values shouldn’t increase or decrease over time. Monotonically increasing keys (timestamps, auto-incrementing IDs, ObjectIds) cause all inserts to go to the shard that owns the maximum chunk range.

Key	Monotonic?	Problem
`createdAt`	Yes	All inserts go to one shard
`_id` (ObjectId)	Yes	ObjectId starts with a timestamp
`UUID`	No	Random distribution
`{ region: 1, orderId: 1 }`	Partially	Compound key can help — random within each region

Match Your Query Patterns

The shard key should appear in your most common queries. If 90% of your reads filter by customerId, then customerId in the shard key means 90% of reads target a single shard.

Compound Shard Keys

When no single field checks all the boxes, a compound shard key combines multiple fields. For example, { customerId: 1, createdAt: 1 } gives you:

Targeted reads — queries filtering by customerId hit one shard
Range queries within a customer — date range scans within a customer stay on one shard
Better distribution — the combination has higher cardinality than either field alone

Common Patterns by Use Case

Use Case	Good Shard Key	Why
E-commerce orders	`{ customerId: 1 }`	Most queries are per-customer
Multi-tenant SaaS	`{ tenantId: 1, entityId: 1 }`	Isolates tenants; entityId prevents jumbo chunks for large tenants
IoT / time-series	`{ deviceId: 1, timestamp: 1 }`	Distributes by device; enables time range queries
Social media posts	`{ userId: 1, createdAt: 1 }`	User timeline queries are most common

The Two MongoDB Commands That Help

Starting in MongoDB 7.0, two database commands let you evaluate shard keys with real data before committing. They work on both replica sets and sharded clusters, so you can analyze before sharding.

`configureQueryAnalyzer`

This command turns on query sampling for a collection. While active, MongoDB records a sample of the reads and writes hitting that collection into config.sampledQueries.

db.adminCommand({
  configureQueryAnalyzer: "mydb.orders",
  mode: "full",
  samplesPerSecond: 5
})

mode — "full" to start, "off" to stop
samplesPerSecond — how many queries per second to capture (1–50)

Each sampled query looks like this:

{
  "cmdName": "find",
  "ns": "mydb.orders",
  "cmd": {
    "filter": { "customerId": "c45fc6a3-89e1-46a1-861a-64ff578e2d31" }
  },
  "expireAt": "2026-03-11T04:14:21.618Z"
}

You want sampling running long enough to capture representative traffic. For production, 20–30 minutes at a low rate (1–5/sec) is a reasonable starting point. Sampled queries expire automatically via a TTL index (default ~27 days).

`analyzeShardKey`

This command evaluates a candidate shard key by examining two data sources:

db.adminCommand({
  analyzeShardKey: "mydb.orders",
  key: { customerId: 1 },
  keyCharacteristics: true,
  readWriteDistribution: true
})

Key characteristics (from the documents themselves):

Cardinality — how many distinct values the key has
Frequency — how evenly those values are distributed
Monotonicity — whether values increase/decrease over time

Read/write distribution (from sampled queries):

Read targeting — what percentage of reads would target a single shard
Write targeting — what percentage of writes would target a single shard
Scatter-gather ratio — how many queries would need to hit all shards

The analysis is read-only and doesn’t modify your data. You can run it repeatedly with different candidates. The more sampled queries available, the more accurate the read/write distribution metrics.

The Catch

These commands are powerful but not easy to use interactively. analyzeShardKey returns raw JSON with nested objects, and comparing multiple candidates means running the command multiple times and manually diffing the output. There’s no scoring, no visualization, and no side-by-side comparison. That’s the gap the shard key analyzer fills.

How the Shard Key Analyzer Helps

The shard key analyzer is a web application (React + Express) that wraps both commands in a guided workflow with scoring and visualization.

Workflow

Connect   -- enter your Atlas connection string
Explore   -- browse databases, collections, schemas, indexes
Sample    -- start configureQueryAnalyzer (runs in background)
Traffic   -- let your app run, or use the workload simulator
Analyze   -- submit candidate shard keys (runs analyzeShardKey)
Report    -- compare candidates with scores and charts

Scoring System

The analyzer translates the raw analyzeShardKey output into scores on five metrics, each weighted:

Metric	Weight	What It Measures
Cardinality	25%	Number of distinct shard key values
Frequency	20%	Evenness of value distribution
Monotonicity	15%	Whether values increase/decrease over time
Read Targeting	20%	Percentage of reads targeting a single shard
Write Targeting	20%	Percentage of writes targeting a single shard

Each metric scores 0–100, and the weighted average gives an overall score with a letter grade:

Grade	Score	Meaning
A	90–100	Excellent candidate
B	80–89	Good candidate
C	70–79	Fair — consider alternatives
D	60–69	Poor — likely problematic
F	<60	Avoid — will cause issues

For example, { status: 1 } on an e-commerce orders collection might score:

Cardinality: 15/100 (only 5 values)
Frequency: 30/100 (most orders are “delivered”)
Monotonicity: 100/100 (not monotonic)
Read Targeting: 20/100 (few queries filter by status alone)
Write Targeting: 25/100 (updates scatter across shards)
Overall: 36/100 (F)

While { customerId: 1 } on the same collection:

Cardinality: 95/100 (millions of customers)
Frequency: 85/100 (fairly even distribution)
Monotonicity: 100/100 (UUIDs aren’t monotonic)
Read Targeting: 90/100 (most queries filter by customer)
Write Targeting: 80/100 (writes naturally target by customer)
Overall: 90/100 (A)

Warnings

The analyzer generates warnings when it detects patterns that indicate problems:

Low cardinality — fewer than 100 distinct values
Monotonic key — values increase or decrease over time
Hot spot detected — one value appears far more often than average
High scatter-gather — more than 50% of reads hit all shards
Shard key updates — writes that modify the shard key field (causes document migration between shards)

Candidate Recommendations

Before you even run the analysis, the explorer page generates candidate suggestions by sampling 100 documents and analyzing field characteristics:

Fields named customerId, userId, or tenantId get a boost
Fields named timestamp or createdAt get a penalty
High-cardinality fields score higher
Fields with many nulls are penalized
Compound key combinations are generated from the top candidates

This gives you a starting point — you’re not guessing at random field combinations.

Workload Simulator

If you don’t have real application traffic yet (maybe you’re designing a new system), the built-in workload simulator generates realistic query patterns. Three profiles are available:

E-commerce — customer lookups, order placement, status updates, regional aggregations
Social Media — user feeds, post creation, likes, comments, trending queries
IoT — device data writes, time-range reads, device lookups

The simulator runs real reads and writes against your database, so use it only on test data. For production collections, use your actual application traffic instead.

Getting Started

Prerequisites

Requirement	Details
MongoDB	7.0+ on Atlas (M30+ for sharding) or any replica set
Node.js	18+
Database User	`clusterManager` role (covers all needed commands)
Network	Your IP in the Atlas whitelist

Setup

git clone https://github.com/tzehon/research.git
cd research/shard-key-analyzer
npm install
npm run dev

This starts the Express backend on port 3001 and the Vite dev server on port 5173. Open http://localhost:5173.

Step-by-Step Walkthrough

1. Connect — Enter your MongoDB connection string:

mongodb+srv://<username>:<password>@<cluster>.xxxxx.mongodb.net/

The app verifies your MongoDB version (7.0+), deployment type (replica set or sharded), and permissions.

2. Explore — Browse your databases and collections. Select a collection to see its schema, document count, indexes, and field-level statistics. The explorer also generates initial shard key candidate suggestions based on field analysis.

3. Start Sampling — Go to the Sampling page, set a sampling rate (start with 1–5/sec for production), and click Start. This runs configureQueryAnalyzer in the background. Leave it running while you generate traffic.

4. Generate Traffic — Either let your real application run its normal workload (recommended), or use the Workload page to run the simulator against test data. The sampling captures queries as they flow in. More queries means better read/write distribution data.

5. Analyze Candidates — Go to the Analysis page. Pick candidates from the suggestions or add your own (e.g., { customerId: 1 }, { customerId: 1, createdAt: 1 }). Click Analyze. The tool runs analyzeShardKey for each candidate and calculates scores.

6. Review the Report — The report page shows:

Overall recommendation with the highest-scoring candidate
Radar charts comparing all candidates across the five metrics
Detailed score breakdowns with explanations
Warnings about potential issues
Side-by-side comparison bars

Supporting Indexes

Before you actually shard a collection, MongoDB requires a supporting index that matches or prefixes your shard key. The analyzer checks for this and shows you the exact createIndex command if one is needed:

db.orders.createIndex({ customerId: 1 })

If you’re evaluating { customerId: 1, createdAt: 1 }, you need an index on { customerId: 1, createdAt: 1 } (or an index where those fields are a prefix).

Production Safety

If you’re running this against a production cluster, be aware of the overhead each operation introduces:

Operation	Impact	Recommendation
`configureQueryAnalyzer`	CPU/IO overhead for intercepting queries	Low sampling rate (1–5/sec)
`analyzeShardKey`	Read-only, reads documents proportional to sample size	Default 10,000 is fine; avoid during peak hours
Workload Simulator	Runs real reads and writes	Never use on production
Sample Data Loading	Inserts documents	Test databases only

The safest production workflow: connect, start sampling at a low rate, let your real application generate traffic for 20–30 minutes, then analyze.

Permissions Quick Reference

Command	Minimum Role
`configureQueryAnalyzer`	`dbAdmin` on the target database, or `clusterManager`
`analyzeShardKey`	`enableSharding` on the collection, or `clusterManager`
Delete sampled queries	`clusterManager`

The clusterManager role covers everything. On Atlas, the Atlas Admin built-in role includes it.

Resources

Choose a Shard Key — MongoDB’s guide to shard key traits
analyzeShardKey command reference
configureQueryAnalyzer command reference
Resharding a collection
Refining a shard key
Sharding overview
Shard key analyzer source

tth

Personal technical blog