Azure Cosmos DB
Cosmos DB is Microsoft’s globally distributed, multi-model NoSQL database. It stores JSON documents (and other data models) with single-digit-millisecond reads and writes at any scale, in any Azure region. You get tunable consistency, automatic indexing, and a partition-based architecture that scales horizontally without redesigning your data layer.
The short version: if your application needs fast, flexible document storage that works globally, Cosmos DB is the managed option that removes most of the operational burden.
Resource Hierarchy
Cosmos DB organizes resources in a clear hierarchy. Understanding this structure matters because cost, throughput, and data distribution decisions happen at different levels.
graph TD
A[Cosmos DB Account] --> B[Database 1]
A --> C[Database 2]
B --> D[Container A]
B --> E[Container B]
D --> F["Items (JSON documents)"]
D --> G["Partition Key (e.g. /tenantId)"]
E --> H["Items (JSON documents)"]
E --> I["Partition Key (e.g. /userId)"]
style A fill:#2d5aa0,color:#fff
style B fill:#3a7bc8,color:#fff
style C fill:#3a7bc8,color:#fff
style D fill:#4a9e5c,color:#fff
style E fill:#4a9e5c,color:#fff
- Account - top-level resource. Defines the API model (NoSQL, MongoDB, etc.), regions, networking, and backup policy. One account can span multiple Azure regions.
- Database - logical grouping for containers. Throughput can be provisioned at the database level (shared across containers) or at individual containers.
- Container - where your data lives. This is the primary unit for throughput, indexing policy, and partition strategy. Container design is the most consequential architectural decision.
- Item - a single JSON document (or row/node/entity depending on the API). Each item belongs to exactly one logical partition within its container.
Partition Keys
The partition key is the single most important design choice in Cosmos DB. It determines how your data is distributed across physical partitions, which directly affects performance, cost, and query efficiency.
What it is: A property path in your documents (like /userId or /tenantId) that Cosmos DB uses to place items into logical partitions. Items with the same partition key value live together and can be read in a single, efficient operation.
Why it matters:
- All queries scoped to a single partition key are fast point reads.
- Queries that span multiple partition keys become fan-out queries, which cost more RU/s and take longer.
- Write-heavy workloads need partition keys that distribute traffic evenly to avoid hot partitions.
How to choose:
| Good partition keys | Why they work |
|---|---|
userId | High cardinality, natural access pattern for user-centric apps |
tenantId | Groups all tenant data together, works well for multi-tenant SaaS |
orderId | Each order is independent, distributes writes evenly |
deviceId | IoT scenarios where each device writes its own telemetry |
sessionId | Session state accessed as a unit |
Common mistakes:
- Choosing a low-cardinality key (like
statusorcountry) creates hot partitions. - Choosing a key that does not align with your read patterns forces expensive cross-partition queries.
- Using a timestamp as partition key concentrates all recent writes on one partition.
When no single property works well, Cosmos DB supports hierarchical partition keys (e.g., /tenantId + /userId) to get both grouping and distribution.
Request Units (RU/s)
Cosmos DB measures throughput in Request Units per second (RU/s). Every operation (read, write, query, index update) costs a specific number of RUs. Think of RU/s as a currency: you provision a budget, and your operations spend against it.
A single point read of a 1 KB item costs 1 RU. Everything else is relative to that baseline.
| Operation | Approximate cost |
|---|---|
| Point read (1 KB) | 1 RU |
| Write (1 KB) | ~5-6 RU |
| Query (returns 1 item, in-partition) | ~3 RU |
| Cross-partition query | Varies, significantly more |
Two capacity models:
- Provisioned throughput - you set a fixed RU/s budget (can autoscale between a min and max). Good for predictable workloads. You pay for provisioned capacity whether you use it or not.
- Serverless - pay per RU consumed, no provisioning. Good for dev/test, bursty workloads, or low-traffic applications. Has a per-container cap of 5,000 RU/s.
Consistency Levels
Cosmos DB offers five consistency levels, from strongest to weakest. This is a spectrum, not a binary choice. Each level trades off read freshness against latency and availability.
| Level | Guarantee | Best for |
|---|---|---|
| Strong | Reads always return the most recent committed write. Linearizable. | Financial transactions, inventory systems where correctness is critical |
| Bounded Staleness | Reads lag behind writes by at most K versions or T seconds | Scenarios needing strong-ish consistency with multi-region writes |
| Session | Within a session, reads see that session’s own writes (read-your-writes) | Most applications. The practical default. |
| Consistent Prefix | Reads never see out-of-order writes, but may lag | Scenarios where ordering matters more than immediacy |
| Eventual | No ordering or freshness guarantee. Lowest latency. | High-throughput reads where slight staleness is acceptable |
Session consistency is the default and the right choice for most applications. It gives you read-your-writes within a session while keeping latency low and allowing multi-region distribution.
Change Feed
The change feed is an ordered, persistent stream of changes to items in a container. Every create and update is captured in the order it occurred within each logical partition.
What you can do with it:
- Trigger downstream processing when documents change (event-driven patterns).
- Materialize views or projections in other containers or services.
- Build real-time notification pipelines.
- Replicate data to external systems.
- Feed event sourcing architectures.
The change feed integrates directly with Azure Functions (via Cosmos DB trigger), or you can consume it with the change feed processor library for more control over checkpointing and error handling.
Note: The change feed currently captures creates and updates. Deletes require a soft-delete pattern (set a TTL or flag) to appear in the feed.
Multi-Model APIs
Cosmos DB supports multiple wire protocols through different API options. You choose the API when creating an account.
| API | Wire protocol | Use when |
|---|---|---|
| NoSQL | Cosmos DB native (SQL-like query syntax) | New projects, full feature access, best integration with Azure |
| MongoDB | MongoDB wire protocol | Migrating from MongoDB, using MongoDB drivers and tools |
| Cassandra | CQL (Cassandra Query Language) | Migrating from Apache Cassandra |
| Gremlin | Apache TinkerPop | Graph traversal workloads |
| Table | Azure Table Storage protocol | Simple key-value, migrating from Table Storage |
The NoSQL API gives you the richest feature set and the most direct access to Cosmos DB capabilities. The compatibility APIs are primarily useful for lift-and-shift migrations.
Vector Search
Cosmos DB supports vector indexing and search natively within the NoSQL API. This is useful for AI and RAG (Retrieval-Augmented Generation) scenarios where you want to store embeddings alongside your application data rather than maintaining a separate vector database.
You can define vector policies on containers, index embedding fields, and run vector similarity searches (cosine, dot product, Euclidean) directly in your queries. This keeps your vectors co-located with the documents they describe, simplifying architecture for AI-enhanced applications.
Common Use Cases
- Real-time web and mobile apps - user profiles, session state, content feeds with low-latency global reads.
- IoT and telemetry - high-throughput device data ingestion with partition-per-device patterns.
- E-commerce - product catalogs, shopping carts, order tracking with flexible schemas.
- Gaming - player state, leaderboards, session data across regions.
- Content management - flexible document schemas that evolve without migrations.
- AI/RAG applications - storing documents alongside vector embeddings for similarity search.
- Multi-tenant SaaS - tenant-isolated data with partition-based separation.
In Entra-Adjacent Systems
Cosmos DB shows up in systems built around Microsoft Entra when those systems need durable application state that does not belong inside Entra or Microsoft Graph. Common patterns include:
- Storing workflow checkpoints, reconciliation records, and onboarding progress for identity automation.
- Tracking sync state between Graph and downstream target systems.
- Coordinating job ownership and retry metadata across distributed workers.
- Using the change feed to trigger downstream processing when identity-related state changes.
In these cases, Cosmos DB acts as the application state layer while Entra and Graph remain the identity source of truth.