Reliable Worker with Service Bus
Scenario Setup
An onboarding system needs to process access requests for many tenants. Each request turns into a series of steps: validate the Graph-side identity record, provision a downstream entitlement, record the result, and notify operators if a step fails repeatedly.
Each message is a unit of work with consequences, so the architecture needs a broker that supports retries, message settlement, and inspection of failures.
Why this pattern exists
This pattern exists because some identity automation is really workflow coordination, not event streaming.
Use Service Bus for Workflows when the message means “someone must complete this step” rather than “here is another event in the stream.” Builders choose it when delivery guarantees, sessions, and failure handling are part of the contract.
Main architecture
The flow usually looks like this:
- A Graph-driven decision or operator action creates a work item.
- The producer writes that work item to a Service Bus queue or topic.
- A worker, often running on Azure Functions, receives the message under a lock.
- The worker calls downstream APIs, updates state, and either completes the message or abandons it for retry.
- Repeated failures move the message to a dead-letter queue for inspection.
The supporting service boundaries stay explicit:
- Microsoft Graph Control Plane remains the identity source and API trigger surface.
- Azure Functions for Identity Workloads is a common compute host for the worker.
- Service Bus for Workflows owns reliable delivery and coordination.
- Cosmos DB for Identity State or Azure Storage Basics can hold supporting records and artifacts.
Why Service Bus instead of Event Hubs
Choose Service Bus instead of Event Hubs when the workload needs broker behavior, not stream behavior.
Service Bus is the better fit when:
- one worker must own each message,
- retries and settlement semantics matter,
- related messages should stay ordered through sessions,
- poison messages need dead-letter inspection.
Event Hubs for Identity Events is better when multiple consumers need to read a high-volume stream independently, but it does not provide the same workflow-oriented guarantees.
Builder-level error handling
Keep the failure model simple and explicit:
- transient downstream failures usually lead to retry,
- repeated or non-recoverable failures move to the dead-letter queue,
- operators inspect dead-lettered messages to find bad payloads, missing prerequisites, or broken downstream dependencies,
- idempotent worker logic prevents a retried message from causing duplicate side effects.
If the workflow spans many steps, persist step progress outside the broker so a retry can resume safely instead of guessing from message history alone.
Topics vs queues
Use a queue when exactly one worker pipeline should own the step.
Use a topic when multiple workflow consumers each need their own managed copy, such as:
- one worker provisions the target system,
- one consumer records compliance telemetry,
- one consumer triggers an operator notification path.
That is still workflow fan-out, not analytics streaming.
Where Entra-specific detail belongs
If the workflow step delegates into product-specific provisioning or hybrid sync behavior, link outward rather than restating those internals here:
When not to use it
Do not use this pattern when the workload is a high-throughput event stream, a simple low-risk buffer, or a pure analytics ingestion path.
It is also a poor fit when:
- replayable event history matters more than per-message ownership,
- the system is simple enough for Azure Queue Storage,
- the work does not justify broker complexity.
In those cases, Event Hubs or simpler storage-backed queues are often a cleaner choice.