Event Stream to Data Explorer
Scenario Setup
An engineering team wants to inspect large volumes of audit-adjacent identity activity across many tenants and connectors. Events are arriving from Graph-derived export jobs, provisioning companions, and worker telemetry, and the team needs fast investigation rather than per-message workflow control.
This pattern sends those events through Event Hubs, ingests them into Azure Data Explorer, and uses KQL for exploration after the stream has landed.
Why this pattern exists
This pattern exists because event ingestion and event analysis are different jobs.
- Event Hubs is the streaming layer that accepts high-volume events, partitions them, and lets multiple consumers read independently.
- Azure Data Explorer is the analytics layer that lets builders query the accumulated event data with KQL.
Keeping those roles separate prevents a system from pretending that an analytics cluster is a broker, or that a streaming hub is where investigation happens.
Main Flow
flowchart LR
subgraph sources["Identity Event Sources"]
graphJobs[Graph-driven jobs]
prov[Provisioning companions]
workers[Worker telemetry]
end
subgraph stream["Streaming Ingestion"]
hubs[Event Hubs]
end
subgraph analytics["Analytics Exploration"]
adx[Azure Data Explorer]
kql[KQL queries]
end
graphJobs --> hubs
prov --> hubs
workers --> hubs
hubs --> adx
adx --> kql
Read the services through their separate responsibilities:
- Event Hubs for Identity Events handles throughput, retention, and independent consumers.
- Azure Data Explorer and KQL handles exploration, summarization, and investigation over the landed dataset.
Streaming ingestion vs analytics exploration
The key lesson in this page is the boundary between the stream and the analytic store.
- Streaming ingestion is about accepting events, preserving partition-based ordering, and making them available to downstream consumers.
- Analytics exploration is about querying the accumulated events to answer operational questions.
That means Event Hubs is the transport surface, while Data Explorer is the place operators ask questions like:
- Which tenant had the highest provisioning failure rate today?
- Which Graph-driven export path started timing out after a deployment?
- Which connector family is producing duplicate retry patterns?
Where this pattern fits around Entra
Use this pattern when the workload touches existing Entra product surfaces but the builder need is still operational analysis.
Examples include:
- investigating provisioning drift while the detailed provisioning behavior still belongs in Entra Application Provisioning,
- analyzing hybrid sync-related events while sync semantics still belong in Entra Cloud Sync or Entra Connect Sync,
- correlating Graph-driven changes without treating Graph itself as the event backbone.
Trade-offs
This pattern gives you replayable ingestion and strong analysis capability, but it does not give you workflow semantics.
- Event Hubs does not replace a delivery-sensitive broker.
- Data Explorer does not replace transactional workflow state.
- KQL helps with investigation, not durable orchestration.
If a record represents a step that must be retried, coordinated, or dead-lettered, it probably belongs in Service Bus for Workflows or a transactional state store instead.
When not to use it
Do not use this pattern when the workload is really a queue-driven command flow, a per-entity onboarding pipeline, or a low-volume system where ad hoc inspection is enough.
It is also the wrong fit when:
- you need durable workflow ownership and retries,
- you only need the current authoritative state and not a large event history,
- the event volume is too small to justify a streaming plus analytics pipeline.
In those cases, keep the design simpler with Cosmos DB, Storage, or Service Bus instead of adding a stream and analytics cluster prematurely.