Event Stream to Data Explorer

Scenario Setup

An engineering team wants to inspect large volumes of audit-adjacent identity activity across many tenants and connectors. Events are arriving from Graph-derived export jobs, provisioning companions, and worker telemetry, and the team needs fast investigation rather than per-message workflow control.

This pattern sends those events through Event Hubs, ingests them into Azure Data Explorer, and uses KQL for exploration after the stream has landed.

Why this pattern exists

This pattern exists because event ingestion and event analysis are different jobs.

  • Event Hubs is the streaming layer that accepts high-volume events, partitions them, and lets multiple consumers read independently.
  • Azure Data Explorer is the analytics layer that lets builders query the accumulated event data with KQL.

Keeping those roles separate prevents a system from pretending that an analytics cluster is a broker, or that a streaming hub is where investigation happens.

Main Flow

flowchart LR
    subgraph sources["Identity Event Sources"]
        graphJobs[Graph-driven jobs]
        prov[Provisioning companions]
        workers[Worker telemetry]
    end

    subgraph stream["Streaming Ingestion"]
        hubs[Event Hubs]
    end

    subgraph analytics["Analytics Exploration"]
        adx[Azure Data Explorer]
        kql[KQL queries]
    end

    graphJobs --> hubs
    prov --> hubs
    workers --> hubs
    hubs --> adx
    adx --> kql

Read the services through their separate responsibilities:

Streaming ingestion vs analytics exploration

The key lesson in this page is the boundary between the stream and the analytic store.

  • Streaming ingestion is about accepting events, preserving partition-based ordering, and making them available to downstream consumers.
  • Analytics exploration is about querying the accumulated events to answer operational questions.

That means Event Hubs is the transport surface, while Data Explorer is the place operators ask questions like:

  • Which tenant had the highest provisioning failure rate today?
  • Which Graph-driven export path started timing out after a deployment?
  • Which connector family is producing duplicate retry patterns?

Where this pattern fits around Entra

Use this pattern when the workload touches existing Entra product surfaces but the builder need is still operational analysis.

Examples include:

  • investigating provisioning drift while the detailed provisioning behavior still belongs in Entra Application Provisioning,
  • analyzing hybrid sync-related events while sync semantics still belong in Entra Cloud Sync or Entra Connect Sync,
  • correlating Graph-driven changes without treating Graph itself as the event backbone.

Trade-offs

This pattern gives you replayable ingestion and strong analysis capability, but it does not give you workflow semantics.

  • Event Hubs does not replace a delivery-sensitive broker.
  • Data Explorer does not replace transactional workflow state.
  • KQL helps with investigation, not durable orchestration.

If a record represents a step that must be retried, coordinated, or dead-lettered, it probably belongs in Service Bus for Workflows or a transactional state store instead.

When not to use it

Do not use this pattern when the workload is really a queue-driven command flow, a per-entity onboarding pipeline, or a low-volume system where ad hoc inspection is enough.

It is also the wrong fit when:

  • you need durable workflow ownership and retries,
  • you only need the current authoritative state and not a large event history,
  • the event volume is too small to justify a streaming plus analytics pipeline.

In those cases, keep the design simpler with Cosmos DB, Storage, or Service Bus instead of adding a stream and analytics cluster prematurely.