Deep Dive: Streaming

Build responsive UIs with real-time Copilot responses.

Why Streaming Matters

LLM responses can take seconds to generate. Without streaming:

  • Users stare at a blank screen
  • Long responses feel even longer
  • No feedback that anything is happening

With streaming:

  • Users see immediate response
  • Content appears progressively
  • App feels faster (even if total time is the same)

Event Stream Anatomy

Event sequence for simple response

sequenceDiagram
    participant App
    participant Session as Copilot Session

    App->>Session: send(prompt)
    Session-->>App: turnStart
    Session-->>App: textDelta("I")
    Session-->>App: textDelta(" can")
    Session-->>App: textDelta(" help")
    Session-->>App: textDelta(" you")
    Session-->>App: textDelta(" with")
    Session-->>App: textDelta(" that")
    Session-->>App: textDelta(".")
    Session-->>App: turnEnd

Event sequence with tool call

sequenceDiagram
    participant App
    participant Session as Copilot Session
    participant Tool as Your Tool Handler

    App->>Session: send(prompt)
    Session-->>App: turnStart
    Session-->>App: textDelta("Let me check...")
    Session-->>App: toolCall
    Note over Session: Copilot pauses until tool result arrives
    App->>Tool: Run handler
    Tool-->>App: Tool result
    App->>Session: submitToolResult()
    Session-->>App: textDelta("The answer is...")
    Session-->>App: turnEnd

Processing Patterns

Pattern 1: Simple print-as-you-go

for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    process.stdout.write(event.delta);
  }
}

Pattern 2: Accumulate then process

let fullResponse = "";

for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    fullResponse += event.delta;
    updateUI(fullResponse); // Re-render with full text
  }
}

// Do something with complete response
parseAndStore(fullResponse);

Pattern 3: Chunked processing

For markdown, code syntax highlighting, or other structured content:

let buffer = "";

for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    buffer += event.delta;

    // Process complete lines only
    const lines = buffer.split("\n");
    if (lines.length > 1) {
      // Process all complete lines
      for (const line of lines.slice(0, -1)) {
        processLine(line);
      }
      // Keep incomplete line in buffer
      buffer = lines[lines.length - 1];
    }
  }
}

// Process any remaining content
if (buffer) {
  processLine(buffer);
}

Pattern 4: Debounced updates

Prevent UI jank from too-frequent updates:

let pending = "";
let updateScheduled = false;

for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    pending += event.delta;

    if (!updateScheduled) {
      updateScheduled = true;
      requestAnimationFrame(() => {
        updateUI(pending);
        updateScheduled = false;
      });
    }
  }
}

// Final update
updateUI(pending);

Pattern 5: Typed character effect

For a typewriter-like appearance:

async function typeText(text: string, element: HTMLElement, msPerChar = 30) {
  for (const char of text) {
    element.textContent += char;
    await sleep(msPerChar);
  }
}

let displayedLength = 0;
let fullText = "";

for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    fullText += event.delta;
    // Let typewriter catch up
    while (displayedLength < fullText.length) {
      outputElement.textContent = fullText.slice(0, ++displayedLength);
      await sleep(20);
    }
  }
}

Building UIs

The main design constraint is that one turn can alternate between streamed text and tool pauses. Your UI should treat the stream as a stateful conversation, not just a firehose of tokens.

stateDiagram-v2
    [*] --> Idle
    Idle --> Streaming: send()
    Streaming --> WaitingOnTool: toolCall
    WaitingOnTool --> Streaming: submitToolResult()
    Streaming --> Completed: turnEnd
    Streaming --> Aborted: abort/cancel
    WaitingOnTool --> Aborted: abort/cancel
    Completed --> Idle: next turn
    Aborted --> Idle: reset UI state

Terminal CLI

import * as readline from "readline";

async function chat(session: CopilotSession) {
  const rl = readline.createInterface({ input: process.stdin, output: process.stdout });

  while (true) {
    const input = await new Promise<string>(resolve =>
      rl.question("\n> ", resolve)
    );

    if (input === "exit") break;

    process.stdout.write("\n");

    for await (const event of session.send(input)) {
      if (event.type === "textDelta") {
        process.stdout.write(event.delta);
      }
    }
  }

  rl.close();
}

React with hooks

import { useState, useCallback } from "react";

function useCopilotStream(session: CopilotSession) {
  const [text, setText] = useState("");
  const [isStreaming, setIsStreaming] = useState(false);

  const send = useCallback(async (prompt: string) => {
    setText("");
    setIsStreaming(true);

    try {
      for await (const event of session.send(prompt)) {
        if (event.type === "textDelta") {
          setText(prev => prev + event.delta);
        }
      }
    } finally {
      setIsStreaming(false);
    }
  }, [session]);

  return { text, isStreaming, send };
}

// Usage
function ChatComponent() {
  const { text, isStreaming, send } = useCopilotStream(session);

  return (
    <div>
      <button onClick={() => send("Hello!")}>Send</button>
      <div className="response">
        {text}
        {isStreaming && <span className="cursor"></span>}
      </div>
    </div>
  );
}

Server-Sent Events (SSE)

// Server (Express)
app.get("/stream", async (req, res) => {
  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

  const prompt = req.query.prompt as string;

  for await (const event of session.send(prompt)) {
    if (event.type === "textDelta") {
      res.write(`data: ${JSON.stringify({ type: "delta", text: event.delta })}\n\n`);
    }
  }

  res.write(`data: ${JSON.stringify({ type: "done" })}\n\n`);
  res.end();
});

// Client
const evtSource = new EventSource(`/stream?prompt=${encodeURIComponent(prompt)}`);
evtSource.onmessage = (e) => {
  const data = JSON.parse(e.data);
  if (data.type === "delta") {
    output.textContent += data.text;
  } else if (data.type === "done") {
    evtSource.close();
  }
};

WebSocket bidirectional

// Server
wss.on("connection", (ws) => {
  let session: CopilotSession;

  ws.on("message", async (data) => {
    const msg = JSON.parse(data.toString());

    if (msg.type === "init") {
      session = await client.createSession();
      ws.send(JSON.stringify({ type: "ready" }));
    }

    if (msg.type === "send") {
      for await (const event of session.send(msg.prompt)) {
        if (event.type === "textDelta") {
          ws.send(JSON.stringify({ type: "delta", text: event.delta }));
        }
      }
      ws.send(JSON.stringify({ type: "done" }));
    }
  });
});

Cancellation

Abort streaming mid-response

const controller = new AbortController();

// Start streaming
const streamTask = (async () => {
  for await (const event of session.send(prompt, { signal: controller.signal })) {
    if (event.type === "textDelta") {
      process.stdout.write(event.delta);
    }
  }
})();

// Cancel after user presses Ctrl+C
process.on("SIGINT", () => {
  controller.abort();
});

try {
  await streamTask;
} catch (e) {
  if (e.name === "AbortError") {
    console.log("\n[Cancelled]");
  }
}

React with cancellation

function CancelableChat({ session }) {
  const [controller, setController] = useState<AbortController | null>(null);
  const [text, setText] = useState("");

  const send = async (prompt: string) => {
    const ctrl = new AbortController();
    setController(ctrl);
    setText("");

    try {
      for await (const event of session.send(prompt, { signal: ctrl.signal })) {
        if (event.type === "textDelta") {
          setText(prev => prev + event.delta);
        }
      }
    } catch (e) {
      if (e.name !== "AbortError") throw e;
    } finally {
      setController(null);
    }
  };

  const cancel = () => controller?.abort();

  return (
    <div>
      <button onClick={() => send("Write a long story")}>Send</button>
      {controller && <button onClick={cancel}>Cancel</button>}
      <div>{text}</div>
    </div>
  );
}

Performance Optimization

1. Reduce DOM updates

// Bad: Update DOM on every delta
for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    element.textContent += event.delta; // Triggers layout on every delta
  }
}

// Good: Batch updates
let buffer = "";
for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    buffer += event.delta;
  }
}
element.textContent = buffer; // Single DOM update

2. Use requestAnimationFrame

let buffer = "";
let rafScheduled = false;

for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    buffer += event.delta;
    if (!rafScheduled) {
      rafScheduled = true;
      requestAnimationFrame(() => {
        element.textContent = buffer;
        rafScheduled = false;
      });
    }
  }
}

3. Virtual scrolling for long responses

For very long outputs, render only visible content:

// Use a library like react-window or react-virtualized
import { VariableSizeList } from "react-window";

function StreamingOutput({ lines }) {
  return (
    <VariableSizeList
      height={400}
      itemCount={lines.length}
      itemSize={index => measureLine(lines[index])}
    >
      {({ index, style }) => (
        <div style={style}>{lines[index]}</div>
      )}
    </VariableSizeList>
  );
}

Pitfalls

1. Not handling backpressure

If your processing is slower than token generation:

// Problem: Events queue up in memory
for await (const event of session.send(prompt)) {
  await slowProcess(event); // Slows iteration
}

// Solution: Process async, don't block iteration
const buffer = [];
const processLoop = async () => {
  while (true) {
    const event = buffer.shift();
    if (!event) { await sleep(10); continue; }
    await slowProcess(event);
  }
};

processLoop(); // Run in background
for await (const event of session.send(prompt)) {
  buffer.push(event);
}

2. Memory leaks with long streams

// Problem: Growing string in memory
let fullText = "";
for await (const event of session.send(prompt)) {
  fullText += event.delta; // String grows forever
}

// Solution: Write to disk or limit buffer size
const stream = fs.createWriteStream("output.txt");
for await (const event of session.send(prompt)) {
  if (event.type === "textDelta") {
    stream.write(event.delta);
  }
}
stream.end();

3. UI freezing

// Problem: Synchronous processing blocks UI
for await (const event of session.send(prompt)) {
  heavyComputation(event); // Blocks main thread
}

// Solution: Use Web Workers or defer heavy work
for await (const event of session.send(prompt)) {
  queueMicrotask(() => heavyComputation(event));
  // Or: postMessageToWorker(event);
}

Exercises

  1. Basic: Build a CLI that shows a spinner while waiting for turnStart, then streams output
  2. Intermediate: Create a React component with a “Stop” button that cancels streaming
  3. Advanced: Build a streaming markdown renderer that syntax-highlights code blocks as they appear

Try implementing these as practice exercises.