Experiments

A sandbox for building and breaking things with the Copilot SDK.

Purpose

This folder is for hands-on experimentation. Unlike the structured learning in concepts/ and deep-dives/, here you:

  • Build mini-projects to explore ideas
  • Test edge cases and limitations
  • Try techniques before using them in real code
  • Document what you learn

Structure

experiments/
├── README.md               # This file
├── [experiment-name]/
│   ├── README.md           # What you're testing, what you learned
│   ├── src/                # Code
│   └── notes.md            # Observations, gotchas, ideas

Experiment Ideas

Beginner

ExperimentWhat You’ll Learn
echo-botBasic client/session setup, simple responses
streaming-loggerEvent types, streaming patterns
tool-playgroundTool definitions, calling, results

Intermediate

ExperimentWhat You’ll Learn
multi-sessionConcurrent sessions, isolation
context-limitsHow context window affects responses
error-recoveryGraceful handling of failures

Advanced

ExperimentWhat You’ll Learn
agent-loopAutonomous multi-turn workflows
mcp-integrationUsing external tool servers
persistenceSaving/restoring sessions

Running Experiments

Each experiment should be self-contained. Typical setup:

cd experiments/[experiment-name]
npm install      # or: pip install -r requirements.txt
npm start        # or: python main.py

Template

Starting a new experiment? Copy this structure:

mkdir experiments/my-experiment
cd experiments/my-experiment

Create README.md:

# [Experiment Name]

## Goal
What are you trying to learn or test?

## Setup
```bash
npm init -y
npm install @github/copilot-sdk

Run

npx tsx src/main.ts

Results

What did you observe?

Key Learnings

  • Bullet points of insights
  • Things that surprised you
  • What you’d do differently

## Documenting Findings

Good experiment notes include:

1. **What you expected** to happen
2. **What actually happened**
3. **Why** (if you figured it out)
4. **Code snippets** that demonstrate the finding
5. **Implications** for real-world use

### Example finding

```markdown
## Finding: Tool calls block text streaming

**Expected**: Text continues streaming while tool executes in background
**Actual**: No textDelta events arrive until submitToolResult() returns

**Implication**: For good UX, show "thinking..." indicator during tool execution

**Code**:
```typescript
// This waits - no text streams until tool completes
for await (const event of session.send("What's the weather?")) {
  if (event.type === "toolCall") {
    await slowApiCall(); // 3 second delay
    await session.submitToolResult(event.callId, result);
    // Text streaming resumes NOW
  }
}

## Experiments To Avoid

- **Don't test** in production with real user data
- **Don't store** API keys or secrets in experiment folders
- **Don't commit** large generated outputs (add to .gitignore)

## Sharing Experiments

If an experiment yields valuable insights, consider:
1. Adding to `notes/` as a documented finding
2. Creating a deep-dive topic if substantial enough
3. Opening an issue/PR on the main SDK if you found a bug