md-server
HTTP API and MCP server that converts documents, web pages, and media to markdown. Auto-detects input type (PDF, Office, images, URLs), delegates to MarkItDown or Crawl4AI, and returns clean markdown with metadata. Single /convert endpoint handles everything.
Overview
- Language: Python
- Repo: peteretelej/md-server
- Install:
uvx md-serverorpip install md-server - Status: stable (Production/Stable)
Architecture
The project runs in two modes: an HTTP server (Litestar + Uvicorn) and an MCP server (FastMCP over stdio). Both share the same core conversion logic.
Core modules (src/md_server/core/):
converter.py-DocumentConverterclass, the central workhorse. Exposesconvert_file(),convert_url(),convert_content(), andconvert_text(). URL conversion has two paths:MarkItDown.convert()for static pages, and Crawl4AI’sAsyncWebCrawlerfor JavaScript-rendered content. File/content conversion usesMarkItDown.convert_stream()withStreamInfofor format hints. Includes markdown cleaning, four truncation modes (chars, tokens, sections, paragraphs), safe truncation that avoids breaking code blocks, and SSRF validation for URLs.detection.py-ContentTypeDetectorwith layered detection: magic bytes (PDF, PNG, JPEG, ZIP/Office, audio), filename extension viamimetypes, and content heuristics (null bytes, non-printable character ratio, BOM detection). Also handles the unified input type dispatch (detect_input_type) that figures out whether a request contains a URL, base64 content, or raw text.config.py-Settingsvia Pydantic Settings with env var support. Configures conversion timeout, max file size, debug mode, API key auth, SSRF controls (allow_localhost, allow_private_networks).errors.py- Error hierarchy with HTTP error classification.classify_http_error()maps upstream HTTP status codes and connection errors into typed exceptions (NotFoundError,AccessDeniedError,URLTimeoutError, etc.) with user-facing suggestions.factories.py-MarkItDownFactoryfor creating configuredMarkItDowninstances.validation.py- Input validation utilities.browser.py-BrowserCheckerfor detecting Playwright/Crawl4AI availability at startup.
HTTP layer (src/md_server/):
app.py- Litestar application with dependency injection. RegistersConvertController, health/formats endpoints, and optional API key auth middleware. Runs browser detection at startup to configure JS rendering capability.controllers.py-ConvertControllerwith a singlePOST /convertendpoint. Parses three input modes (JSON body, multipart file upload, raw binary), delegates toDocumentConverter, and supports content negotiation (JSON response by default, raw markdown viaAccept: text/markdownheader oroutput_formatoption).models.py- Pydantic models for API requests/responses (ConvertResponse,ErrorResponse,ConversionMetadata,TruncationInfo).middleware/auth.py- Optional API key authentication middleware.security/url_validator.py- SSRF protection that blocks requests to private networks and localhost unless explicitly allowed.
MCP layer (src/md_server/mcp/):
server.py- FastMCP server with a singleconvert_to_markdowntool. Wraps the sameDocumentConverteras the HTTP layer. Handles base64 file content decoding and maps errors toToolErrorwith suggestions.handlers.py-handle_read_resource()function that dispatches URL vs file conversion, validates inputs, and returns either raw markdown or structured JSON based onoutput_format.tools.py- MCP tool schema definition (read_resourcetool with properties for URL, file_content, render_js, truncation options, etc.).models.py- MCP-specific response models (MCPSuccessResponse,MCPErrorResponse).errors.py- Factory functions for structured MCP error responses with suggestions.
SDK (src/md_server/sdk/):
converter.py- Programmatic SDK for using md-server as a Python library.remote.py- HTTP client for talking to a remote md-server instance.
Metadata (src/md_server/metadata/):
extractor.py- Extracts title, token count (via tiktoken), and language detection (via langdetect) from converted markdown. Can optionally inject YAML frontmatter with extracted metadata.
Data flow for a URL conversion:
POST /convertwith{"url": "https://..."}(or MCPconvert_to_markdowncall)- Controller parses request, creates options dict
DocumentConverter.convert_url()validates URL (SSRF check), checks if JS rendering requested- If JS rendering: Crawl4AI launches headless Chromium, crawls page, returns markdown
- If static:
MarkItDown.convert(url)fetches and converts - Apply truncation and cleaning options
- Extract metadata (title, tokens, language)
- Return markdown (with optional YAML frontmatter) or JSON with full metadata
Key Design Decisions
Two conversion backends. MarkItDown handles static content and file formats (PDF, Office, images). Crawl4AI with Playwright handles JavaScript-rendered pages. The render_js flag switches between them. This gives good coverage without making every request pay the browser startup cost.
Single endpoint, auto-detection. Instead of separate /convert/url, /convert/file, /convert/text endpoints, a single POST /convert uses content type detection to figure out what to do. JSON body with a url field triggers URL conversion, content field triggers base64 file conversion, multipart triggers file upload. This simplifies the API surface.
Dual-mode server. The same conversion logic runs as either an HTTP API (for integration with other services) or an MCP server (for direct AI assistant use). The __main__.py entrypoint uses --mcp-stdio to switch modes. This avoids code duplication and keeps behavior consistent.
SSRF protection by default. URL validation blocks private networks (10.x, 172.16.x, 192.168.x) and localhost unless explicitly enabled via settings. This is important because the server fetches arbitrary URLs from user input.
Structured truncation. Four truncation modes (chars, tokens, sections, paragraphs) with safe boundary detection. Token-based truncation uses tiktoken’s cl100k_base encoding for accuracy. Section-based truncation splits at ## headings. All modes avoid breaking inside code blocks by checking fence count parity.
Content negotiation. The HTTP API defaults to JSON responses with full metadata, but clients can request raw markdown via Accept: text/markdown header. The MCP server defaults to raw markdown since that is what AI assistants typically want.
Development
# Install dependencies
uv sync
# Run HTTP server
uv run md-server --host 127.0.0.1 --port 8080
# Run MCP server
uv run md-server --mcp-stdio
# Run tests
uv run pytest
# Type checking
uv run mypy src/
# Lint
uv run ruff check src/
# Docker
docker build -t md-server .
docker run -p 8080:8080 md-server