Oxc Architecture

How Oxc’s Rust codebase is organized, how data flows through the toolchain, and what matters for contributing.

Source: reading the actual crate source code at v0.116.0.

Crate Map

The workspace has ~30 crates in crates/, apps in apps/, and Node.js bindings in napi/.

Core (parsing and AST)

CratePurpose
oxc_parserRecursive descent parser for JS/TS/JSX/TSX. Entry: Parser::new(&allocator, source, source_type).parse()
oxc_astAST node type definitions (all the Expression, Statement, etc. enums and structs)
oxc_ast_visitGenerated Visit and VisitMut traits for walking the AST
oxc_semanticSemantic analysis: scopes, symbols, references, control flow graph. SemanticBuilder traverses the AST in a single pass
oxc_traverseMutable AST traversal used by the transformer (distinct from the read-only visitor)
oxc_spanSpan (pair of u32 byte offsets), SourceType, Atom (interned string)
oxc_syntaxLanguage-level types: operators, precedence, scope flags, symbol flags, ModuleRecord

Tools

CratePurpose
oxc_linterLinting engine: rule trait, rule registry, config loading, Linter::run() loop
oxc_transformerBabel-compatible transforms (TS stripping, JSX, ES2015-ES2026 downleveling)
oxc_codegenCode printer. Walks the AST via a Gen trait and emits source text + optional source maps
oxc_minifierCode minification (constant folding, dead code elimination)
oxc_manglerVariable name mangling (shortening identifiers)
oxc_formatterPrettier-compatible code formatting (in development)
oxc_isolated_declarationsTypeScript .d.ts generation without a full type checker
oxc_regular_expressionRegExp pattern parser used by the main parser and linter

Infrastructure

CratePurpose
oxc_allocatorBump-allocator arena. All AST nodes are allocated here for fast bulk deallocation
oxc_diagnosticsOxcDiagnostic type with builder pattern (.with_label(), .with_help(), etc.), plus DiagnosticService for rendering
oxc_macrosProc macros, primarily declare_oxc_lint!
oxc_ast_macrosProc macros for AST type generation (#[ast])
oxc_data_structuresShared data structures: CodeBuffer, stack types, etc.
oxc_ecmascriptPure ECMAScript spec operations (abstract equality, ToNumber, etc.)
oxc_cfgControl flow graph builder and types
oxc_sourcemapSource map generation
oxc_indexType-safe index types (like newtype_index! in rustc)
oxc_napiShared NAPI utilities for Node.js bindings
oxc_language_serverLSP server for editor integration

Apps and Bindings

PathPurpose
apps/oxlint/CLI binary for the linter
apps/oxfmt/CLI binary for the formatter
napi/parser/Node.js binding for the parser
napi/transform/Node.js binding for the transformer
napi/minify/Node.js binding for the minifier

Data Flow

How source code flows through Oxc’s pipeline:

Source Text (String)
        |
        v
+------------------+
|   oxc_allocator   |  Arena created, shared by all stages
+------------------+
        |
        v
+------------------+
|   oxc_parser      |  Lexer -> Recursive Descent Parser
|                   |  Output: Program<'a> (AST in arena)
|                   |         + errors + ModuleRecord
+------------------+
        |
        v
+------------------+
|  oxc_semantic     |  Single-pass AST walk via Visit trait
|                   |  Builds: AstNodes (parent pointers),
|                   |          Scoping (scopes, symbols, refs),
|                   |          ClassTable, optional CFG
+------------------+
        |
        +-----> Linter (read-only, per-node rule dispatch)
        |
        +-----> Transformer (mutates AST via Traverse trait)
        |               |
        |               v
        |       +------------------+
        |       |   oxc_codegen    |  Walks AST via Gen trait
        |       |                  |  Output: String + SourceMap
        |       +------------------+
        |
        +-----> Minifier (AST transforms, then codegen)
        |
        +-----> Isolated Declarations (.d.ts emission)

Key design point: The allocator is created once and owns all AST memory. Parsing allocates AST nodes into it; semantic analysis reads them; the transformer mutates them in-place; codegen reads them again. When the allocator is dropped, all AST memory is freed in one shot.

The parser produces a structurally valid AST even on syntax errors (error recovery). More expensive checks like scope resolution and duplicate binding detection are delegated to SemanticBuilder to keep the parser fast.


The Linter Pipeline

From the CLI entry point (apps/oxlint/src/main.rs) through to diagnostics:

1. CLI Parsing (bpaf)
   apps/oxlint/src/main.rs -> CliRunner::run()

2. Config Loading
   ConfigLoader resolves .oxlintrc.json / oxlint.json
   -> ConfigStoreBuilder builds ConfigStore (rules + severity)

3. File Discovery
   Walk (powered by `ignore` crate, respects .gitignore)
   -> produces Vec<PathBuf>

4. Per-file Processing (parallelized via rayon)
   For each file:
   a. Read source text into arena allocator
   b. Parser::new(&allocator, &source, source_type).parse()
      -> ParserReturn { program, errors, module_record, ... }
   c. SemanticBuilder::new().build(&program)
      -> Semantic { nodes, scoping, classes, cfg, ... }
   d. Build ContextSubHost (wraps semantic + tokens for linter)

5. Rule Execution (Linter::run)
   - Resolve which rules apply to this file path
   - Filter rules: skip if file has no matching AST node types
   - For small files (<200k nodes): iterate rules, each rule walks all nodes
   - For large files (>200k nodes): iterate nodes, dispatch to matching rules
     (avoids cache thrashing by keeping rule data in inner loop)
   - Each rule receives (AstNode, LintContext)
   - Rules report diagnostics via ctx.diagnostic() or ctx.diagnostic_with_fix()

6. Disable Directives
   Respects // eslint-disable, // eslint-disable-next-line, etc.
   Filtered during/after rule execution

7. Diagnostic Collection + Reporting
   Messages collected, sorted, deduplicated
   DiagnosticService renders via miette (fancy terminal output)

The LintService in crates/oxc_linter/src/service/runtime.rs handles the parallelism. It uses rayon for data-parallel file processing and an AllocatorPool to recycle bump allocators across files (avoiding repeated allocation of the arena itself).


Rule Anatomy

Every linter rule follows the same structure. Here is the simplest possible rule, no_debugger:

// crates/oxc_linter/src/rules/eslint/no_debugger.rs

use oxc_ast::AstKind;
use oxc_diagnostics::OxcDiagnostic;
use oxc_macros::declare_oxc_lint;
use oxc_span::Span;
use crate::{AstNode, context::LintContext, rule::Rule};

// 1. Diagnostic factory function
fn no_debugger_diagnostic(span: Span) -> OxcDiagnostic {
    OxcDiagnostic::warn("`debugger` statement is not allowed")
        .with_label(span)
}

// 2. Rule struct (must be Debug + Default + Clone)
#[derive(Debug, Default, Clone)]
pub struct NoDebugger;

// 3. Metadata macro: docs, plugin, category, fix capability
declare_oxc_lint!(
    /// ### What it does
    /// Checks for usage of the `debugger` statement
    NoDebugger,
    eslint,
    correctness,
    fix
);

// 4. Rule implementation
impl Rule for NoDebugger {
    fn run<'a>(&self, node: &AstNode<'a>, ctx: &LintContext<'a>) {
        if let AstKind::DebuggerStatement(stmt) = node.kind() {
            ctx.diagnostic_with_fix(
                no_debugger_diagnostic(stmt.span),
                |fixer| fixer.delete(&stmt.span),
            );
        }
    }
}

// 5. Tests inline in the same file
#[test]
fn test() {
    use crate::tester::Tester;
    let pass = vec!["var test = { debugger: 1 }; test.debugger;"];
    let fail = vec!["if (foo) debugger"];
    Tester::new(NoDebugger::NAME, NoDebugger::PLUGIN, pass, fail)
        .test_and_snapshot();
}

The Rule Trait

Defined in crates/oxc_linter/src/rule.rs:

pub trait Rule: Sized + Default + fmt::Debug {
    /// Called for every AST node (the main hook)
    fn run<'a>(&self, node: &AstNode<'a>, ctx: &LintContext<'a>) {}

    /// Called once per file (for whole-program checks)
    fn run_once(&self, ctx: &LintContext) {}

    /// Called for Jest/Vitest test nodes specifically
    fn run_on_jest_node<'a, 'c>(&self, node: &PossibleJestNode<'a, 'c>, ctx: &'c LintContext<'a>) {}

    /// Check if rule should run on this file
    fn should_run(&self, ctx: &ContextHost) -> bool { true }

    /// Deserialize config from JSON (ESLint format)
    fn from_configuration(_value: serde_json::Value) -> Result<Self, ...> { Ok(Self::default()) }
}

Most rules only implement run. The run_once method is for rules that need a whole-file view (e.g., checking for duplicate imports). The run_on_jest_node is for test-framework-specific rules.

The declare_oxc_lint! Macro

This generates the RuleMeta implementation (NAME, PLUGIN, CATEGORY, FIX, documentation) so you do not need to write it by hand. The arguments are:

declare_oxc_lint!(
    /// Documentation (becomes the website docs page)
    StructName,
    plugin_name,    // eslint, typescript, react, unicorn, etc.
    category,       // correctness, suspicious, pedantic, perf, style, restriction, nursery
    fix_kind,       // fix, suggestion, dangerous_fix, pending, none (default)
    config = Type,  // optional: config struct for rules with options
);

Diagnostic Builder Pattern

OxcDiagnostic uses a chained builder:

OxcDiagnostic::warn("message")        // or ::error("message")
    .with_label(span)                  // highlight the problematic code
    .with_labels([span1, span2])       // multiple highlights
    .with_help("Try doing X instead") // suggestion text
    .with_note("Additional context")   // informational note
    .with_url("https://...")           // link to docs
    .with_error_code("eslint", "no-debugger")  // rule identifier

Rules report diagnostics through LintContext:

  • ctx.diagnostic(d) - report without fix
  • ctx.diagnostic_with_fix(d, |fixer| ...) - report with auto-fix
  • ctx.diagnostic_with_fix_of_kind(d, FixKind::DangerousFix, |fixer| ...) - report with a specific fix kind

The fixer API: fixer.replace(span, "new text"), fixer.delete(&span), fixer.insert_before(node, "text"), fixer.insert_after(node, "text").

Rule Categories

Rules are organized by plugin (matching ESLint ecosystem) in crates/oxc_linter/src/rules/:

rules/
  eslint/          # Core ESLint rules
  typescript/      # @typescript-eslint rules
  react/           # eslint-plugin-react
  react_perf/      # eslint-plugin-react-perf
  jsx_a11y/        # eslint-plugin-jsx-a11y
  unicorn/         # eslint-plugin-unicorn
  import/          # eslint-plugin-import
  jest/            # eslint-plugin-jest
  vitest/          # eslint-plugin-vitest
  jsdoc/           # eslint-plugin-jsdoc
  nextjs/          # @next/eslint-plugin-next
  node/            # eslint-plugin-node
  promise/         # eslint-plugin-promise
  oxc/             # Oxc-specific rules
  vue/             # eslint-plugin-vue

RuleEnum and Codegen

RuleEnum is 16 bytes (verified by a size assertion test) to keep the rule dispatch loop cache-friendly. The enum and its match arms are code-generated by oxc_linter_codegen (run via cargo lintgen). You add a new rule file, register it, and re-run codegen.


Key Rust Patterns

Arena Allocation (oxc_allocator)

All AST nodes live in a bump allocator. This means:

  1. No individual frees. The entire AST is deallocated at once when the Allocator drops.
  2. Box<'a, T> and Vec<'a, T> are arena-backed versions of std::boxed::Box and std::vec::Vec.
  3. The allocator grows by doubling chunk sizes. For best performance, reuse allocators with .reset() rather than creating new ones.
let allocator = Allocator::new();
let ret = Parser::new(&allocator, source, source_type).parse();
// ret.program contains arena-allocated AST nodes
// When `allocator` drops, all AST memory is freed

The AllocatorPool in the linter service recycles allocators across files to avoid repeated system allocation.

The Visitor Pattern (oxc_ast_visit)

Read-only traversal uses the Visit trait (generated). SemanticBuilder implements Visit to walk the AST in a single pass:

impl<'a> Visit<'a> for SemanticBuilder<'a> {
    fn visit_function(&mut self, func: &Function<'a>) {
        // enter: create scope, process bindings
        walk_function(self, func);  // recurse into children
        // exit: close scope
    }
}

The linter does NOT use the visitor pattern directly for rule dispatch. Instead, SemanticBuilder builds an AstNodes structure (a flat vec of nodes with parent pointers), and the linter iterates that flat list. Each AstNode has a kind() returning an AstKind enum, which rules pattern-match on:

fn run<'a>(&self, node: &AstNode<'a>, ctx: &LintContext<'a>) {
    let AstKind::BinaryExpression(expr) = node.kind() else { return };
    // inspect expr...
}

Mutable traversal for the transformer uses oxc_traverse::Traverse trait with enter/exit hooks for each AST node type:

impl<'a> Traverse<'a, State> for MyTransform {
    fn enter_expression(&mut self, expr: &mut Expression<'a>, ctx: &mut TraverseCtx<'a>) {
        // mutate the expression
    }
    fn exit_expression(&mut self, expr: &mut Expression<'a>, ctx: &mut TraverseCtx<'a>) {
        // post-order processing
    }
}

UniquePromise for Safety

The parser uses a zero-sized UniquePromise type to enforce at the type level that only one ParserImpl exists per thread at a time. This is needed for soundness of unsafe pointer operations in the lexer’s source cursor:

// Only Parser::parse() can create a UniquePromise
pub fn parse(self) -> ParserReturn<'a> {
    let unique = UniquePromise::new();  // private constructor
    let parser = ParserImpl::new(self.allocator, ..., unique);
    parser.parse()
}

RuleRunFunctionsImplemented Optimization

At codegen time, the linter analyzes which Rule trait methods each rule actually overrides. This lets the dispatch loop skip calling run() on a rule that only implements run_once(), and vice versa. The generated RuleRunner impl carries this info as a const:

pub trait RuleRunner: Rule {
    const NODE_TYPES: Option<&AstTypesBitset>;  // which AST node kinds this rule cares about
    const RUN_FUNCTIONS: RuleRunFunctionsImplemented;  // which hooks are implemented
}

AstTypesBitset for Rule Filtering

Each rule declares which AstType variants it inspects (also computed at codegen time). Before running a file, the linter checks if the file’s AST contains any matching node types. If not, the rule is skipped entirely. For large files, rules are bucketed by AstType to avoid iterating all rules for each node.

Span as u32 Pair

Source positions use u32 instead of usize, limiting files to ~4 GiB but halving the size of every span. This matters because spans are everywhere in the AST and keeping them small improves cache utilization.

Rayon for Parallelism

File processing is parallelized via rayon. Each file gets its own allocator (from the pool), parsed independently, and linted independently. The diagnostic channel (DiagnosticService) is an mpsc channel that collects results from all worker threads.


Your Projects Connection

As someone who writes Rust (4 projects: tree, winmux, winpane, windows-cli-tools) and TypeScript (6 projects), here is how the Oxc crates map to contribution paths:

Best Starting Points for Contributing

  1. oxc_linter rules - The most accessible entry point. Each rule is a self-contained file with clear structure, inline tests, and snapshot testing. You do not need to understand the parser or semantic analysis internals. Just pattern-match on AstKind variants and report diagnostics. The no_debugger rule above is ~100 lines including tests.

  2. oxc_linter infrastructure - Config loading, fix application, diagnostic formatting. This is standard Rust (serde, file I/O, CLI plumbing) without the complexity of compiler internals.

  3. oxc_codegen - If you are interested in code generation. Each AST node implements a Gen trait that prints itself. Adding support for new syntax or fixing formatting issues is localized.

Worth Understanding (but more complex)

  1. oxc_transformer - Requires understanding the Traverse pattern and how Oxc mutates arena-allocated AST nodes. The ES2015-2026 modules are organized similarly to Babel plugins. Relevant if you use TypeScript and want to understand how TS stripping works.

  2. oxc_semantic - The bridge between parsing and everything else. Understanding Scoping (scope tree + symbol table + reference resolution) is useful for writing rules that need to understand variable bindings, e.g., “is this variable used?” or “does this name refer to a global?”.

Deep Internals (for understanding, not casual contribution)

  1. oxc_parser - A high-performance recursive descent parser. Changes here require deep JS/TS grammar knowledge and carry high correctness risk. The codebase prioritizes performance (unsafe code, SIMD in the lexer, arena allocation).

  2. oxc_allocator - Bump allocator internals. You’ll use it as a consumer (just call Allocator::new()) but modifying it requires understanding memory layout and unsafe Rust.

Cross-Pollination with Your TypeScript Projects

The napi/parser, napi/transform, and napi/minify crates expose Oxc’s functionality to Node.js. If you integrate oxlint into your TypeScript projects, understanding the config format (oxlint.json) and rule naming conventions is the practical starting point. The VS Code extension in editors/ is another path if you use Oxc in your editor workflow.

Your Windows Expertise

Oxc runs on Windows. The ignore crate handles path normalization, and the linter tests run on Windows CI. Since you write Windows CLI tools, you could contribute to platform-specific issues (path handling, terminal output, performance on NTFS) if you encounter them.