Oxc Architecture

How Oxc’s Rust codebase is organized, how data flows through the toolchain, and what matters for contributing.

Source: reading the actual crate source code at v0.116.0.

Crate Map

The workspace has ~30 crates in crates/, apps in apps/, and Node.js bindings in napi/.

Core (parsing and AST)

Crate	Purpose
`oxc_parser`	Recursive descent parser for JS/TS/JSX/TSX. Entry: `Parser::new(&allocator, source, source_type).parse()`
`oxc_ast`	AST node type definitions (all the `Expression`, `Statement`, etc. enums and structs)
`oxc_ast_visit`	Generated `Visit` and `VisitMut` traits for walking the AST
`oxc_semantic`	Semantic analysis: scopes, symbols, references, control flow graph. `SemanticBuilder` traverses the AST in a single pass
`oxc_traverse`	Mutable AST traversal used by the transformer (distinct from the read-only visitor)
`oxc_span`	`Span` (pair of `u32` byte offsets), `SourceType`, `Atom` (interned string)
`oxc_syntax`	Language-level types: operators, precedence, scope flags, symbol flags, `ModuleRecord`

Tools

Crate	Purpose
`oxc_linter`	Linting engine: rule trait, rule registry, config loading, `Linter::run()` loop
`oxc_transformer`	Babel-compatible transforms (TS stripping, JSX, ES2015-ES2026 downleveling)
`oxc_codegen`	Code printer. Walks the AST via a `Gen` trait and emits source text + optional source maps
`oxc_minifier`	Code minification (constant folding, dead code elimination)
`oxc_mangler`	Variable name mangling (shortening identifiers)
`oxc_formatter`	Prettier-compatible code formatting (in development)
`oxc_isolated_declarations`	TypeScript `.d.ts` generation without a full type checker
`oxc_regular_expression`	RegExp pattern parser used by the main parser and linter

Infrastructure

Crate	Purpose
`oxc_allocator`	Bump-allocator arena. All AST nodes are allocated here for fast bulk deallocation
`oxc_diagnostics`	`OxcDiagnostic` type with builder pattern (`.with_label()`, `.with_help()`, etc.), plus `DiagnosticService` for rendering
`oxc_macros`	Proc macros, primarily `declare_oxc_lint!`
`oxc_ast_macros`	Proc macros for AST type generation (`#[ast]`)
`oxc_data_structures`	Shared data structures: `CodeBuffer`, stack types, etc.
`oxc_ecmascript`	Pure ECMAScript spec operations (abstract equality, `ToNumber`, etc.)
`oxc_cfg`	Control flow graph builder and types
`oxc_sourcemap`	Source map generation
`oxc_index`	Type-safe index types (like `newtype_index!` in rustc)
`oxc_napi`	Shared NAPI utilities for Node.js bindings
`oxc_language_server`	LSP server for editor integration

Apps and Bindings

Path	Purpose
`apps/oxlint/`	CLI binary for the linter
`apps/oxfmt/`	CLI binary for the formatter
`napi/parser/`	Node.js binding for the parser
`napi/transform/`	Node.js binding for the transformer
`napi/minify/`	Node.js binding for the minifier

Data Flow

How source code flows through Oxc’s pipeline:

Source Text (String)
        |
        v
+------------------+
|   oxc_allocator   |  Arena created, shared by all stages
+------------------+
        |
        v
+------------------+
|   oxc_parser      |  Lexer -> Recursive Descent Parser
|                   |  Output: Program<'a> (AST in arena)
|                   |         + errors + ModuleRecord
+------------------+
        |
        v
+------------------+
|  oxc_semantic     |  Single-pass AST walk via Visit trait
|                   |  Builds: AstNodes (parent pointers),
|                   |          Scoping (scopes, symbols, refs),
|                   |          ClassTable, optional CFG
+------------------+
        |
        +-----> Linter (read-only, per-node rule dispatch)
        |
        +-----> Transformer (mutates AST via Traverse trait)
        |               |
        |               v
        |       +------------------+
        |       |   oxc_codegen    |  Walks AST via Gen trait
        |       |                  |  Output: String + SourceMap
        |       +------------------+
        |
        +-----> Minifier (AST transforms, then codegen)
        |
        +-----> Isolated Declarations (.d.ts emission)

Key design point: The allocator is created once and owns all AST memory. Parsing allocates AST nodes into it; semantic analysis reads them; the transformer mutates them in-place; codegen reads them again. When the allocator is dropped, all AST memory is freed in one shot.

The parser produces a structurally valid AST even on syntax errors (error recovery). More expensive checks like scope resolution and duplicate binding detection are delegated to SemanticBuilder to keep the parser fast.

The Linter Pipeline

From the CLI entry point (apps/oxlint/src/main.rs) through to diagnostics:

1. CLI Parsing (bpaf)
   apps/oxlint/src/main.rs -> CliRunner::run()

2. Config Loading
   ConfigLoader resolves .oxlintrc.json / oxlint.json
   -> ConfigStoreBuilder builds ConfigStore (rules + severity)

3. File Discovery
   Walk (powered by `ignore` crate, respects .gitignore)
   -> produces Vec<PathBuf>

4. Per-file Processing (parallelized via rayon)
   For each file:
   a. Read source text into arena allocator
   b. Parser::new(&allocator, &source, source_type).parse()
      -> ParserReturn { program, errors, module_record, ... }
   c. SemanticBuilder::new().build(&program)
      -> Semantic { nodes, scoping, classes, cfg, ... }
   d. Build ContextSubHost (wraps semantic + tokens for linter)

5. Rule Execution (Linter::run)
   - Resolve which rules apply to this file path
   - Filter rules: skip if file has no matching AST node types
   - For small files (<200k nodes): iterate rules, each rule walks all nodes
   - For large files (>200k nodes): iterate nodes, dispatch to matching rules
     (avoids cache thrashing by keeping rule data in inner loop)
   - Each rule receives (AstNode, LintContext)
   - Rules report diagnostics via ctx.diagnostic() or ctx.diagnostic_with_fix()

6. Disable Directives
   Respects // eslint-disable, // eslint-disable-next-line, etc.
   Filtered during/after rule execution

7. Diagnostic Collection + Reporting
   Messages collected, sorted, deduplicated
   DiagnosticService renders via miette (fancy terminal output)

The LintService in crates/oxc_linter/src/service/runtime.rs handles the parallelism. It uses rayon for data-parallel file processing and an AllocatorPool to recycle bump allocators across files (avoiding repeated allocation of the arena itself).

Rule Anatomy

Every linter rule follows the same structure. Here is the simplest possible rule, no_debugger:

// crates/oxc_linter/src/rules/eslint/no_debugger.rs

use oxc_ast::AstKind;
use oxc_diagnostics::OxcDiagnostic;
use oxc_macros::declare_oxc_lint;
use oxc_span::Span;
use crate::{AstNode, context::LintContext, rule::Rule};

// 1. Diagnostic factory function
fn no_debugger_diagnostic(span: Span) -> OxcDiagnostic {
    OxcDiagnostic::warn("`debugger` statement is not allowed")
        .with_label(span)
}

// 2. Rule struct (must be Debug + Default + Clone)
#[derive(Debug, Default, Clone)]
pub struct NoDebugger;

// 3. Metadata macro: docs, plugin, category, fix capability
declare_oxc_lint!(
    /// ### What it does
    /// Checks for usage of the `debugger` statement
    NoDebugger,
    eslint,
    correctness,
    fix
);

// 4. Rule implementation
impl Rule for NoDebugger {
    fn run<'a>(&self, node: &AstNode<'a>, ctx: &LintContext<'a>) {
        if let AstKind::DebuggerStatement(stmt) = node.kind() {
            ctx.diagnostic_with_fix(
                no_debugger_diagnostic(stmt.span),
                |fixer| fixer.delete(&stmt.span),
            );
        }
    }
}

// 5. Tests inline in the same file
#[test]
fn test() {
    use crate::tester::Tester;
    let pass = vec!["var test = { debugger: 1 }; test.debugger;"];
    let fail = vec!["if (foo) debugger"];
    Tester::new(NoDebugger::NAME, NoDebugger::PLUGIN, pass, fail)
        .test_and_snapshot();
}

The `Rule` Trait

Defined in crates/oxc_linter/src/rule.rs:

pub trait Rule: Sized + Default + fmt::Debug {
    /// Called for every AST node (the main hook)
    fn run<'a>(&self, node: &AstNode<'a>, ctx: &LintContext<'a>) {}

    /// Called once per file (for whole-program checks)
    fn run_once(&self, ctx: &LintContext) {}

    /// Called for Jest/Vitest test nodes specifically
    fn run_on_jest_node<'a, 'c>(&self, node: &PossibleJestNode<'a, 'c>, ctx: &'c LintContext<'a>) {}

    /// Check if rule should run on this file
    fn should_run(&self, ctx: &ContextHost) -> bool { true }

    /// Deserialize config from JSON (ESLint format)
    fn from_configuration(_value: serde_json::Value) -> Result<Self, ...> { Ok(Self::default()) }
}

Most rules only implement run. The run_once method is for rules that need a whole-file view (e.g., checking for duplicate imports). The run_on_jest_node is for test-framework-specific rules.

The `declare_oxc_lint!` Macro

This generates the RuleMeta implementation (NAME, PLUGIN, CATEGORY, FIX, documentation) so you do not need to write it by hand. The arguments are:

declare_oxc_lint!(
    /// Documentation (becomes the website docs page)
    StructName,
    plugin_name,    // eslint, typescript, react, unicorn, etc.
    category,       // correctness, suspicious, pedantic, perf, style, restriction, nursery
    fix_kind,       // fix, suggestion, dangerous_fix, pending, none (default)
    config = Type,  // optional: config struct for rules with options
);

Diagnostic Builder Pattern

OxcDiagnostic uses a chained builder:

OxcDiagnostic::warn("message")        // or ::error("message")
    .with_label(span)                  // highlight the problematic code
    .with_labels([span1, span2])       // multiple highlights
    .with_help("Try doing X instead") // suggestion text
    .with_note("Additional context")   // informational note
    .with_url("https://...")           // link to docs
    .with_error_code("eslint", "no-debugger")  // rule identifier

Rules report diagnostics through LintContext:

ctx.diagnostic(d) - report without fix
ctx.diagnostic_with_fix(d, |fixer| ...) - report with auto-fix
ctx.diagnostic_with_fix_of_kind(d, FixKind::DangerousFix, |fixer| ...) - report with a specific fix kind

The fixer API: fixer.replace(span, "new text"), fixer.delete(&span), fixer.insert_before(node, "text"), fixer.insert_after(node, "text").

Rule Categories

Rules are organized by plugin (matching ESLint ecosystem) in crates/oxc_linter/src/rules/:

rules/
  eslint/          # Core ESLint rules
  typescript/      # @typescript-eslint rules
  react/           # eslint-plugin-react
  react_perf/      # eslint-plugin-react-perf
  jsx_a11y/        # eslint-plugin-jsx-a11y
  unicorn/         # eslint-plugin-unicorn
  import/          # eslint-plugin-import
  jest/            # eslint-plugin-jest
  vitest/          # eslint-plugin-vitest
  jsdoc/           # eslint-plugin-jsdoc
  nextjs/          # @next/eslint-plugin-next
  node/            # eslint-plugin-node
  promise/         # eslint-plugin-promise
  oxc/             # Oxc-specific rules
  vue/             # eslint-plugin-vue

RuleEnum and Codegen

RuleEnum is 16 bytes (verified by a size assertion test) to keep the rule dispatch loop cache-friendly. The enum and its match arms are code-generated by oxc_linter_codegen (run via cargo lintgen). You add a new rule file, register it, and re-run codegen.

Key Rust Patterns

Arena Allocation (`oxc_allocator`)

All AST nodes live in a bump allocator. This means:

No individual frees. The entire AST is deallocated at once when the Allocator drops.
Box<'a, T> and Vec<'a, T> are arena-backed versions of std::boxed::Box and std::vec::Vec.
The allocator grows by doubling chunk sizes. For best performance, reuse allocators with .reset() rather than creating new ones.

let allocator = Allocator::new();
let ret = Parser::new(&allocator, source, source_type).parse();
// ret.program contains arena-allocated AST nodes
// When `allocator` drops, all AST memory is freed

The AllocatorPool in the linter service recycles allocators across files to avoid repeated system allocation.

The Visitor Pattern (`oxc_ast_visit`)

Read-only traversal uses the Visit trait (generated). SemanticBuilder implements Visit to walk the AST in a single pass:

impl<'a> Visit<'a> for SemanticBuilder<'a> {
    fn visit_function(&mut self, func: &Function<'a>) {
        // enter: create scope, process bindings
        walk_function(self, func);  // recurse into children
        // exit: close scope
    }
}

The linter does NOT use the visitor pattern directly for rule dispatch. Instead, SemanticBuilder builds an AstNodes structure (a flat vec of nodes with parent pointers), and the linter iterates that flat list. Each AstNode has a kind() returning an AstKind enum, which rules pattern-match on:

fn run<'a>(&self, node: &AstNode<'a>, ctx: &LintContext<'a>) {
    let AstKind::BinaryExpression(expr) = node.kind() else { return };
    // inspect expr...
}

Mutable traversal for the transformer uses oxc_traverse::Traverse trait with enter/exit hooks for each AST node type:

impl<'a> Traverse<'a, State> for MyTransform {
    fn enter_expression(&mut self, expr: &mut Expression<'a>, ctx: &mut TraverseCtx<'a>) {
        // mutate the expression
    }
    fn exit_expression(&mut self, expr: &mut Expression<'a>, ctx: &mut TraverseCtx<'a>) {
        // post-order processing
    }
}

`UniquePromise` for Safety

The parser uses a zero-sized UniquePromise type to enforce at the type level that only one ParserImpl exists per thread at a time. This is needed for soundness of unsafe pointer operations in the lexer’s source cursor:

// Only Parser::parse() can create a UniquePromise
pub fn parse(self) -> ParserReturn<'a> {
    let unique = UniquePromise::new();  // private constructor
    let parser = ParserImpl::new(self.allocator, ..., unique);
    parser.parse()
}

`RuleRunFunctionsImplemented` Optimization

At codegen time, the linter analyzes which Rule trait methods each rule actually overrides. This lets the dispatch loop skip calling run() on a rule that only implements run_once(), and vice versa. The generated RuleRunner impl carries this info as a const:

pub trait RuleRunner: Rule {
    const NODE_TYPES: Option<&AstTypesBitset>;  // which AST node kinds this rule cares about
    const RUN_FUNCTIONS: RuleRunFunctionsImplemented;  // which hooks are implemented
}

AstTypesBitset for Rule Filtering

Each rule declares which AstType variants it inspects (also computed at codegen time). Before running a file, the linter checks if the file’s AST contains any matching node types. If not, the rule is skipped entirely. For large files, rules are bucketed by AstType to avoid iterating all rules for each node.

`Span` as `u32` Pair

Source positions use u32 instead of usize, limiting files to ~4 GiB but halving the size of every span. This matters because spans are everywhere in the AST and keeping them small improves cache utilization.

Rayon for Parallelism

File processing is parallelized via rayon. Each file gets its own allocator (from the pool), parsed independently, and linted independently. The diagnostic channel (DiagnosticService) is an mpsc channel that collects results from all worker threads.

Your Projects Connection

As someone who writes Rust (4 projects: tree, winmux, winpane, windows-cli-tools) and TypeScript (6 projects), here is how the Oxc crates map to contribution paths:

Best Starting Points for Contributing

oxc_linter rules - The most accessible entry point. Each rule is a self-contained file with clear structure, inline tests, and snapshot testing. You do not need to understand the parser or semantic analysis internals. Just pattern-match on AstKind variants and report diagnostics. The no_debugger rule above is ~100 lines including tests.
oxc_linter infrastructure - Config loading, fix application, diagnostic formatting. This is standard Rust (serde, file I/O, CLI plumbing) without the complexity of compiler internals.
oxc_codegen - If you are interested in code generation. Each AST node implements a Gen trait that prints itself. Adding support for new syntax or fixing formatting issues is localized.

Worth Understanding (but more complex)

oxc_transformer - Requires understanding the Traverse pattern and how Oxc mutates arena-allocated AST nodes. The ES2015-2026 modules are organized similarly to Babel plugins. Relevant if you use TypeScript and want to understand how TS stripping works.
oxc_semantic - The bridge between parsing and everything else. Understanding Scoping (scope tree + symbol table + reference resolution) is useful for writing rules that need to understand variable bindings, e.g., “is this variable used?” or “does this name refer to a global?”.

Deep Internals (for understanding, not casual contribution)

oxc_parser - A high-performance recursive descent parser. Changes here require deep JS/TS grammar knowledge and carry high correctness risk. The codebase prioritizes performance (unsafe code, SIMD in the lexer, arena allocation).
oxc_allocator - Bump allocator internals. You’ll use it as a consumer (just call Allocator::new()) but modifying it requires understanding memory layout and unsafe Rust.

Cross-Pollination with Your TypeScript Projects

The napi/parser, napi/transform, and napi/minify crates expose Oxc’s functionality to Node.js. If you integrate oxlint into your TypeScript projects, understanding the config format (oxlint.json) and rule naming conventions is the practical starting point. The VS Code extension in editors/ is another path if you use Oxc in your editor workflow.

Your Windows Expertise

Oxc runs on Windows. The ignore crate handles path normalization, and the linter tests run on Windows CI. Since you write Windows CLI tools, you could contribute to platform-specific issues (path handling, terminal output, performance on NTFS) if you encounter them.