Extending Incan: Builtins vs New Syntax¶
This document is for contributors who want to add new language features.
Incan is implemented as a multi-stage compiler:
- Frontend: Lexer → Parser → AST → Typechecker (
typechecker/) - Backend: Lowering (AST → IR) → Emitter (IR → Rust)
That separation is intentional (clarity, correctness, debuggability), but it means that adding new syntax typically touches multiple stages.
Required reading (contributors)
Before making language changes, read these end-to-end:
- Incan Compiler Architecture — internal pipeline + module layout
- How Incan works — conceptual pipeline schematic
- RFC index — required for new language features (syntax/semantics), not for bugs/chores
Architecture schematic (high-level)¶
flowchart TD
incanSource[Incan_source_.incn] --> parser[Parse_and_validate]
parser --> typecheck[Typecheck_and_resolve]
typecheck --> lower[Lower_to_Rust]
lower --> cargo[Cargo_build]
cargo --> binary[Executable]
Where things live (crates and modules)¶
Incan’s “language surface” spans a small number of key crates/modules:
| Crate/Module | Purpose |
|---|---|
crates/incan_syntax |
Lexer/parser/AST/diagnostics (shared by compiler, formatter, and LSP to prevent drift) |
crates/incan_core |
Semantic registries + pure helpers shared across the ecosystem (should not drift) |
crates/incan_stdlib |
Runtime support for generated programs (preferred home for “just a function” behavior) |
crates/incan_derive |
Derives used by generated Rust programs (runtime-side) |
src/frontend |
Module resolution + typechecker (turns syntax into a typed program) |
src/backend |
Lowering + IR + emission (turns typed program into Rust) |
src/format/ |
Source formatter (incan fmt) |
src/lsp/ |
Language server (reuses frontend to provide IDE diagnostics) |
When you’re unsure where to implement something, start by deciding which crate owns the responsibility.
Rule of Thumb¶
Prefer a library/builtin over a new keyword.
Add a new keyword / syntax form only when the feature:
- Introduces control-flow that cannot be expressed as a call (e.g.
match,yield,await,?) - Requires special typing rules that would be awkward or misleading as a function
- Needs non-standard evaluation of its operands (short-circuiting, implicit returns, pattern binding, etc.)
If the feature is “some behavior” (logging, printing, tracing, helpers), it should usually be:
- A stdlib function (preferred), or
- A compiler builtin (when it must lower to special Rust code).
Path A: Adding a Function (Stdlib or Compiler Builtin)¶
A.1: Stdlib function (no new syntax)¶
Use this when the behavior can live in runtime support crates (e.g. incan_stdlib), without compiler special casing.
Typical work:
- Add runtime implementation in
crates/incan_stdlib/ - Expose it via the prelude if appropriate
- Document it in the language guide
This avoids changing the lexer/parser/AST/IR.
A.2: Compiler builtin function (special lowering/emission)¶
Use this when you want a function-call surface syntax, but it must emit a particular Rust pattern.
Incan already has enum-dispatched builtins in IR (BuiltinFn) and emission logic in emit/expressions/builtins.rs.
End-to-end checklist:
- Frontend symbol table: add the builtin name and signature so it typechecks
src/frontend/symbols.rs→SymbolTable::add_builtins()
- IR builtin enum: add a new variant and name mapping
src/backend/ir/expr.rs→enum BuiltinFn+BuiltinFn::from_name()
- Lowering: ensure calls to that name lower to
IrExprKind::BuiltinCallsrc/backend/ir/lower/expr.rsusesBuiltinFn::from_name(name)for identifiers
- Emission: emit the Rust code for the new builtin
src/backend/ir/emit/expressions/builtins.rs→emit_builtin_call()
- Docs/tests: add/adjust as needed
This path is often much cheaper than adding new syntax, while still letting you control the generated Rust.
A.2: Compiler builtin method (special method lowering/emission)¶
Use this when you want to add a method on existing types (e.g. list.some_method()) that needs special Rust emission.
Incan has enum-dispatched methods in IR (MethodKind) and emission logic in emit/expressions/methods.rs.
End-to-end checklist:
- IR method enum: add a new variant and name mapping
src/backend/ir/expr.rs→enum MethodKind+MethodKind::from_name()
- Lowering: automatic (uses
MethodKind::from_name(name)for all method calls)src/backend/ir/lower/expr.rsalready handles this
- Emission: emit the Rust code for the new method
src/backend/ir/emit/expressions/methods.rs→emit_known_method_call()
- Docs/tests: add/adjust as needed
Unknown methods pass through as regular Rust method calls, so you don't break Rust interop by adding known methods.
Path B: Add a New Keyword / Syntax Form¶
Use this only when the feature is genuinely syntactic/control-flow.
End-to-end checklist (typical):
Lexer: crates/incan_syntax/src/lexer/*
- Add a
KeywordIdand aKEYWORDSentry (canonical spelling/metadata) incrates/incan_core/src/lang/keywords.rs - Ensure tokenization emits
TokenKind::Keyword(KeywordId::YourKeyword) - Update lexer parity tests (keyword/operator/punctuation registry parity)
Word-operators (special case)
If the new “keyword” is meant to behave like an operator (it participates in expression precedence like and, or,
not, in, is), treat it as a word-operator:
- Add it to
crates/incan_core/src/lang/operators.rs(precedence/fixity source of truth) - Add a corresponding
KeywordId+KEYWORDSentry incrates/incan_core/src/lang/keywords.rs(so the lexer will still lex it as a keyword) - Update expression parsing in
crates/incan_syntax/src/parser/expr.rsto place it at the right precedence level
Parser: crates/incan_syntax/src/parser/*
- Parse the syntax and build a new AST node (usually an
ExprorStatementvariant)
AST: crates/incan_syntax/src/ast.rs
- Add the new
Expr::<YourNode>orStatement::<YourNode>variant
Formatter: src/format/formatter.rs
- Teach the formatter how to print the new node
Typechecker:src/frontend/typechecker/
check_decl.rs– add type-level rules (models, classes, traits)check_stmt.rs– add statement-level rules (assignments, control flow)check_expr/*.rs– add expression-level rules (calls, operators, match)
(Optional) Scanners: src/backend/ir/scanners.rs
- Ensure feature detection traverses the new node if relevant
Lowering (AST → IR): src/backend/ir/lower/*
- Lower the new AST node into an IR representation
IR (if needed): src/backend/ir/expr.rs / stmt.rs / decl.rs
- Add a new
IrExprKind/IrStmtKindvariant if the feature is not expressible via existing IR
Emitter (IR → Rust): src/backend/ir/emit/**/*.rs
- Emit correct Rust for the new IR node
Editor tooling (optional but recommended): editors/*
editors/vscode/*: keyword highlighting / indentation patterns
Docs + tests
- Add a guide snippet and at least one parse/typecheck/codegen regression test
Practical guidance¶
- If you find yourself adding a keyword to achieve “a function with a special implementation”, pause and consider making it a builtin function instead.
- If you add a new AST/IR enum variant, rely on Rust’s exhaustiveness errors as your checklist: the compiler will tell you which match arms you need to update.