RFC 017: Validated newtypes with implicit coercion (pydantic-like feel)¶
Status: Planned
Created: 2026-01-12
Target: v0.2
0.1 vs 0.2 (what’s implemented when)¶
This RFC describes the intended v0.2 direction.
For v0.1 stabilization, we may land only the minimum foundations needed so newtype construction and model/class construction semantics “make sense” (e.g., reserved hook recognition, no validation bypass paths, better diagnostics).
Area¶
- Incan Language (syntax/semantics)
- Runtime / Core crates (stdlib/core/derive)
Summary¶
Introduce a pydantic-like “validated boundary” for newtype by standardizing a canonical validation/conversion hook on
every newtype and allowing the compiler to perform implicit coercion from the underlying type in well-defined contexts
(function arguments, typed initializers, and model/class field initialization). Invalid input fails fast with a
ValidationError panic by default to preserve “pydantic feel”.
Because panics can be a poor fit for reusable library code and long-running services, the proposal also keeps an explicit
Result-returning constructor for recoverable flows, and introduces opt-out controls (e.g. @no_implicit_coercion) plus
structured/aggregated validation errors for model/class initialization.
Motivation¶
We want guardrails for common primitives (especially numerics) without forcing boilerplate everywhere. In practice, many values originate from boundaries (CLI/config/env/API), and teams want “it just works”-ergonomics with strong invariants and high-quality error messages.
Today, users can implement validation manually (e.g. helper constructors returning Result), but this requires explicit
calls at every use site. The desired “pydantic feel” is:
- pass raw values at boundaries
- have them coerced + validated automatically
- get a clear
ValidationErroron invalid inputs
Concrete “pydantic-like” types this model can express (v1 targets):
- Constrained numerics (pure):
PositiveInt,NonNegativeInt, “finite float”, etc. - Parsing/normalization types (pure): e.g.
HttpUrl,UUID4,IPvAnyAddressimplemented as parse + normalize (no DNS/network IO). - Human-friendly boundary parsing stays explicit: e.g.
BytesCount = newtype int[ge=0]plus an explicitparse_byte_size(str) -> Result[BytesCount, ValidationError](v1 does not introduce implicitstr -> intparsing). - Redaction/masking types: e.g.
SecretStrcan be a normalnewtype str; masking is a formatter/serializer concern (outside this RFC’s compiler changes).
Conversely, some pydantic types are intentionally out-of-scope for v1 implicit coercion because they are non-deterministic
(filesystem existence checks like FilePath, clock-based checks like PastDate, etc.). Those can still exist as explicit
constructors, or via a future “impure validation” extension.
Proposal sketch¶
Newtype validation/conversion hook (canonical)¶
For a newtype:
type Attempts = newtype int:
def from_underlying(n: int) -> Result[Attempts, ValidationError]:
if n <= 0:
return Err(ValidationError("attempts must be >= 1"))
return Ok(Attempts(n))
Notes:
from_underlyingis the canonical “validated conversion” from the newtype’s underlying type.- It is defined in the newtype body (supported today by parser); there is no separate
implblock in Incan. - If a newtype does not define
from_underlying, the compiler auto-generates a default implementation:from_underlying(x) = Ok(T(x))(no extra validation).- If the underlying type includes constraints (e.g.
int[gt=0]), the generatedfrom_underlyingenforces those constraints.
from_underlyingis treated as a reserved hook name for newtypes (compiler-recognized for coercion/validation).- Contract (important for predictability and tooling):
from_underlyingmust be deterministic and side-effect free (no IO, no global mutation, no time/random dependence, no randomized hashing dependence).from_underlyingmust not panic; it returnsErr(ValidationError)on invalid input.- On
Ok(...), it must return exactly the validated newtype instance corresponding to the input (no “partial” values). - This intentionally rules out validators that consult the filesystem, the clock, randomness, or the network; those require explicit validation, or a future opt-in extension with tighter guardrails.
Relationship to TryFrom[T] (stdlib):
- Incan already has a
TryFrom[T]trait with@classmethod def try_from(cls, value: T) -> Result[Self, str]. - This RFC introduces
from_underlying(...) -> Result[T, ValidationError]as a dedicated hook because:- we want a structured
ValidationError(paths, chains, constraint metadata) rather than a plainstr, and - Incan does not yet have
implblocks for newtypes, so a compiler-recognized hook inside the newtype body keeps the surface area minimal.
- we want a structured
- Future-friendly: the compiler could auto-provide
TryFrom[U]for a newtype by delegating tofrom_underlyingand stringifying the error, butfrom_underlyingremains the canonical validation contract.
Constrained underlying type syntax (nice surface form)¶
To avoid the semantic implication that int(...) accepts constraint keyword arguments at runtime, represent constraints
as type-level parameters using bracket syntax:
(Inspiration: this is similar in spirit to Python's typing.Annotated[...] constraints and pydantic's constrained types.
The goal is “feel”, not copying pydantic syntax.)
type NonNegativeInt = newtype int[ge=0]
type PositiveInt = newtype int[gt=0]
Semantics:
int("5")remains a plain conversion toint(no constraints).- The constraints live on the constrained type (
int[ge=0]) and are enforced by the newtype’s validation hook (from_underlying) and by implicit coercion sites. - This keeps constraint syntax in the “type parameter” family (similar to
list[T],Result[T, E]) rather than looking like a runtime call with keyword arguments. - For v1, constrained primitive types are intended to be used as the underlying type of a
newtype(not as a general “drop-in replacement” forint/floateverywhere). This keeps the surface area small while preserving the feel.
Constraint vocabulary (v1, proposal):
int[...]:ge,gt,le,lt(values must be compile-time constants: literals or namedconsts)float[...](optional for v1):ge,gt,le,lt(values must be compile-time constants: literals or namedconsts)- IEEE note: comparisons with NaN are always false, so NaN should fail all constraints by default.
Constraint semantics:
ge: greater than or equal (\(\ge\))gt: greater than (\(>\))le: less than or equal (\(\le\))lt: less than (\(<\))
Syntax notes:
- Multiple constraints are allowed:
int[ge=0, lt=10](order-insensitive; all constraints must hold). - Constraint parameters are compile-time constants (not arbitrary expressions).
- Only a single constraint block is permitted on a primitive type (e.g.
int[...]cannot be followed by another[...]). - Duplicate keys are rejected (e.g.
int[gt=0, gt=1]is an error).
Newtype-on-newtype (transitive coercion)¶
Newtypes should be able to wrap other newtypes:
type PositiveInt = newtype int[gt=0]
type RetryAttempts = newtype PositiveInt
In this case, implicit coercion should be transitive:
- expected
RetryAttempts, gotint- coerce
int -> PositiveInt(validategt=0) - then coerce
PositiveInt -> RetryAttempts
- coerce
Because each newtype has exactly one underlying type, the coercion chain is deterministic. The compiler should reject cycles (if they can be formed) with a clear diagnostic.
Implicit coercion sites (compiler rewriting)¶
When a context expects a validated newtype T_target but receives a value of some type S0, the compiler may insert
implicit coercions consisting only of validated newtype conversions (from_underlying), potentially chained across
nested newtypes.
Critically, the compiler does not insert ambient primitive conversions (e.g. str -> int, int -> float) as part of
implicit coercion. If you want parsing/coercion between primitives, write it explicitly at the call site (e.g. int("5")).
Proposed contexts¶
- Function arguments
- Typed initializers
- Model/class field initialization
- Newtype construction (checked by default;
T(x)treated as a coercion/validation site)
Controls & policy (avoid hidden panics)¶
Implicit coercion is intended as a boundary ergonomics feature, but it can introduce hidden panics in reusable library code (especially when used via function arguments). Provide opt-out controls so library/service authors can make call sites explicit.
Failure behavior (pydantic-like)¶
If from_underlying(...) returns Err(e) during implicit coercion, panic with a ValidationError (or wrap the returned
e into a ValidationError), similar to pydantic raising/throwing a validation exception.
Aggregation policy:
- Function arguments and typed initializers: fail fast (panic on first invalid coercion).
- Model/class field initialization: aggregate all field coercion errors and panic once with a multi-error
ValidationErrorfor better diagnostics.
Impact / compatibility¶
- Behavioral: introduces implicit conversions at specific boundaries; must be carefully scoped to avoid “magic everywhere”.
- Runtime: implicit coercion failure panics; this must be clearly documented as a boundary feature.
- Performance: newtypes should remain zero-cost wrappers in generated code; validation cost is paid only at coercion sites
and should not allocate/box unless the user’s
from_underlyingdoes so.
Implementation notes (high-level)¶
Likely layers:
- Syntax/parser: support constrained primitive type syntax like
int[gt=0]as a type-level annotation. - Frontend/typechecker: detect coercion sites and insert validated coercion nodes.
- Lowering/codegen: emit calls to
from_underlyingand panic with rich errors on failure. - Stdlib/core: define
ValidationErrorshape and formatting. - Docs: document coercion behavior and explicit alternatives.
- Tests: parser + typechecker + codegen + runtime behavior.
Unresolved questions¶
- Panic strategy: unwind vs abort for
ValidationErrorpanics. - Constraint catalog scope for v1.
- Non-panicking coercion mode (
Resultmode). - Serialization/FFI boundaries.
Checklist (comprehensive)¶
Spec / semantics¶
- [ ] Lock down the canonical hook name and signature:
- [ ]
from_underlying(underlying: U) -> Result[Self, ValidationError]as the compiler-recognized entrypoint - [ ] deterministic / side-effect free contract, and “must not panic” rule
- [ ]
- [ ] Define the failure policy precisely per coercion site:
- [ ] function args + typed initializers: fail-fast (panic on first invalid coercion)
- [ ] model/class construction: aggregate all field coercion failures and panic once
- [ ] Define transitive coercion rules for newtype-on-newtype (chain building + cycle detection).
- [ ] Define opt-out controls (library-friendly):
- [ ]
@no_implicit_coercion(type-level) and/or per-site opt-out - [ ] “Result mode” / explicit construction guidance
- [ ]
- [ ] Define constraint semantics for
int[...]/float[...]:- [ ] permitted keys (
ge/gt/le/lt) + compile-time const rules for values - [ ] NaN policy for float constraints
- [ ] duplicate keys + incompatible constraint combinations diagnostics
- [ ] permitted keys (
Syntax / AST¶
- [ ] Parser: support constrained primitive type syntax in type position (e.g.
int[gt=0],float[ge=0.0]). - [ ] AST: represent constrained types in a structured way (not as ad-hoc strings).
- [ ] Formatter: preserve / normalize constraint formatting (order-insensitive, stable output).
Frontend (typechecker)¶
- [ ] Recognize
from_underlyingas a reserved newtype hook and validate its shape:- [ ] static (no receiver)
- [ ] exactly one parameter of the underlying type (including constrained underlying types)
- [ ] return type is
Result[Newtype, ValidationError]
- [ ] For newtypes without
from_underlying, define and implement the auto-generated default behavior:- [ ]
Ok(T(x))if no constraints - [ ] enforce constraints from underlying type if present
- [ ]
- [ ] Insert implicit coercions at the approved sites:
- [ ] function arguments
- [ ] typed initializers
- [ ] model/class field initialization
- [ ] newtype construction
T(x)(checked by default)
- [ ] Ensure the compiler does not insert ambient primitive conversions (no
str -> int, noint -> float). - [ ] Diagnostics quality:
- [ ] span-accurate errors pointing at the call site / field initializer
- [ ] actionable hints (show explicit
T.from_underlying(...)/try_...alternatives) - [ ] include coercion chain in errors (e.g.
int -> PositiveInt -> RetryAttempts)
Lowering / IR¶
- [ ] Introduce an explicit IR representation for “validated coercion” (preferred) or document the exact rewriting strategy.
- [ ] Ensure lowering avoids recursion traps (e.g.
T(x)insideimpl T/ inside the hook itself). - [ ] Preserve deterministic evaluation order for default values and coercion calls.
Codegen (Rust emission)¶
- [ ] Emit
from_underlying(...)calls at coercion sites. - [ ] Emit failure behavior:
- [ ] fail-fast sites: panic with
ValidationError(include message + path context) - [ ] aggregated sites: collect multiple errors and panic once
- [ ] fail-fast sites: panic with
- [ ] Keep generated code zero-cost outside coercion sites (no extra allocations on happy path beyond user code).
- [ ] Decide and document when generated code may use
.expect(...)/.unwrap(...)vs structured error helpers.
Runtime / core crates¶
- [ ] Define canonical
ValidationErrortype:- [ ] structured fields (path, kind, message, optional nested causes)
- [ ] stable formatting (human-readable + optionally machine-readable)
- [ ] Provide helper APIs for:
- [ ] creating per-field errors
- [ ] aggregating multiple errors
- [ ] panic/raise helpers with
#[track_caller]
Tooling / docs¶
- [ ] Update Book + Reference docs:
- [ ] explain
from_underlying - [ ] document implicit coercion sites and opt-out controls
- [ ] show explicit alternatives (
Result-returning flows)
- [ ] explain
- [ ] Add examples:
- [ ] constrained numeric newtypes
- [ ] parsing/normalization newtypes (pure)
- [ ] model/class aggregated validation example
Testing (must be comprehensive)¶
- [ ] Parser tests for constraint syntax and errors.
- [ ] Typechecker tests:
- [ ] correct insertion of coercions at each site
- [ ] no ambient primitive conversions
- [ ] transitive coercion chains + cycle detection errors
- [ ] spans + hint text assertions for key diagnostics
- [ ] Codegen snapshot tests:
- [ ] emitted
from_underlyingcalls at each coercion site - [ ] emitted aggregation logic for model/class initialization
- [ ] emitted
- [ ] Runtime behavior tests:
- [ ] panic messages include correct path/field context
- [ ] aggregated errors contain all invalid fields