RFC 055: std.fs — pathlib-shaped filesystem APIs with chunked file I/O¶
- Status: Implemented
- Created: 2026-04-11
- Author(s): Danny Meijer (@dannymeijer)
- Related:
- RFC 000 (core language builtins, including whole-file helpers)
- RFC 005 (Rust interop)
- RFC 010 (temporary filesystem objects)
- RFC 022 (namespaced stdlib modules and compiler handoff)
- RFC 023 (compilable stdlib and Rust module binding)
- RFC 041 (first-class Rust interop authoring)
- RFC 056 (
std.ioin-memory byte cursors)
- Issue: https://github.com/dannys-code-corner/incan/issues/286
- RFC PR: —
- Written against: v0.2
- Shipped in: v0.3
Summary¶
This RFC introduces std.fs as Incan's path-centric filesystem module. The compatibility target for the core path surface is CPython 3.14 pathlib.Path, while Incan also exposes explicit filesystem extensions inspired by Rust's std::fs where pathlib is intentionally incomplete. The module should be broad enough that users can stay inside std.fs.Path for ordinary filesystem work, including metadata, traversal, copy, move, linking, cleanup, and disk-usage operations, without needing a separate os- or shutil-shaped layer. The contract is not limited to whole-file helpers: std.fs must also provide chunked file I/O through Path.open(...) and a File surface with bounded reads, writes, seeking, and durability operations so large files do not require loading the entire payload into memory.
Motivation¶
Incan already exposes minimal whole-file builtins for the common text case, but that is not a complete filesystem story. Programs need ordinary path operations, metadata, directory lifecycle operations, and binary-safe reads and writes. They also need a truthful large-file path: authors must be able to open a file and process it incrementally in chunks rather than pretending read_bytes() is sufficient for every workload. Today, authors fall through to rust::std::fs and rust::std::io, which works but fragments documentation, discoverability, and examples. A namespaced stdlib surface matches RFC 022's direction and gives the project one place to define contracts for path behavior, binary I/O, durability, and errors while letting the runtime stay Rust-native underneath.
Goals¶
- Provide
std.fs.Pathas Incan's primary filesystem entry point, with method names and grouping that should match CPython 3.14pathlib.Pathwherever semantics and language grammar allow. - Make large files chunkable from day one: the
std.fscontract must includeopen(...), boundedread(size), exactread_exact(size),write(...),tell(), andseek(...)on a file type so streaming workloads do not depend onrust::. - Keep the API path-centric for ordinary tasks: tutorials should center
Path, not ad hoc compiler builtins or directrust::std::fsusage. - Expose a small set of explicit Incan extensions where
pathlibis not enough for everyday work, including honest existence checks, recursive delete, path-centric copy and move operations, structured open flags, and durability primitives. - Keep Rust interop (RFC 005) as the escape hatch for advanced or host-specific behavior such as memory mapping, ACL-specific operations, or exotic platform knobs.
Non-Goals¶
- Defining async filesystem APIs in this RFC.
- Standardizing in-memory byte cursors here; that belongs to RFC 056 (
std.io). - Mirroring Rust's
std::fsnames one-to-one; Rust remains the implementation substrate, not the tutorial vocabulary. - Introducing a parallel
std.osorstd.shutilmodule for common path-owned chores unless a later RFC shows thatstd.fsis the wrong home. - Importing every newer
pathlibaddition mechanically. CPython 3.14 remains the baseline reference, but Incan should still choose a coherent path-centric surface rather than mirroring every adjacent Python helper without judgment.
Guide-level explanation¶
Authors work with std.fs.Path the way they would with pathlib.Path: they join components, inspect lexical parts, create directories, and read or write files using path methods.
from std.fs import Path
model = Path("model.bin")
data = model.read_bytes()?
out = Path("out") / "copy.bin"
out.write_bytes(data)?
cfg_dir = Path("config")
if not cfg_dir.exists():
cfg_dir.mkdir(parents=True)?
When the file is too large to load all at once, authors open it and process bounded chunks.
from std.fs import Path
fh = Path("video.bin").open("rb")?
header = fh.read_exact(16)?
chunk = fh.read(8192)?
offset = fh.tell()?
The mental model is simple: Path owns path-based operations; File owns open-file streaming; whole-file helpers are convenience operations, not the only supported route.
Reference-level explanation¶
Compatibility target and extensions¶
- The standard library must expose
std.fsfor path-based filesystem work. std.fs.Pathshould follow CPython 3.14pathlib.Pathfor method spelling and behavior when that surface already exists there.- When Incan exposes capabilities that are outside
pathlibproper, the docs must describe them as explicit Incan extensions rather than implying they are standardpathlibbehavior. - The extension set includes
try_exists,copy,copy_into,move,move_into,remove_tree,scandir,disk_usage,OpenOptions,File.sync, andFile.sync_data. - CPython 3.14 compatibility is a baseline, not a wholesale import rule: Incan should adopt the pieces that strengthen a coherent path-centric filesystem module, not every adjacent helper automatically.
- Implementations may use Rust
std::fs,std::io, and related crates internally, but user-visible semantics are defined by this RFC and stdlib docs, not by Rust's type names.
Required capabilities¶
std.fs must provide the following baseline:
Path(path: str | Path) -> Path.- Path joining via
/and/orjoinpath(...). - Lexical properties
parent,name,suffix, andstem. - Predicates
exists(),is_file(),is_dir(), andis_symlink()where the host OS supports symlink inspection. try_exists() -> Result[bool, E]as the honest existence probe when callers must distinguish "missing" from "could not determine".mkdir(...)withparents/exist_okstyle options.- Whole-file binary helpers
read_bytes()andwrite_bytes(...). open(...)with the full Python-style mode family:"r","w","a","x","rb","wb","ab","xb","r+","w+","a+","x+","rb+","wb+","ab+", and"xb+".- A
Filesurface that must supportread(size),read_exact(size),write(data),tell(),seek(offset, whence=0),sync(), andsync_data().
Large-file behavior is normative:
File.read(size)must return at mostsizebytes and must not require loading the remainder of the file into memory.File.read(size)must return an emptybytesvalue at EOF rather than failing.File.read_exact(size)must fail if fewer thansizebytes remain.Path.read_bytes()remains valid convenience API, but it must be documented as a whole-file helper rather than the preferred route for large inputs.
Durability semantics are also normative:
- Successful
write(...),write_bytes(...), or object drop must not by themselves imply crash-safe persistence. sync()is the explicit durability operation and must request persistence of file content and associated metadata.fsync()must exist as an alias ofsync()with identical semantics.sync_data()is the lighter durability operation and may omit metadata that is not required for data visibility, subject to host-platform behavior documented by the stdlib.
open(...) mode semantics are also normative:
"r"/"rb"must open an existing file for reading and fail if the path does not exist."w"/"wb"must open for writing, creating the file if needed and truncating it if it already exists."a"/"ab"must open for append, creating the file if needed and writing new data at the end."x"/"xb"must open for exclusive creation and fail if the target already exists.- Modes containing
"+"must permit both reading and writing. - Modes containing
"b"must use binary I/O. - Modes without
"b"must use text I/O and therefore participate in the encoding / newline rules defined elsewhere in this RFC.
Text I/O defaults are also normative:
- Text mode must default to
encoding="utf-8"anderrors="strict". read_text(...)and text-modeopen(...)must use universal newline handling on read, accepting"\n","\r\n", and"\r"as line boundaries and normalizing them in the returned text.write_text(...)and text-modeopen(...)must default to writing"\n"line endings unlessnewline=...is explicitly provided.- Callers may override
encoding,errors, andnewlinewhen interoperating with legacy systems or external formats that require different text conventions.
Existence-query behavior is also normative:
exists(),is_file(),is_dir(), andis_symlink()may follow CPython 3.14's ergonomic bool style, where callers getfalsefor missing or inaccessible paths instead of a raised OS error.try_exists()exists specifically because the bool style is lossy: it must returnOk(false)only when absence is known, andErr(...)when the runtime cannot determine existence without ambiguity.
Copy, move, and tree-removal behavior is also normative:
copy(...)andcopy_into(...)must work for both regular files and directory trees.- If
follow_symlinksistrue, copying a symlink must copy the symlink target's contents. Iffollow_symlinksisfalse, the symlink itself must be recreated at the destination rather than dereferenced. - If
preserve_metadataisfalse, the implementation must guarantee copied file data and directory structure, but it must not promise preservation of ownership, timestamps, ACLs, extended attributes, or platform-specific metadata. - If
preserve_metadataistrue, the implementation should preserve permissions, modification/access times, flags, and extended attributes where the host platform can do so. The docs must state clearly which metadata classes are best-effort rather than guaranteed. move(...)andmove_into(...)must behave like ordinary renames when the source and destination are on the same filesystem and must fall back to copy-then-delete semantics when they are not.remove_tree()must delete a directory tree rooted atself.remove_tree()must fail ifselfnames a regular file; callers must useunlink()for files.remove_tree()must fail ifselfis a symbolic link, including a symlink to a directory; it must never recurse into a symlink target.remove_tree()must remove entries bottom-up so non-empty directories are not removed before their children.
Expected API shape¶
This subsection names the user-visible spellings. It is not an implementation plan.
Path¶
Path(path: str | Path) -> Path./andjoinpath(...)for path composition.parent() -> Path,name() -> str,suffix() -> str,stem() -> str.exists() -> bool,is_file() -> bool,is_dir() -> bool,is_symlink() -> bool.try_exists() -> Result[bool, E]as an explicit Incan extension.mkdir(parents: bool, exist_ok: bool) -> Result[(), E].read_bytes() -> Result[bytes, E],write_bytes(data: bytes) -> Result[(), E].read_text(encoding: str, errors: str) -> Result[str, E],write_text(data: str, encoding: str, errors: str, newline: str | None) -> Result[(), E].open(mode: str, buffering: int, encoding: str | None, errors: str | None, newline: str | None) -> Result[File, E]. Supports the full Python-style mode family defined above.iterdir() -> Result[list[Path], E],glob(pattern: str) -> Result[list[Path], E],rglob(pattern: str) -> Result[list[Path], E].stat() -> Result[PathStat, E],lstat() -> Result[PathStat, E].copy(target: Path | str, follow_symlinks: bool, preserve_metadata: bool) -> Result[Path, E]. Copies a file or directory tree totargetand returns the new path. Withpreserve_metadata=False, only file data and directory structure are guaranteed; withpreserve_metadata=True, metadata preservation is attempted where supported and documented.copy_into(target_dir: Path | str, follow_symlinks: bool, preserve_metadata: bool) -> Result[Path, E]. Copies this path into an existing target directory and returns the copied path.move(target: Path | str) -> Result[Path, E],move_into(target_dir: Path | str) -> Result[Path, E]. Path-centric move operations following CPython 3.14 naming. Same-filesystem moves should use rename/replace-class semantics; cross-filesystem moves must behave as copy-then-delete.rename(target: Path | str) -> Result[Path, E],replace(target: Path | str) -> Result[Path, E].unlink() -> Result[(), E],rmdir() -> Result[(), E],remove_tree() -> Result[(), E].resolve() -> Result[Path, E],absolute() -> Result[Path, E],Path.cwd() -> Result[Path, E],Path.home() -> Result[Path, E].disk_usage() -> Result[DiskUsage, E]as an explicit Incan extension returning at leasttotal,used, andfreein bytes for the filesystem containing this path.scandir() -> Result[list[DirEntry], E]as an explicit Incan extension.touch(exist_ok: bool) -> Result[(), E],chmod(readonly: bool) -> Result[(), E],symlink_to(target: Path | str) -> Result[(), E],hardlink_to(target: Path | str) -> Result[(), E],samefile(other: Path | str) -> Result[bool, E],is_mount() -> Result[bool, E],expanduser() -> Result[Path, E].
OpenOptions¶
OpenOptions() -> OpenOptions.read(v: bool) -> OpenOptions,write(v: bool) -> OpenOptions,append(v: bool) -> OpenOptions,truncate(v: bool) -> OpenOptions.create(v: bool) -> OpenOptions,create_new(v: bool) -> OpenOptions.open(path: Path | str) -> Result[File, E].
File¶
read(size: int) -> Result[str, E],read_bytes(size: int) -> Result[bytes, E].read_exact(size: int) -> Result[bytes, E].write(data: str) -> Result[int, E],write_bytes(data: bytes) -> Result[int, E].tell() -> Result[int, E],seek(offset: int, whence: int) -> Result[int, E].sync() -> Result[(), E]as an explicit durability primitive.fsync() -> Result[(), E]as an alias ofsync()with identical semantics.sync_data() -> Result[(), E]as an explicit durability primitive that may omit non-essential metadata writes.flush() -> Result[(), E]. Flushes user-space buffers but does not imply durable persistence.
DirEntry¶
path: Path,file_name() -> str.is_file() -> bool,is_dir() -> bool,is_symlink() -> bool.metadata() -> Result[PathStat, E].
Errors and compatibility¶
- Operations must surface failure through ordinary
Resultreturns unless a helper is explicitly documented otherwise. - Error payloads should be actionable, including at minimum the relevant path and the underlying OS message for filesystem failures.
- This RFC is additive. Existing programs and builtins keep compiling while
std.fsbecomes the documented default.
Design details¶
Why chunked file I/O belongs in std.fs¶
Large-file chunking is a filesystem concern before it is an in-memory parsing concern. A program that wants to hash, upload, transcode, or scan a multi-gigabyte file must be able to open that file and consume bounded reads without routing through read_bytes(). That is why Path.open(...) and a File contract are part of the std.fs approval unit rather than a deferred convenience.
CPython baseline plus explicit Incan extensions¶
The core design rule is: if CPython 3.14 pathlib.Path already has the operation, Incan should prefer the same spelling and broadly compatible behavior. When Incan needs more, the RFC should say so plainly. try_exists, path-centric copy and move helpers, remove_tree, OpenOptions, and durability methods are not "secret pathlib"; they are deliberate Incan extensions informed by Rust's std::fs because they solve real filesystem tasks cleanly.
CPython 3.14 also made exists()-style queries more aggressively bool-shaped by returning false rather than surfacing OS errors for inaccessible paths. That is a reasonable default for quick predicates, but it is not sufficient for correctness-sensitive code. Incan therefore keeps the ergonomic bool predicates and also standardizes try_exists() for callers that need to preserve the distinction between "missing" and "unknown because the probe failed."
CPython 3.14 introduced path copy and move helpers as well. Those belong in Incan's std.fs story for the exact reason you called out: users should be able to stay on Path for ordinary filesystem work instead of reaching for a second filesystem module. This RFC adopts the CPython 3.14 spellings directly: copy, copy_into, move, and move_into.
Some shutil ideas do fit naturally in a path-centric module. Disk-usage queries are a good example because they are path-owned and filesystem-facing: a Path.disk_usage() method keeps the operation in std.fs without introducing a second high-level file-operations namespace. By contrast, shell/environment helpers such as path-variable expansion or executable lookup should not be pulled into std.fs just because Python happens to expose them elsewhere.
This same principle applies to whole-file helpers. The RFC does not expose parallel module-level read_bytes(...) / write_bytes(...) / read_text(...) / write_text(...) shortcuts, because keeping a second "string path first" style alive would weaken the intended Path-first model. Incan should teach: construct a Path, then operate on that path.
Existing compiler builtins such as read_file and write_file may remain for compatibility. This RFC does not deprecate or remove them. The design claim is narrower: std.fs.Path is the canonical filesystem model for new APIs, documentation, and examples, and any builtin overlap should converge on the same observable semantics where practical.
The text defaults follow the same philosophy. Incan should choose explicit, modern defaults rather than inheriting ambient host conventions, so UTF-8 with strict error handling and "\n" output are the normative defaults. At the same time, the API must stay parameterized because migration and interoperability work are real use cases; callers need to be able to opt into ASCII, Latin-1, Windows line endings, or other legacy conventions when the target system requires them.
Copy and tree semantics that should not stay implicit¶
The weak point in many filesystem APIs is not naming but hand-wavy behavior around metadata, symlinks, and recursive deletion. This RFC should be explicit there.
copy(..., preserve_metadata=False) is intentionally the lower-guarantee path: callers get copied bytes and the expected directory structure, but not a portability promise about timestamps, ownership, ACLs, or extended attributes. preserve_metadata=True is the opt-in request for richer preservation, with the understanding that some metadata classes are inherently host-sensitive and must remain best-effort unless the platform can guarantee them.
remove_tree() also needs a hard boundary: it is for real directory trees, not a polymorphic "delete whatever I point at" shortcut. Python's rmtree() explicitly says the path "must point to a directory (but not a symbolic link to a directory)" (shutil 3.14), and that is the right safety posture here too.
Durability should also stay explicit. Python normally requires flush() plus os.fsync(...) for a strong persistence request (os 3.14), and Rust exposes sync_all() / sync_data() rather than treating ordinary writes or drop as durability boundaries (Rust std::fs::File). Incan should follow that model: successful writes mean normal write success, while sync() / sync_data() express persistence intent. Because the runtime is Rust-backed, the RFC does not require an explicit close() method as part of the public contract.
Interaction with Rust interop¶
Authors may still use rust::std::fs and rust::std::io for capabilities this RFC does not standardize. The stdlib module should remain the documented default for portable baseline filesystem work.
Alternatives considered¶
- Whole-file helpers only — smaller surface, but it does not solve large-file chunking and would leave serious workloads on
rust::. - Keep
std.fsandstd.ioin one RFC — rejected because filesystem review and in-memory byte-cursor review are separable approval units. - Rust-shaped public API — accurate to the backing implementation, but a worse tutorial and documentation story for Incan users.
- Separate
std.os/std.shutilstyle modules — rejected for now because ordinary path-owned chores should stay path-centric in Incan.
Drawbacks¶
std.fsis still a broad surface area even after splitting outstd.io.- Promise drift is a risk: docs must not claim "pathlib parity" while quietly changing semantics or inventing extensions without labeling them as such.
- Durability, recursive deletion, and cross-platform metadata have semantic corners that require careful stdlib documentation and tests.
- Whole-file helpers remain easy to misuse on large inputs, so docs must explicitly teach when to choose chunked
open(...)instead.
Implementation architecture¶
(Non-normative.) The public Path, File, OpenOptions, metadata, directory-entry, disk-usage, and error types should be normal Incan stdlib code. Rust-backed helpers are the host boundary for std::fs, std::io, and platform filesystem calls; they should not replace the authored Incan API shape.
Layers affected¶
- Stdlib / runtime (
incan_stdlib): newstd.fsmodule and supporting types such asPath,File,OpenOptions, andDirEntry. - Language surface: imports, constructors, methods, and helpers must be available without ad hoc special cases.
- Reference docs: documentation must explain the difference between whole-file helpers and chunked file-handle APIs.
- LSP / tooling: completions and hovers for
std.fsmembers. - Tests / docs-site: API docs and examples must cover both whole-file and chunked large-file workflows.
Implementation Plan¶
- Register
std.fsin the stdlib namespace metadata and keep it distinct from existing names such asstd.web.Path. - Add the authored
std.fs.incnsurface forPath,File,OpenOptions,DirEntry,PathStat,DiskUsage, andIoError. - Add narrow Rust host-boundary helpers for path construction, lexical operations, filesystem predicates, directory operations, whole-file byte/text I/O, file open modes, file reads/writes, seeking, durability, metadata, traversal, copy/move, links, permissions, and disk usage.
- Ensure
from std.fs import Path, Fileresolves through the normal stdlib loader without ad hoc compiler special cases. - Ensure methods and operators on
Path,File, and related stdlib types lower and emit through the RFC 023 stdlib path. - Preserve existing
read_file/write_filebehavior while makingstd.fs.Paththe documented default for new code. - Add typechecker, codegen, runtime, and smoke coverage for path construction, joining, lexical properties, predicates,
try_exists, directories, whole-file byte/text I/O, open/read/read_exact/write/tell/seek, durability, traversal, metadata, copy/move, tree removal, links, permissions, and failure paths. - Update authored user docs: the dedicated
std.fsreference, file I/O how-to, tutorial examples that already mentionPath, stdlib reference navigation, and release notes.
Implementation log¶
Spec / lifecycle¶
- Confirm RFC 055 is the blocking prerequisite for RFC 010's filesystem
Pathcontract. - Establish implementation scope as the complete RFC 055
std.fscontract.
Stdlib / runtime¶
- Register
std.fsin the stdlib namespace metadata. - Add the authored
std.fs.incndeclarations. - Add runtime-backed
Pathconstruction and lexical path helpers. - Add path joining support.
- Add
exists,is_file,is_dir,is_symlink, andtry_exists. - Add
mkdirwith parent/exist-ok behavior. - Add whole-file byte and text helpers.
- Add
Path.open(...)mode handling. - Add
File.read,read_bytes,read_exact,write,write_bytes,tell,seek,flush,sync,fsync, andsync_data. - Add traversal, metadata, copy/move, tree removal, link, permission, and disk-usage helpers.
- Add filesystem error modeling sufficient for examples and diagnostics.
Compiler / tooling¶
- Ensure
std.fsimports resolve through the normal stdlib loader. - Ensure emitted Rust uses the stdlib module without hardcoded path special cases.
- Ensure LSP completions/hovers discover the new namespace through registry metadata.
- Preserve existing
read_file/write_filebehavior.
Tests¶
- Add typechecker tests for importing
PathandFile. - Add codegen/snapshot coverage for
std.fsimports. - Add runtime tests for path construction, joining, lexical properties, predicates, and
try_exists. - Add runtime tests for directory creation and cleanup.
- Add runtime tests for whole-file byte and text I/O.
- Add runtime tests for open/read/read_exact/write/tell/seek behavior.
- Add runtime tests for durability method availability.
- Add failure-path coverage for missing files, exact-read EOF, and exclusive create.
Docs¶
- Update the file I/O reference with the implemented
std.fssurface. - Update the file I/O how-to with whole-file vs chunked guidance.
- Update tutorials that already mention
Pathso they import and usestd.fstruthfully. - Add
std.fsto the standard-library reference index and navigation if the docs structure requires it. - Add release notes for the shipped surface.
Design Decisions¶
std.fs.Pathis the canonical filesystem model. Existing builtins such asread_fileandwrite_filemay remain for compatibility, but new APIs, documentation, and examples should preferPath.- Construction uses direct calls such as
Path("config.toml"), notPath.new(...). Path.open(...)commits to the full Python-style mode-string family.Pathremains the single path object for both files and directories. Opening a path yieldsFile; there is no parallelFolderabstraction.- Durability is explicit. Successful writes and object drop do not imply crash-safe persistence; callers use
sync()orsync_data()when they need persistence guarantees.fsync()exists as an alias ofsync(). - Text I/O defaults are
utf-8,strict, and"\n"output, with override parameters for interoperability and migration work. std.fsstays path-centric: copy, move, recursive deletion, scanning, and disk-usage queries belong onPathrather than being pushed into separateos- orshutil-style modules.