Migration Guide
This document covers deft’s CMake-to-deft.toml migration tool, implemented
entirely in migrate.rs. It is intentionally a “best
effort, never crash, always tell you what’s left” tool rather than a full
CMake interpreter — the design rationale and every limitation below is
directly reflected in the source.
deft migrate [--from=cmake] [--path DIR]
Currently --from accepts only the literal value "cmake" (the default);
any other value returns DeftError::Config immediately. --path defaults to
. and must contain a CMakeLists.txt.
deft migrate --from=cmake
Zero-dependency token scanner mechanics
In keeping with deft’s “no heavy deps” rule (see
architecture.md), migrate.rs pulls in no
CMake-parsing crate — not even a generic grammar/parser-combinator library.
The entire extraction pipeline is plain &str scanning:
Comment stripping.
strip_commentssplits each line at the first#and keeps only the part before it:text.lines().map(|line| line.split('#').next().unwrap_or("")).collect::<Vec<_>>().join("\n")This intentionally does not handle CMake’s rarer
#[[ ... ]]bracket-comment syntax — the doc comment in the source explains the reasoning explicitly: “a naive tool errs on the side of flagging unfamiliar syntax rather than silently mis-parsing it.” Content inside an unhandled bracket comment is left untouched and may surface as an unrecognized token rather than being silently consumed.Call extraction.
extract_calls(text, command)finds every invocation of a given CMake command (e.g."add_executable") and returns the raw text between its parentheses, oneStringper call site. It performs a case-insensitive substring search for"<command>("(CMake commands are conventionally case-insensitive), then hands the byte offset immediately following the open paren to the depth-counting parser below.Token unquoting.
unquote(token)strips a single matching pair of leading/trailing"or'quote characters from a token, leaving unquoted tokens (CMake variable names, bare library names) untouched.
Five CMake primitives are recognized this way:
project, add_executable, add_library,
target_include_directories, target_link_libraries. Anything else —
variables (set(...), ${VAR} expansion), control flow, generator
expressions — is not interpreted at all by the scanner; see
Graceful Error Recovery for how that surfaces to
the user instead of being silently dropped.
Parentheses depth-counting parser
CMake call arguments can themselves contain nested parens (most commonly in
if(...) conditions, though the migrator doesn’t extract if at all today —
the depth counter is written generally enough to handle any future nested
call). extract_calls’s inner loop is a simple depth counter over raw bytes:
let mut depth = 1usize;
let mut end = start;
while end < bytes.len() && depth > 0 {
match bytes[end] {
b'(' => depth += 1,
b')' => depth -= 1,
_ => {}
}
end += 1;
}
if depth == 0 {
calls.push(text[start..end - 1].to_string());
}
search_from = end.max(start + 1);
Starting depth is 1 (we’ve already consumed the call’s own opening paren).
Each subsequent ( increments depth, each ) decrements it; the call’s
argument text ends the moment depth returns to 0, i.e. at the matching
close paren — this correctly balances nested parens within one call’s
argument list. The known, accepted limitation (stated directly in the source
comment) is that this scan does not understand parens inside quoted
strings — a CMake argument like "weird(unbalanced" would desynchronize
the depth count. This is treated as acceptable for a tool whose entire
premise is “naive, best-effort, flags what it can’t handle” rather than a
spec-complete CMake parser.
After each successful match, the search resumes from end (or start + 1 if
somehow end <= start, guarding against infinite loops on a pathological
unbalanced-paren input) — so all call sites of the same command throughout
the file are collected, not just the first.
Dominant Language Resolution
CMake places no single-language restriction on a target — a single
add_executable can legally mix .c and .cpp source files. deft’s layout
model forbids this entirely (see
manifest.md). Rather than making
migration an all-or-nothing failure for any project that mixes languages,
parse_cmake computes a majority verdict and partitions the rest:
let cpp_count = project.sources.iter().filter(|s| is_cpp_source(s)).count();
let c_count = project.sources.len() - cpp_count;
project.is_c = c_count > cpp_count;
is_cpp_source recognizes the same extension set as deft’s own
Language::from_extension (compiler.rs): cc, cpp,
cxx, c++, cp. Every other recognized source extension (in practice,
.c) counts toward c_count.
Tie-breaking. The comparison is strict (c_count > cpp_count), so a tie
(equal counts, including the zero-sources case) resolves to is_c = false,
i.e. C++ wins ties — the source comment notes this matches deft init’s
own default behavior (no flags → C++ executable), keeping the two tools
consistent.
Partitioning the minority. Once the dominant language is decided, the
full source list collected from add_executable/add_library is split:
let dominant_is_cpp = !project.is_c;
let (dominant, conflicting): (Vec<String>, Vec<String>) = all_sources
.into_iter()
.partition(|s| is_cpp_source(s) == dominant_is_cpp);
project.sources = dominant;
project.conflicting_sources = conflicting;
project.sources (dominant-language files) are listed in the generated
manifest’s migration notes as files to manually move into the appropriate
strict-layout entry file (src/main.cpp, etc. — deft has no “sources list”
in deft.toml, so even the dominant-language files require a manual move,
just without a language conflict to resolve first). conflicting_sources
(minority-language files) are excluded from the primary migration path
entirely and surfaced as TODOs — see below.
Graceful Error Recovery
The tool’s central guarantee, stated directly in the module doc comment, is
that deft migrate must never abort on a mixed-language project or on
any CMake construct it doesn’t understand — every unmapped element is
preserved as an explicit, human-readable TODO in the generated deft.toml
rather than triggering a panic or hard error. Three categories of “unmapped”
input are handled this way:
1. Unmapped link libraries. CMake’s target_link_libraries gives bare
library/target names (mylib, pthread, Boost::filesystem) that have no
automatic mapping to deft’s gh:user/lib dependency shorthand — deft cannot
guess a GitHub repository from a bare name. Every captured name becomes a
commented-out, ready-to-uncomment-and-edit line:
[dependencies]
# TODO: map these CMake target_link_libraries to deft `gh:user/lib` deps
# (deft cannot infer a repository from a bare library name):
# "gh:<user>/mylib" = "x.y.z" # was: mylib
2. Unmapped (minority-language) source files. The conflicting_sources
computed above are rendered as a dedicated TODO block at the end of the
manifest, naming the dominant/excluded languages, the exact file list, and
the structural reason (deft’s one-language-per-package rule):
# TODO: Manually resolve mixed-language translation units
# deft enforces one language per package; C++ was chosen as the
# dominant language by source count. These C source(s) were excluded
# from this migration and need a plan (e.g. a sibling deft package):
# - legacy/parser.c
# - legacy/util.c
3. Unmapped complex CMake logic. parse_cmake’s final pass scans the
(comment-stripped) full text, case-insensitively, for five keyword markers
that indicate constructs the scanner does not interpret at all:
foreach(, function(, macro(, while(, generate_export_header(. Any
match is recorded in complex_hits and surfaced as a runtime note
(printed, not embedded in the TOML file itself, since there’s no specific
location/value to anchor a TODO comment to):
note: CMakeLists.txt uses constructs deft does not parse (foreach, function).
Review the file manually for logic not captured above.
None of this — a mixed-language project, an unrecognized library name, or an
unparsed foreach loop — ever causes deft migrate to return an Err or
panic. The only conditions that actually fail the command outright are: an
unsupported --from value, a missing CMakeLists.txt at the resolved path,
or a deft.toml that already exists at the destination (overwrite
protection, mirroring deft init’s same check — see
cli.md).
Diagnostic Feedback
migrate::run separates its output into two channels with different
visibility guarantees:
Non-essential notices (print_notices, stdout, suppressed by -q).
Printed only when !quiet:
- If any dominant-language sources were detected, a note listing them and
the exact strict-layout entry file path they need to be manually moved
into (deft has no sources list — moving files in is unavoidable manual
work even for successfully-migrated projects):
note: deft uses a strict layout — there is no "sources" list in deft.toml. Move/merge these detected sources into src/main.cpp by hand: - main.cpp - app.cpp - If any of the five complex-construct keywords were detected, the
complex_hitsnote shown above.
The unmapped-source warning (print_unmapped_warning, stderr,
unconditional). This is the one diagnostic deft prints even under
--quiet — the module doc comment explains why: “Mixed-language fallout is
always reported, even under --quiet: it lists exactly what the migration
could NOT map automatically, which is the one thing a user re-running this
non-interactively still needs to see.” It only fires when
conflicting_sources is non-empty:
warning: mixed-language CMake project detected — C++ was chosen as the
dominant language (3 C++ vs. 2 C source file(s)). The following C file(s)
could not be mapped automatically and were left as TODOs in deft.toml:
- legacy/parser.c
- legacy/util.c
This asymmetry — informational layout guidance is quiet-suppressible,
structural data-loss-risk warnings are not — mirrors deft’s general
philosophy that -q controls progress noise, never correctness-relevant
information (the same principle that keeps hard errors visible under -q
across every other command; see cli.md).
After migration, the recommended next steps for a developer are, in order:
- Move the listed dominant-language source files into the strict-layout
entry path noted by
print_notices. - Resolve the
# TODOblocks for unmapped dependencies and conflicting-language sources in the generateddeft.toml. - Manually review
CMakeLists.txtfor any logic guarded by thecomplex_hitskeywords, since none of that control flow was translated. - Run
deft build(ordeft doctorfirst, if uncertain about the local toolchain) to validate the migrated package.