Reflection¶
Reflection solves a class of problems that comes up constantly in data pipelines: you want to write a transformation that works on any set of columns or models, not just the specific ones that exist today. As the schema evolves — new columns added, old ones renamed — reflection-based models stay correct without edits, because the column list is computed fresh at every compile. smelt's reflection API operates entirely in the meta-world: column names, model paths, and source identifiers are never string literals in your SQL; they are compile-time values that become SQL identifiers at their splice points. This means the type checker validates every lifted identifier against the scope where it appears, so schema errors surface immediately in the editor rather than at query time.
This page covers two reflection areas:
- Narrow reflection —
smelt.columns_ofreflects a single model, source, or seed into aList<ColumnRef>, letting you iterate over a table's columns. - Wide reflection —
smelt.models.with_tag,smelt.models.all,smelt.sources.with_tag, andsmelt.sources.allreflect the entire workspace intoList<ModelRef>orList<SourceRef>values, letting you build cross-model queries that update automatically as the workspace grows.
Complete example — coalesce all numeric columns in any model:
-- models/orders_safe.sql
-- smelt.columns_of(smelt.orders) returns List<ColumnRef> at compile time.
-- filter keeps only numeric columns; map wraps each in COALESCE.
-- The spread splices the resulting SelectItems into the SELECT list.
SELECT
customer_name,
...smelt.columns_of(smelt.orders)
|> filter(fn c => c.is_numeric)
|> map(fn c => COALESCE(c.name, 0))
FROM smelt.orders
-- Engine sees: SELECT customer_name, COALESCE(id, 0), COALESCE(amount, 0), COALESCE(discount, 0) FROM smelt.orders
smelt.columns_of¶
Signature¶
smelt.columns_of is a meta-only accessor. It takes exactly one positional argument — a TableExpr-typed value — and returns the column list of that table as a List<ColumnRef>.
The argument may be any of the following:
- A
smelt.<path>reference resolving to a model, source, or seed. The existing schema-resolution machinery supplies the column list from the target's declared schema. - A
smelt.defineparameter declaredTableExprorTableExpr<{…}>. At function body-check time the result type isList<ColumnRef>parametrically — the concrete column list is not materialised until expansion time at each call site. ATableExpr<{required columns}>parameter contributes only the required columns to body-check-time reasoning; at expansion time the call-site argument's full schema (which may include additional columns under the row-tail) supplies the complete list. - A CTE alias or subquery alias within the same model body, resolved through the standard
TableExpr-typed expression path.
smelt.columns_of must be called with exactly one positional argument. Named arguments emit ColumnsOfNamedArgument. An argument whose type is not assignable to TableExpr emits ColumnsOfRequiresTableExpr. If the schema cannot be resolved at expansion time (for example because an upstream model has an Unknown schema), ColumnsOfUnresolvableSchema is emitted and the surrounding HOF call drops its splice without further diagnostics.
The returned list preserves the declared column order of the source schema. For models, sources, and seeds this is the order columns appear in their schema; for TableExpr parameters at expansion time this is the order columns appear in the call-site argument's schema.
smelt.columns_of is Salsa-cached: given the same workspace snapshot, it returns byte-equal results. LSP invalidation is automatic when an upstream schema changes.
Example — list all columns¶
-- List all columns of the orders model.
-- At expansion time, smelt.columns_of(smelt.orders) produces a List<ColumnRef>
-- whose elements correspond to the four columns declared in orders.sql.
SELECT
...smelt.columns_of(smelt.orders) |> map(fn c => c.name)
FROM smelt.orders
For a fuller worked example, see Worked example: coalesce_numeric below.
ColumnRef¶
ColumnRef is a closed meta-only record type that represents a single column from a resolved schema. You cannot construct a ColumnRef directly — values originate only from smelt.columns_of.
Fields¶
| Field | Type | Meaning |
|---|---|---|
name |
Text |
The column's identifier as it appears in the source schema (un-quoted; case-preserved) |
type |
DataType |
The column's DataType from the DataType vocabulary |
is_numeric |
Boolean |
TRUE if type is in the Numeric constraint set |
Access each field using dot-notation inside a HOF lambda:
smelt.columns_of(smelt.orders)
|> filter(fn c => c.is_numeric) -- Boolean field
|> map(fn c => c.name) -- Text field, lifts to identifier in splice
Any field name other than name, type, or is_numeric emits ColumnRefFieldUnknown.
ColumnRef is closed: the field set is exactly the three fields above. Adding a field requires a spec edit and a compiler change. ColumnRef is also not user-writable — you cannot use it as a smelt.define parameter or return type annotation, and you cannot construct a ColumnRef value in a list literal.
Body-check vs expansion-time¶
Inside a smelt.define function body where the argument to smelt.columns_of is a TableExpr parameter, the type checker operates in two regimes:
- At body-check time:
smelt.columns_of(t)synthesises asList<ColumnRef>parametrically. Each HOF lambda over the result is type-checked once againstColumnRef. No concrete column list is materialised. - At expansion time: when the function is inlined at a call site with a concrete
TableExprargument, the call-site schema is resolved, theList<ColumnRef>is materialised, HOF lambdas walk each element, and meta-Text-as-identifier lifts are validated against the surrounding splice context.
This means you get full type-safety at definition time (the type checker knows each lambda parameter is a ColumnRef and that c.name is Text, c.is_numeric is Boolean, etc.) while the concrete expansion is deferred until each call site provides a schema.
Meta-Text-as-identifier lift¶
A ColumnRef's name field has type Text. When a meta-Text value — such as c.name inside a HOF lambda — appears in a position where the SQL grammar expects an unquoted identifier, smelt lifts that value to the identifier rather than treating it as a string literal.
The lift applies in exactly four positions:
| Position | Example |
|---|---|
| Column-reference position inside an expression | COALESCE(c.name, 0) — c.name lifts to a column reference |
AS alias of a SELECT item |
SUM(amount) AS c.name — c.name lifts to the output column alias |
ORDER BY column reference |
ORDER BY c.name — c.name lifts to a sort key |
GROUP BY column reference |
GROUP BY c.name — c.name lifts to a grouping key |
In any other position — function arguments typed Expr<Text>, comparison operands typed Text, string-literal positions, named-argument values — a meta-Text value retains its string-value meaning. The lift is grammar-position-driven; there is no annotation or opt-in marker.
After lifting, the identifier is validated against the surrounding splice context's column-resolution scope using the standard scoping rule. If the lifted identifier names no in-scope column, the existing UnknownColumn diagnostic fires at the meta expression's source span (not at the lifted text). The lift itself produces no additional diagnostic.
The lift applies only to compile-time meta-Text values. A runtime Expr<Text> (for example UPPER('foo')) in an identifier position remains a data-world type error; the meta lift does not extend to evaluated SQL expressions.
Examples¶
-- Column-reference lift: c.name becomes the column identifier
smelt.columns_of(smelt.orders)
|> map(fn c => COALESCE(c.name, 0))
-- c.name (meta-Text) lifts to the column identifier: COALESCE(id, 0), COALESCE(amount, 0), …
-- AS-alias lift: c.name becomes the SELECT alias
smelt.columns_of(smelt.orders)
|> map(fn c => COALESCE(c.name, 0) AS c.name)
-- c.name after AS lifts to alias: COALESCE(id, 0) AS id, COALESCE(amount, 0) AS amount, …
-- GROUP BY lift
smelt.columns_of(smelt.orders)
|> filter(fn c => NOT c.is_numeric)
|> map(fn c => c.name)
-- The spread of this list into GROUP BY: GROUP BY customer_name
Worked example: coalesce_numeric¶
This example mirrors the fixture in examples/meta_columns/.
The upstream model: orders.sql¶
-- examples/meta_columns/models/orders.sql
-- Schema: id INTEGER, customer_name VARCHAR, amount DOUBLE, discount DOUBLE
SELECT
id,
customer_name,
amount,
discount
FROM smelt.sources.raw.orders
The orders model has four columns. Three are numeric (id, amount, and discount); one is non-numeric (customer_name).
The function: coalesce_numeric.sql¶
-- examples/meta_columns/functions/coalesce_numeric.sql
smelt.define coalesce_numeric(t: TableExpr) -> SelectItems<Scalar, t> AS (
smelt.columns_of(t)
|> filter(fn c => c.is_numeric)
|> map(fn c => COALESCE(c.name, 0))
)
What the type checker sees at body-check time:
smelt.columns_of(t)synthesisesList<ColumnRef>parametrically.tis aTableExprparameter — no concrete schema is available yet.filter(fn c => c.is_numeric): the lambda parametercis typedColumnRef;c.is_numericsynthesisesBoolean. Type-checks cleanly.map(fn c => COALESCE(c.name, 0)):c.namesynthesisesText. In the column-reference position insideCOALESCE(…), a meta-Textvalue is lifted to an identifier. The checker records that the lift will be validated at expansion time.COALESCE(text_identifier, 0)type-checks against the return sortSelectItems<Scalar, t>.
What happens at expansion time (when called with smelt.orders):
tis bound tosmelt.orders, whose schema is{id: INTEGER, customer_name: VARCHAR, amount: DOUBLE, discount: DOUBLE}.smelt.columns_of(smelt.orders)materialises[{name:"id", type:Integer, is_numeric:TRUE}, {name:"customer_name", type:Text, is_numeric:FALSE}, {name:"amount", type:Double, is_numeric:TRUE}, {name:"discount", type:Double, is_numeric:TRUE}].filter(fn c => c.is_numeric)keepsid,amount,discount.map(fn c => COALESCE(c.name, 0))lifts eachc.nameto a column identifier, producingCOALESCE(id, 0),COALESCE(amount, 0),COALESCE(discount, 0).
The caller: orders_safe.sql¶
-- examples/meta_columns/models/orders_safe.sql
SELECT
customer_name,
...smelt.functions.coalesce_numeric(smelt.orders)
FROM smelt.orders
What the engine sees after compilation:
The spread ...smelt.functions.coalesce_numeric(smelt.orders) materialises the three SelectItems produced by the function inline into the SELECT list. No column name is ever a string literal in the output — the meta-Text values carried by c.name are lifted to identifiers during expansion.
LSP support¶
Editor support: reflection constructs surface in hover, completion, and diagnostics.
- Hover on
smelt.columns_of(t)showsList<ColumnRef>. Whent's schema is statically resolvable (for example a directsmelt.<path>reference), the tooltip also shows the resolved column count and the first five column names. - Hover on a
ColumnRef-typed lambda parameter (for examplecinfn c => c.name) showsColumnReftogether with the closed field list and each field's type. - Hover on a field projection (
c.name,c.type,c.is_numeric) shows the field's declared type. When the projection is reached at expansion time over a resolvable list, the tooltip shows the concrete value at the current call site. - Goto-definition on
smelt.columns_ofresolves to the reference page as a URL hint; clients that do not support URL goto-definition targets no-op gracefully. - Completion at a field-projection position (
c.<cursor>) offers the three closed field names:name,type,is_numeric. - Completion at the
smelt.columns_of(<cursor>)argument position offers in-scopeTableExpr-valued names —smelt.<path>references and the enclosing function'sTableExprparameters. - Diagnostics with frame stacks: a type error inside a HOF lambda body whose source list comes from
smelt.columns_of(t)carries an anonymous expansion frame. When the source column is statically traceable, the frame includes acolumn_originfield pointing to the column's declaration span in the upstream schema, surfaced as a "from column declared at …" trailer.
Diagnostic codes¶
ColumnsOfRequiresTableExpr
When it fires: smelt.columns_of(x) is called and x synthesises to a type that is not assignable to TableExpr.
Message: smelt.columns_of expects TableExpr; found {actual}
Fires at: the argument expression.
Example:
What to fix: Pass a TableExpr-typed value — a smelt.<path> reference to a model, source, or seed, or a TableExpr parameter of the enclosing smelt.define function. Use smelt.columns_of(smelt.orders) rather than passing a non-table expression.
ColumnsOfNamedArgument
When it fires: smelt.columns_of is called with a named argument instead of a positional argument.
Message: smelt.columns_of takes one positional argument; named arguments are not supported
Fires at: the named-argument span.
Example:
What to fix: Remove the => and pass the value positionally: smelt.columns_of(smelt.orders).
ColumnsOfUnresolvableSchema
When it fires: At expansion time, smelt.columns_of(t) is evaluated but the schema for t cannot be statically determined (for example because an upstream model has an unknown schema).
Message: cannot resolve column list for {t}; upstream schema is unknown
Fires at: the smelt.columns_of(…) call site.
What to fix: Ensure the TableExpr argument resolves to a model, source, or seed with a fully declared schema. If the upstream model itself has type errors, fix those first — the schema becomes resolvable once the upstream compiles cleanly. This diagnostic suppresses further errors from the surrounding HOF call; fix the schema resolution first, then re-check.
ColumnRefFieldUnknown
When it fires: Field access on a ColumnRef-typed value uses an identifier that is not one of the three declared fields.
Message: ColumnRef has no field {name}; expected one of: name, type, is_numeric
Fires at: the field name span in the dot-notation expression.
Example:
smelt.columns_of(smelt.orders)
|> map(fn c => c.label) -- ← ColumnRefFieldUnknown: 'label' is not a ColumnRef field
What to fix: Use one of the three valid field names: c.name (the column's identifier as Text), c.type (the column's DataType), or c.is_numeric (Boolean). If you need metadata beyond these three fields, that requires a spec extension.
Wide reflection: workspace introspection¶
Wide reflection gives you a compile-time view of the entire workspace: every model and every declared source. The result is a List<ModelRef> or List<SourceRef> — a sequence of workspace entities you can iterate with HOFs, project fields from, or reduce into a UNION ALL query.
Accessors¶
smelt.models.with_tag(tag: Text) -> List<ModelRef>
smelt.models.all() -> List<ModelRef>
smelt.sources.with_tag(tag: Text) -> List<SourceRef>
smelt.sources.all() -> List<SourceRef>
smelt.models.with_tag(tag) returns every model in the workspace whose merged tag set includes tag. The tag string is matched case-sensitively and must be a compile-time Text literal.
smelt.models.all() returns every model in the workspace with no tag filter.
smelt.sources.with_tag(tag) and smelt.sources.all() are the corresponding accessors for declared sources.
All four accessors return results in path-sorted order — byte-lexicographic on the workspace-relative path string with / separators. This order is deterministic across runs over the same workspace state and stable under edits to model content (only renames change it).
Both tag accessors take exactly one positional argument: a compile-time Text literal. Named arguments and non-literal expressions are rejected.
Both all() accessors take no arguments. Passing any argument is rejected.
ModelRef¶
ModelRef is a closed meta-only record type representing a single model in the workspace.
| Field | Type | Meaning |
|---|---|---|
path |
Text |
Workspace-relative path, e.g. models/orders.sql |
name |
Text |
Short model name, e.g. orders |
tags |
List<Text> |
Merged tag set (smelt.yml global tags first, then frontmatter tags not already present) |
columns |
List<ColumnRef> |
The model's column list — equivalent to smelt.columns_of(m) |
Access fields using dot-notation inside a HOF lambda: m.path, m.name, m.tags, m.columns. Any other field name emits ModelRefFieldUnknown.
ModelRef is not user-constructible: values originate only from smelt.models.* accessors.
SourceRef¶
SourceRef is the source-side counterpart to ModelRef. It has the same four fields:
| Field | Type | Meaning |
|---|---|---|
path |
Text |
Workspace-relative path to the source YAML, e.g. models/sources/raw/orders.yml |
name |
Text |
Short source name (last segment of the address), e.g. orders |
tags |
List<Text> |
Tags declared in the source YAML's tags: key |
columns |
List<ColumnRef> |
The source's column list — equivalent to smelt.columns_of(s) |
Access fields using dot-notation: s.path, s.name, s.tags, s.columns. Any other field name emits SourceRefFieldUnknown.
Subtyping: ModelRef and SourceRef lift to TableExpr¶
ModelRef <: TableExpr and SourceRef <: TableExpr. A ModelRef value can be passed wherever a TableExpr is required — including as an argument to smelt.columns_of and as an element in a list reduced with union_all.
Because of the List<T> covariant subtyping rule, List<ModelRef> also lifts to List<TableExpr>. This means reduce(smelt.models.with_tag('cohort'), union_all) typechecks without any explicit projection step — you do not need |> map(fn m => m.table_expr).
m.columns is equivalent to smelt.columns_of(m)¶
m.columns and smelt.columns_of(m) produce the same List<ColumnRef>. Use whichever reads more naturally. At body-check time both return List<ColumnRef> parametrically; at expansion time both resolve the concrete schema from the model's declared columns.
Determinism and ordering¶
The wide reflection accessors are Salsa-cached: on unchanged workspace state, smelt.models.with_tag(t) returns the same results byte-equal across invocations. When a model's file content, frontmatter tags, or smelt.yml global tags change, the Salsa query is invalidated and re-evaluated. The LSP updates hover counts and diagnostics automatically.
Results are always path-sorted (byte-lexicographic, / separators). A reduce(smelt.models.with_tag('cohort'), union_all) query renders UNION ALL branches in the same stable order. If downstream queries depend on row order, document this dependency explicitly.
Example: map model names¶
-- Collect the names of all models tagged 'cohort'
-- smelt.models.with_tag('cohort') returns List<ModelRef> in path order
-- map projects each ModelRef to its name field (a Text value)
SELECT map(smelt.models.with_tag('cohort'), fn m => m.name)
Example: map source names¶
-- Collect the names of all sources tagged 'audit'
SELECT map(smelt.sources.with_tag('audit'), fn s => s.name)
Example: workspace inventory¶
LSP support for wide reflection¶
Hover on smelt.models.with_tag('cohort') shows List<ModelRef> together with the tag string and (when the workspace is resolvable at the cursor) the number of matching models and the first five model names. Hover on smelt.models.all / smelt.sources.all shows the total model/source count.
Hover on a ModelRef-typed or SourceRef-typed lambda parameter shows the record type plus the closed four-field list with each field's type.
Hover on a field projection (m.path, m.name, m.tags, m.columns) shows the field's declared type.
Goto-definition on a ModelRef value at a splice site (including m.path and m.name) resolves to the model's source .sql file. The same rule applies to SourceRef resolving to the source YAML file.
Completion at smelt.models.<cursor> or smelt.sources.<cursor> offers exactly {with_tag, all}. Completion at m.<cursor> where m: ModelRef offers exactly {path, name, tags, columns} — same for SourceRef.
Wide-reflection diagnostic codes¶
WithTagRequiresText
When it fires: smelt.models.with_tag(x) or smelt.sources.with_tag(x) is called and x is not a compile-time Text literal (for example an integer, a function call, or a runtime expression).
Message: with_tag expects a compile-time Text; found {actual}
Fires at: the argument expression span.
Example:
-- ← WithTagRequiresText: 42 is not a Text literal
SELECT map(smelt.models.with_tag(42), fn m => m.name)
What to fix: Use a string literal: smelt.models.with_tag('cohort').
WithTagNamedArgument
When it fires: with_tag is called with a named argument instead of a positional argument.
Message: with_tag takes one positional argument; named arguments are not supported
Fires at: the named-argument span.
Example:
What to fix: Remove the => and pass the tag string positionally: with_tag('cohort').
WideReflectionUnknownAccessor
When it fires: smelt.models.<name> or smelt.sources.<name> uses an accessor name outside the closed set {with_tag, all}.
Message: smelt.{models,sources} has no accessor `{name}`; expected one of: with_tag, all
Fires at: the accessor name token span.
Example:
What to fix: Use with_tag('my-tag') to filter by tag, or all() to get every entity.
WideReflectionUnexpectedArgument
When it fires: smelt.models.all or smelt.sources.all is called with one or more arguments.
Message: {accessor} takes no arguments
Fires at: the argument span.
Example:
What to fix: Remove the argument: smelt.models.all().
ModelRefFieldUnknown
When it fires: Field access on a ModelRef-typed value uses an identifier that is not one of the four declared fields.
Message: ModelRef has no field `{name}`; expected one of: path, name, tags, columns
Fires at: the field name token span.
Example:
-- ← ModelRefFieldUnknown: 'materialization' is not a ModelRef field
SELECT map(smelt.models.with_tag('cohort'), fn m => m.materialization)
What to fix: Use one of the four valid fields: m.path, m.name, m.tags, or m.columns. Additional fields (such as materialization) are not yet in the closed field set.
SourceRefFieldUnknown
When it fires: Field access on a SourceRef-typed value uses an identifier that is not one of the four declared fields.
Message: SourceRef has no field `{name}`; expected one of: path, name, tags, columns
Fires at: the field name token span.
Example:
-- ← SourceRefFieldUnknown: 'schema' is not a SourceRef field
SELECT map(smelt.sources.all(), fn s => s.schema)
What to fix: Use one of the four valid fields: s.path, s.name, s.tags, or s.columns.
Generator-body restriction¶
smelt.models.* accessors (smelt.models.with_tag, smelt.models.all) are not available inside generator file bodies (files whose frontmatter declares generates: models). Calling them emits GeneratorBodyForbidsModelReflection.
Why. Workspace shape — which models exist — is determined by evaluating all generators in a single pass. Admitting smelt.models.* inside a generator would create a circular dependency between generator emissions and the model-reflection they observe.
smelt.sources.* is allowed inside generator bodies. Sources are evaluated before any generator, so there is no circularity.
---
generates: models
---
-- OK: smelt.sources.with_tag works inside generator bodies.
smelt.sources.with_tag('raw')
|> map(fn s => ModelDef {
name: s.name,
body: SELECT * FROM smelt.sources.raw.[s.name]
})
-- NOT OK: smelt.models.with_tag fires GeneratorBodyForbidsModelReflection.
-- smelt.models.with_tag('staging') |> map(fn m => ModelDef { … })
For driving a generator from a data file, use smelt.config.load_yaml / load_json instead. See Generator Files and Config Loaders.
Planned but not yet available¶
- Additional
ModelRef/SourceReffields (materialization,backends,description, …): the field set will expand as concrete use cases are identified.
See also¶
- Lists & Spread —
List<T>type, list literals, and the spread operator used to materialise reflection results into SELECT lists. - Higher-Order Functions —
filterandmap, which are the primary tools for transforming aList<ColumnRef>,List<ModelRef>, orList<SourceRef>. - Pipe Operator —
|>for readable HOF chains over reflection results. - Reference — alphabetical quick reference including all reflection diagnostic codes.