version:0.1 doc:authoring-reference updated:2026

Authoring Reference

The complete per-attribute reference for every YAML field an author writes in a Semantic Rails package under the v1 contract. For each component you get its purpose, fields, defaults, behavior, the errors it raises, and a runnable example.

The narrative authoring guide lives in Package Authoring. This page is meant to be scanned with Cmd + F while you write YAML. The API Reference documents the wire shape used by the runtime.

Schema version: 1 Examples use: a synthetic shop domain (customer / order / order_item / product) Contract: v1 (terse authoring + deterministic loader normalization)
Measures vs metrics

Before any syntax, set the frame. Measures and metrics are two distinct surfaces in the catalog with two distinct purposes.

measures Primitives — building blocks Columnar facts (sum, count, count-distinct). Queryable flexibly by any reachable entity, time grain, dimension, or aggregation function within the measure's allowed set. ARR is a measure: sum of monthly contributions.
metrics Governed access patterns Named, stable contracts that codify a specific use of one or more measures with conditions, filters, time alignment, or composition. NRR is a metric: (start_arr + expansion_arr − churn_arr) / start_arr with cohort and time-alignment conditions.

Implications:

  • Not every measure needs a corresponding metric. Measures are queryable as primitives.
  • The catalog lists measures and metrics as distinct surfaces.
  • Measures do not auto-publish to metrics. Author the metric explicitly when an access pattern deserves a stable contract.
What the loader does for you

The v1 contract is terse because the loader fills in identifiers and traversal deterministically. Author the business meaning; the loader handles:

  • Key-derived IDs. Every object's id and name come from package.namespace + key. Override with as: only to preserve a public reference.
  • Auto-created key dimensions. From graph.entities.<x>.key: for every model that exposes the entity.
  • Inferred relationships. Any pair of entities co-declared in model.entities: produces a default RelationshipConfig. Author overrides in graph.relationships: only when needed.
  • Primary entity auto-detect. The entity whose canonical key matches model.grain:. No marker needed.
  • Backing time dimensions. Auto-created from each times: entry's column:.
  • Default aggregations. Accumulation class drives the allowed set; subtract with disallowed_aggregations:.

The full table is in auto-derivations summary.

At a glance
Package directory layout
Path Required? Purpose
package.yml Required Identity, warehouse target, connection, seeds, schema_strict flag.
graph.yml Required Canonical entities + explicit relationship overrides.
models/**/*.yml At least one One model per warehouse table or mart. Declares model.entities:, times:, dimensions, measures.
metrics/**/*.yml Optional Governed access patterns: ratio, cumulative, derived, conversion, etc.
segments/*.yml Optional Reusable filter sets anchored on an entity plus a basis metric.
policies.yml Optional Visibility, access, release labels, protected objects.
caveats.yml Optional Advisory business, definition, or data-quality context emitted as warnings when relevant.
examples/*.yml Optional Named example queries surfaced through discovery and inspect routes.
tests/*.yml Optional Package-local regression tests. The runner walks this directory directly.
Component 1 / 6
package.yml

Declares the schema version, identity, warehouse target, seed inputs (DuckDB) or connection (every other warehouse), declared environments, the schema_strict flag, and defaults inherited by dimensions, temporal roles, and measures.

When to edit Once at package creation, then rarely — whenever you add a deployment environment, switch warehouses, or update seed assets.

Top-level fields

Field Type Status Behavior
schema_version integer Required Only accepted value is 1; anything else raises INVALID_CONFIG.
package.id string Required Stable identifier used by the CLI (--package <id>) and the catalog.
package.namespace string Auto-derived Defaults to package.id. Drives every auto-derived ID. Setting it explicitly stabilizes IDs.
package.warehouse enum Required One of duckdb, snowflake, postgres, bigquery, databricks, motherduck, ducklake, athena, clickhouse.
package.default_db string Conditional Required for warehouse: duckdb.
package.connection object Conditional Required for every warehouse except duckdb. See connection-mode.
package.seed object Conditional Required for DuckDB. Sub-fields: kind, source, optional post_sql, null_strings.
package.environments list<string> Optional Declared environment names policies can target.
package.schema_strict boolean Optional Default false. true rejects legacy authoring forms (scalar key:, redundant grain: alongside entities:, etc.) and unknown values for typed fields like kind: — see strict-mode rules. Recommended for new packages. See also the scope note below.
defaults map<string, map> Optional Per-row inheritable defaults: defaults.dimension, defaults.time, defaults.measure, defaults.relationship.

Example

For remote warehouses (Snowflake, Postgres, BigQuery, Databricks, and the rest), swap seed for connection — see connection-mode decision.

package.yml — DuckDB
schema_version: 1

package:
  id: shop
  namespace: shop
  warehouse: duckdb
  default_db: data/shop.duckdb
  seed: { kind: sql_script, source: data/seed.sql }
  schema_strict: true

defaults:
  dimension: { groupable: true, filterable: true }
  time:
    timezone: UTC
    supported_grains: [day, week, month, quarter, year]
Component 2 / 6
graph.yml

The canonical entity registry. Each entry declares one business object the package exposes (customer, order, product, …), its key column, and any non-default relationships to other entities.

graph.entities:

Field Type Status Behavior
key string | list<string> Required Canonical column name(s). String for single-key; list for compound (e.g. [customer_id, valid_from]). Single source for the entity's column — models bind to it automatically.
label · description string Optional Display metadata.
allowed_as_root boolean Optional Default true. Set false for snapshot/junction entities.
synonyms list<string> Optional Alternate human terms for resolve / discover. Excessively broad terms produce AMBIGUOUS_ALIAS.
disallowed_names list<string> Optional Explicit anti-pattern guard. Names that may never appear as a column/dimension/measure on any model. Validator rejects and suggests expr: for intentional renames.

graph.relationships:

Most relationships are inferred from FK references in model.entities: blocks. Author an explicit graph.relationships: entry only when you need a non-default rule: per-direction rollup safety, SCD2 temporal_validity:, custom cardinality:, or allowed_directions: restriction.

Relationships are bidirectional. The entities: field is an unordered pair; cardinality and rollup safety are expressed relative to that pair.

Field Type Status Behavior
graph.relationships.<name>.entities [string, string] Required The unordered pair of entities the relationship connects.
graph.relationships.<name>.cardinality enum Auto-derived one_to_one, many_to_one, one_to_many, or many_to_many. many_to_one means "first entity is many; second is one." The loader infers from key roles; override only when needed.
graph.relationships.<name>.safety enum Auto-derived safe, requires_rewrite, unsafe. Defaults from cardinality (safe for 1:1/N:1; requires_rewrite for 1:N/M:N). unsafe joins are rejected outright.
graph.relationships.<name>.allowed_directions list<string> Optional Default [forward, reverse]. Restricts which traversal directions the planner may pick.
graph.relationships.<name>.rollup_safe {forward, reverse} Optional Per-direction rollup safety. forward: lists aggregations safe when rolling up from the first entity to the second; reverse: lists those safe when rolling up the other way. Either may be empty. Replaces the legacy per-model parent_entity: block.
graph.relationships.<name>.temporal_validity {valid_from, valid_to} Required for SCD2 Names the columns that bound an SCD2 record's validity. The planner automatically appends a validity-range predicate to the join condition. Without it, joins to history entities can return duplicate rows.
target_key_role · source_key_role enum Optional primary, unique, foreign, natural. Disambiguates compound-key joins.
target_key_type enum Optional Default primary. Either primary or identifier.
path_preference integer Optional Default 100. Lower wins when multiple paths are valid — resolves AMBIGUOUS_PATH.

Example

graph.yml — shop
graph:
  entities:
    customer:
      label: Customer
      key: customer_id
      synonyms: [buyer, account]
      disallowed_names: [cust_id, custid, customerid]
    order:
      label: Order
      key: order_id
      disallowed_names: [ord_id, orderid]
    order_item:
      label: Order item
      key: order_item_id
    product:
      label: Product
      key: product_id
    customer_history:
      label: Customer history
      key: [customer_id, valid_from]
      allowed_as_root: false

  relationships:
    customer_history_x_customer:
      entities: [customer_history, customer]
      cardinality: many_to_one
      safety: requires_rewrite
      temporal_validity:
        valid_from: effective_from
        valid_to: effective_to
      rollup_safe:
        forward: [sum, count]
        reverse: []
Component 3 / 6
Models (models/<id>.yml)

A model declares one warehouse table or mart, the entities it exposes, and the dimensions, temporal roles, and measures attached to its grain. The loader merges every YAML file under models/** into the package config.

Top-level model fields

Field Type Status Behavior
model.id string Auto-derived From the YAML filename when omitted.
model.relation string Required Physical relation: warehouse table, view, or seed name.
model.grain list<string> Required One row per this. Drives planner fanout safety. Compound grain supported.
model.label · description string Optional Display metadata.
model.defaults map Optional Per-model overrides for defaults.dimension, defaults.time, etc. Merge: package → model → per-row.

model.entities:

Required on every model. Declares which entities the model exposes. Each entry binds by default to the entity's canonical column from graph.entities.<x>.key:; override per entity with expr: when the model's column name differs.

The primary entity is auto-detected: whichever entity's canonical key column matches model.grain:. For compound-grain models, all entities whose keys are in the grain are co-primary.

Field Type Status Behavior
model.entities.<entity>.expr string Optional Override the column the entity binds to. Use when the model's column name differs from graph.entities.<entity>.key:.
model.entities.<entity>.label string Optional Per-binding display override; rarely needed.
model.entities.bridge boolean Optional Block-level option (sibling of the entity entries). Default true. Set false when this model should NOT be auto-used as a join path between other entities. Queries within the model still work; the planner just won't route through it.
Use cases for bridge: false Junction/mapping tables (m:n bridges), denormalized snapshots that shouldn't be joined to live data, partial bridges where the data isn't complete enough for arbitrary multi-hop traversal.

Examples

model.entities patterns
# Default binding — columns auto-resolved from graph
model:
  id: orders
  relation: shop_order
  # grain derived from primary entity's key (graph's order.key)
  entities:
    order: {}                      # primary; column = graph's order_id
    customer: {}                   # FK reference; column = graph's customer_id
    product: {}

# Column rename via expr:
model:
  id: order_renamed_columns
  relation: shop_order
  entities:
    order: { expr: ord_id }        # primary, column renamed
    customer: { expr: cust_id }    # FK, column renamed

# Junction table — not used as a join path. Compound grain comes from
# the bridge's two primary entity keys, not an authored `grain:` field.
model:
  id: customer_segment_membership
  relation: shop_customer_segment
  entities:
    bridge: false
    customer: {}
    segment: {}

times: (temporal roles)

The times: block key IS the temporal role. The backing date/timestamp dimension is auto-created from column:. default: true picks the implicit time axis when a query omits time.temporal_role.

Field Type Status Behavior
times.<key>.column string Required The model column. Auto-creates a backing dimension.
times.<key>.kind enum Required date or timestamp.
times.<key>.class enum Required event_time, calendar_time, as_of_time, or state_time. Distinguishes flow / calendar / snapshot / SCD2-validity semantics.
times.<key>.supported_grains list<string> Optional Default [day, week, month, quarter, year].
times.<key>.default boolean Optional Picks the implicit time axis. At most one default: true per model.
times.<key>.timezone · label string Optional Default timezone UTC; label is display text.

dimensions:

Behavioral attributes attached to the model's grain. The dimension's kind drives type coercion in SQL lowering and the filter-operator menu surfaced through build-options.

Key dimensions auto-create from graph.entities.<x>.key. Backing date/timestamp dimensions auto-create from times:. You only author behavioral dimensions here.

Field Type Status Behavior
kind enum Required categorical, boolean, integer, continuous, number, percent, currency, date, timestamp.
column string Auto-derived From the key. Override only if the physical column differs.
domain string | object Optional Value-domain ID or inline {values: […]}. Powers valid-values; without it raises NO_VALID_VALUES_SOURCE.
label · description string Optional Display metadata.
filterable · groupable boolean Optional Default true. false hides the dimension from suggestions.

measures:

Each measure declares the explicit triple (kind, accumulation, value_type) plus the aggregation expression. expr: and default_agg: live directly on the measure — no expression: wrapper.

Measures do not auto-publish to metrics. To expose a measure as a governed metric, write the metric explicitly under metrics.

Core fields

Field Type Status Behavior
measures.<key>.kind enum Required aggregate · entity_count. Drives compiler dispatch: entity_count uses COUNT(DISTINCT entity_key); aggregate uses default_agg.
measures.<key>.accumulation object Required Always object form. kind is required and must be one of the enum {flow, stock, event, population}. stock additionally carries snapshot: (start_of_period | end_of_period).

Examples: { kind: flow }, { kind: event }, { kind: stock, snapshot: end_of_period }.

measures.<key>.value_type enum Required currency, count, number, percent, boolean. Required — explicit declaration; no inferred default.
measures.<key>.expr string | expression AST Required for aggregate The column or scalar expression aggregated by default_agg. String forms parse as Python-like expressions; objects use the expression AST.
measures.<key>.entity_key string Required for entity_count The entity name (e.g. order, customer). The loader resolves the entity's canonical column from graph.entities.<x>.key.
measures.<key>.default_agg string Required for aggregate sum, avg, min, max, count, count_distinct, median, percentile, first_value, last_value. The default aggregation the API uses if the caller doesn't specify one.
measures.<key>.disallowed_aggregations list<string> Optional Subtract from the accumulation-derived allowed set. Effective allowed = derived − disallowed. Use to remove specific aggregations that don't make business sense (e.g. [median] on revenue).
measures.<key>.label · description string Optional Display metadata.
measures.<key>.time string Optional The temporal role this measure can be queried over. When omitted, inherits from the model's default: true entry in times:.
measures.<key>.comparison_family · comparison_mode string Optional Drives same_query vs coordinated_queries selection. Load-bearing for the planner.
measures.<key>.validity_windows list<{from, to, semantics}> Optional Time ranges where the measure is meaningful. Queries outside the window raise MEASURE_VALIDITY_BOUNDARY or surface as caveats depending on cross_window_policy.
measures.<key>.cross_window_policy enum Optional Default caveat. strict turns out-of-window queries into errors.
measures.<key>.external_discontinuities list<{from, to, what, magnitude_estimate_pct}> Optional Documents known external breaks. Surfaces in inspect as caveats.

Full model example

models/orders.yml
model:
  id: orders
  label: Orders
  relation: shop_order
  # grain derived from primary entity's key — do NOT author alongside entities:
  entities:
    order: {}                      # primary
    customer: {}                   # FK reference
    product: {}                    # FK reference
  times:
    ordered_at:
      column: ordered_at
      kind: timestamp
      class: event_time
      default: true
  dimensions:
    status: { kind: categorical }
  measures:
    revenue_usd:
      label: Revenue (USD)
      kind: aggregate
      expr: order_total_cents / 100.0
      default_agg: sum
      accumulation: { kind: flow }
      value_type: currency
      disallowed_aggregations: [median]
    order_count:
      kind: entity_count
      entity_key: order
      accumulation: { kind: event }
      value_type: count
Component 4 / 6
Metrics (metrics/**.yml)

Metrics codify governed access patterns. Each metric carries a kind: that determines the required fields. Common kinds use direct named fields; kind: derived and kind: conversion use the full expression AST.

References to other measures/metrics use package-relative keys (revenue_usd), not fully qualified IDs (metric.shop.revenue_usd). The loader resolves keys.

Top-level fields (any kind)

Field Type Status Behavior
metrics.<key>.kind enum Required One of: aggregate, ratio, cumulative, rolling, prior_period, period_to_date, semi_additive, derived, conversion. Selects required fields (see below).
metrics.<key>.value_type enum Required currency, count, number, percent, boolean. The metric's output type may differ from the underlying measure.
metrics.<key>.label · description · examples string / list Optional Display metadata. examples are sample query phrases the runtime can echo back in compile’s explain payload.
metrics.<key>.as string Optional Override the auto-derived ID. Same-namespace only. Validator warns if redundant.
metrics.<key>.time string Optional Default time axis. Metrics do NOT inherit the model's default_time.
metrics.<key>.comparison_family · comparison_mode string Optional Load-bearing for query plan selection. Same as on measures.

Required fields per kind

kind Direct named fields
aggregatemeasure: <key>
rationumerator, denominator, null_behavior (default null_if_zero)
cumulativemeasure; optional window
rollingmeasure, window: {unit, value}
prior_periodmeasure, period
period_to_datemeasure, period (resets per period; distinct from cumulative)
semi_additivemeasure; underlying measure must have accumulation: { kind: stock }
derivedexpression: <AST> — long-tail case
conversionexpression: { kind: conversion, base, converted, entity, window, matching_mode }

Aggregate, ratio, and time-series examples

These kinds use direct named fields (no expression AST). Anchors: aggregate, ratio, cumulative, rolling, prior_period, period_to_date.

Common mistake Filtering on time inside a cumulative query raises CUMULATIVE_TIME_FILTER_UNSUPPORTED. Use period_to_date for a bounded running total within a period.
metrics — direct-named-field kinds
metrics:
  # kind: aggregate — publish a measure
  revenue_usd:
    label: Revenue (USD)
    kind: aggregate
    measure: revenue_usd
    value_type: currency

  # kind: ratio — direct numerator/denominator
  aov_usd:
    label: Average order value (USD)
    kind: ratio
    numerator: revenue_usd
    denominator: order_count
    null_behavior: null_if_zero
    value_type: currency
    time: ordered_at

  # time-series kinds: same shape, different window/period field
  cumulative_revenue_usd:
    kind: cumulative
    measure: revenue_usd
    value_type: currency

  revenue_28d:
    kind: rolling
    measure: revenue_usd
    window: { unit: day, value: 28 }
    value_type: currency

  revenue_prior_month:
    kind: prior_period
    measure: revenue_usd
    period: month
    value_type: currency

  revenue_mtd:
    kind: period_to_date
    measure: revenue_usd
    period: month
    value_type: currency

derived metric (AST escape hatch)

Use kind: derived for arbitrary formulas over multiple metrics. The full expression AST is authored under expression:. References to other metrics use bare keys; the loader resolves them to qualified IDs.

metrics/margin_pct.yml — derived
metrics:
  margin_pct:
    label: Gross Margin (%)
    description: (revenue − cogs) / revenue
    kind: derived
    value_type: percent
    expression:
      kind: arithmetic
      op: divide
      left:
        kind: arithmetic
        op: subtract
        left:  { kind: metric, metric: revenue_usd }
        right: { kind: metric, metric: cogs_usd }
      right: { kind: metric, metric: revenue_usd }

conversion metric

Counts entities where a base event is followed by a converted event within a bounded window. Authored as the AST.

Required fields entity, window: {unit, value}, matching_mode, and the two event sides (base, converted) are all mandatory. Missing fields raise CONVERSION_ENTITY_REQUIRED, CONVERSION_WINDOW_REQUIRED, or CONVERSION_MATCHING_MODE_REQUIRED.
metrics/signup_to_first_order_7d.yml — conversion
metrics:
  signup_to_first_order_7d:
    label: Signup → first order (7d)
    kind: conversion
    value_type: count
    expression:
      kind: conversion
      base:      { metric: signup_count }
      converted: { metric: order_count }
      entity: customer
      window: { unit: day, value: 7 }
      matching_mode: first_converted_after_base
      constant_properties: [region]
Component 5 / 6
Segments (segments/*.yml)

Reusable membership filters anchored on an entity plus a basis metric. Previewed by POST /api/v1/segment-preview, validated by /segment-validate, and explained by /segment-explain.

Field Type Status Behavior
entity string Required Entity grain at which segment membership is tested.
basis_metric string Required Drives population size and preview rows.
preview_dimensions list<string> Optional Dimensions surfaced in segment-preview.
membership.where list<{field, op, value}> Optional Dimension-level predicates only — not expression AST.
membership.metric_filters list<{expression, op, value}> Optional Expression-level filters on metric values. Full AST; typically {metric: …} or metric_predicate.
membership.time · path_policy map Optional Pin the time axis or join path.

Example

segments/high_value_customers.yml
segments:
  high_value_customers:
    entity: customer
    basis_metric: revenue_usd
    preview_dimensions: [region]
    membership:
      metric_filters:
       - expression:
            metric: lifetime_revenue_usd
          op: ">="
          value: 1000
       - expression:
            kind: metric_predicate
            input: { metric: order_count }
            entity: customer
            op: ">="
            value: 3
            scope_mode: entity_only
            window: { unit: day, value: 90 }
          op: "is_true"
Component 6 / 7
Policies (policies.yml)

Optional. Policies live in a single top-level file under semantic_policies:. Four kinds are recognized by the runtime: package_release, object_visibility, object_access, protected_object.

Field Type Status Behavior
id string Required Stable identifier.
kind enum Required package_release, object_visibility, object_access, protected_object.
action string Required Paired with kind: visibility uses hidden/visible; access uses deny/redact.
audiences · environments · object_ids list<string> Optional Filter by audience/environment, or target specific object IDs (empty = package-wide).
config · rationale map / string Optional Kind-specific config (e.g. { mask: "***" }); rationale returned in POLICY_DENIED hints.
Component 7 / 7

Caveats (caveats.yml)

Optional. Caveats live under semantic_caveats: and surface as advisory SEMANTIC_CAVEAT_APPLIED warnings on validate, compile, and query. They never change SQL, rows, access, discovery, or policy behavior.

Field Type Status Behavior
id · kind · message string / enum Required kind is one of business_event, definition_change, or data_quality.
object_ids · entity_values list Required* At least one targeting field is required. Entity-value caveats fire only when the matching value is filtered or the dimension is exposed, and only on the declared dimension — declare one row per dimension a value is commonly reached through.
time.at · time.from/to date Optional Point or half-open range trigger. Time-bound caveats require explicit query time or an inferable comparison window.
audiences · environments · severity · owner · references list / string Optional Optional context gates and metadata for the warning payload. severity: info adds definitional framing; warning (the default) means the matched window or slice itself is affected.
Expression AST reference

Every expression:, expr:, and metric-filter body is a small AST. The kind: field selects the node shape; the runtime rejects unknown kinds with INVALID_EXPRESSION_AST.

Authors may write {measure: …} as shorthand for {kind: measure, measure: …}, and {metric: …} as shorthand for {kind: metric, metric: …}; both expand identically.

kind Required fields Allowed in
measure measure (+ optional aggregation, temporal_role) measure expr, metric expr, segment metric_filters, query select
aggregate measure, aggregation query select; rare in metric expr (use kind: aggregate metric instead)
metric metric metric expr (derived/conversion), segment metric_filters, query select
column column measure expr, scalar expressions inside metric expr
literal value any scalar context
arithmetic op, left, right (+ optional null_behavior) metric expr, measure expr
comparison op, left, right boolean predicates, case.whens[*].when
boolean op, args predicate trees
call · case · in / not_in · nullif · date_add · between / not_between see SDK schema for kind-specific required fields. between takes expr, low, high (+ optional negated) and desugars at parse time to BooleanExpr(AND, [Comparison(>=), Comparison(<=)]). scalar expr inside measure / metric
cumulative input (+ optional partition_by, window_scope) metric expr (top of cumulative-kind metric, when named fields aren't enough)
rolling input, window: {unit, value} metric expr (top of rolling-kind metric)
prior_period input, offset: {unit, value} metric expr (top of prior_period-kind metric)
period_to_date input, period metric expr (top of period_to_date-kind metric)
metric_predicate input, entity, op, scope_mode (in package context); optional value, time_grain, time_alignment, window segment metric_filters, query metric_filters
scoped_aggregate measure (+ optional aggregation, temporal_role, predicates[*].metric, where, null_behavior) metric expr
aggregate_if aggregation (one of count, count_distinct, sum, avg, min, max, median, percentile), condition; required value for all aggregations except count. Column refs inside condition / value must specify entity or table — there is no surrounding measure to inherit from. Compiles natively to COUNT_IF / SUM_IF on Snowflake and to portable <AGG>(CASE WHEN cond THEN value END) elsewhere. query select[*].expression or metric_filters[*].expression
ratio numerator, denominator (+ optional null_behavior, default null_if_zero) metric expr (typically inside kind: derived)
conversion base, converted, entity, window: {unit, value}, matching_mode metric expr (top of conversion-kind metric)

metric_predicate alignment matrix

metric_predicate is the most field-heavy AST node; this matrix shows which combinations of scope_mode and time_alignment are legal. Other combinations raise INVALID_METRIC_PREDICATE.

scope_mode Allowed time_alignment May declare time_grain? Use case
contextual same_query_period (only) Yes "Customers whose revenue this period exceeds X."
entity_only query_window or rolling_window_in_period No "Customers with ≥ 3 orders in any 90-day window across history."
Where each expression kind is allowed

The planner enforces context-specific subsets. Putting a metric AST in where: is the most common authoring trap; where: only accepts dimension-level predicates.

Context Allowed Not allowed
query.where Dimension-level predicates only: {field, op, value}, {field, op: in, values: [...]} — key is field, not dimension. Expression AST nodes — use metric_filters instead.
query.metric_filters Full expression AST: metric, aggregate, aggregate_if, metric_predicate, scoped_aggregate, ratio, arithmetic, comparison, boolean, between. Bare column nodes (the planner cannot resolve a column outside a measure body).
measure.expr Scalar AST: column, literal, arithmetic, comparison, boolean, call, case, nullif, date_add, in, between. metric, cumulative, rolling, prior_period, period_to_date, conversion, metric_predicate, scoped_aggregate, aggregate_if (aggregate_if is query-level only — declare a normal measure if you want to bake the conditional aggregate into the catalog).
metric.expression (kind: derived / conversion) Everything except column at the top. Bare column at the top (use a measure body for raw column arithmetic).
segment.membership.where[*] Dimension-level predicates only (same shape as query.where). Expression AST.
segment.membership.metric_filters[*] Same as query.metric_filters, plus metric_predicate with required scope_mode. metric_predicate without scope_mode in package context.
Auto-derivations summary

Every rule below fires only when the field is missing or empty — explicit values are preserved unchanged. package.namespace (default package.id) is the master switch.

What Derived from
Entity IDentity.<ns>_<key>
Entity name<ns>.<TitleKey>, e.g. shop.Customer
Key dimensionsgraph.entities.<x>.key for every model that exposes the entity
Backing date/timestamp dimensionEach times: entry's column:
Primary entityEntity whose canonical key matches model.grain
Inferred relationshipsPairs of entities co-declared in model.entities: (excluding bridge-disabled models)
CardinalityKey roles: 1:1, N:1, 1:N, M:N
Join safetysafe for 1:1 / N:1; requires_rewrite for 1:N / M:N
Allowed aggregationskind + accumulation; subtract with disallowed_aggregations
Measure IDmeasure.<ns>.<key>
Metric IDmetric.<ns>.<key>; override with as:
Default time on measuresModel's default: true entry in times:; metrics do NOT inherit
Strict-mode validation rules

When package.schema_strict: true is set, the loader rejects every legacy authoring form. This is the recommended setting for new packages.

Removed boilerplate (use auto-derivation instead)

  • id: on semantic objects — auto-derived from namespace + key; use as: only to preserve a public reference.
  • name: on objects — auto-derived from key.
  • kind: id dimensions — key dimensions auto-create from graph.entities.<x>.key.
  • Duplicate date/timestamp dimension when times: covers the same column — the times: entry creates the backing dimension.
  • Authored model.grain: separate from primary — derived from primary entity key.
  • Top-level temporal_role.<id>: separate registration — the times: block key IS the role.
  • Model-level entity: (singular) + keys.foreign: + joins: blocks — use model.entities:; overrides in graph.relationships:.
  • Per-model parent_entity: block — rollup safety lives on graph.relationships.<name>.rollup_safe.
  • entity.key_roles (authored) — auto-derived from key role pairs.

Renames

  • agg_function:default_agg: on measures.
  • accumulation: stock + sibling snapshot_policy: → nested accumulation: { kind: stock, snapshot: end_of_period }.
  • expression: wrapper around expr: / default_agg: on kind: aggregate measures → un-nest. Write expr: and default_agg: directly.
  • Buried expression: AST on metric kinds with direct named fields (aggregate, ratio, cumulative, rolling, prior_period, period_to_date, semi_additive) → use direct fields. AST authoring is reserved for kind: derived and kind: conversion.

Required additions

  • accumulation always object form; kind must be in {flow, stock, event, population}.
  • Every authored metric must declare value_type: explicitly. The metric output type may differ from the underlying measure.
  • At most one times: entry per model may have default: true.
  • No name on a model (column, dimension, measure, entity binding) may appear in any graph entity's disallowed_names:.

Dropped (no replacement)

  • auto_publish: on measures — auto-publish is gone. Author explicit metrics.
  • package.default_account_id — vestigial, no consumer.
  • dimension.preferred_filter_ops — metadata-only, no planner gating.
  • measure.clock_variants, comparison_peers, preferred_companion_metrics — metadata-only on measures. comparison_family and comparison_mode stay (load-bearing). preferred_companion_metrics is allowed on metrics as advisory governance metadata; companion-metric relationships are too volatile to lock in at the measure layer.
  • topics: on any object — no validation, no scaling pattern.
  • policy.kind: plan_constraint — runtime no-op. The four real kinds are package_release, object_visibility, object_access, protected_object.

Warnings (advisory, not blocking): as: used where the resulting ID matches the auto-computed one (use is unnecessary).

What schema_strict does and does not catch

schema_strict: true rejects the legacy authoring forms listed above and unknown values for typed fields (e.g. kind: catagorical is caught with a list of valid kinds). It may not reject unknown top-level field names on every object — a typo such as dimentions: on a model: block can be silently dropped along with the entire mistyped block. Always cross-check authoring outcomes with the check command and confirm the catalog contains every object you authored before trusting a parse-clean result.

Authoring contract version

Every package must declare schema_version: 1 at the top of package.yml. Any other value raises INVALID_CONFIG at load time.

schema_version is a stamp, not a feature toggle. The number bumps on a breaking authoring change — not on additive features. The runtime no longer has a version-dispatch path; 1 is the only contract.

Connection-mode decision

Every warehouse except duckdb declares a package.connection block whose kind selects the connector. For warehouse: snowflake, kind selects between two paths.

Use case Pick Why
Local development, snow CLI configured kind: snowflake_cli No env vars needed. The runtime shells out to snow connection test and reuses the CLI's named profile.
Production / container deployments kind: snowflake_native Pulls account, user, and password from environment variables (or files via password_file_env). Supports query_tag for attribution. Requires the connector extra: uv sync --extra snowflake-native.

Both kinds require name. snowflake_cli reads it as the CLI profile name; snowflake_native reads it as a logical label surfaced in explain and operational metadata.

Native connection options

  • account_env · user_env · password_env — env var names holding the credential. Literal credentials are rejected.
  • password_file_env — env var holding a file path; the file's content is the password.
  • warehouse · role · database · schema — per-session defaults applied at connection time.
  • query_tag — attached to every query; surfaces in Snowflake query history.

Other warehouses

The remaining warehouses each expose a single native kind: postgres_native, bigquery_native, databricks_native, motherduck_native, ducklake_native, athena_native, clickhouse_native. The same secrets convention applies everywhere: credentials are never literals: they arrive through env-var indirection (*_env names) or file paths (*_file), while non-secret locators like host or database may be literal or env-indirected.

package.yml: Postgres connection
package:
  warehouse: postgres
  connection:
    kind: postgres_native
    host_env: SR_POSTGRES_HOST
    port: 5432
    database: sr_jaffle
    schema: public
    user_env: SR_POSTGRES_USER
    password_env: SR_POSTGRES_PASSWORD

The full option list for each warehouse lives in the per-warehouse *_CONNECTION_OPTIONS tuples in semantic_rails/dialects.py, with canonical env-var names in .env.example at the repo root. Redshift is coming next; until then see docs/ADDING_A_DIALECT.md.

Error code → root cause

Every error code raised at load or plan time maps to a small number of authoring fields. Use this table when an error appears in validate-config, compile, or explain. The full inventory and request / response shapes live in the API reference.

Code Meaning Typical authoring fix
INVALID_CONFIG YAML failed authoring-contract validation Check schema_version, package.warehouse, and package.seed / package.connection.
STRICT_LEGACY_FORM_REJECTED Strict mode rejected a legacy authoring form See strict-mode rules for the full table of rejected forms and replacements.
DISALLOWED_NAME Column / dimension / measure used a name in graph.entities.<x>.disallowed_names Use the canonical column from graph.entities.<x>.key or declare an expr: rename in model.entities:.
INVALID_ACCUMULATION_KIND accumulation.kind not in {flow, stock, event, population} Use one of the four canonical values.
METRIC_VALUE_TYPE_REQUIRED Authored metric is missing value_type: Declare value_type: explicitly. The metric's output type may differ from the underlying measure.
MULTIPLE_DEFAULT_TIME More than one times: entry on a model has default: true Pick one default per model.
OBJECT_NOT_FOUND Referenced ID does not exist Verify the auto-derived form (auto-derivations).
INVALID_EXPRESSION_AST Expression node missing required fields or unknown kind Cross-reference the expression matrix.
AMBIGUOUS_ALIAS Human token resolves to multiple objects Tighten graph.entities.<key>.synonyms or use stable IDs.
AMBIGUOUS_PATH · PATH_NOT_FOUND Join path ambiguous or missing Pin path_preference / query.path_policy; add an FK in model.entities:.
FANOUT_UNSAFE · ROLLUP_UNSAFE Join or rollup smears values Check cardinality, safety, and per-direction rollup_safe.
UNSUPPORTED_AGGREGATION Aggregation not in the measure's allowed list Adjust disallowed_aggregations or pick another.
INVALID_TEMPORAL_ROLE · INCOMPATIBLE_TEMPORAL_ROLE Time grain or role rejected Check times.<key>.supported_grains and metric time:.
MEASURE_VALIDITY_BOUNDARY Query reaches outside validity window Adjust the query or revise validity_windows / cross_window_policy.
MIXED_GRAIN_INVALID · REWRITE_NOT_SUPPORTED Mixed grains cannot be safely combined Split the query, align time grains, or expose pre-aggregated measures.
CUMULATIVE_TIME_FILTER_UNSUPPORTED Time filter inside a cumulative query Use period_to_date or move filter to partition_by.
CONVERSION_* (entity / window / matching_mode required, not supported) Conversion metric malformed See conversion metric kind: entity, window: {unit, value}, matching_mode all required.
PREDICATE_* · INVALID_METRIC_PREDICATE metric_predicate malformed or context-incompatible See the metric_predicate row + alignment matrix in expression AST. Package context requires scope_mode.
DUPLICATE_OUTPUT_ALIAS · INVALID_ORDER_BY Query select / order-by structural issue Rename with as: ...; reference a select alias or grouped dimension.
NO_VALID_VALUES_SOURCE valid-values on a dimension without a domain Add dimensions.<key>.domain.
POLICY_DENIED · OUT_OF_SCOPE Policy rejected request, or request outside semantic-query planning Check policies.yml + audience headers. The runtime’s scope classifier runs inline on every discover / plan call and returns out_of_scope / low_relevance blocks with closest matches when applicable.
INVALID_QUERY · QUERY_EXECUTION_ERROR Query structurally invalid or warehouse rejected SQL Confirm against the planner contract; read explain for warehouse error.
Smallest valid package

One entity, one model with one dimension / time / measure, no metrics, no segment, no policy. Extend section-by-section by referring back to the relevant component above.

shop_min — four files
# package.yml
schema_version: 1
package:
  id: shop_min
  namespace: shop_min
  warehouse: duckdb
  default_db: data/shop_min.duckdb
  seed: { kind: sql_script, source: data/seed.sql }
  schema_strict: true

# graph.yml
graph:
  entities:
    order:
      label: Order
      key: [order_id]                 # list form is required
      model: orders                   # which model declares this entity as primary

# models/orders.yml
model:
  id: orders                          # required; the mapping key alone is not enough
  label: Orders
  relation: shop_min_order
  # grain: is derived from the primary entity's key — do NOT author it
  # alongside `entities:` (strict mode rejects the combination).
  entities:
    order: {}                         # primary; grain comes from graph's order.key
  times:
    ordered_at:
      column: ordered_at
      kind: timestamp
      class: event_time
      default: true
  dimensions:
    status: { kind: categorical }
  measures:
    order_count:
      kind: entity_count
      entity_key: order
      accumulation: { kind: event }
      value_type: count

# metrics/order_count.yml — at least one metric recipe is required
metrics:
  order_count:
    label: Order Count
    description: Count of orders. Codified for stable reference.
    kind: aggregate
    measure: order_count
    value_type: count
Four fields the loader requires that older drafts omitted
  • model.id: — required on every model. The mapping key alone is not sufficient.
  • graph.entities.<x>.key: — list form ([order_id]), even for a single column.
  • graph.entities.<x>.model: — binds the entity to the model that declares it as primary.
  • Do not author model.grain: when entities: is present — the loader derives grain from the primary entity's key. Authoring both is rejected.

uv run semantic-rails parse-config --path /path/to/shop_min returns ok: true; the catalog exposes measure.shop_min.order_count and metric.shop_min.order_count. A package must declare at least one metric recipe to load — the metrics/ file above is not optional.

This page documents only what an author types under the v1 contract. Internal-only fields (measure_class, computed catalog fields, runtime decoration) live in semantic_rails/schema.py and are out of scope here.

Next: pair this reference with the narrative Package Authoring guide, Query Planner for how authored objects flow into plans, or the API Reference for the wire shape on every route.