Skip to main content

Extending dbt Models

Step-by-step guide to adding a new Gold model to Konsolidat.

Adding a Gold Model

1. Create the SQL File

Create dbt_project/models/gold/gold_your_model.sql:

{{
config(
materialized='table',
tags=['gold'],
**epm_config(order_by='(data_area_id, fiscal_year, fiscal_period, main_account)')
)
}}

select
tb.data_area_id,
tb.fiscal_year,
tb.fiscal_period,
tb.main_account,
{{ dim_select('tb.') }},
{{ measure_passthrough('tb.') }}
from {{ ref('gold_trial_balance') }} as tb
where tb.fiscal_period between 1 and 12

Key points:

  • Use epm_config() for ClickHouse engine settings
  • Use dim_select() / dim_group_by() for dimension columns
  • Use measure_passthrough() for standard measures
  • Reference upstream models with {{ ref('model_name') }}
Dimensions and measures are registry-driven

dim_select(), dim_group_by(), measure_select() and the other helper macros read from the dimensions and base_measures vars in dbt_project.yml. Those vars are generated from the Dimension and Measure doctypes in Frappe (konsol.dbt_config.regenerate_vars()) — Frappe owns that section of dbt_project.yml. To add or change a dimension or measure, edit the doctype and regenerate; do not hand-edit the vars block.

2. Add YAML Documentation

Add an entry to dbt_project/models/gold/_gold__models.yml:

- name: gold_your_model
description: "What this model does"
columns:
- name: data_area_id
description: "Legal entity identifier"
tests:
- not_null
- name: your_key_column
description: "Description of this column"

3. Add Tests

Create dbt_project/tests/assert_your_condition.sql:

-- Describe what this test checks
select *
from {{ ref('gold_your_model') }}
where some_condition_that_should_never_be_true

dbt tests return rows that violate the assertion. Zero rows = pass.

Use {{ config(severity='warn') }} for non-blocking tests.

4. Build and Test

dbt run --select gold_your_model # Build just this model
dbt test --select gold_your_model # Run tests
dbt build --select gold_your_model # Both in one command

Adding a Seed-Driven Model

If your model needs reference data:

1. Create the Seed CSV

Add dbt_project/seeds/your_reference_data.csv:

column_a,column_b,column_c
value1,value2,value3

2. Configure Column Types (Optional)

In dbt_project.yml, add type overrides:

seeds:
open_epm:
your_reference_data:
+column_types:
column_a: String
column_b: Decimal(18,2)

3. Reference in Your Model

from {{ ref('your_reference_data') }} as ref_data

4. Load the Seed

dbt seed --select your_reference_data

Patterns to Follow

Aggregation Model

select
data_area_id,
fiscal_year,
fiscal_quarter,
main_account,
{{ dim_select() }},
sum(period_net_amount) as quarter_net_amount
from {{ ref('gold_trial_balance') }} as tb
inner join {{ ref('gold_period_hierarchy') }} as ph
on tb.fiscal_period = ph.fiscal_period
group by
data_area_id,
fiscal_year,
fiscal_quarter,
main_account,
{{ dim_group_by() }}

Join Model (e.g., Variance)

select
{{ dim_coalesce('a', 'b') }},
a.amount as actual_amount,
b.amount as budget_amount,
a.amount - b.amount as variance_abs
from {{ ref('gold_trial_balance') }} as a
full outer join {{ ref('gold_spread_budget') }} as b
on a.data_area_id = b.data_area_id
and a.fiscal_year = b.fiscal_year
and a.fiscal_period = b.fiscal_period
and a.main_account = b.main_account
{{ dim_join_on('a', 'b') }}

Window Function Model (e.g., YTD)

select
*,
sum(period_net_amount) over (
partition by data_area_id, fiscal_year, main_account,
{{ dim_partition_by() }}
order by fiscal_period
) as ytd_net_amount
from {{ ref('gold_trial_balance') }}

Consolidation Layer (Empty Dims)

For non-entity layers (IC, CTA), use dim_empty_strings():

select
consolidation_group,
'ic_elimination' as adjustment_type,
'' as data_area_id,
fiscal_year,
fiscal_period,
main_account,
{{ dim_empty_strings() }},
elimination_amount as amount
from {{ ref('gold_ic_eliminations') }}

Making a Model API-Queryable

To expose a new model through the =EPM() function, register it through the data-model registry rather than editing hardcoded lists in api.py:

  1. Create a Fact Table doc in Frappe pointing at your ClickHouse table (clickhouse_table), so _batch_query_clickhouse() can resolve it via _get_fact() / _get_fact_by_scenario().
  2. Measures are validated against the registry (and re-checked with a safe-identifier regex); the table name is likewise validated with _SAFE_TABLE_NAME before it reaches the FROM clause.
  3. Once the Fact Table is published, _batch_query_clickhouse() resolves and queries it automatically — no code change in api.py is required.

See Extending the API for details.

Next Steps