Redshift configurations
Incremental materialization strategies
In dbt-redshift, the following incremental materialization strategies are supported:
- append(default when- unique_keyis not defined)
- merge
- delete+insert(default when- unique_keyis defined)
- microbatch
All of these strategies are inherited from dbt-postgres.
Performance optimizations
Using sortkey and distkey
Tables in Amazon Redshift have two powerful optimizations to improve query performance: distkeys and sortkeys. Supplying these values as model-level configurations apply the corresponding settings in the generated CREATE TABLE DDL. Note that these settings will have no effect on models set to view or ephemeral models.
- distcan have a setting of- all,- even,- auto, or the name of a key.
- sortaccepts a list of sort keys, for example:- ['reporting_day', 'category']. dbt will build the sort key in the same order the fields are supplied.
- sort_typecan have a setting of- interleavedor- compound. if no setting is specified, sort_type defaults to- compound.
When working with sort keys, it's highly recommended you follow Redshift's best practices on sort key effectiveness and cardinality.
Sort and dist keys should be added to the {{ config(...) }} block in model .sql files, eg:
-- Example with one sort key
{{ config(materialized='table', sort='reporting_day', dist='unique_id') }}
select ...
-- Example with multiple sort keys
{{ config(materialized='table', sort=['category', 'region', 'reporting_day'], dist='received_at') }}
select ...
-- Example with interleaved sort keys
{{ config(materialized='table',
          sort_type='interleaved'
          sort=['category', 'region', 'reporting_day'],
          dist='unique_id')
}}
select ...
For more information on distkeys and sortkeys, view Amazon's docs:
- AWS Documentation » Amazon Redshift » Database Developer Guide » Designing Tables » Choosing a Data Distribution Style
- AWS Documentation » Amazon Redshift » Database Developer Guide » Designing Tables » Choosing Sort Keys
Late binding views
Redshift supports views unbound from their dependencies, or late binding views. This DDL option "unbinds" a view from the data it selects from. In practice, this means that if upstream views or tables are dropped with a cascade qualifier, the late-binding view does not get dropped as well.
Using late-binding views in a production deployment of dbt can vastly improve the availability of data in the warehouse, especially for models that are materialized as late-binding views and are queried by end-users, since they won’t be dropped when upstream models are updated. Additionally, late binding views can be used with external tables via Redshift Spectrum.
To materialize a dbt model as a late binding view, use the bind: false configuration option:
{{ config(materialized='view', bind=False) }}
select *
from source.data
To make all views late-binding, configure your dbt_project.yml file like this:
models:
  +bind: false # Materialize all views as late-binding
  project_name:
    ....
Materialized views
The Redshift adapter supports materialized views with the following configuration parameters:
| Parameter | Type | Required | Default | Change Monitoring Support | 
|---|---|---|---|---|
| on_configuration_change | <string> | no | apply | n/a | 
| dist | <string> | no | even | drop/create | 
| sort | [<string>] | no | none | drop/create | 
| sort_type | <string> | no | autoif nosortcompoundifsort | drop/create | 
| auto_refresh | <boolean> | no | false | alter | 
| backup | <string> | no | true | n/a | 
- Project file
- Property file
- Config block
models:
  <resource-path>:
    +materialized: materialized_view
    +on_configuration_change: apply | continue | fail
    +dist: all | auto | even | <field-name>
    +sort: <field-name> | [<field-name>]
    +sort_type: auto | compound | interleaved
    +auto_refresh: true | false
    +backup: true | false
version: 2
models:
  - name: [<model-name>]
    config:
      materialized: materialized_view
      on_configuration_change: apply | continue | fail
      dist: all | auto | even | <field-name>
      sort: <field-name> | [<field-name>]
      sort_type: auto | compound | interleaved
      auto_refresh: true | false
      backup: true | false
{{ config(
    materialized="materialized_view",
    on_configuration_change="apply" | "continue" | "fail",
    dist="all" | "auto" | "even" | "<field-name>",
    sort=["<field-name>"],
    sort_type="auto" | "compound" | "interleaved",
    auto_refresh=true | false,
    backup=true | false,
) }}
Many of these parameters correspond to their table counterparts and have been linked above. The parameters unique to materialized views are the auto-refresh and backup functionality, which are covered below.
Learn more about these parameters in Redshift's docs.
Auto-refresh
| Parameter | Type | Required | Default | Change Monitoring Support | 
|---|---|---|---|---|
| auto_refresh | <boolean> | no | false | alter | 
Redshift supports automatic refresh configuration for materialized views.
By default, a materialized view does not automatically refresh.
dbt monitors this parameter for changes and applies them using an ALTER statement.
Learn more information about the parameters in the Redshift docs.
Backup
| Parameter | Type | Required | Default | Change Monitoring Support | 
|---|---|---|---|---|
| backup | <boolean> | no | true | n/a | 
Redshift supports backup configuration of clusters at the object level.
This parameter identifies if the materialized view should be backed up as part of the cluster snapshot.
By default, a materialized view will be backed up during a cluster snapshot.
dbt cannot monitor this parameter as it is not queryable within Redshift.
If the value changes, the materialized view will need to go through a --full-refresh to set it.
Learn more about these parameters in Redshift's docs.
Limitations
As with most data platforms, there are limitations associated with materialized views. Some worth noting include:
- Materialized views cannot reference views, temporary tables, user-defined functions, or late-binding tables.
- Auto-refresh cannot be used if the materialized view references mutable functions, external schemas, or another materialized view.
Find more information about materialized view limitations in Redshift's docs.