dbt-core, ClickHouse and Dagster
This post briefly captures the usage of dbt-core and it's integration with Dagster.
Introduction
dbt Core is an open source command line tool that enables data teams to transform data using analytics engineering best practices.
dagster is an orchestration platform for the development, production, and observation of data assets.
ClickHouse is an open-source column-oriented database management system that allows generating analytical data reports in real-time using SQL queries.
Minimum Software Requirements
Installations
Install the dbt ClickHouse plugin.
pip install dbt-clickhouse
Install the dagster-dbt library.
pip install dagster-dbt dagster-webserver
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Initialize the dbt project.
dbt init dbt_data_practitioner
cd dbt_data_practitioner
touch profiles.yml
dbt_data_practitioner:
target: dev
outputs:
dev:
type: clickhouse
schema: sakila_db
host: localhost
port: 8123
user: default
password: root
secure: False
dbt debug
dbt docs
dbt docs generate
dbt docs serve
models
cd models
mkdir sakila_db
cd sakila_db
touch actor_film_actor_join.sql
touch point_of_interest_1.sql
Delete the examples folder present inside the models folder.
cd ..
cd ..
dbt build
dbt build
The tables and views defined are now generated in ClickHouse DB.
dbt docs generate
dbt docs serve
Lineage Graph and other details.
Dagster
cd dbt_data_practitioner
dagster-dbt project scaffold --project-name dagster_data_practitioner
cd dagster_data_practitioner
DAGSTER_DBT_PARSE_PROJECT_ON_LOAD=1 dagster dev
To access from your browser, navigate to: http://127.0.0.1:3000
Dagster UI
Click on the black "Materialize all" button.
Sample Project
DataPractitioner is the sample project i've used to illustrate the usage of the aforementioned tools.
Noticed an issue with this Sample Project? Open an issue or a PR on GitHub!