After this updates run the dbt debug command. To make sure the connection is working properly
dbt debug 00:31:58 Running with dbt=1.7.6 00:31:58 dbt version: 1.7.6 00:31:58 python version: 3.11.6 00:31:58 python path: /home/rramos/Development/local/dbt/bin/python 00:31:58 os info: Linux-6.6.10-zen1-1-zen-x86_64-with-glibc2.38 00:31:58 Using profiles dir at /home/rramos/.dbt 00:31:58 Using profiles.yml file at /home/rramos/.dbt/profiles.yml 00:31:58 Using dbt_project.yml file at /home/rramos/Development/local/dbt/imdb/dbt_project.yml 00:31:58 adapter type: clickhouse 00:31:58 adapter version: 1.7.1 00:31:58 Configuration: 00:31:58 profiles.yml file [OK found and valid] 00:31:58 dbt_project.yml file [OK found and valid] 00:31:58 Required dependencies: 00:31:58 - git [OK found] ... 00:31:58 Registered adapter: clickhouse=1.7.1 00:31:58 Connection test: [OK connection ok]
If the connection test passed properly, one just need to create the model via dbt.
dbt run
And you should have a similar output
dbt run 00:38:13 Running with dbt=1.7.6 00:38:13 Registered adapter: clickhouse=1.7.1 00:38:13 Unable to do partial parsing because a project config has changed 00:38:15 Found 1 model, 6 sources, 0 exposures, 0 metrics, 421 macros, 0 groups, 0 semantic models 00:38:15 00:38:15 Concurrency: 1 threads (target='dev') 00:38:15 00:38:15 1 of 1 START sql view model `imdb`.`actor_summary` ............................. [RUN] 00:38:15 1 of 1 OK created sql view model `imdb`.`actor_summary` ........................ [OK in 0.17s] 00:38:15 00:38:15 Finished running 1 view model in 0 hours 0 minutes and 0.27 seconds (0.27s). 00:38:15 00:38:15 Completed successfully 00:38:15 00:38:15 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
Test query the model
SELECT* FROM imdb_dbt.actor_summary WHERE num_movies >5 ORDERBY avg_rank DESC
Conclusion
In this article I’ve went through the process of setup a Clickhouse database and setup dbt to setup the models with IMDB test data for actors, directors, movies, etc.
This two systems work like a charm together. Clickstream shows great performance for analytical queries, and dbt compiles and runs your analytics code against your data platform, enabling you and your team to collaborate on a single source of truth for metrics, insights, and business definitions.
Would like to extend this exercise by incorporating github actions related with dbt test actions before promoting to production.