seed 可以方便的进行数据的导入,可以方便的进行不变数据(少量)以及测试数据的导入,
base 设置为 ephemeral(暂态),这个同时也是官方最佳实践的建议
项目依赖的gitlab 数据可以参考https://github.com/rongfengliang/graphql-engine-gitlab
参考项目
- 初始化
dbt init gitlab-data
- 配置项目
# Name your package! Package names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'gitlab'
version: '1.0'
# This setting configures which "profile" dbt uses for this project. Profiles contain
# database connection information, and should be configured in the ~/.dbt/profiles.yml file
profile: 'default'
# These configurations specify where dbt should look for different types of files.
# The `source-paths` config, for example, states that source models can be found
# in the "models/" directory. You probably won't need to change these!
source-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
data-paths: ["data"] # 可以放seed 数据
macro-paths: ["macros"]
target-path: "target" # directory which will store compiled SQL files
clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_modules"
# You can define configurations for models in the `source-paths` directory here.
# Using these configurations, you can enable or disable models, change how they
# are materialized, and more!
# In this example config, we tell dbt to build all models in the example/ directory
# as views (the default). Try changing `view` to `table` below, then re-running dbt
models:
gitlab:
gitlab:
base:
materialized: ephemeral # base 建议配置为ephemeral
- 模型添加
model/gitlab/base/gitlab_projectinfo.sql:
select * from projects
model/gitlab/transform/gitlab_project_counts.sql:
select * from {{ref('gitlab_projectinfo')}}
profile 配置
~/.dbt/profiles.yml
default:
target: dev
outputs:
dev:
type: postgres
host: 127.0.0.1
user: postgres
pass: password
port: 5432
dbname: gitlabhq_production
schema: public
threads: 3
pg:
target: dev
outputs:
dev:
type: postgres
host: 127.0.0.1
user: postgres
pass: password
port: 5433
dbname: gitlabhq_production
schema: public
threads: 3
运行&&测试&&文档
- 运行
dbt run && dbt seed --show && dbt docs generate && dbt docs serve
- 效果
参考资料
https://github.com/rongfengliang/graphql-engine-gitlab
https://docs.getdbt.com/docs/configuring-models
https://docs.getdbt.com/docs/best-practices
https://docs.getdbt.com/reference#seed