lakefs 的hooks 是一种比较灵活的能力,我们基于此可以实现数据的加工处理,同时也算是一种弥补lakefs
s3 事件的处理
环境准备
- docker-compose
version: '3'
services:
lakefs:
image: "treeverse/lakefs:${VERSION:-latest}"
ports:
- "8000:8000"
depends_on:
- "postgres"
environment:
- LAKEFS_AUTH_ENCRYPT_SECRET_KEY=${LAKEFS_AUTH_ENCRYPT_SECRET_KEY:-some random secret string}
- LAKEFS_DATABASE_CONNECTION_STRING=${LAKEFS_DATABASE_CONNECTION_STRING:-postgres://lakefs:lakefs@postgres/postgres?sslmode=disable}
- LAKEFS_BLOCKSTORE_TYPE=${LAKEFS_BLOCKSTORE_TYPE:-local}
- LAKEFS_BLOCKSTORE_LOCAL_PATH=${LAKEFS_BLOCKSTORE_LOCAL_PATH:-/home/lakefs}
- LAKEFS_GATEWAYS_S3_DOMAIN_NAME=${LAKEFS_GATEWAYS_S3_DOMAIN_NAME:-s3.local.lakefs.io:8000}
- LAKEFS_BLOCKSTORE_S3_CREDENTIALS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID:-}
- LAKEFS_BLOCKSTORE_S3_CREDENTIALS_ACCESS_SECRET_KEY=${AWS_SECRET_ACCESS_KEY:-}
- LAKEFS_LOGGING_LEVEL=${LAKEFS_LOGGING_LEVEL:-INFO}
- LAKEFS_STATS_ENABLED
- LAKEFS_BLOCKSTORE_S3_ENDPOINT
- LAKEFS_BLOCKSTORE_S3_FORCE_PATH_STYLE
- LAKEFS_COMMITTED_LOCAL_CACHE_DIR=${LAKEFS_COMMITTED_LOCAL_CACHE_DIR:-/home/lakefs/.local_tier}
entrypoint:
[
"/app/wait-for",
"postgres:5432",
"--",
"/app/lakefs",
"run"
]
postgres:
image: "postgres:${PG_VERSION:-11}"
command: "-c log_min_messages=FATAL"
ports:
- "5432:5432"
environment:
POSTGRES_USER: lakefs
POSTGRES_PASSWORD: lakefs
logging:
driver: none
dremio:
build: ./
ports:
- "9047:9047"
- "31010:31010"
- "8849:8849"
s3:
image: minio/minio
environment:
- "MINIO_ACCESS_KEY=minio"
- "MINIO_SECRET_KEY=minio123"
command: server /data --console-address ":9001"
ports:
- "9000:9000"
- "9001:9001"
- 初始化&准备
这个比较简单,参考官方文档或者我以前写的
配置hooks
- 创建文件夹(在lakefs 的repo中)
lakfes 的hooks 是基于前缀的,而且需要配置放到特定的目录(_lakefs_actions)
效果
- 创建hooks
就是一个yaml文件,同时官方代码中也提供了不少demo
以前hooks 是在pre-merge 阶段执行的,而且是特性的main分支
name: Good files check
description: set of checks to verify that branch is good
on:
pre-commit:
pre-merge:
branches:
- main
hooks:
- id: no_temp
type: webhook
description: checking no temporary files found
properties:
url: "https://your.domain.io/webhook?notmp=true?t=1za2PbkZK1bd4prMuTDr6BeEQwWYcX2R"
- id: no_freeze
type: webhook
description: check production is not in dev freeze
properties:
url: "https://your.domain.io/webhook?nofreeze=true?t=1za2PbkZK1bd4prMuTDr6BeEQwWYcX2R"
官方demo
- 触发hooks
创建一个分支,上传一些文件,然后合并main
注意执行不会成功,因为接口地址是不通的,对于异常的话,lakefs 直接会提示异常,同时合并操作是不会成功的(原子操作)
说明
目前官方提供了两种模式的hooks,webhook 以及airflow (都是比较好用的),同时也提供了接口可以查看hooks 的结果
利用好hooks 能解决不少我们的问题,很强大的功能