• Singer 学习二 使用Singer进行gitlab 2 postgres 数据转换


    Singer 可以方便的进行数据的etl 处理,我们可以处理的数据可以是api 接口,也可以是数据库数据,或者
    是文件
    备注: 测试使用docker-compose 运行&&提供数据库内容,使用virtualenv && python 3.5 以及以上

    环境准备

    • docker-compose 文件
     
    version: "3"
    services:
      gogs-service:
        image: gogs/gogs
        ports:
          - "10022:22"
          - "10080:3000"
      mysql:
        image: mysql:5.7.16
        ports:
          - 3306:3306
        command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
        environment:
          MYSQL_ROOT_PASSWORD: dalongrong
          MYSQL_DATABASE: gogs
          MYSQL_USER: gogs
          MYSQL_PASSWORD: dalongrong
          TZ: Asia/Shanghai
      postgres:
        image: postgres:9.6.11
        ports:
        - "5432:5432"
        environment:
        - "POSTGRES_PASSWORD:dalong"
     
     
    • postgres target 配置
      target.json
     
    {
        "host": "localhost",
        "port": 5432,
        "dbname": "postgres",
        "user": "postgres",
        "password": "postgres",
        "schema": "public"
    }
     
     
    • 创建gitlab virtualenv
    virtualenv gitlab  
    source ./gitlab/bin/activate
    pip install tap-gitlab
     
    • 创建access_token
      从gitlab 官方网站创建即可
    • gitlab tap 配置文件
      格式如下,因为隐私没有暴露:
     
    {
     "api_url": "https://gitlab.com/api/v4",
     "private_token": "xxxxxxx",
     "groups": "",
     "projects": "", 
     "start_date":"2010-01-01T00:00:00Z"
    }
     
     

    运行&&效果

    • 运行
    ./gitlab/bin/tap-gitlab -c gitlab.json | ./postgres/bin/target-postgres -c target.json
     
    • 效果
    INFO Starting sync
    INFO GET https://gitlab.com/api/v4/projects/dalongrong%2Fdemoapp?private_token=XXXXXXXXX
    INFO Table 'projects' does not exist. Creating... CREATE TABLE public.projects ("approvals_before_merge" bigint, "archived" boolean, "avatar
    _url" character varying, "builds_enabled" boolean, "container_registry_enabled" boolean, "created_at" timestamp without time zone, "creator_
    id" bigint, "default_branch" character varying, "description" character varying, "forks_count" bigint, "http_url_to_repo" character varying,
     "id" bigint, "issues_enabled" boolean, "last_activity_at" timestamp without time zone, "lfs_enabled" boolean, "merge_requests_enabled" bool
    ean, "name" character varying, "name_with_namespace" character varying, "namespace__id" bigint, "namespace__kind" character varying, "namespace__name" character varying, "namespace__path" character varying, "only_allow_merge_if_all_discussions_are_resolved" boolean, "only_allow_merge_if_build_succeeds" boolean, "open_issues_count" bigint, "owner_id" bigint, "path" character varying, "path_with_namespace" character varying, "public" boolean, "public_builds" boolean, "request_access_enabled" boolean, "shared_runners_enabled" boolean, "shared_with_groups" jsonb, "snippets_enabled" boolean, "ssh_url_to_repo" character varying, "star_count" bigint, "tag_list" jsonb, "visibility_level" bigint, "web_url" character varying, "wiki_enabled" boolean, PRIMARY KEY ("id"))
    INFO Table 'branches' does not exist. Creating... CREATE TABLE public.branches ("commit_id" character varying, "developers_can_merge" boolean, "developers_can_push" boolean, "merged" boolean, "name" character varying, "project_id" bigint, "protected" boolean, PRIMARY KEY ("project_id", "name"))
    INFO Table 'commits' does not exist. Creating... CREATE TABLE public.commits ("allow_failure" boolean, "author_email" character varying, "author_name" character varying, "committer_email" character varying, "committer_name" character varying, "created_at" timestamp without time zone, "id" character varying, "message" character varying, "project_id" bigint, "short_id" character varying, "title" character varying, PRIMARY KEY ("id"))
    INFO Table 'issues' does not exist. Creating... CREATE TABLE public.issues ("assignee_id" bigint, "author_id" bigint, "confidential" boolean, "created_at" timestamp without time zone, "description" character varying, "due_date" character varying, "id" bigint, "iid" bigint, "labels" jsonb, "milestone_id" bigint, "project_id" bigint, "state" character varying, "subscribed" boolean, "title" character varying, "updated_at" timestamp wi
     
     

    说明

    使用类似的方法,我们也可以转换github 的以及jira 等基于api 开发的模型

    参考资料

    https://github.com/singer-io/tap-gitlab
    https://github.com/rongfengliang/singer-mysql2postges-demo

  • 相关阅读:
    Git 分支创建,合并, 分支切换, 分支拉取,提交
    Win7 Nodejs 安装
    .ssh github
    xxxx.IronManager was loaded by com.taobao.pandora.boot.loader.XxxxClassLoader@xxx,it should be loaded by Pandora Container...与摒弃引进别的项目的一些冲突包
    推荐一波微软家的浏览器:EDGE
    谷歌浏览器新功能 Copy Declaration
    微信支付回调数据接收不完整解决方案
    开源物联网框架EasyIot(适用于快递柜&售货机)
    开源物联网框架EasyIot场景落地(适用于快递柜、储物柜)
    海康摄像头音频方案(播放音频文件+语音对讲+语音转发)支持window/Linuxjava版本
  • 原文地址:https://www.cnblogs.com/rongfengliang/p/10239449.html
Copyright © 2020-2023  润新知