• setup airflow on MySQL


    SQLite Database

    https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html#setting-up-a-sqlite-database

    用于开发环境,有一些限制,只支持 序列执行器, 不能用作产品环境。

    SQLite database can be used to run Airflow for development purpose as it does not require any database server (the database is stored in a local file). There are a few limitations of using the SQLite database (for example it only works with Sequential Executor) and it should NEVER be used for production.

    MySQL Database

    https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html#setting-up-a-mysql-database

    先建立数据库:

    You need to create a database and a database user that Airflow will use to access this database. In the example below, a database airflow_db and user with username airflow_user with password airflow_pass will be created

    CREATE DATABASE airflow_db CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    CREATE USER 'airflow_user' IDENTIFIED BY 'airflow_pass';
    GRANT ALL PRIVILEGES ON airflow_db.* TO 'airflow_user';
    

    设置数据库配置文件  explicit_defaults_for_timestamp=1

    We rely on more strict ANSI SQL settings for MySQL in order to have sane defaults. Make sure to have specified explicit_defaults_for_timestamp=1 option under [mysqld] section in your my.cnf file. You can also activate these options with the --explicit-defaults-for-timestamp switch passed to mysqld executable

    安装python的客户端驱动库

    pip install mysqlclient

    We recommend using the mysqlclient driver and specifying it in your SqlAlchemy connection string.

    设置DATABASE_URI

    https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html#database-uri

    修改 airflow.cfg配置文件中的 sql_alchemy_conn 参数,或者配置环境变量AIRFLOW__CORE__SQL_ALCHEMY_CONN

    Airflow uses SQLAlchemy to connect to the database, which requires you to configure the Database URL. You can do this in option sql_alchemy_conn in section [core]. It is also common to configure this option with AIRFLOW__CORE__SQL_ALCHEMY_CONN environment variable.


    mysql+mysqldb://<user>:<password>@<host>[:<port>]/<dbname>

    初始化数据库

    https://stackoverflow.com/questions/61663681/configure-sql-server-in-airflow-with-sql-alchemy-conn

    airflow initdb

    另外一种客户端驱动

    支持SSL免证书选项。

    But we also support the mysql-connector-python driver, which lets you connect through SSL without any cert options provided.

    mysql+mysqlconnector://<user>:<password>@<host>[:<port>]/<dbname>

    客户端驱动issue

    如果驱动未安装或者安装不正确, 执行 airflow initdb 会报错

    ModuleNotFoundError: No module named 'MySQLdb'

    解决办法就是使用pip安装相应的驱动。

    https://stackoverflow.com/questions/51245121/mysqldb-modulenotfounderror

    Database Urls of SQLAlchemy

    https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls

    The create_engine() function produces an Engine object based on a URL. These URLs follow RFC-1738, and usually can include username, password, hostname, database name as well as optional keyword arguments for additional configuration. In some cases a file path is accepted, and in others a “data source name” replaces the “host” and “database” portions. The typical form of a database URL is:

    dialect+driver://username:password@host:port/database

    Dialect names include the identifying name of the SQLAlchemy dialect, a name such as sqlite, mysql, postgresql, oracle, or mssql. The drivername is the name of the DBAPI to be used to connect to the database using all lowercase letters. If not specified, a “default” DBAPI will be imported if available - this default is typically the most widely known driver available for that backend.

    The MySQL dialect uses mysql-python as the default DBAPI. There are many MySQL DBAPIs available, including MySQL-connector-python and OurSQL:

    # default
    engine = create_engine('mysql://scott:tiger@localhost/foo')
    
    # mysqlclient (a maintained fork of MySQL-Python)
    engine = create_engine('mysql+mysqldb://scott:tiger@localhost/foo')
    
    # PyMySQL
    engine = create_engine('mysql+pymysql://scott:tiger@localhost/foo')

    Excercise

    https://github.com/fanqingsong/machine_learning_workflow_on_airflow

    基于本地的调测环境,将此项目改造为基于MySQL数据库后台, 并在readme中记录过程。

    出处:http://www.cnblogs.com/lightsong/ 本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接。
  • 相关阅读:
    Redhat MysqlReport安装配置详解
    asp.net中服务器端控件和客户端控件的交互问题
    关于弹出对话框返回值的分析
    关于父子窗口的参数传递(引用的高手的)
    呵呵!刚刚申请!
    Loadrunner教程
    性能测试常见用语
    如何删除电脑垃圾文件
    内连接和外连接
    酒桌上的规矩
  • 原文地址:https://www.cnblogs.com/lightsong/p/14972493.html
Copyright © 2020-2023  润新知