• Airflow遇到问题 Task received SIGTERM signal


    当我们在使用airflow调度系统的时候,出现以下问题

    *** Reading local file: /usr/local/airflow/logs/dagabcfcdeebc7d438db4ec7ef944d19b5a_7d4ce1b2d8d84e82a6421472ce3bef48/mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48/2020-03-18T02:36:05.955649+00:00/1.log
    [2020-03-18 02:36:08,865] {taskinstance.py:630} INFO - Dependencies all met for <TaskInstance: dagabcfcdeebc7d438db4ec7ef944d19b5a_7d4ce1b2d8d84e82a6421472ce3bef48.mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 2020-03-18T02:36:05.955649+00:00 [queued]>
    [2020-03-18 02:36:08,881] {taskinstance.py:630} INFO - Dependencies all met for <TaskInstance: dagabcfcdeebc7d438db4ec7ef944d19b5a_7d4ce1b2d8d84e82a6421472ce3bef48.mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 2020-03-18T02:36:05.955649+00:00 [queued]>
    [2020-03-18 02:36:08,881] {taskinstance.py:841} INFO - 
    --------------------------------------------------------------------------------
    [2020-03-18 02:36:08,881] {taskinstance.py:842} INFO - Starting attempt 1 of 4
    [2020-03-18 02:36:08,881] {taskinstance.py:843} INFO - 
    --------------------------------------------------------------------------------
    [2020-03-18 02:36:08,893] {taskinstance.py:862} INFO - Executing <Task(SSHOperator): mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48> on 2020-03-18T02:36:05.955649+00:00
    [2020-03-18 02:36:08,893] {base_task_runner.py:133} INFO - Running: ['airflow', 'run', 'dagabcfcdeebc7d438db4ec7ef944d19b5a_7d4ce1b2d8d84e82a6421472ce3bef48', 'mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48', '2020-03-18T02:36:05.955649+00:00', '--job_id', '759619', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/dagabcfcdeebc7d438db4ec7ef944d19b5a_7d4ce1b2d8d84e82a6421472ce3bef48.py', '--cfg_path', '/tmp/tmp6jf69h_y']
    [2020-03-18 02:36:09,494] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 /usr/local/lib/python3.7/site-packages/airflow/configuration.py:226: FutureWarning: The task_runner setting in [core] has the old default value of 'BashTaskRunner'. This value has been changed to 'StandardTaskRunner' in the running config, but please update your config before Apache Airflow 2.0.
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   FutureWarning
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 /usr/local/lib/python3.7/site-packages/airflow/configuration.py:606: DeprecationWarning: Specifying both AIRFLOW_HOME environment variable and airflow_home in the config file is deprecated. Please use only the AIRFLOW_HOME environment variable and remove the config file entry.
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   warnings.warn(msg, category=DeprecationWarning)
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 /usr/local/lib/python3.7/site-packages/airflow/config_templates/airflow_local_settings.py:65: DeprecationWarning: The elasticsearch_host option in [elasticsearch] has been renamed to host - the old setting has been used, but please update your config.
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   ELASTICSEARCH_HOST = conf.get('elasticsearch', 'HOST')
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 /usr/local/lib/python3.7/site-packages/airflow/config_templates/airflow_local_settings.py:67: DeprecationWarning: The elasticsearch_log_id_template option in [elasticsearch] has been renamed to log_id_template - the old setting has been used, but please update your config.
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   ELASTICSEARCH_LOG_ID_TEMPLATE = conf.get('elasticsearch', 'LOG_ID_TEMPLATE')
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 /usr/local/lib/python3.7/site-packages/airflow/config_templates/airflow_local_settings.py:69: DeprecationWarning: The elasticsearch_end_of_log_mark option in [elasticsearch] has been renamed to end_of_log_mark - the old setting has been used, but please update your config.
    [2020-03-18 02:36:09,495] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   ELASTICSEARCH_END_OF_LOG_MARK = conf.get('elasticsearch', 'END_OF_LOG_MARK')
    [2020-03-18 02:36:09,624] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:09,624] {settings.py:252} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=110780
    [2020-03-18 02:36:10,159] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,158] {configuration.py:299} WARNING - section/key [rest_api_plugin/log_loading] not found in config
    [2020-03-18 02:36:10,159] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,159] {rest_api_plugin.py:60} WARNING - Initializing [rest_api_plugin/LOG_LOADING] with default value = False
    [2020-03-18 02:36:10,159] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,159] {configuration.py:299} WARNING - section/key [rest_api_plugin/filter_loading_messages_in_cli_response] not found in config
    [2020-03-18 02:36:10,159] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,159] {rest_api_plugin.py:60} WARNING - Initializing [rest_api_plugin/FILTER_LOADING_MESSAGES_IN_CLI_RESPONSE] with default value = True
    [2020-03-18 02:36:10,160] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,159] {configuration.py:299} WARNING - section/key [rest_api_plugin/rest_api_plugin_http_token_header_name] not found in config
    [2020-03-18 02:36:10,160] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,160] {rest_api_plugin.py:50} WARNING - Initializing [rest_api_plugin/REST_API_PLUGIN_HTTP_TOKEN_HEADER_NAME] with default value = rest_api_plugin_http_token
    [2020-03-18 02:36:10,160] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,160] {configuration.py:299} WARNING - section/key [rest_api_plugin/rest_api_plugin_expected_http_token] not found in config
    [2020-03-18 02:36:10,160] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,160] {rest_api_plugin.py:50} WARNING - Initializing [rest_api_plugin/REST_API_PLUGIN_EXPECTED_HTTP_TOKEN] with default value = None
    [2020-03-18 02:36:10,301] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,300] {__init__.py:51} INFO - Using executor SequentialExecutor
    [2020-03-18 02:36:10,301] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,301] {dagbag.py:92} INFO - Filling up the DagBag from /usr/local/airflow/dags/dagabcfcdeebc7d438db4ec7ef944d19b5a_7d4ce1b2d8d84e82a6421472ce3bef48.py
    [2020-03-18 02:36:10,333] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 /usr/local/lib/python3.7/site-packages/airflow/utils/helpers.py:428: DeprecationWarning: Importing 'ExternalTaskSensor' directly from 'airflow.operators' has been deprecated. Please import from 'airflow.operators.[operator_module]' instead. Support for direct imports will be dropped entirely in Airflow 2.0.
    [2020-03-18 02:36:10,334] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   DeprecationWarning)
    [2020-03-18 02:36:10,334] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 /usr/local/lib/python3.7/site-packages/airflow/utils/helpers.py:428: DeprecationWarning: Importing 'EmailOperator' directly from 'airflow.operators' has been deprecated. Please import from 'airflow.operators.[operator_module]' instead. Support for direct imports will be dropped entirely in Airflow 2.0.
    [2020-03-18 02:36:10,334] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   DeprecationWarning)
    [2020-03-18 02:36:10,385] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 [2020-03-18 02:36:10,385] {cli.py:545} INFO - Running <TaskInstance: dagabcfcdeebc7d438db4ec7ef944d19b5a_7d4ce1b2d8d84e82a6421472ce3bef48.mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 2020-03-18T02:36:05.955649+00:00 [running]> on host airflow-5d68dbf958-pcvw5
    [2020-03-18 02:36:10,405] {logging_mixin.py:112} INFO - [2020-03-18 02:36:10,405] {ssh_hook.py:166} WARNING - Remote Identification Change is not verified. This wont protect against Man-In-The-Middle attacks
    [2020-03-18 02:36:10,405] {logging_mixin.py:112} INFO - [2020-03-18 02:36:10,405] {ssh_hook.py:170} WARNING - No Host Key Verification. This wont protect against Man-In-The-Middle attacks
    [2020-03-18 02:36:10,416] {logging_mixin.py:112} INFO - [2020-03-18 02:36:10,416] {transport.py:1819} INFO - Connected (version 2.0, client OpenSSH_8.0)
    [2020-03-18 02:36:10,493] {logging_mixin.py:112} INFO - [2020-03-18 02:36:10,492] {transport.py:1819} INFO - Authentication (password) successful!
    [2020-03-18 02:36:10,606] {ssh_operator.py:139} INFO - DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
    Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
    [2020-03-18 02:36:11,206] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.202 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
    [2020-03-18 02:36:11,214] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.213 [main] INFO  Engine - the machine info  => 
    
        osInfo:    Oracle Corporation 1.8 25.102-b14
        jvmInfo:    Linux amd64 3.10.0-862.14.4.el7.x86_64
        cpu num:    4
    
        totalPhysicalMemory:    -0.00G
        freePhysicalMemory:    -0.00G
        maxFileDescriptorCount:    -1
        currentOpenFileDescriptorCount:    -1
    
        GC Names    [PS MarkSweep, PS Scavenge]
    
        MEMORY_NAME                    | allocation_size                | init_size                      
        PS Eden Space                  | 256.00MB                       | 256.00MB                       
        Code Cache                     | 240.00MB                       | 2.44MB                         
        Compressed Class Space         | 1,024.00MB                     | 0.00MB                         
        PS Survivor Space              | 42.50MB                        | 42.50MB                        
        PS Old Gen                     | 683.00MB                       | 683.00MB                       
        Metaspace                      | -0.00MB                        | 0.00MB                         
    [2020-03-18 02:36:11,241] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.240 [main] INFO  Engine - 
    {
        "content":[
            {
                "reader":{
                    "name":"mongodbreader",
                    "parameter":{
                        "address":[
                            "10.1.119.32:30717"
                        ],
                        "collectionName":"data_collects",
                        "column":[
                            {
                                "name":"id",
                                "type":"string"
                            },
                            {
                                "name":"title",
                                "type":"string"
                            },
                            {
                                "name":"description",
                                "type":"string"
                            },
                            {
                                "name":"bys",
                                "type":"string"
                            },
                            {
                                "name":"url",
                                "type":"string"
                            },
                            {
                                "name":"tags",
                                "type":"Array"
                            },
                            {
                                "name":"likes",
                                "type":"double"
                            }
                        ],
                        "dbName":"xa_dmp",
                        "userName":"root",
                        "userPassword":"*********"
                    }
                },
                "writer":{
                    "name":"hdfswriter",
                    "parameter":{
                        "column":[
                            {
                                "name":"id",
                                "type":"STRING"
                            },
                            {
                                "name":"title",
                                "type":"STRING"
                            },
                            {
                                "name":"description",
                                "type":"STRING"
                            },
                            {
                                "name":"bys",
                                "type":"STRING"
                            },
                            {
                                "name":"url",
                                "type":"STRING"
                            },
                            {
                                "name":"tags",
                                "type":"STRING"
                            },
                            {
                                "name":"likes",
                                "type":"STRING"
                            }
                        ],
                        "compress":"",
                        "defaultFS":"hdfs://10.1.119.13:8020",
                        "fieldDelimiter":"u0001",
                        "fileName":"data_collects7d4ce1b2d8d84e82a6421472ce3bef48",
                        "fileType":"text",
                        "path":"/user/hive/warehouse/ods.db/data_collects",
                        "writeMode":"append"
                    }
                }
            }
        ],
        "setting":{
            "speed":{
                "channel":2
            }
        }
    }
    [2020-03-18 02:36:11,260] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.259 [main] WARN  Engine - prioriy set to 0, because NumberFormatException, the value is: null
    [2020-03-18 02:36:11,262] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.261 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
    2020-03-18 10:36:11.262 [main] INFO  JobContainer - DataX jobContainer starts job.
    [2020-03-18 02:36:11,264] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.264 [main] INFO  JobContainer - Set jobId = 0
    [2020-03-18 02:36:11,395] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.394 [job-0] INFO  cluster - Cluster created with settings {hosts=[10.1.119.32:30717], mode=MULTIPLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
    [2020-03-18 02:36:11,396] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.394 [job-0] INFO  cluster - Adding discovered server 10.1.119.32:30717 to client view of cluster
    [2020-03-18 02:36:11,663] {ssh_operator.py:139} INFO - 2020-03-18 10:36:11.656 [cluster-ClusterId{value='5e71891b975ee89137c49e53', description='null'}-10.1.119.32:30717] INFO  connection - Opened connection [connectionId{localValue:1, serverValue:76}] to 10.1.119.32:30717
    2020-03-18 10:36:11.658 [cluster-ClusterId{value='5e71891b975ee89137c49e53', description='null'}-10.1.119.32:30717] INFO  cluster - Monitor thread successfully connected to server with description ServerDescription{address=10.1.119.32:30717, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 1, 6]}, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, roundTripTimeNanos=654449}
    2020-03-18 10:36:11.659 [cluster-ClusterId{value='5e71891b975ee89137c49e53', description='null'}-10.1.119.32:30717] INFO  cluster - Discovered cluster type of STANDALONE
    [2020-03-18 02:36:12,044] {ssh_operator.py:144} WARNING - 三月 18, 2020 10:36:12 上午 org.apache.hadoop.util.NativeCodeLoader <clinit>
    警告: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    [2020-03-18 02:36:12,734] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.733 [job-0] INFO  JobContainer - jobContainer starts to do prepare ...
    [2020-03-18 02:36:12,735] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.734 [job-0] INFO  JobContainer - DataX Reader.Job [mongodbreader] do prepare work .
    [2020-03-18 02:36:12,736] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.735 [job-0] INFO  JobContainer - DataX Writer.Job [hdfswriter] do prepare work .
    [2020-03-18 02:36:12,854] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.852 [job-0] INFO  HdfsWriter$Job - 由于您配置了writeMode append, 写入前不做清理工作, [/user/hive/warehouse/ods.db/data_collects] 目录下写入相应文件名前缀  [data_collects7d4ce1b2d8d84e82a6421472ce3bef48] 的文件
    2020-03-18 10:36:12.852 [job-0] INFO  JobContainer - jobContainer starts to do split ...
    2020-03-18 10:36:12.852 [job-0] INFO  JobContainer - Job set Channel-Number to 2 channels.
    [2020-03-18 02:36:12,896] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.894 [job-0] INFO  connection - Opened connection [connectionId{localValue:2, serverValue:77}] to 10.1.119.32:30717
    [2020-03-18 02:36:12,910] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.907 [job-0] INFO  JobContainer - DataX Reader.Job [mongodbreader] splits to [2] tasks.
    2020-03-18 10:36:12.908 [job-0] INFO  HdfsWriter$Job - begin do split...
    [2020-03-18 02:36:12,923] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.917 [job-0] INFO  HdfsWriter$Job - splited write file name:[hdfs://10.1.119.13:8020/user/hive/warehouse/ods.db/data_collects__0dc15a85_5060_4ff9_aa5a_cac01adb9eee/data_collects7d4ce1b2d8d84e82a6421472ce3bef48__78d34f42_b4bb_4c9b_88e8_65e6d32e6879]
    2020-03-18 10:36:12.918 [job-0] INFO  HdfsWriter$Job - splited write file name:[hdfs://10.1.119.13:8020/user/hive/warehouse/ods.db/data_collects__0dc15a85_5060_4ff9_aa5a_cac01adb9eee/data_collects7d4ce1b2d8d84e82a6421472ce3bef48__25d4da52_f1c9_478d_b0ad_13ec3dcadc2f]
    2020-03-18 10:36:12.918 [job-0] INFO  HdfsWriter$Job - end do split.
    2020-03-18 10:36:12.918 [job-0] INFO  JobContainer - DataX Writer.Job [hdfswriter] splits to [2] tasks.
    [2020-03-18 02:36:12,947] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.944 [job-0] INFO  JobContainer - jobContainer starts to do schedule ...
    [2020-03-18 02:36:12,963] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.953 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.
    2020-03-18 10:36:12.956 [job-0] INFO  JobContainer - Running by standalone Mode.
    [2020-03-18 02:36:12,991] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.988 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [2] channels for [2] tasks.
    [2020-03-18 02:36:13,003] {ssh_operator.py:139} INFO - 2020-03-18 10:36:12.997 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.
    2020-03-18 10:36:12.998 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.
    [2020-03-18 02:36:13,036] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.027 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
    2020-03-18 10:36:13.030 [0-0-0-reader] INFO  cluster - Cluster created with settings {hosts=[10.1.119.32:30717], mode=MULTIPLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
    2020-03-18 10:36:13.031 [0-0-0-reader] INFO  cluster - Adding discovered server 10.1.119.32:30717 to client view of cluster
    [2020-03-18 02:36:13,068] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.064 [cluster-ClusterId{value='5e71891d975ee89137c49e54', description='null'}-10.1.119.32:30717] INFO  connection - Opened connection [connectionId{localValue:3, serverValue:78}] to 10.1.119.32:30717
    2020-03-18 10:36:13.065 [cluster-ClusterId{value='5e71891d975ee89137c49e54', description='null'}-10.1.119.32:30717] INFO  cluster - Monitor thread successfully connected to server with description ServerDescription{address=10.1.119.32:30717, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 1, 6]}, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, roundTripTimeNanos=644490}
    2020-03-18 10:36:13.066 [cluster-ClusterId{value='5e71891d975ee89137c49e54', description='null'}-10.1.119.32:30717] INFO  cluster - Discovered cluster type of STANDALONE
    [2020-03-18 02:36:13,069] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.069 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[1] attemptCount[1] is started
    [2020-03-18 02:36:13,071] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.071 [0-0-0-writer] INFO  HdfsWriter$Task - begin do write...
    [2020-03-18 02:36:13,072] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.071 [0-0-0-writer] INFO  HdfsWriter$Task - write to file : [hdfs://10.1.119.13:8020/user/hive/warehouse/ods.db/data_collects__0dc15a85_5060_4ff9_aa5a_cac01adb9eee/data_collects7d4ce1b2d8d84e82a6421472ce3bef48__78d34f42_b4bb_4c9b_88e8_65e6d32e6879]
    [2020-03-18 02:36:13,092] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.086 [0-0-1-reader] INFO  cluster - Cluster created with settings {hosts=[10.1.119.32:30717], mode=MULTIPLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
    2020-03-18 10:36:13.088 [0-0-1-reader] INFO  cluster - Adding discovered server 10.1.119.32:30717 to client view of cluster
    [2020-03-18 02:36:13,098] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.094 [0-0-1-writer] INFO  HdfsWriter$Task - begin do write...
    2020-03-18 10:36:13.095 [0-0-1-writer] INFO  HdfsWriter$Task - write to file : [hdfs://10.1.119.13:8020/user/hive/warehouse/ods.db/data_collects__0dc15a85_5060_4ff9_aa5a_cac01adb9eee/data_collects7d4ce1b2d8d84e82a6421472ce3bef48__25d4da52_f1c9_478d_b0ad_13ec3dcadc2f]
    [2020-03-18 02:36:13,162] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.159 [0-0-1-reader] INFO  cluster - No server chosen by ReadPreferenceServerSelector{readPreference=primary} from cluster description ClusterDescription{type=UNKNOWN, connectionMode=MULTIPLE, all=[ServerDescription{address=10.1.119.32:30717, type=UNKNOWN, state=CONNECTING}]}. Waiting for 30000 ms before timing out
    [2020-03-18 02:36:13,231] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.219 [cluster-ClusterId{value='5e71891d975ee89137c49e55', description='null'}-10.1.119.32:30717] INFO  connection - Opened connection [connectionId{localValue:4, serverValue:79}] to 10.1.119.32:30717
    2020-03-18 10:36:13.220 [cluster-ClusterId{value='5e71891d975ee89137c49e55', description='null'}-10.1.119.32:30717] INFO  cluster - Monitor thread successfully connected to server with description ServerDescription{address=10.1.119.32:30717, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[4, 1, 6]}, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, roundTripTimeNanos=658846}
    2020-03-18 10:36:13.221 [cluster-ClusterId{value='5e71891d975ee89137c49e55', description='null'}-10.1.119.32:30717] INFO  cluster - Discovered cluster type of STANDALONE
    [2020-03-18 02:36:13,244] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.243 [0-0-0-writer] INFO  HdfsWriter$Task - end do write
    [2020-03-18 02:36:13,266] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.264 [0-0-1-reader] INFO  connection - Opened connection [connectionId{localValue:5, serverValue:80}] to 10.1.119.32:30717
    [2020-03-18 02:36:13,275] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.273 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[264]ms
    [2020-03-18 02:36:13,438] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.437 [0-0-1-writer] INFO  HdfsWriter$Task - end do write
    [2020-03-18 02:36:13,478] {ssh_operator.py:139} INFO - 2020-03-18 10:36:13.475 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[1] is successed, used[438]ms
    2020-03-18 10:36:13.476 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it's tasks.
    [2020-03-18 02:36:13,862] {logging_mixin.py:112} INFO - [2020-03-18 02:36:13,862] {local_task_job.py:167} WARNING - State of this instance has been externally set to removed. Taking the poison pill.
    [2020-03-18 02:36:13,865] {helpers.py:308} INFO - Sending Signals.SIGTERM to GPID 110780
    [2020-03-18 02:36:13,866] {taskinstance.py:905} ERROR - Received SIGTERM. Terminating subprocesses.
    [2020-03-18 02:36:13,873] {taskinstance.py:1058} ERROR - SSH operator error: Task received SIGTERM signal
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/ssh_operator.py", line 133, in execute
        readq, _, _ = select([channel], [], [], self.timeout)
      File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 907, in signal_handler
        raise AirflowException("Task received SIGTERM signal")
    airflow.exceptions.AirflowException: Task received SIGTERM signal
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 930, in _run_raw_task
        result = task_copy.execute(context=context)
      File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/ssh_operator.py", line 173, in execute
        raise AirflowException("SSH operator error: {0}".format(str(e)))
    airflow.exceptions.AirflowException: SSH operator error: Task received SIGTERM signal
    [2020-03-18 02:36:13,875] {taskinstance.py:1081} INFO - Marking task as UP_FOR_RETRY
    [2020-03-18 02:36:13,892] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 Traceback (most recent call last):
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/ssh_operator.py", line 133, in execute
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     readq, _, _ = select([channel], [], [], self.timeout)
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 907, in signal_handler
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     raise AirflowException("Task received SIGTERM signal")
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 airflow.exceptions.AirflowException: Task received SIGTERM signal
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 During handling of the above exception, another exception occurred:
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 Traceback (most recent call last):
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/bin/airflow", line 37, in <module>
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     args.func(args)
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 74, in wrapper
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     return f(*args, **kwargs)
    [2020-03-18 02:36:13,893] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 551, in run
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     _run(args, dag, ti)
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 469, in _run
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     pool=args.pool,
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     return func(*args, **kwargs)
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 930, in _run_raw_task
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     result = task_copy.execute(context=context)
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48   File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/ssh_operator.py", line 173, in execute
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48     raise AirflowException("SSH operator error: {0}".format(str(e)))
    [2020-03-18 02:36:13,894] {base_task_runner.py:115} INFO - Job 759619: Subtask mongodb2hdfs_7d4ce1b2d8d84e82a6421472ce3bef48 airflow.exceptions.AirflowException: SSH operator error: Task received SIGTERM signal
    [2020-03-18 02:36:14,279] {helpers.py:286} INFO - Process psutil.Process(pid=110780, status='terminated') (110780) terminated with exit code 1
    [2020-03-18 02:36:14,280] {logging_mixin.py:112} INFO - [2020-03-18 02:36:14,280] {local_task_job.py:124} WARNING - Time since last heartbeat(0.43 s) < heartrate(5.0 s), sleeping for 4.5726189999999995 s
    [2020-03-18 02:36:18,858] {logging_mixin.py:112} INFO - [2020-03-18 02:36:18,857] {local_task_job.py:103} INFO - Task exited with return code 0

    网上有帖子说是通过加入线程来解决问题,然后,Airflow会等到所有线程执行完毕后再发送SIGTERM。

    threads = []   #array for threads
    
     t = Thread(...)
     threads.append(t) #add all threads
    
     # Start all threads
     for x in threads:
         x.start()
    
     # Wait for all of them to finish
     for x in threads:
         x.join()

    关于这个解决方案我不是很理解了,不过我这里通过另外一种解决方案了解决,实在一点,修改airflow.cfg文件参数

    首先进入你安装的airflow.cfg文件

    找到这个配置项

    把这个超时时间调大一些,因为主要原因是因为没有灯任务运行结束就发出SIGTERM信号了,然后再启动一下调度任务,不过这次等待运行的时间会稍微长一些,最终是成功了的!!

  • 相关阅读:
    poj 1700 Crossing River 过河问题。贪心
    Alice's Print Service
    POI 2000 ------Stripes
    Uva 1378
    hdu 3068 最长回文
    bnu Game 博弈。
    链栈的C语言实现
    链栈的C语言实现
    顺序栈C语言实现
    顺序栈C语言实现
  • 原文地址:https://www.cnblogs.com/braveym/p/12516513.html
Copyright © 2020-2023  润新知