• FATE安装部署日志


    参考:
    https://blog.csdn.net/qq_42906753/article/details/105138596

    1、Manage Docker as a non-root user时,出现问题,

    docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.
    See 'docker run --help'.
    

    原因:未启动docker
    解决办法:

    service  docker start
    

    部署机ip:192.168.170.142/24 目标机ip:192.168.170.145/24

    目标机:

    部署机

    参考https://blog.csdn.net/weixin_44002829/article/details/97619826
    下载python3.6编译安装
    命令:wget https://www.python.org/ftp/python/3.6.0/Python-3.6.0.tgz
    (如果没有安装wget, 先安装,命令 : yum install wget)

    解压: tar -xzvf Python-3.6.0.tgz (解压在home目录)

    指向路径: cd Python-3.6.0 (不知道文件夹在哪可以查找一下 用ls 指令查一下在那个目录下,然后cd)

    编译: ./configure --prefix=/usr/local

    如果遇到 configure: error: no acceptable C complier found in $PATH
    解决: yum install gcc

    继续 :

    make altinstall

    更改 /usr/bin/python链接

    cd /usr/bin

    mv python python.backup

    ln -s /usr/local/bin/python3.6 /usr/bin/python
    ln -s /usr/local/bin/python3.6 /usr/bin/python3

    更改yum脚本的python 依赖
    (这个改了不知道有什么用)

    ls yum*
    vi /usr/bin/yum
    vi /usr/libexec/urlgrabber-ext-down
    

    (将执行指令后进入的文件的开头为

    !/usr/bin/python 改为 #!/usr/bin/python2)

    之后python3.6就完成了.
    下载FATE

    curl -OL https://github.com/FederatedAI/KubeFATE/releases/download/v1.3.0/kubefate-docker-compose.tar.gz#下载
    tar -xvzf kubefate-docker-compose.tar.gz   #解压
    

    进入docker-deploy目录,对parties.conf修改。

    下载安装虚拟化所用工具(pip install virtualenvwrapper)时,出现错误:

    Could not fetch URL https://pypi.org/simple/virtualenvwrapper/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/virtualenvwrapper/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)) - skipping
      Could not find a version that satisfies the requirement virtualenvwrapper (from versions: )
    No matching distribution found for virtualenvwrapper
    pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
    

    使用ssh

    在部署机上使用ssh root@192.168.170.145 可以在部署机上连接目标机。

    在部署机上,下载并解压Kubefate1.3的kubefate-docker-compose.tar.gz资源包

    # curl -OL https://github.com/FederatedAI/KubeFATE/releases/download/v1.3.0/kubefate-docker-compose.tar.gz
    
    # tar -xzf kubefate-docker-compose.tar.gz
    

    定义需要部署的实例数目

    进入docker-deploy目录
    
    # cd docker-deploy/
    

    编辑parties.conf如下

    vi parties.conf 
    
    user=root                                   
    dir=/data/projects/fate                     
    partylist=(10000 9999)                      
    partyiplist=(192.168.170.142 192.168.170.145)       
    servingiplist=(192.168.170.142 192.168.170.145)     
    exchangeip=  
    

    执行生成集群启动文件脚本
    #bash generate_config.sh

    执行启动集群脚本

    # bash docker_deploy.sh all
    
    命令输入后需要每个用户输入4次root用户的密码
    

    验证集群基本功能

    #docker exec -it confs-10000_python_1 bash
    

    之后出现为error:
    Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

    解决:重启docker

    #systemctl daemon-reload
    
    #systemctl restart docker.service
    

    出现问题:

    Status: Downloaded newer image for federatedai/fateboard:1.3.0-release
    Creating docker-deploy_proxy_1        ... done
    Creating docker-deploy_redis_1      ... done
    Creating docker-deploy_mysql_1      ... done
    Creating docker-deploy_federation_1 ... done
    Creating docker-deploy_egg_1          ... done
    Creating docker-deploy_meta-service_1 ... done
    Creating docker-deploy_roll_1         ... done
    Creating docker-deploy_python_1       ... error
    
    ERROR: for docker-deploy_python_1  Cannot create container for service python: failed to mount local volume: mount /path/to/host/dir/examples:/var/lib/docker/volumes/docker-deploy_shared_dir_examples/_data, flags: 0x1000: no such file or directory
    
    ERROR: for python  Cannot create container for service python: failed to mount local volume: mount /path/to/host/dir/examples:/var/lib/docker/volumes/docker-deploy_shared_dir_examples/_data, flags: 0x1000: no such file or directory
    ERROR: Encountered errors while bringing up the project.
    
  • 相关阅读:
    grunt in webstorm
    10+ Best Responsive HTML5 AngularJS Templates
    响应式布局
    responsive grid
    responsive layout
    js event bubble and capturing
    Understanding Service Types
    To add private variable to this Javascript literal object
    Centering HTML elements larger than their parents
    java5 新特性
  • 原文地址:https://www.cnblogs.com/eosmomo/p/13894813.html
Copyright © 2020-2023  润新知