• 容器技术之Docker-swarm


      前文我聊到了docker machine的简单使用和基本原理的说明,回顾请参考https://www.cnblogs.com/qiuhom-1874/p/13160915.html;今天我们来聊一聊docker集群管理工具docker swarm;docker swarm是docker 官方的集群管理工具,它可以让跨主机节点来创建,管理docker 集群;它的主要作用就是可以把多个节点主机的docker环境整合成一个大的docker资源池;docker swarm面向的就是这个大的docker 资源池在上面管理容器;在前面我们都只是在单台主机上的创建,管理容器,但是在生产环境中通常一台物理机上的容器实在是不能够满足当前业务的需求,所以docker swarm提供了一种集群解决方案,方便在多个节点上创建,管理容器;接下来我们来看看docker swarm集群的搭建过程吧;

      docker swarm 在我们安装好docker时就已经安装好了,我们可以使用docker info来查看

    [root@node1 ~]# docker info
    Client:
     Debug Mode: false
    
    Server:
     Containers: 0
      Running: 0
      Paused: 0
      Stopped: 0
     Images: 0
     Server Version: 19.03.11
     Storage Driver: overlay2
      Backing Filesystem: xfs
      Supports d_type: true
      Native Overlay Diff: true
     Logging Driver: json-file
     Cgroup Driver: cgroupfs
     Plugins:
      Volume: local
      Network: bridge host ipvlan macvlan null overlay
      Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
     Swarm: inactive
     Runtimes: runc
     Default Runtime: runc
     Init Binary: docker-init
     containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
     runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
     init version: fec3683
     Security Options:
      seccomp
       Profile: default
     Kernel Version: 3.10.0-693.el7.x86_64
     Operating System: CentOS Linux 7 (Core)
     OSType: linux
     Architecture: x86_64
     CPUs: 4
     Total Memory: 3.686GiB
     Name: docker-node01
     ID: 4HXP:YJ5W:4SM5:NAPM:NXPZ:QFIU:ARVJ:BYDG:KVWU:5AAJ:77GC:X7GQ
     Docker Root Dir: /var/lib/docker
     Debug Mode: false
     Registry: https://index.docker.io/v1/
     Labels:
      provider=generic
     Experimental: false
     Insecure Registries:
      127.0.0.0/8
     Live Restore Enabled: false
    
    [root@node1 ~]# 
    

      提示:从上面的信息可以看到,swarm是处于非活跃状态,这是因为我们还没有初始化集群,所以对应的swarm选项的值是处于inactive状态;

      初始化集群

    [root@docker-node01 ~]# docker swarm init --advertise-addr 192.168.0.41
    Swarm initialized: current node (ynz304mbltxx10v3i15ldkmj1) is now a manager.
    
    To add a worker to this swarm, run the following command:
    
        docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-2m9x12n102ca4qlyjpseobzik 192.168.0.41:2377
    
    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
    
    [root@docker-node01 ~]# 
    

      提示:从上面反馈的信息可以看到,集群初始化成功,并且告诉我们当前节点为管理节点,如果想要其他节点加入到该集群,可以在对应节点上运行docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-2m9x12n102ca4qlyjpseobzik 192.168.0.41:2377 这个命令,就把对应节点当作work节点加入到该集群,如果想要以管理节点身份加入到集群,我们需要在当前终端运行docker swarm join-token manager命令

    [root@docker-node01 ~]# docker swarm join-token manager
    To add a manager to this swarm, run the following command:
    
        docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-dqjeh8hp6cp99bksjc03b8yu3 192.168.0.41:2377
    
    [root@docker-node01 ~]# 
    

      提示:我们执行docker swarm join-token manager命令,它返回了一个命令,并告诉我们添加一个管理节点,在对应节点上执行docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-dqjeh8hp6cp99bksjc03b8yu3 192.168.0.41:2377命令即可;

      到此docker swarm集群就初始化完毕,接下来我们把其他节点加入到该集群

      把docker-node02以work节点身份加入集群

    [root@node2 ~]# docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-2m9x12n102ca4qlyjpseobzik 192.168.0.41:2377
    This node joined a swarm as a worker.
    [root@node2 ~]# 
    

      提示:没有报错就表示加入集群成功;我们可以使用docker info来查看当前的docker 环境详细信息

      提示:从上面的信息可以看到,在docker-node02这台主机上docker swarm 已经激活,并且可以看到管理节点的地址;除了以上方式可以确定docker-node02以及加入到集群;我们还可以在管理节点上运行docker node ls 查看集群节点信息;

      查看集群节点信息

      提示:在管理节点上运行docker node ls 就可以列出当前集群里有多少节点已经成功加入进来;

      把docker-node03以管理节点身份加入到集群

      提示:可以看到docker-node03已经是集群的管理节点,所以可以在docker-node03这个节点执行docker node ls 命令;到此docker swarm集群就搭建好了;接下来我们来说一说docker swarm集群的常用管理

      有关节点相关管理命令

      docker node ls :列出当前集群上的所有节点

    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
    aeo8j7zit9qkoeeft3j0q1h0z     docker-node03       Ready               Active              Reachable           19.03.11
    [root@docker-node01 ~]# 
    

      提示:该命令只能在管理节点上执行;

      docker node inspect :查看指定节点的详细信息;

    [root@docker-node01 ~]# docker node inspect docker-node01
    [
        {
            "ID": "ynz304mbltxx10v3i15ldkmj1",
            "Version": {
                "Index": 9
            },
            "CreatedAt": "2020-06-20T05:57:17.57684293Z",
            "UpdatedAt": "2020-06-20T05:57:18.18575648Z",
            "Spec": {
                "Labels": {},
                "Role": "manager",
                "Availability": "active"
            },
            "Description": {
                "Hostname": "docker-node01",
                "Platform": {
                    "Architecture": "x86_64",
                    "OS": "linux"
                },
                "Resources": {
                    "NanoCPUs": 4000000000,
                    "MemoryBytes": 3958075392
                },
                "Engine": {
                    "EngineVersion": "19.03.11",
                    "Labels": {
                        "provider": "generic"
                    },
                    "Plugins": [
                        {
                            "Type": "Log",
                            "Name": "awslogs"
                        },
                        {
                            "Type": "Log",
                            "Name": "fluentd"
                        },
                        {
                            "Type": "Log",
                            "Name": "gcplogs"
                        },
                        {
                            "Type": "Log",
                            "Name": "gelf"
                        },
                        {
                            "Type": "Log",
                            "Name": "journald"
                        },
                        {
                            "Type": "Log",
                            "Name": "json-file"
                        },
                        {
                            "Type": "Log",
                            "Name": "local"
                        },
                        {
                            "Type": "Log",
                            "Name": "logentries"
                        },
                        {
                            "Type": "Log",
                            "Name": "splunk"
                        },
                        {
                            "Type": "Log",
                            "Name": "syslog"
                        },
                        {
                            "Type": "Network",
                            "Name": "bridge"
                        },
                        {
                            "Type": "Network",
                            "Name": "host"
                        },
                        {
                            "Type": "Network",
                            "Name": "ipvlan"
                        },
                        {
                            "Type": "Network",
                            "Name": "macvlan"
                        },
                        {
                            "Type": "Network",
                            "Name": "null"
                        },
                        {
                            "Type": "Network",
                            "Name": "overlay"
                        },
                        {
                            "Type": "Volume",
                            "Name": "local"
                        }
                    ]
                },
                "TLSInfo": {
                    "TrustRoot": "-----BEGIN CERTIFICATE-----
    MIIBaTCCARCgAwIBAgIUeBd/eSZ7WaiyLby9o1yWpjps3gwwCgYIKoZIzj0EAwIw
    EzERMA8GA1UEAxMIc3dhcm0tY2EwHhcNMjAwNjIwMDU1MjAwWhcNNDAwNjE1MDU1
    MjAwWjATMREwDwYDVQQDEwhzd2FybS1jYTBZMBMGByqGSM49AgEGCCqGSM49AwEH
    A0IABMsYxnGoPbM4gqb23E1TvOeQcLcY56XysLuF8tYKm56GuKpeD/SqXrUCYqKZ
    HV+WSqcM0fD1g+mgZwlUwFzNxhajQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMB
    Af8EBTADAQH/MB0GA1UdDgQWBBTV64kbvS83eRHyI6hdJeEIv3GmrTAKBggqhkjO
    PQQDAgNHADBEAiBBB4hLn0ijybJWH5j5rtMdAoj8l/6M3PXERnRSlhbcawIgLoby
    ewMHCnm8IIrUGe7s4CZ07iHG477punuPMKDgqJ0=
    -----END CERTIFICATE-----
    ",
                    "CertIssuerSubject": "MBMxETAPBgNVBAMTCHN3YXJtLWNh",
                    "CertIssuerPublicKey": "MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEyxjGcag9sziCpvbcTVO855BwtxjnpfKwu4Xy1gqbnoa4ql4P9KpetQJiopkdX5ZKpwzR8PWD6aBnCVTAXM3GFg=="
                }
            },
            "Status": {
                "State": "ready",
                "Addr": "192.168.0.41"
            },
            "ManagerStatus": {
                "Leader": true,
                "Reachability": "reachable",
                "Addr": "192.168.0.41:2377"
            }
        }
    ]
    [root@docker-node01 ~]#
    

      docker node ps :列出指定节点上运行容器的清单

    [root@docker-node01 ~]# docker node ps 
    ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE       ERROR               PORTS
    [root@docker-node01 ~]# docker node ps docker-node01
    ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE       ERROR               PORTS
    [root@docker-node01 ~]# 
    

      提示:类似docker ps 命令,我上面没有运行容器,所以看不到对应信息;默认不指定节点名称表示查看当前节点上的运行容器清单;

      docker node rm :删除指定节点

    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
    aeo8j7zit9qkoeeft3j0q1h0z     docker-node03       Ready               Active              Reachable           19.03.11
    [root@docker-node01 ~]# docker node rm docker-node03
    Error response from daemon: rpc error: code = FailedPrecondition desc = node aeo8j7zit9qkoeeft3j0q1h0z is a cluster manager and is a member of the raft cluster. It must be demoted to worker before removal
    [root@docker-node01 ~]# docker node rm docker-node02
    Error response from daemon: rpc error: code = FailedPrecondition desc = node tzkm0ymzjdmc1r8d54snievf1 is not down and can't be removed
    [root@docker-node01 ~]# 
    

      提示:删除节点前必须满足,被删除的节点不是管理节点,其次就是要删除的节点必须是down状态;

      docker swarm leave:离开当前集群

    [root@docker-node03 ~]# docker ps 
    CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
    e7958ffa16cd        nginx               "/docker-entrypoint.…"   28 seconds ago      Up 26 seconds       80/tcp              n1
    [root@docker-node03 ~]# docker swarm leave 
    Error response from daemon: You are attempting to leave the swarm on a node that is participating as a manager. Removing this node leaves 1 managers out of 2. Without a Raft quorum your swarm will be inaccessible. The only way to restore a swarm that has lost consensus is to reinitialize it with `--force-new-cluster`. Use `--force` to suppress this message.
    [root@docker-node03 ~]# docker swarm leave -f
    Node left the swarm.
    [root@docker-node03 ~]# 
    

      提示:管理节点默认是不允许离开集群的,如果强制使用-f选项离开集群,会导致在其他管理节点无法正常管理集群;

    [root@docker-node01 ~]# docker node ls
    Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
    [root@docker-node01 ~]#
    

      提示:我们在docker-node01上现在就不能使用docker node ls 来查看集群节点列表了;解决办法重新初始化集群;

    [root@docker-node01 ~]# docker node ls 
    Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
    [root@docker-node01 ~]# docker swarm init --advertise-addr 192.168.0.41
    Error response from daemon: This node is already part of a swarm. Use "docker swarm leave" to leave this swarm and join another one.
    [root@docker-node01 ~]# docker swarm init --force-new-cluster 
    Swarm initialized: current node (ynz304mbltxx10v3i15ldkmj1) is now a manager.
    
    To add a worker to this swarm, run the following command:
    
        docker swarm join --token SWMTKN-1-6difxlq3wc8emlwxzuw95gp8rmvbz2oq62kux3as0e4rbyqhk3-2m9x12n102ca4qlyjpseobzik 192.168.0.41:2377
    
    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
    
    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Unknown             Active                                  19.03.11
    aeo8j7zit9qkoeeft3j0q1h0z     docker-node03       Down                Active                                  19.03.11
    rm3j7cjvmoa35yy8ckuzoay46     docker-node03       Unknown             Active                                  19.03.11
    [root@docker-node01 ~]# 
    

      提示:重新初始化集群不能使用docker swarm init --advertise-addr 192.168.0.41这种方式初始化,必须使用docker swarm init --force-new-cluster,该命令表示使用从当前状态强制创建一个集群;现在我们就可以使用docker node rm 把down状态的节点从集群删除;

      删除down状态的节点

    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
    aeo8j7zit9qkoeeft3j0q1h0z     docker-node03       Down                Active                                  19.03.11
    rm3j7cjvmoa35yy8ckuzoay46     docker-node03       Down                Active                                  19.03.11
    [root@docker-node01 ~]# docker node rm aeo8j7zit9qkoeeft3j0q1h0z rm3j7cjvmoa35yy8ckuzoay46
    aeo8j7zit9qkoeeft3j0q1h0z
    rm3j7cjvmoa35yy8ckuzoay46
    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
    [root@docker-node01 ~]# 
    

      docker node promote:把指定节点提升为管理节点

    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
    [root@docker-node01 ~]# docker node promote docker-node02
    Node docker-node02 promoted to a manager in the swarm.
    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active              Reachable           19.03.11
    [root@docker-node01 ~]# 
    

      docker node demote:把指定节点降级为work节点

    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active              Reachable           19.03.11
    [root@docker-node01 ~]# docker node demote docker-node02
    Manager docker-node02 demoted in the swarm.
    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
    [root@docker-node01 ~]# 
    

      docker node update:更新指定节点

    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Active              Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
    [root@docker-node01 ~]# docker node update docker-node01 --availability drain 
    docker-node01
    [root@docker-node01 ~]# docker node ls
    ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
    ynz304mbltxx10v3i15ldkmj1 *   docker-node01       Ready               Drain               Leader              19.03.11
    tzkm0ymzjdmc1r8d54snievf1     docker-node02       Ready               Active                                  19.03.11
    [root@docker-node01 ~]# 
    

      提示:以上命令把docker-node01的availability属性更改为drain,这样更改后docker-node01的资源就不会被调度到用来运行容器;

      为docker swarm集群添加图形界面

    [root@docker-node01 docker]# docker run --name v1 -d -p 8888:8080 -e HOST=192.168.0.41 -e PORT=8080 -v /var/run/docker.sock:/var/run/docker.sock docker-registry.io/test/visualizer
    Unable to find image 'docker-registry.io/test/visualizer:latest' locally
    latest: Pulling from test/visualizer
    cd784148e348: Pull complete 
    f6268ae5d1d7: Pull complete 
    97eb9028b14b: Pull complete 
    9975a7a2a3d1: Pull complete 
    ba903e5e6801: Pull complete 
    7f034edb1086: Pull complete 
    cd5dbf77b483: Pull complete 
    5e7311667ddb: Pull complete 
    687c1072bfcb: Pull complete 
    aa18e5d3472c: Pull complete 
    a3da1957bd6b: Pull complete 
    e42dbf1c67c4: Pull complete 
    5a18b01011d2: Pull complete 
    Digest: sha256:54d65cbcbff52ee7d789cd285fbe68f07a46e3419c8fcded437af4c616915c85
    Status: Downloaded newer image for docker-registry.io/test/visualizer:latest
    3c15b186ff51848130393944e09a427bd40d2504c54614f93e28477a4961f8b6
    [root@docker-node01 docker]# docker ps 
    CONTAINER ID        IMAGE                                COMMAND             CREATED             STATUS                            PORTS                    NAMES
    3c15b186ff51        docker-registry.io/test/visualizer   "npm start"         6 seconds ago       Up 5 seconds (health: starting)   0.0.0.0:8888->8080/tcp   v1
    [root@docker-node01 docker]# 
    

      提示:我上面的命令是从私有仓库中下载的镜像,原因是互联网下载太慢了,所以我提前下载好,放在私有仓库中;有关私有仓库的搭建使用,请参考https://www.cnblogs.com/qiuhom-1874/p/13061984.html或者https://www.cnblogs.com/qiuhom-1874/p/13058338.html;在管理节点上运行visualizer容器后,我们就可以直接访问该管理节点地址的8888端口,就可以看到当前容器的情况;如下图

      提示:从上面的信息可以看到当前集群有一个管理节点和两个work节点;现目前集群里没有运行任何容器;

      在docker swarm运行服务

    [root@docker-node01 ~]# docker service create --name myweb docker-registry.io/test/nginx:latest
    i0j6wvvtfe1360ibj04jxulmd
    overall progress: 1 out of 1 tasks 
    1/1: running   [==================================================>] 
    verify: Service converged 
    [root@docker-node01 ~]# docker service ls
    ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
    i0j6wvvtfe13        myweb               replicated          1/1                 docker-registry.io/test/nginx:latest   
    [root@docker-node01 ~]# docker service ps myweb
    ID                  NAME                IMAGE                                  NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
    99y8towew77e        myweb.1             docker-registry.io/test/nginx:latest   docker-node03       Running             Running 1 minutes ago                       
    [root@docker-node01 ~]#
    

      提示:docker service create 表示在当前swarm集群环境中创建一个服务;以上命令表示在swarm集群上创建一个名为myweb的服务,用docker-registry.io/test/nginx:latest镜像;默认情况下只启动一个副本;

      提示:可以看到当前集群中运行了一个myweb的容器,并且运行在docker-node03这台主机上;

      在swarm 集群上创建多个副本服务

    [root@docker-node01 ~]# docker service create --replicas 3 --name web docker-registry.io/test/nginx:latest
    mbiap412jyugfpi4a38mb5i1k
    overall progress: 3 out of 3 tasks 
    1/3: running   [==================================================>] 
    2/3: running   [==================================================>] 
    3/3: running   [==================================================>] 
    verify: Service converged 
    [root@docker-node01 ~]# docker service ls
    ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
    i0j6wvvtfe13        myweb               replicated          1/1                 docker-registry.io/test/nginx:latest   
    mbiap412jyug        web                 replicated          3/3                 docker-registry.io/test/nginx:latest   
    [root@docker-node01 ~]#docker service ps web
    ID                  NAME                IMAGE                                  NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
    1rt0e7u4senz        web.1               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 28 seconds ago                       
    31ll0zu7udld        web.2               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 28 seconds ago                       
    l9jtbswl2x22        web.3               docker-registry.io/test/nginx:latest   docker-node03       Running             Running 32 seconds ago                       
    [root@docker-node01 ~]# 
    

      提示:--replicas选项用来指定期望运行的副本数量,该选项会在集群上创建我们指定数量的副本,即便我们集群中有节点宕机,它始终会创建我们指定数量的容器在集群上运行着;

      测试:把docker-node03关机,看看我们运行的服务是否会迁移到节点2上呢?

      docker-node03关机前

      docker-node03关机后

      提示:从上面的截图可以看到,当节点3宕机后,节点3上跑的所有容器,会全部迁移到节点2上来;这就是创建容器时用--replicas选项的作用;总结一点,创建服务使用副本模式,该服务所在节点故障,它会把对应节点上的服务迁移到其他节点上;这里需要提醒一点的是,只要集群上的服务副本满足我们指定的replicas的数量,即便故障的节点恢复了,它是不会把服务迁移回来的;

    [root@docker-node01 ~]# docker service ps web
    ID                  NAME                IMAGE                                  NODE                DESIRED STATE       CURRENT STATE             ERROR               PORTS
    1rt0e7u4senz        web.1               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 15 minutes ago                        
    31ll0zu7udld        web.2               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 15 minutes ago                        
    t3gjvsgtpuql        web.3               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 6 minutes ago                         
    l9jtbswl2x22         \_ web.3           docker-registry.io/test/nginx:latest   docker-node03       Shutdown            Shutdown 23 seconds ago                       
    [root@docker-node01 ~]# 
    

      提示:我们在管理节点查看服务列表,可以看到它迁移服务就是把对应节点上的副本停掉,然后在其他节点创建一个新的副本;

      服务伸缩

    [root@docker-node01 ~]# docker service ls
    ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
    i0j6wvvtfe13        myweb               replicated          1/1                 docker-registry.io/test/nginx:latest   
    mbiap412jyug        web                 replicated          3/3                 docker-registry.io/test/nginx:latest   
    [root@docker-node01 ~]# docker service scale myweb=3 web=5
    myweb scaled to 3
    web scaled to 5
    overall progress: 3 out of 3 tasks 
    1/3: running   [==================================================>] 
    2/3: running   [==================================================>] 
    3/3: running   [==================================================>] 
    verify: Service converged 
    overall progress: 5 out of 5 tasks 
    1/5: running   [==================================================>] 
    2/5: running   [==================================================>] 
    3/5: running   [==================================================>] 
    4/5: running   [==================================================>] 
    5/5: running   [==================================================>] 
    verify: Service converged 
    [root@docker-node01 ~]# docker service ls
    ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
    i0j6wvvtfe13        myweb               replicated          3/3                 docker-registry.io/test/nginx:latest   
    mbiap412jyug        web                 replicated          5/5                 docker-registry.io/test/nginx:latest   
    [root@docker-node01 ~]# docker service ps myweb web
    ID                  NAME                IMAGE                                  NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
    j7w490h2lons        myweb.1             docker-registry.io/test/nginx:latest   docker-node02       Running             Running 12 minutes ago                       
    1rt0e7u4senz        web.1               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 21 minutes ago                       
    99y8towew77e        myweb.1             docker-registry.io/test/nginx:latest   docker-node03       Shutdown            Shutdown 5 minutes ago                       
    en5rk0jf09wu        myweb.2             docker-registry.io/test/nginx:latest   docker-node03       Running             Running 31 seconds ago                       
    31ll0zu7udld        web.2               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 21 minutes ago                       
    h1hze7h819ca        myweb.3             docker-registry.io/test/nginx:latest   docker-node03       Running             Running 30 seconds ago                       
    t3gjvsgtpuql        web.3               docker-registry.io/test/nginx:latest   docker-node02       Running             Running 12 minutes ago                       
    l9jtbswl2x22         \_ web.3           docker-registry.io/test/nginx:latest   docker-node03       Shutdown            Shutdown 5 minutes ago                       
    od3ti2ixpsgc        web.4               docker-registry.io/test/nginx:latest   docker-node03       Running             Running 31 seconds ago                       
    n1vur8wbmkgz        web.5               docker-registry.io/test/nginx:latest   docker-node03       Running             Running 31 seconds ago                       
    [root@docker-node01 ~]# 
    

      提示:docker service scale 命令用来指定服务的副本数量,从而实现动态伸缩;

      服务暴露

    [root@docker-node01 ~]# docker service ls
    ID                  NAME                MODE                REPLICAS            IMAGE                                  PORTS
    i0j6wvvtfe13        myweb               replicated          3/3                 docker-registry.io/test/nginx:latest   
    mbiap412jyug        web                 replicated          5/5                 docker-registry.io/test/nginx:latest   
    [root@docker-node01 ~]# docker service update  --publish-add 80:80 myweb 
    myweb
    overall progress: 3 out of 3 tasks 
    1/3: running   [==================================================>] 
    2/3: running   [==================================================>] 
    3/3: running   [==================================================>] 
    verify: Service converged 
    [root@docker-node01 ~]#
    

      提示:docker swarm集群中的服务暴露和docker里面的端口暴露原理是一样的,都是通过iptables 规则表或LVS规则实现的;

      提示:我们可以在管理节点上看到对应80端口已经处于监听状态,并且在iptables规则表中多了一项访问本机80端口都DNAT到172.18.0.2的80上了;其实不光是在管理节点,在work节点上相应的iptables规则也都发生了变化;如下

      提示:从上面的规则来看,我们访问节点地址的80端口,都会DNAT到172.18.0.2的80;

      提示:从上面是显示结果看,我们不难得知在docker-node02运行myweb容器的内部地址是10.0.0.7,那为什么我们访问172.18.0.2是能够访问到容器内部的服务呢?

      测试:我们在docker-node02追踪查看nginx容器的访问日志,看看到容器的IP地址是那个?

    [root@docker-node02 ~]# docker ps
    CONTAINER ID        IMAGE                                  COMMAND                  CREATED             STATUS              PORTS               NAMES
    2134e1b2c689        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   24 minutes ago      Up 24 minutes       80/tcp              nginx.1.ych7y3ugxp6o592pbz5k2i412
    [root@docker-node02 ~]# docker logs -f nginx.1.ych7y3ugxp6o592pbz5k2i412 
    /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
    /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
    /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
    10-listen-on-ipv6-by-default.sh: Getting the checksum of /etc/nginx/conf.d/default.conf
    10-listen-on-ipv6-by-default.sh: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
    /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
    /docker-entrypoint.sh: Configuration complete; ready for start up
    10.0.0.3 - - [21/Jun/2020:02:37:11 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
    172.18.0.1 - - [21/Jun/2020:02:38:35 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
    10.0.0.2 - - [21/Jun/2020:02:53:32 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
    10.0.0.2 - - [21/Jun/2020:02:53:58 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-"
    ^C
    [root@docker-node02 ~]# 
    

      提示:我们在管理节点上访问172.18.0.2在node2节点上看到的日志是10.0.0.2的ip访问到nginx服务;这是为什么呢?其实原因就是在每个节点上都有一个ingress-sbox容器,该容器的地址就10.0.0.2;不同节点上的ingress-sbox的地址都不同,所以我们访问不同节点地址,在nginx上看到地址也就不同;如下图所示

      提示:访问不同的节点地址,在nginx日志上记录的IP各不相同

      提示:从上面的截图可以了解到每个节点的ingress-sbox容器的地址各不相同,但他们都把网关指向10.0.0.1,这意味着各个节点容器通信就可以基于这个网关来进行,从而实现了swarm集群上的容器间通信能够基于ingress网络进行;现在还有一个问题就是172.18.0.0/18的网络是怎么和10.0.0.0/24的网络通信的?

      提示:从上面的截图可以看到,在管理节点上有两个网络名称空间,一个id为0,而id为0的网络名称空间中有veth0和vxlan0这两个网卡;而veth0和vxlan0都是桥接到br0上的,br0的地址就是10.0.0.1/24;vxlan的vlan id为4096;结合上面nginx的日志,不难想到

    我们访问管理节点上的80,通过iptables规则把流量转发给docker-gwbridge网络上;现在我们还不清楚docker-gwbridge网络上那个名称空间的网络,但是我们清楚知道在容器内部有两张网卡,一张是eth0,一张是eth1,而eth1就是桥接到docker-gwbridge网络上,这也就意味着容docker-gwbridge网络的名称空间和容器内部的eth1网络名称空间相同;

      提示:从上面的截图看,1-u5mwgfq7rb这个名称的网络名称空间有三张网卡,分别是eth0,eth1和vxlan0,它们都是桥接在br0这个网卡上;而上面管理节点也在1-u5mwgfq7rb这个网络名称空间,并且它们中的vxlan0的vlan id都是4096,这意味着管理节点上的vxlan0可以同node2上的vxlan0直接通信(相同网络名称空间中的相同VLAN id是可以直接通信的),而vxlan0又是直接桥接到br0这块网卡,所以我们在nginx日志中能够看到ingress-sbox容器的地址在访问nginx;这其中的原因是ingress-sbox的网关就是br0;其实node3也是相同逻辑,不同节点上的容器间通信都是走vxlan0,与外部通信走eth1---->然后通过SNAT走docker-gwbridge---->物理网卡出去;

      提示:一个容器上有两个网络,一个是eth0 ingress网络,一个是eth1属于docker-gwbridge网络,两者都属于同一容器中的网络名称空间,所以我们访问172.18.0.2就会通过ingress-sbox容器把源地址更改为docker-gwbridge上的ingress-sbox的地址,从而我们在看nginx日志,就会看到10.0.0.2的地址;ingress-sbox容器作用我们可以理解为做SNAT的作用;

      测试:访问管理节点的80服务看看是否能够访问到nginx提供的页面呢?

    [root@docker-node02 ~]# docker ps
    CONTAINER ID        IMAGE                                  COMMAND                  CREATED             STATUS              PORTS               NAMES
    b829991d6966        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   About an hour ago   Up About an hour    80/tcp              myweb.1.ilhkslrlnreyo6xx5j2h9isjb
    8c2965fbdc27        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.2.pthe8da2n45i06oee4n7h4krd
    b019d663e48e        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.3.w26gqpoyysgplm7qwhjbgisiv
    a7c1afd76f1f        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.1.ho0d7u3wensl0kah0ioz1lpk5
    [root@docker-node02 ~]# docker exec -it myweb.1.ilhkslrlnreyo6xx5j2h9isjb  bash
    root@b829991d6966:/# cd /usr/share/nginx/html/
    root@b829991d6966:/usr/share/nginx/html# ls
    50x.html  index.html
    root@b829991d6966:/usr/share/nginx/html# echo "this is docker-node02 index page" >index.html
    root@b829991d6966:/usr/share/nginx/html# cat index.html
    this is docker-node02 index page
    root@b829991d6966:/usr/share/nginx/html# 
    

      提示:以上是在docker-node02节点上对运行的nginx容器的主页进行了修改,接下我们访问管理节点的80端口,看看是否能够访问得到work节点上的容器,它们会有什么效果?是轮询?还是一直访问一个容器?

      提示:可以看到我们访问管理节点的80端口,会轮询的访问到work节点上的容器;用浏览器测试可能存在缓存的问题,我们可以用curl命令测试比较准确;如下

    [root@docker-node03 ~]# docker ps
    CONTAINER ID        IMAGE                                  COMMAND                  CREATED             STATUS              PORTS               NAMES
    f43fdb9ec7fc        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              myweb.3.pgdjutofb5thlk02aj7387oj0
    4470785f3d00        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              myweb.2.uwxbe182qzq00qgfc7odcmx87
    7493dcac95ba        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.4.rix50fhlmg6m9txw9urk66gvw
    118880d300f4        docker-registry.io/test/nginx:latest   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          80/tcp              web.5.vo7c7vjgpf92b0ryelb7eque0
    [root@docker-node03 ~]# docker exec -it myweb.2.uwxbe182qzq00qgfc7odcmx87 bash
    root@4470785f3d00:/# cd /usr/share/nginx/html/
    root@4470785f3d00:/usr/share/nginx/html# echo "this is myweb.2 index page" > index.html 
    root@4470785f3d00:/usr/share/nginx/html# cat index.html
    this is myweb.2 index page
    root@4470785f3d00:/usr/share/nginx/html# exit
    exit
    [root@docker-node03 ~]# docker exec -it myweb.3.pgdjutofb5thlk02aj7387oj0 bash
    root@f43fdb9ec7fc:/# cd /usr/share/nginx/html/
    root@f43fdb9ec7fc:/usr/share/nginx/html# echo "this is myweb.3 index page" >index.html 
    root@f43fdb9ec7fc:/usr/share/nginx/html# cat index.html
    this is myweb.3 index page
    root@f43fdb9ec7fc:/usr/share/nginx/html# exit
    exit
    [root@docker-node03 ~]# 
    

      提示:为了访问方便看得出效果,我们把myweb.2和myweb.3的主页都更改了内容

    [root@docker-node01 ~]# for i in {1..10} ; do curl 192.168.0.41; done
    this is myweb.3 index page
    this is docker-node02 index page
    this is myweb.2 index page
    this is myweb.3 index page
    this is docker-node02 index page
    this is myweb.2 index page
    this is myweb.3 index page
    this is docker-node02 index page
    this is myweb.2 index page
    this is myweb.3 index page
    [root@docker-node01 ~]# 
    

      提示:通过上面的测试,我们在使用--publish-add 暴露服务时,就相当于在管理节点创建了一个load balance;

  • 相关阅读:
    tomcat使用入门
    IDEA2020 创建springboot项目提示程序包org.springframework.boot不存在 问题
    jvm内存泄露
    tomcat 上设置可以直接访问的图片路径
    服务器上安装mysql后开启远程连接
    图的遍历,BFS和DFS的Java实现
    并查集
    深度优先搜索实现拓扑排序(leetcode210课程表)
    在Java中怎么实现字符'a'转成字符'b'
    MyBatis底层原理
  • 原文地址:https://www.cnblogs.com/qiuhom-1874/p/13169070.html
Copyright © 2020-2023  润新知