• ESRally压测ElasticSearch性能 CentOS 7.5 安装 Python3.7


    1,CentOS 7.5 安装 Python3.7 

    1、安装开发者工具

    yum -y groupinstall "Development Tools"
    2、安装Python编译依赖包

    yum -y install openssl-devel zlib-devel bzip2-devel sqlite-devel readline-devel libffi-devel systemtap-sdt-devel
    3、下载安装包

    wget https://www.python.org/ftp/python/3.7.0/Python-3.7.0.tgz
    4、解压&编译

    tar zvxf Python-3.7.0.tgz
    cd Python-3.7.0
    ./configure --prefix=/usr/local/python3.7 --enable-optimizations
    make && make install

    # 编译完成后,创建软链接文件到执行文件路径:
    ln -s /usr/local/python3/bin/python3 /usr/bin/python3
    ln -s /usr/local/python3/bin/pip3 /usr/bin/pip3
    # 我们可以清除之前编译的可执行文件及配置文件 && 清除所有生成的文件:
    make clean && make distclean


    5、配置环境变量

    文件: /etc/profile.d/python37.sh

    if [ -z ${PYTHON37_HOME} ]; then
    export PYTHON37_HOME=/usr/local/python3.7
    export PATH=${PYTHON37_HOME}/bin:${PATH}
    fi
    6、加载环境变量

    source /etc/profile.d/python37.sh
    7、测试

    python3 -c "import sys; print(sys.version)"

    bug: 使用pip 命令失败
    2.1 错误信息
    pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
    Collecting virtualenv
    Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
    Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
    Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
    Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
    Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/virtualenv/
    Could not fetch URL https://pypi.org/simple/virtualenv/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/virtualenv/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping
    Could not find a version that satisfies the requirement virtualenv (from versions: )
    No matching distribution found for virtualenv
    pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
    Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping
    
    2.2 原因
    系统版本centos6.5,其中openssl的版本为OpenSSL 1.0.1e-fips 11 Feb 2013,而python3.7需要的openssl的版本为1.0.2或者1.1.x,需要对openssl进行升级,并重新编译python3.7.0。yum 安装的openssl 版本都比较低。
    
    2.3 升级openssl
    # 1.下贼openssl
    wget https://www.openssl.org/source/openssl-1.1.1a.tar.gz
    tar -zxvf openssl-1.1.1a.tar.gz
    cd openssl-1.1.1a
    # 2.编译安装
    ./config --prefix=/usr/local/openssl no-zlib #不需要zlib
    make
    make install
    # 3.备份原配置
    mv /usr/bin/openssl /usr/bin/openssl.bak
    mv /usr/include/openssl/ /usr/include/openssl.bak
    # 4.新版配置
    ln -s /usr/local/openssl/include/openssl /usr/include/openssl
    ln -s /usr/local/openssl/lib/libssl.so.1.1 /usr/local/lib64/libssl.so
    ln -s /usr/local/openssl/bin/openssl /usr/bin/openssl
    # 5.修改系统配置
    ## 写入openssl库文件的搜索路径
    echo "/usr/local/openssl/lib" >> /etc/ld.so.conf
    ## 使修改后的/etc/ld.so.conf生效 
    ldconfig -v
    # 6.查看openssl版本
    openssl version
    
    openssl version 提示:
    
     /usr/local/openssl/bin/openssl: error while loading shared libraries: libssl.so.1.1: cannot open shared object file: No such file or directory
    
    假如你的libssl.so.1.1 文件在/usr/local/openssl/lib/下面,可以这样做
    
    ln -s /usr/local/openssl/lib/libssl.so.1.1 /usr/lib64/libssl.so.1.1
    
    ln -s /usr/local/openssl/lib/libcrypto.so.1.1 /usr/lib64/libcrypto.so.1.1

    再重新装3.7
    ./configure --prefix=/usr/local/python3 --with-openssl=/usr/local/openssl
    make && make install
    Fatal Python error: initfsencoding: Unable to get the locale encoding
    LookupError: unknown encoding: GB18030
    
    设置字符集:
    export LANG=zh_CN.UTF-8
    export LANGUAGE=zh_CN.UTF-8
    之后就解决了
    装好后,unset下
    遇到奇葩找不到源的问题 No matching distribution found for esrally:
    用国内豆瓣代理
     pip3 install  --trusted-host  http://pypi.douban.com/simple/   esrally

     2,git2 安装

    centos7系统默认的git安装版本是1.8,但是在项目构建中发现git版本过低,于是用源码编译的方式进行升级.

     

    安装流程

    1、第一步卸载原有的git。

    yum remove git
    

    2、安装相关依赖

    yum install curl-devel expat-devel gettext-devel openssl-devel zlib-devel asciidoc
    yum install  gcc perl-ExtUtils-MakeMaker
    

    3、安装git

    wget https://github.com/git/git/archive/v2.10.5.tar.gz(这个没有configure,无法加载 更新为openssl 1.1的版本)
     wget https://www.kernel.org/pub/software/scm/git/git-2.11.1.tar.gz

    tar -xzvf v2.10.5.tar.gz
    cd git-2.10.5
    编译安装git(如果更新了openssl到1.1版本,需要指定一下:--with-openssl=/usr/local/openssl)

    ./configure --prefix=/usr/local/git --with-openssl=/usr/local/openssl

    sudo make && make install

    配置环境变量

    echo "export PATH=$PATH:/usr/local/git/bin" >> /etc/profile && source /etc/profile

    查看git版本

    git --version

    安装完成:
    生成ssh key :
    #ssh-keygen -t rsa -C “xxx@gmail.com”
    登录Github点击Edit your profile->SSH keys,添加./.ssh/id_rsa.pub中的内容

    问题解决

    正常的流程就是按照上面的流程进行安装即可,下面总结一些在安装过程中遇到的几个问题.
    1、make prefix=/usr/local/git all进行编译的时候提示如下错误

     LINK git-credential-store
    libgit.a(utf8.o): In function `reencode_string_iconv':
    /usr/src/git-2.8.3/utf8.c:463: undefined reference to `libiconv'
    libgit.a(utf8.o): In function `reencode_string_len':
    /usr/src/git-2.8.3/utf8.c:502: undefined reference to `libiconv_open'
    /usr/src/git-2.8.3/utf8.c:521: undefined reference to `libiconv_close'
    /usr/src/git-2.8.3/utf8.c:515: undefined reference to `libiconv_open'
    collect2: ld returned 1 exit status
    make: *** [git-credential-store] Error 1
    

    这个问题主要是系统缺少libiconv库导致的。根据上面提供的链接,下载libiconv即可。

    cd /usr/local/src
    wget http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.14.tar.gz
    tar -zxvf libiconv-1.14.tar.gz
    cd libiconv-1.14
    配置
    ./configure --prefix=/usr/local/libiconv
    编译
    make
    安装
    make install
    建立软连接
    ln -s /usr/local/lib/libiconv.so /usr/lib
    ln -s /usr/local/lib/libiconv.so.2 /usr/lib
    

    这时候还libiconv库已经安装完成,下面进入我们的git安装目录,按照下面的方式进行安装

    make configure
    ./configure --prefix=/usr/local --with-iconv=/usr/local/libiconv
    编译
    make
    安装
    make install
    加入环境变量
    export PATH=$PATH:/usr/local/bin/git
    检测版本号
    git --version
    

    2、在安装libiconv时会遇到./stdio.h:1010:1: error: ‘gets’ undeclared here (not in a function)的错误提示,进行下面的操作即可解决.

    进入错误文件路径
    cd libiconv-1.14/srclib
    编辑文件stdio.in.h找到698行的样子,内容是_GL_WARN_ON_USE (gets, "gets is a security hole - use fgets instead");
    将这一行注释掉(注意注释一定要用/**/来进行注释),替换为下面的内容
    #if defined(__GLIBC__) && !defined(__UCLIBC__) && !__GLIBC_PREREQ(2, 16)
    _GL_WARN_ON_USE (gets, "gets is a security hole - use fgets instead");
    #endif

    安装git编译的时候发生报错:

    1. [root@localhost git-2.4.5]# make
    2. SUBDIR perl
    3. /usr/bin/perl Makefile.PL PREFIX='/usr/local/git' INSTALL_BASE='' --localedir='/usr/local/git/share/locale'
    4. Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at Makefile.PL line 3.
    5. BEGIN failed--compilation aborted at Makefile.PL line 3.
    6. make[1]: *** [perl.mak] Error 2
    7. make: *** [perl/perl.mak] Error 2

    解决办法如下:
    yum install perl-ExtUtils-Embed -y
    安装完以后重新编译解决问题

    如果有其他的问题,可以参考公众干号:浪子编程走四方


    作者:一介布衣q
    链接:https://www.imooc.com/article/275738
    来源:慕课网
    本文原创发布于慕课网 ,转载请注明出处,谢谢合作

    ——————————————————————————————————————————

    4,使用ESRally压测ElasticSearch性能

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    ------------------------------------------------------
    _______ __ _____
    / ____(_)___ ____ _/ / / ___/_________ ________
    / /_ / / __ / __ `/ / \__ / ___/ __ / ___/ _
    / __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
    /_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
    ------------------------------------------------------

    | Metric | Task | Value | Unit |
    |-------------------------------:|---------------------:|----------:|-------:|
    | Total indexing time | | 28.0997 | min |
    | Total merge time | | 6.84378 | min |
    | Total refresh time | | 3.06045 | min |
    | Total flush time | | 0.106517 | min |
    | Total merge throttle time | | 1.28193 | min |
    | Median CPU usage | | 471.6 | % |
    | Total Young Gen GC | | 16.237 | s |
    | Total Old Gen GC | | 1.796 | s |
    | Index size | | 2.60124 | GB |
    | Total written | | 11.8144 | GB |
    | Heap used for segments | | 14.7326 | MB |
    | Heap used for doc values | | 0.115917 | MB |
    | Heap used for terms | | 13.3203 | MB |
    | Heap used for norms | | 0.0734253 | MB |
    | Heap used for points | | 0.5793 | MB |
    | Heap used for stored fields | | 0.643608 | MB |
    | Segment count | | 97 | |
    | Min Throughput | index-append | 31925.2 | docs/s |
    | Median Throughput | index-append | 39137.5 | docs/s |
    | Max Throughput | index-append | 39633.6 | docs/s |
    | 50.0th percentile latency | index-append | 872.513 | ms |
    | 90.0th percentile latency | index-append | 1457.13 | ms |
    | 99.0th percentile latency | index-append | 1874.89 | ms |
    | 100th percentile latency | index-append | 2711.71 | ms |
    | 50.0th percentile service time | index-append | 872.513 | ms |
    | 90.0th percentile service time | index-append | 1457.13 | ms |
    | 99.0th percentile service time | index-append | 1874.89 | ms |
    | 100th percentile service time | index-append | 2711.71 | ms |
    | ... | ... | ... | ... |
    | ... | ... | ... | ... |
    | Min Throughput | painless_dynamic | 2.53292 | ops/s |
    | Median Throughput | painless_dynamic | 2.53813 | ops/s |
    | Max Throughput | painless_dynamic | 2.54401 | ops/s |
    | 50.0th percentile latency | painless_dynamic | 172208 | ms |
    | 90.0th percentile latency | painless_dynamic | 310401 | ms |
    | 99.0th percentile latency | painless_dynamic | 341341 | ms |
    | 99.9th percentile latency | painless_dynamic | 344404 | ms |
    | 100th percentile latency | painless_dynamic | 344754 | ms |
    | 50.0th percentile service time | painless_dynamic | 393.02 | ms |
    | 90.0th percentile service time | painless_dynamic | 407.579 | ms |
    | 99.0th percentile service time | painless_dynamic | 430.806 | ms |
    | 99.9th percentile service time | painless_dynamic | 457.352 | ms |
    | 100th percentile service time | painless_dynamic | 459.474 | ms |

    ----------------------------------
    [INFO] SUCCESS (took 2634 seconds)
    ----------------------------------

    在部署完一套ES集群之后,我们肯定想知道这套集群性能如何?是否可以支撑未来业务发展?存不存在性能上的瓶颈?要想有依据的回答这些问题,我们需要通过压力测试结果中找答案。

    介绍

    Rally是Elasticsearch的基准测试框架,由官方提供维护。

    安装

    1. 安装Python3.5及以上版本,系统默认可能是2.x版本,如果需要升级请参考《在CentOS7上安装Python3》。
    2. 安装git1.9及以上版本
    3. 安装esrally pip3 install esrally
    4. 配置esrally esrally configure,执行此命令后会在当前用户根目录下生成 .rally 目录,可以 ll ~/.rally这样来确认。

    使用

    快速开始

    如果想测试当前机器上某个版本单点ES性能,可以像下面这样:

    1
    2
    3
    4
    esrally --distribution-version=6.5.3

    # 同样的如果你想测试其它版本
    esrally --distribution-version=6.8.1   --car="4gheap" 

    当执行上面的命令之后会自动下载对应es版本软件,并在本地启动,接着执行测试。这个过程在rally中被称为比赛,而赛道是用默认的,即geonames

    测试远程集群

    上面的示例不能测试存在的es集群,下面介绍使用方法:

    1. 指定跑道和ES集群地址后就可以执行测试。

    esrally --pipeline=benchmark-only 
    --track=http_logs
    --target-hosts=192.168.1.100:9200,192.168.1.101:9200,192.168.1.102:9200
    --report-file=/tmp/report_http_logs.md
    --track-params="bulk_indexing_clients:96"
     --include-tasks="index-append"
    --challenge=append-fast-with-no-conflicts (只测试写)

     --report-format=csv


    备注: esrally list tracks 命令可以查看可用跑道(–track)

    可以使用 --report-file=/path/to/your/report.md 将此报告也保存到文件中,并使用 --report-format=csv将其另存为CSV。

    修改默认跑道参数

    如果直接在默认跑道上修改,会被还原,所以只能通过增加跑道的方式。

    1. 在 .rally/benchmarks/tracks 下面创建新的赛道,比如 custom

    2. 在 custom/http_logs/challenges/default.json 文件中调整赛道的配置并保存,例如下面修改default对应的操作,由10个客户端发起,每个用户端发出100次操作。

      1
      2
      3
      4
      5
      6
      7
      {
      "operation": "default",
      "clients": 10,
      "warmup-iterations": 500,
      "iterations": 100,
      "target-throughput": 100
      }
    3. 在启动时指定跑道:esrally --track=custom ....,例如:

      1
      2
      3
      4
      5
      esrally --pipeline=benchmark-only 
      --track=custom
      --track=http_logs
      --target-hosts=192.168.1.100:9200,192.168.1.101:9200,192.168.1.102:9200
      --report-file=/tmp/report_http_logs.md

     

    参考文献:

    • https://esrally.readthedocs.io/en/stable/index.html
    • https://github.com/elastic/rally
    • 比较全的压测介绍:https://www.jianshu.com/p/c89975b50447
    • 使用docker 运行 esrally(包含离线数据集):https://www.jianshu.com/p/3a019c135e2a
    • 测试参数解释:https://www.jianshu.com/p/979f548c233e
    • 讨论为什么elasticsearch没有被压满:https://discuss.elastic.co/t/es-benchmark-using-rally-to-stress-a-2-node-setup/150020/6
    • 有压测结果对比:https://www.jianshu.com/p/e7de3b24f505

    数据集:
    国内下载慢,可以先执行一遍
    esrally --distribution-version=6.5.3 --track=geonames
    这样即使下载测试数据失败,但是目录结构都生成好了。可以自行下载bz文件,在

    http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents.json.bz2
    存成默认定义的:
    documents-2.json.bz2
    documents-2.json.offset
    然后直接拷贝过去即可:~/.rally/benchmarks/data/geonames

    esrally list tracks

    This will show the following list:

    Name        Description                                          Documents  Compressed Size    Uncompressed Size    Default Challenge        All Challenges
    ----------  -------------------------------------------------  -----------  -----------------  -------------------  -----------------------  ---------------------------
    geonames    POIs from Geonames                                    11396505  252.4 MB           3.3 GB               append-no-conflicts      append-no-conflicts,appe...
    geopoint    Point coordinates from PlanetOSM                      60844404  481.9 MB           2.3 GB               append-no-conflicts      append-no-conflicts,appe...
    http_logs   HTTP server log data                                 247249096  1.2 GB             31.1 GB              append-no-conflicts      append-no-conflicts,appe...
    nested      StackOverflow Q&A stored as nested docs               11203029  663.1 MB           3.4 GB               nested-search-challenge  nested-search-challenge,...
    noaa        Global daily weather measurements from NOAA           33659481  947.3 MB           9.0 GB               append-no-conflicts      append-no-conflicts,appe...
    nyc_taxis   Taxi rides in New York in 2015                       165346692  4.5 GB             74.3 GB              append-no-conflicts      append-no-conflicts,appe...
    percolator  Percolator benchmark based on AOL queries              2000000  102.7 kB           104.9 MB             append-no-conflicts      append-no-conflicts,appe...
    pmc         Full text benchmark with academic papers from PMC       574199  5.5 GB             21.7 GB      

    这个地址里面 https://github.com/elastic/rally-tracks/tree/master/,进入子目录有各个数据集可配置的参数:

    如http_logs:

    Parameters (--track-params="bulk_indexing_clients:96"

    This track allows to overwrite the following parameters with Rally 0.8.0+ using --track-params:

    • bulk_size (default: 5000)
    • bulk_indexing_clients (default: 8): Number of clients that issue bulk indexing requests.

    5,自己定义压测:

    参考:https://esrally.readthedocs.io/en/latest/adding_tracks.html

    主要步骤:

    1,创建目录:~/rally-tracks/tutorial

    2,手动下载 geonames 压测数据: http://download.geonames.org/export/dump/allCountries.zip 。它里面是用tab分开的文本文件。需要转成json格式。

    用这段python代码转:

    import json
    
    cols = (("geonameid", "int", True),
            ("name", "string", True),
            ("asciiname", "string", False),
            ("alternatenames", "string", False),
            ("latitude", "double", True),
            ("longitude", "double", True),
            ("feature_class", "string", False),
            ("feature_code", "string", False),
            ("country_code", "string", True),
            ("cc2", "string", False),
            ("admin1_code", "string", False),
            ("admin2_code", "string", False),
            ("admin3_code", "string", False),
            ("admin4_code", "string", False),
            ("population", "long", True),
            ("elevation", "int", False),
            ("dem", "string", False),
            ("timezone", "string", False))
    
    
    def main():
        with open("allCountries.txt", "rt", encoding="UTF-8") as f:
            for line in f:
                tup = line.strip().split("	")
                record = {}
                for i in range(len(cols)):
                    name, type, include = cols[i]
                    if tup[i] != "" and include:
                        if type in ("int", "long"):
                            record[name] = int(tup[i])
                        elif type == "double":
                            record[name] = float(tup[i])
                        elif type == "string":
                            record[name] = tup[i]
                print(json.dumps(record, ensure_ascii=False))
    
    
    if __name__ == "__main__":
        main()

    存到刚才~/rally-tracks/tutorial目录下,python3 toJSON.py documents.json

    3,对于7.0以下的es,保持这个成index.json

    {
      "settings": {
        "index.number_of_replicas": 0
      },
      "mappings": {
        "docs": {
          "dynamic": "strict",
          "properties": {
            "geonameid": {
              "type": "long"
            },
            "name": {
              "type": "text"
            },
            "latitude": {
              "type": "double"
            },
            "longitude": {
              "type": "double"
            },
            "country_code": {
              "type": "text"
            },
            "population": {
              "type": "long"
            }
          }
        }
      }
    }

    5,再保存一个track.json

    {
      "version": 2,
      "description": "Tutorial benchmark for Rally",
      "indices": [
        {
          "name": "geonames",
          "body": "index.json",
          "types": [ "docs" ]
        }
      ],
      "corpora": [
        {
          "name": "rally-tutorial",
          "documents": [
            {
              "source-file": "documents.json",
              "document-count": 11658903,
              "uncompressed-bytes": 1544799789
            }
          ]
        }
      ],
      "schedule": [
        {
          "operation": {
            "operation-type": "delete-index"
          }
        },
        {
          "operation": {
            "operation-type": "create-index"
          }
        },
        {
          "operation": {
            "operation-type": "cluster-health",
            "request-params": {
              "wait_for_status": "green"
            }
          }
        },
        {
          "operation": {
            "operation-type": "bulk",
            "bulk-size": 5000
          },
          "warmup-time-period": 120,
          "clients": 8
        },
        {
          "operation": {
            "operation-type": "force-merge"
          }
        },
        {
          "operation": {
            "name": "query-match-all",
            "operation-type": "search",
            "body": {
              "query": {
                "match_all": {}
              }
            }
          },
          "clients": 8,
          "warmup-iterations": 1000,
          "iterations": 1000,
          "target-throughput": 100
        }
      ]
    }

    其中documents 属性里面的字段值是这么来的:

    wc -l documents.json json个数

    stat -f "%z" documents.json 文件大小

    7.0以后版本要去掉types 。

    6,检查建立成功没有:esrally list tracks --track-path=~/rally-tracks/tutorial

    7,执行自己的track:esrally --distribution-version=6.0.0 --track-path=~/rally-tracks/tutorial

    --test-mode 来检测配置文件对否。

    这个来生成1000条数据:head -n 1000 documents.json documents-1k.json


  • 相关阅读:
    [ZJOI2011]营救皮卡丘
    TJOI2018Party
    HEOI2013SAO
    [BJOI2017]树的难题
    [HNOI2016]序列
    [SHOI2007]善意的投票
    CF802C Heidi and Library (hard)
    SPOJ DIVCNT2
    LOJ子序列
    BZOJ2882工艺
  • 原文地址:https://www.cnblogs.com/bigben0123/p/11188461.html
Copyright © 2020-2023  润新知