• es集群迁移工具


    最近测试需要从线上数据拷贝一些数据到测试集群,找了一下两个工具

    elasticsearch-dump:https://github.com/elasticsearch-dump/elasticsearch-dump

    elasticsearch-migration(基于elasticsearch-dump做了一些优化):https://github.com/medcl/esm

    elasticsearch-migration在bulk为5m,一个并行度的情况下,6.2G数据的迁移效率

    Usage:
      esm [OPTIONS]
    
    Application Options:
      -s, --source=                    source elasticsearch instance, ie: http://localhost:9200
      -q, --query=                     query against source elasticsearch instance, filter data before migrate, ie: name:medcl
      -d, --dest=                      destination elasticsearch instance, ie: http://localhost:9201
      -m, --source_auth=               basic auth of source elasticsearch instance, ie: user:pass
      -n, --dest_auth=                 basic auth of target elasticsearch instance, ie: user:pass
      -c, --count=                     number of documents at a time: ie "size" in the scroll request (10000)
          --buffer_count=              number of buffered documents in memory (100000)
      -w, --workers=                   concurrency number for bulk workers (1)
      -b, --bulk_size=                 bulk size in MB (5)
      -t, --time=                      scroll time (1m)
          --sliced_scroll_size=        size of sliced scroll, to make it work, the size should be > 1 (1)
      -f, --force                      delete destination index before copying
      -a, --all                        copy indexes starting with . and _
          --copy_settings              copy index settings from source
          --copy_mappings              copy index mappings from source
          --shards=                    set a number of shards on newly created indexes
      -x, --src_indexes=               indexes name to copy,support regex and comma separated list (_all)
      -y, --dest_index=                indexes name to save, allow only one indexname, original indexname will be used if not specified
      -u, --type_override=             override type name
          --green                      wait for both hosts cluster status to be green before dump. otherwise yellow is okay
      -v, --log=                       setting log level,options:trace,debug,info,warn,error (INFO)
      -o, --output_file=               output documents of source index into local file
      -i, --input_file=                indexing from local dump file
          --input_file_type=           the data type of input file, options: dump, json_line, json_array, log_line (dump)
          --source_proxy=              set proxy to source http connections, ie: http://127.0.0.1:8080
          --dest_proxy=                set proxy to target http connections, ie: http://127.0.0.1:8080
          --refresh                    refresh after migration finished
          --fields=                    filter source fields, comma separated, ie: col1,col2,col3,...
          --rename=                    rename source fields, comma separated, ie: _type:type, name:myname
      -l, --logstash_endpoint=         target logstash tcp endpoint, ie: 127.0.0.1:5055
          --secured_logstash_endpoint  target logstash tcp endpoint was secured by TLS
          --repeat_times=              repeat the data from source N times to dest output, use align with parameter regenerate_id to amplify the data size
      -r, --regenerate_id              regenerate id for documents, this will override the exist document id in data source
          --compress                   use gzip to compress traffic
      -p, --sleep=                     sleep N seconds after finished a bulk request (-1)
    
    Help Options:
      -h, --help                       Show this help message
    

     

  • 相关阅读:
    第二周作业
    求最大值及下标编程总结
    查找整数编程总结
    课程设计第一次实验总结
    第十二周作业
    第十一周作业
    第十周作业
    第九周作业
    第八周作业
    第七周作业
  • 原文地址:https://www.cnblogs.com/to-here/p/14304730.html
Copyright © 2020-2023  润新知