• Cassandra配置多节点集群以及使用雅虎YCSB压测Cassandra 3.11


    这几天在搭Cassandra集群以及对Cassandra的性能测试,步骤还挺多,记录一下。

    关于Caaandra在服务器上配置多节点集群,可以参考一下文章:

    http://blog.csdn.net/cloud_xy/article/details/48091003

    http://blog.csdn.net/cloud_xy/article/details/48107251

    以及Cassandra官方文档:http://cassandra.apache.org/doc/latest/getting_started/configuring.html

    注意:最好关掉集群中每台服务器的防火墙,确保相应端口通过,以便节点之间能互相访问。

    systemctl stop firewalld.service #停止firewall
    systemctl disable firewalld.service #禁止firewall开机启动
    firewall-cmd --state #查看默认防火墙状态(关闭后显示notrunning,开启后显示running)

    Cassandra的具体使用请参见Cassandra日常运维:http://zqhxuyuan.github.io/2015/10/15/Cassandra-Daily/

    和官方Documentation:http://cassandra.apache.org/doc/latest/getting_started/index.html

    关于使用YCSB压测Cassandra 3.11.2:

    准备工作·:最好仔细阅读YCSB的wiki:https://github.com/brianfrankcooper/YCSB/wiki

    环境:三台服务器:ip 200.200.172.117-119

       Cassandra3.11.2

    1.在https://github.com/brianfrankcooper/YCSB/获取源代码,解压至本地目录即可。

    2.用cassandra的cqlsh创建keyspace和cloumn family

    (1)新建keyspace:

    cqlsh> create keyspace usertable with replication = {'class':'SimpleStrategy', 'replication_factor':3};

    注意:数据复制有两种策略:

    • SimpleStrategy:仅用于单个数据中心和一个机架。如果您打算使用多个数据中心,请使用NetworkTopologyStrategy
    • NetworkTopologyStrategy:强烈建议用于大多数部署,因为未来扩展需要扩展到多个数据中心时更容易。

     replication_factor:复制因子。如果class是SimpleStrategy,则是必需的; 否则,不使用。多个节点上数据的复制数量。

    复制因子1意味着一个节点上每行只有一个副本。复制因子2意味着每行的两个副本,其中每个副本位于不同的节点上。所有复制品都同样重要; 没有主要或主要副本。作为一般规则,复制因子不应超过群集中的节点数量。

    (2)应用keyspace:

    cqlsh> USE usertable;

    (3)新建table也即cloumn family:

    create table usertable (y_id varchar primary key,field0 varchar,field1 varchar,field2 varchar,field3 varchar,field4 varchar,field5 varchar,field6 varchar,field7 varchar,field8 varchar,field9 varchar);

     3、查看ycsb命令格式:

    #cd bin
    #ycsb
    usage: ./ycsb command database [options]
    
    
    Commands:
        load           Execute the load phase
        run            Execute the transaction phase
        shell          Interactive mode
    
    
    Databases:
        accumulo       https://github.com/brianfrankcooper/YCSB/tree/master/accumulo
        aerospike      https://github.com/brianfrankcooper/YCSB/tree/master/aerospike
        arangodb       https://github.com/brianfrankcooper/YCSB/tree/master/arangodb
        asynchbase     https://github.com/brianfrankcooper/YCSB/tree/master/asynchbase
        basic          https://github.com/brianfrankcooper/YCSB/tree/master/basic
        cassandra-cql  https://github.com/brianfrankcooper/YCSB/tree/master/cassandra
        cassandra2-cql https://github.com/brianfrankcooper/YCSB/tree/master/cassandra2
        couchbase      https://github.com/brianfrankcooper/YCSB/tree/master/couchbase
        couchbase2     https://github.com/brianfrankcooper/YCSB/tree/master/couchbase2
        dynamodb       https://github.com/brianfrankcooper/YCSB/tree/master/dynamodb
        elasticsearch  https://github.com/brianfrankcooper/YCSB/tree/master/elasticsearch
        geode          https://github.com/brianfrankcooper/YCSB/tree/master/geode
        googlebigtable https://github.com/brianfrankcooper/YCSB/tree/master/googlebigtable
        googledatastore https://github.com/brianfrankcooper/YCSB/tree/master/googledatastore
        hbase094       https://github.com/brianfrankcooper/YCSB/tree/master/hbase094
        hbase098       https://github.com/brianfrankcooper/YCSB/tree/master/hbase098
        hbase10        https://github.com/brianfrankcooper/YCSB/tree/master/hbase10
        hypertable     https://github.com/brianfrankcooper/YCSB/tree/master/hypertable
        infinispan     https://github.com/brianfrankcooper/YCSB/tree/master/infinispan
        infinispan-cs  https://github.com/brianfrankcooper/YCSB/tree/master/infinispan
        jdbc           https://github.com/brianfrankcooper/YCSB/tree/master/jdbc
        kudu           https://github.com/brianfrankcooper/YCSB/tree/master/kudu
        mapkeeper      https://github.com/brianfrankcooper/YCSB/tree/master/mapkeeper
        memcached      https://github.com/brianfrankcooper/YCSB/tree/master/memcached
        mongodb        https://github.com/brianfrankcooper/YCSB/tree/master/mongodb
        mongodb-async  https://github.com/brianfrankcooper/YCSB/tree/master/mongodb
        nosqldb        https://github.com/brianfrankcooper/YCSB/tree/master/nosqldb
        orientdb       https://github.com/brianfrankcooper/YCSB/tree/master/orientdb
        rados          https://github.com/brianfrankcooper/YCSB/tree/master/rados
        redis          https://github.com/brianfrankcooper/YCSB/tree/master/redis
        riak           https://github.com/brianfrankcooper/YCSB/tree/master/riak
        s3             https://github.com/brianfrankcooper/YCSB/tree/master/s3
        solr           https://github.com/brianfrankcooper/YCSB/tree/master/solr
        tarantool      https://github.com/brianfrankcooper/YCSB/tree/master/tarantool
        voldemort      https://github.com/brianfrankcooper/YCSB/tree/master/voldemort
    
    
    Options:
        -P file        Specify workload file
        -cp path       Additional Java classpath entries
        -jvm-args args Additional arguments to the JVM
        -p key=value   Override workload property
        -s             Print status to stderr
        -target n      Target ops/sec (default: unthrottled)
        -threads n     Number of client threads (default: 1)
    
    
    Workload Files:
        There are various predefined workloads under workloads/ directory.
        See https://github.com/brianfrankcooper/YCSB/wiki/Core-Properties
        for the list of workload properties.
    ycsb: error: too few arguments

    从命令格式里可以看出   -P可以加载一些配置文件    -p可以以键值对的方式加载一些配置   -s每隔一段时间输出执行信息  -threads线程数 

    4、新建cassandra连接文件(里面的属性可以在源码https://github.com/brianfrankcooper/YCSB/blob/master/cassandra/src/main/java/com/yahoo/ycsb/db/CassandraCQLClient.java中查看)

    #vim cassandra.properties
    
    
    hosts = spark131,spark130,spark129   #host列表,用逗号,隔开
    port = 9042
    cassandra.keyspace = usertable    #测试表
    cassandra.username = ershixiong   #cassandra用户名
    cassandra.password = 111111          #cassandra密码
    cassandra.readconsistencylevel = ANY
    cassandra.writeconsistencylevel = ANY
    cassandra.maxconnections = 100
    cassandra.connecttimeoutmillis = 1000000000
    cassandra.readtimeoutmillis  = 1000000000

    cassandra.properties这个文件没有的话就新建一个

    注意:关于读写一致性级别的设置请参见文档:https://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
    5、配置workload

    #vim workloads/workloada
    workload=com.yahoo.ycsb.workloads.CoreWorkload
    readallfields=false
    readproportion=0.5
    updateproportion=0.5
    scanproportion=0
    insertproportion=0
    requestdistribution=zipfian
    fieldcount 表示每条数据中的字段数,默认为 10;
    fieldlength 表示每个字段的值的长度,默认为 100;
    readallfields 域用来标识是否读取所有的所有的字段,取值有 ture 或 false;
    readproportion,
    updateproportion,
    scanproportion,
    insertproportion 分别表示该 workload中读、更新、扫描和插入操作占总操作的百分比,这四个值的和为 1;
    requestdistribution 表示数据的分布情况,当前支持 uniform,zipfian 和 latest,默认为 uniform;
    maxscanlength 域主要为扫描操作定义,定义了最大扫描的记录数量,默认为 1000;
    scanlengthdistribution 域也是为扫描操作定义的,为每次扫描的长度定义相应的分布,默认是 uniform;
    insertorder 域主要分两种 ordered 和 hashed,默认为 hashed;
    operationcount 总共的 operation 数量;
    maxexecutiontime 为该 workload 定义了最长的执行时间,单位为 s。
    AverageLatency(平均潜伏期)平均潜伏期(average latency):指当磁头移动到数据所在的磁道后,然后等待所要的数据块继续转动(半圈或多些、少些)到磁头下的时间,单位为毫秒(ms)。平均潜伏期是越小越好,潜伏期小代表硬盘的读取数据的等待时间短,这就等于具有更高的硬盘数据传输率。

    注意:还有一个变量:zeropadding可以设置Key的长度,具体workloads参数请参考YCSB源码:https://github.com/brianfrankcooper/YCSB/blob/master/core/src/main/java/com/yahoo/ycsb/workloads/CoreWorkload.java

    关于workload的不同类型说明(a、b、c、d)请参见官方文档:https://github.com/brianfrankcooper/YCSB/wiki/Core-Workloads


    5、测试

    YCSB测试包括两个阶段,Load the data 和 Run the workload

    load阶段:

    bin/ycsb load cassandra2-cql -P workloads/workloada -P cassandra.properties -p columnfamily=usertable -s -threads 32 > load_32threads.dat

    run阶段:

    bin/ycsb run cassandra2-cql -P workloads/workloada -P cassandra.properties -p columnfamily=usertable -s -threads 32 > run_32threads.dat

    数据接口名为cassandra2-cql
    加载ycsb的配置:workloads/workloada
    加载cassandra的配置:cassandra.properties
    columnfamily名称为usertable
    32个线程执行

    具体输出信息意义请参见官方文档:https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload

    常见错误:

    1.All host(s) tried for query failed (tried: /ip (com.datastax.driver.core.TransportException:[/ip] Cannot connect), /ip (com.datastax.driver.core.TransportException: [/ip] Cannot connect))

    解决:如果需要远程连接,那么cassandra.yaml里面的rpc_address就需要改变为您当前环境实际的IP地址,否则会报错!修改后就可以了!

    2.SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".

    SLF4J: Defaulting to no-operation (NOP) logger implementation

    SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

    解决:下载最新的slf 并将slf4j-simple-1.7.7.jar和slf4j-api-1.7.7.jar复制到ycsb的lib目录:https://www.slf4j.org/download.html

    
    
    
  • 相关阅读:
    [20190502]给显示输出加入时间戳.txt
    [20190423]oradebug peek测试脚本.txt
    [20190423]简单测试latch nowilling等待模式.txt
    [20190423]那个更快的疑问3.txt
    [20190419]shared latch spin count 2.txt
    [20190419]shared latch spin count.txt
    [20190418]exclusive latch spin count.txt
    [20190417]隐含参数_SPIN_COUNT.txt
    Java写时复制CopyOnWriteArrayList
    Python 定义常量
  • 原文地址:https://www.cnblogs.com/lijinji/p/8591800.html
Copyright © 2020-2023  润新知