• 通过cassandra-cli客户端了解cassandra的内部数据结构


    和cassandra数据库交互的方式有两种,一种是通过类似于cassandra-cli命令的thrift api,或者通过cassandra提供的cql(cassandra query lanugage),.

    注意:cassandra-cli客户端命令从cassandra V2.2已经弃用,所以想使用cassandra-cli命令的话只能安装cassandra V2.2之前的版本。cassandra-cli命令比较难懂,和传统的sql有很大的区别,对于我们学习起来是比较头疼的事情。cql api的语法风格类似于sql,该api屏蔽了cassandra底层架构,将底层的数据结构以sql的形式展现出来。推荐学习cassandra的时候一定好好研究一下该thrift api的使用,cassandra-cli可以帮助我们深入了解cassandra的内部存储结构。

    1. keyspace(键空间)

    首先查看当前cluster默认的data center

    [cassandra@sht-sgmhadoopcm-01 bin]$ nodetool status
    Datacenter: EAST
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
    UN  172.16.101.54  51.66 KB   256     100.0%            d821e5e5-e99d-41a9-b11f-a0fd0d3d9b05  RAC1

    可以看到当前集群默认有一个data center EAST,当前cassandra节点在RAC1机柜,该data center的定义涉及到两个配置文件

    $CASSANDRA_HOME/conf/cassandra.yaml的“endpoint_snitch: PropertyFileSnitch“参数

    $CASSANDRA_HOME/conf//cassandra-topology.properties

    查看创建keyspace的语法并创建 mytest keyspace

    [default@unknown] help create keyspace;
    ...................
    Examples:
    create keyspace Keyspace2
        with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
        and strategy_options = {replication_factor:4};
    create keyspace Keyspace3
        with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy'
        and strategy_options={DC1:2, DC2:2};
    create keyspace Keyspace4
        with placement_strategy = 'org.apache.cassandra.locator.OldNetworkTopologyStrategy'
        and strategy_options = {replication_factor:1};
    
    [default@unknown] create keyspace mytest with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options={EAST:1};
    78c6f3e9-622c-3ae8-86d2-7733518ca7ff
    
    [default@unknown] describe mytest;
    
    WARNING: CQL3 tables are intentionally omitted from 'describe' output.
    See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details.
    
    Keyspace: mytest:
      Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
      Durable Writes: true
        Options: [EAST:1]
      Column Families:

    2. column family(列族)

    [default@unknown] use mytest;
    [default@mytest] create column family users with column_type=Standard and comparator=UTF8Type and key_validation_class=UTF8Type and default_validation_class=UTF8Type;
    bd4ffef6-9e93-38ec-a4fc-62b4fc00604c
    [default@mytest] describe users;
    
    WARNING: CQL3 tables are intentionally omitted from 'describe' output.
    See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details.
    
        ColumnFamily: users
          Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
          Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
          Cells sorted by: org.apache.cassandra.db.marshal.UTF8Type
          GC grace seconds: 864000
          Compaction min/max thresholds: 4/32
          Read repair chance: 0.0
          DC Local Read repair chance: 0.1
          Caching: KEYS_ONLY
          Default time to live: 0
          Bloom Filter FP chance: default
          Index interval: default
          Speculative Retry: NONE
          Built indexes: []
          Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
          Compression Options:
            sstable_compression: org.apache.cassandra.io.compress.LZ4Compressor

     通过上述创建users列族可以看到,我们仅仅是指定了列族级别的属性(相当于表的属性),但是该列族没有任何列。

    我们也可以通过meta_data选项在创建column family的时候加入预定义的列以及索引信息

    [default@mytest] create column family students
    ...    with column_type = 'Standard'
    ...    and comparator = UTF8Type
    ...    and key_validation_class=UTF8Type
    ...    and default_validation_class=UTF8Type
    ...    and column_metadata = [
    ...    {column_name: age, validation_class: UTF8Type}
    ...    {column_name: birthday, validation_class: UTF8Type, index_type: KEYS,index_name: IDXbirthday}
    ...    {column_name: first, validation_class: UTF8Type}
    ...    {column_name: last, validation_class: UTF8Type}
    ...    ];
    [default@mytest] describe students; WARNING: CQL3 tables are intentionally omitted from 'describe' output. See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details. ColumnFamily: students Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Cells sorted by: org.apache.cassandra.db.marshal.UTF8Type GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 0.0 DC Local Read repair chance: 0.1 Caching: KEYS_ONLY Default time to live: 0 Bloom Filter FP chance: default Index interval: default Speculative Retry: NONE Built indexes: [students.IDXbirthday] Column Metadata: Column Name: first Validation Class: org.apache.cassandra.db.marshal.UTF8Type Column Name: birthday Validation Class: org.apache.cassandra.db.marshal.UTF8Type Index Name: IDXbirthday Index Type: KEYS Column Name: age Validation Class: org.apache.cassandra.db.marshal.UTF8Type Column Name: last Validation Class: org.apache.cassandra.db.marshal.UTF8Type Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy Compression Options: sstable_compression: org.apache.cassandra.io.compress.LZ4Compressor

    其中

    • column_type定义列的类型,默认为标准列
    • comparator 在查询数据时返回的列的排序方式,注意这里是列名,不是列的值,可以根据long、byte、UTF8等其他排序方式进行,同时也意味着当列名显示在命令行界面时,它们将显示为UTF8Type(可读)文本。在Cassandra中是无法按照关系型数据中的值来排序的,必须按照列名来排序,以便能从一个很宽的行里高效取出一列,而无需把每列都读进内存。
    • key_validation_class 定义row key(主键)的数据类型
    • default_validation_class 定义value的数据类型

    3. set&get/list&del/count(插入更新&查询&删除&统计)

    语法 set column family[‘row key‘]['key']='value'

    [default@mytest] set users['zhangpeng']['first']='zhang';
    Value inserted.
    Elapsed time: 3.83 msec(s).
    [default@mytest] set users['zhangpeng']['last']='peng';
    Value inserted.
    Elapsed time: 2.69 msec(s).
    [default@mytest] set users['zhangpeng']['age']='18';
    Value inserted.
    Elapsed time: 1.97 msec(s).
    [default@mytest] set users['wangxing']['first']='wang';
    Value inserted.
    Elapsed time: 1.64 msec(s).
    [default@mytest] set users['wangxing']['last']='xing';
    Value inserted.
    Elapsed time: 1.4 msec(s).
    [default@mytest] set users['wangxing']['age']='19';
    Value inserted.
    Elapsed time: 1.38 msec(s).
    [default@mytest] set users['wangxing']['sex']='male';
    Value inserted.
    Elapsed time: 1.38 msec(s).
    
    [default@mytest] get users['zhangpeng'];
    => (name=age, value=18, timestamp=1527690907938000)
    => (name=first, value=zhang, timestamp=1527690887611000)
    => (name=last, value=peng, timestamp=1527690898435000)
    Returned 3 results.
    Elapsed time: 3.68 msec(s).
    
    [default@mytest] get users['wangxing'];
    => (name=age, value=19, timestamp=1527690946003000)
    => (name=first, value=wang, timestamp=1527690926424000)
    => (name=last, value=xing, timestamp=1527690934083000)
    => (name=sex, value=male, timestamp=1527690962938000)
    Returned 4 results.
    Elapsed time: 3.46 msec(s).
    
    [default@mytest] set users[wangxing]['age']='17';
    Value inserted.
    Elapsed time: 10 msec(s).
    [default@mytest] get users['wangxing'];
    => (name=age, value=17, timestamp=1527691267345000)
    => (name=first, value=wang, timestamp=1527690926424000)
    => (name=last, value=xing, timestamp=1527690934083000)
    => (name=sex, value=male, timestamp=1527690962938000)
    Returned 4 results.
    Elapsed time: 3.67 msec(s).
    [default@mytest] del users['wangxing']['sex'];
    cell removed.
    Elapsed time: 14 msec(s).
    [default@mytest] get users['wangxing'];
    => (name=age, value=17, timestamp=1527691267345000)
    => (name=first, value=wang, timestamp=1527690926424000)
    => (name=last, value=xing, timestamp=1527690934083000)
    Returned 3 results.
    Elapsed time: 4.84 msec(s).
    
    [default@mytest] count users['wangxing'];
    3 cells
    [default@mytest] count users['zhangpeng'];
    3 cells
    [default@mytest] count users['wangxing'];
    3 cells
    [default@mytest] list users;
    Using default limit of 100
    Using default cell limit of 100
    -------------------
    RowKey: wangxing
    => (name=age, value=19, timestamp=1527692441512000)
    => (name=first, value=wang, timestamp=1527692429554000)
    => (name=last, value=xing, timestamp=1527692435344000)
    -------------------
    RowKey: zhangpeng
    => (name=age, value=18, timestamp=1527692421898000)
    => (name=first, value=zhang, timestamp=1527692409752000)
    => (name=last, value=peng, timestamp=1527692416150000)
    
    2 Rows Returned.
    Elapsed time: 6.55 msec(s).

    4. index

    默认情况下我们不可以通过where限制条件查询数据,否则会报如下错误,我们可以为需要where条件的列增加索引,但是只局限于where的等值查询

    [default@mytest] get users where age='17';
    No indexed columns present in index clause with operator EQ
    [default@mytest] update column family users with column_metadata=[{column_name : age, validation_class : UTF8Type, index_type : KEYS,index_name : IDXage}];
    b2a8c0af-4162-3590-acda-70ccdb7d363c
    [default@mytest] get users where age='17';
    -------------------
    RowKey: wangxing
    => (name=age, value=17, timestamp=1527692751954000)
    => (name=first, value=wang, timestamp=1527692429554000)
    => (name=last, value=xing, timestamp=1527692435344000)
    
    1 Row Returned.
    Elapsed time: 70 msec(s).
  • 相关阅读:
    CentOS8下升级Python3.6到3.9
    web service基础知识
    mysql+centos7+主从复制
    saltstack高效运维
    Docker
    python如何配置virtualenv
    Python操作 RabbitMQ、Redis、Memcache、SQLAlchemy
    nginx+uWSGI+django+virtualenv+supervisor发布web服务器
    RabbitMQ消息队列-Centos7下安装RabbitMQ3.6.1
    flask-wtforms
  • 原文地址:https://www.cnblogs.com/ilifeilong/p/9097438.html
Copyright © 2020-2023  润新知