• 试用阿里云RDS的MySQL压缩存储引擎TokuDB


    以前就用过自己搭建MySQL服务器的两种存储引擎MyISAM和InnoDB(也用过一点Memory方式),在今年初转向阿里云关系型数据库服务RDS的时候,看到可调参数中有一个TokuDB,不过不太了解也没有管。

      最近同事转给我阿里云介绍TokuDB的文章,其中压缩存储的特性对我们来说很有吸引力,因为我们的数据库一般都偏大,已经转到阿里云的就有几百个GB了,加上以后要转的肯定是TB数量级的,而且目前还是用的MyISAM,如果用InnoDB的话,那还要扩大数倍,仅仅是存储的费用就让人难以承受。但MyISAM存在表容易损坏的问题,往后用的人越来越少,Drupal 7 以后默认的支持引擎都改为InnoDB,阿里云也推荐不要使用MyISAM。

      据说这个TokuDB与InnoDB的特性很类似,而改用压缩方式后特别适合大数据时代的应用,但数据的压缩解压必定带来CPU在这方面的消耗,这不是大的问题,我关注的主要是IOPS和连接数是否会增加,如果这两个参数基本维持稳定的话,用CPU来换存储空间还是值得的、有余地的。

      虽然今天是周末,但也还是找了几篇文章、网站查看:

      然后马上就把我博客所在的RDS中多数表都从InnoDB改为TokuDB了,这个RDS的空间从1.8G下降到约800M,下降了一多半,还是很明显的,如果是更大数据量的表下降应该更加明显,不过如果以前是MyISAM的话,下降可能就不那么明显了。

      另外,要设置loose_tokudb_buffer_pool_ratio为合适的比例,也就是tokudb占用tokudb与innodb共用缓存的比例,默认在tokudb不使用的情况下是0,如果全部都用tokudb可以改为100,也可以在innodb转换tokudb前根据下面公式来计算:

    select sum(data_length) into @all_size from information_schema.tables where engine='innodb';
    select sum(data_length) into @change_size from information_schema.tables where engine='innodb' and concat(table_schema, '.', table_name) in ('XX.XXXX', 'XX.XXXX', 'XX.XXXX');
    select round(@change_size/@all_size*100);

      或者转换后根据我自己改写的公式来计算:

    select sum(data_length) into @innodb_size from information_schema.tables where engine='innodb';
    select sum(data_length) into @tokudb_size from information_schema.tables where engine='tokudb';
    select round(@tokudb_size/(@innodb_size+@tokudb_size)*100);

      改变后各种参数还在观察中,如果效果好再推广到其他RDS的其他数据上。


      参考与tokudb有关的变量:

    mysql>show variables like '%tokudb%';
    +---------------------------------+-----------------+
    | Variable_name | Value |
    +---------------------------------+-----------------+
    | tokudb_alter_print_error | OFF |
    | tokudb_analyze_delete_fraction | 1.000000 |
    | tokudb_analyze_time | 5 |
    | tokudb_block_size | 4194304 |
    | tokudb_bulk_fetch | ON |
    | tokudb_cache_size | 905969664 |
    | tokudb_check_jemalloc | 1 |
    | tokudb_checkpoint_lock | OFF |
    | tokudb_checkpoint_on_flush_logs | OFF |
    | tokudb_checkpointing_period | 60 |
    | tokudb_cleaner_iterations | 5 |
    | tokudb_cleaner_period | 1 |
    | tokudb_commit_sync | ON |
    | tokudb_cpu_nums | 0 |
    | tokudb_create_index_online | ON |
    | tokudb_data_dir | |
    | tokudb_debug | 0 |
    | tokudb_directio | OFF |
    | tokudb_disable_hot_alter | OFF |
    | tokudb_disable_prefetching | OFF |
    | tokudb_disable_slow_alter | OFF |
    | tokudb_disable_slow_update | OFF |
    | tokudb_disable_slow_upsert | OFF |
    | tokudb_empty_scan | rl |
    | tokudb_fs_reserve_percent | 5 |
    | tokudb_fsync_log_period | 0 |
    | tokudb_hide_default_row_format | ON |
    | tokudb_killed_time | 4000 |
    | tokudb_last_lock_timeout | |
    | tokudb_load_save_space | ON |
    | tokudb_loader_memory_size | 100000000 |
    | tokudb_lock_timeout | 4000 |
    | tokudb_lock_timeout_debug | 1 |
    | tokudb_log_dir | |
    | tokudb_max_lock_memory | 113246208 |
    | tokudb_optimize_index_fraction | 1.000000 |
    | tokudb_optimize_index_name | |
    | tokudb_optimize_throttle | 0 |
    | tokudb_pk_insert_mode | 1 |
    | tokudb_prelock_empty | ON |
    | tokudb_read_block_size | 65536 |
    | tokudb_read_buf_size | 131072 |
    | tokudb_read_status_frequency | 10000 |
    | tokudb_row_format | tokudb_zlib |
    | tokudb_rpl_check_readonly | ON |
    | tokudb_rpl_lookup_rows | ON |
    | tokudb_rpl_lookup_rows_delay | 0 |
    | tokudb_rpl_unique_checks | ON |
    | tokudb_rpl_unique_checks_delay | 0 |
    | tokudb_support_xa | ON |
    | tokudb_tmp_dir | |
    | tokudb_version | 7.5.6 |
    | tokudb_write_status_frequency | 1000 |
    +---------------------------------+-----------------+
    共返回 53 行记录,花费 117.83 ms.
    

      查看与tokedb有关的状态:

    mysql>show status like '%tokudb%';
    +-----------------------------------------------------------------+--------------------------+
    | Variable_name | Value |
    +-----------------------------------------------------------------+--------------------------+
    | Tokudb_DB_OPENS | 1074 |
    | Tokudb_DB_CLOSES | 17 |
    | Tokudb_DB_OPEN_CURRENT | 1057 |
    | Tokudb_DB_OPEN_MAX | 1057 |
    | Tokudb_CHECKPOINT_PERIOD | 60 |
    | Tokudb_CHECKPOINT_LAST_BEGAN | Mon Jul 13 11:22:52 2015 |
    | Tokudb_CHECKPOINT_LAST_COMPLETE_BEGAN | Mon Jul 13 11:21:52 2015 |
    | Tokudb_CHECKPOINT_LAST_COMPLETE_ENDED | Mon Jul 13 11:22:09 2015 |
    | Tokudb_CHECKPOINT_DURATION | 185 |
    | Tokudb_CHECKPOINT_DURATION_LAST | 17 |
    | Tokudb_CHECKPOINT_TAKEN | 23 |
    | Tokudb_CHECKPOINT_FAILED | 0 |
    | Tokudb_CHECKPOINT_BEGIN_TIME | 120980 |
    | Tokudb_CHECKPOINT_LONG_BEGIN_TIME | 0 |
    | Tokudb_CHECKPOINT_LONG_BEGIN_COUNT | 0 |
    | Tokudb_CACHETABLE_MISS | 4604 |
    | Tokudb_CACHETABLE_MISS_TIME | 4345137 |
    | Tokudb_CACHETABLE_PREFETCHES | 10 |
    | Tokudb_CACHETABLE_SIZE_CURRENT | 907181668 |
    | Tokudb_CACHETABLE_SIZE_LIMIT | 996566630 |
    | Tokudb_CACHETABLE_SIZE_WRITING | 0 |
    | Tokudb_CACHETABLE_SIZE_NONLEAF | 1818161 |
    | Tokudb_CACHETABLE_SIZE_LEAF | 902267903 |
    | Tokudb_CACHETABLE_SIZE_ROLLBACK | 832 |
    | Tokudb_CACHETABLE_SIZE_CACHEPRESSURE | 760812 |
    | Tokudb_CACHETABLE_SIZE_CLONED | 3094772 |
    | Tokudb_CACHETABLE_EVICTIONS | 60 |
    | Tokudb_CACHETABLE_CLEANER_EXECUTIONS | 2156 |
    | Tokudb_CACHETABLE_CLEANER_PERIOD | 1 |
    | Tokudb_CACHETABLE_CLEANER_ITERATIONS | 5 |
    | Tokudb_CACHETABLE_WAIT_PRESSURE_COUNT | 0 |
    | Tokudb_CACHETABLE_WAIT_PRESSURE_TIME | 0 |
    | Tokudb_CACHETABLE_LONG_WAIT_PRESSURE_COUNT | 0 |
    | Tokudb_CACHETABLE_LONG_WAIT_PRESSURE_TIME | 0 |
    | Tokudb_LOCKTREE_MEMORY_SIZE | 0 |
    | Tokudb_LOCKTREE_MEMORY_SIZE_LIMIT | 113246208 |
    | Tokudb_LOCKTREE_ESCALATION_NUM | 0 |
    | Tokudb_LOCKTREE_ESCALATION_SECONDS | 0.000000 |
    | Tokudb_LOCKTREE_LATEST_POST_ESCALATION_MEMORY_SIZE | 0 |
    | Tokudb_LOCKTREE_OPEN_CURRENT | 1059 |
    | Tokudb_LOCKTREE_PENDING_LOCK_REQUESTS | 0 |
    | Tokudb_LOCKTREE_STO_ELIGIBLE_NUM | 0 |
    | Tokudb_LOCKTREE_STO_ENDED_NUM | 84 |
    | Tokudb_LOCKTREE_STO_ENDED_SECONDS | 0.001284 |
    | Tokudb_LOCKTREE_WAIT_COUNT | 0 |
    | Tokudb_LOCKTREE_WAIT_TIME | 0 |
    | Tokudb_LOCKTREE_LONG_WAIT_COUNT | 0 |
    | Tokudb_LOCKTREE_LONG_WAIT_TIME | 0 |
    | Tokudb_LOCKTREE_TIMEOUT_COUNT | 0 |
    | Tokudb_LOCKTREE_WAIT_ESCALATION_COUNT | 0 |
    | Tokudb_LOCKTREE_WAIT_ESCALATION_TIME | 0 |
    | Tokudb_LOCKTREE_LONG_WAIT_ESCALATION_COUNT | 0 |
    | Tokudb_LOCKTREE_LONG_WAIT_ESCALATION_TIME | 0 |
    | Tokudb_DICTIONARY_UPDATES | 0 |
    | Tokudb_DICTIONARY_BROADCAST_UPDATES | 0 |
    | Tokudb_DESCRIPTOR_SET | 0 |
    | Tokudb_MESSAGES_IGNORED_BY_LEAF_DUE_TO_MSN | 207 |
    | Tokudb_LEAF_NODES_FLUSHED_NOT_CHECKPOINT | 2 |
    | Tokudb_LEAF_NODES_FLUSHED_NOT_CHECKPOINT_BYTES | 694272 |
    | Tokudb_LEAF_NODES_FLUSHED_NOT_CHECKPOINT_UNCOMPRESSED_BYTES | 5592182 |
    | Tokudb_LEAF_NODES_FLUSHED_NOT_CHECKPOINT_SECONDS | 0.637258 |
    | Tokudb_NONLEAF_NODES_FLUSHED_TO_DISK_NOT_CHECKPOINT | 0 |
    | Tokudb_NONLEAF_NODES_FLUSHED_TO_DISK_NOT_CHECKPOINT_BYTES | 0 |
    | Tokudb_NONLEAF_NODES_FLUSHED_TO_DISK_NOT_CHECKPOINT_UNCOMPRESSE | 0 |
    | Tokudb_NONLEAF_NODES_FLUSHED_TO_DISK_NOT_CHECKPOINT_SECONDS | 0.000000 |
    | Tokudb_LEAF_NODES_FLUSHED_CHECKPOINT | 1155 |
    | Tokudb_LEAF_NODES_FLUSHED_CHECKPOINT_BYTES | 273844224 |
    | Tokudb_LEAF_NODES_FLUSHED_CHECKPOINT_UNCOMPRESSED_BYTES | 1223284397 |
    | Tokudb_LEAF_NODES_FLUSHED_CHECKPOINT_SECONDS | 6.420306 |
    | Tokudb_NONLEAF_NODES_FLUSHED_TO_DISK_CHECKPOINT | 506 |
    | Tokudb_NONLEAF_NODES_FLUSHED_TO_DISK_CHECKPOINT_BYTES | 484352 |
    | Tokudb_NONLEAF_NODES_FLUSHED_TO_DISK_CHECKPOINT_UNCOMPRESSED_BY | 600902 |
    | Tokudb_NONLEAF_NODES_FLUSHED_TO_DISK_CHECKPOINT_SECONDS | 0.390991 |
    | Tokudb_LEAF_NODE_COMPRESSION_RATIO | 4.476154 |
    | Tokudb_NONLEAF_NODE_COMPRESSION_RATIO | 1.240631 |
    | Tokudb_OVERALL_NODE_COMPRESSION_RATIO | 4.470456 |
    | Tokudb_NONLEAF_NODE_PARTIAL_EVICTIONS | 5839 |
    | Tokudb_NONLEAF_NODE_PARTIAL_EVICTIONS_BYTES | 1831390 |
    | Tokudb_LEAF_NODE_PARTIAL_EVICTIONS | 135986 |
    | Tokudb_LEAF_NODE_PARTIAL_EVICTIONS_BYTES | 12265410353 |
    | Tokudb_LEAF_NODE_FULL_EVICTIONS | 56 |
    | Tokudb_LEAF_NODE_FULL_EVICTIONS_BYTES | 63468429 |
    | Tokudb_NONLEAF_NODE_FULL_EVICTIONS | 4 |
    | Tokudb_NONLEAF_NODE_FULL_EVICTIONS_BYTES | 4662 |
    | Tokudb_LEAF_NODES_CREATED | 87 |
    | Tokudb_NONLEAF_NODES_CREATED | 0 |
    | Tokudb_LEAF_NODES_DESTROYED | 0 |
    | Tokudb_NONLEAF_NODES_DESTROYED | 0 |
    | Tokudb_MESSAGES_INJECTED_AT_ROOT_BYTES | 61828 |
    | Tokudb_MESSAGES_FLUSHED_FROM_H1_TO_LEAVES_BYTES | 60424 |
    | Tokudb_MESSAGES_IN_TREES_ESTIMATE_BYTES | 1404 |
    | Tokudb_MESSAGES_INJECTED_AT_ROOT | 911 |
    | Tokudb_BROADCASE_MESSAGES_INJECTED_AT_ROOT | 0 |
    | Tokudb_BASEMENTS_DECOMPRESSED_TARGET_QUERY | 51 |
    | Tokudb_BASEMENTS_DECOMPRESSED_PRELOCKED_RANGE | 1 |
    | Tokudb_BASEMENTS_DECOMPRESSED_PREFETCH | 0 |
    | Tokudb_BASEMENTS_DECOMPRESSED_FOR_WRITE | 47 |
    | Tokudb_BUFFERS_DECOMPRESSED_TARGET_QUERY | 3167 |
    | Tokudb_BUFFERS_DECOMPRESSED_PRELOCKED_RANGE | 45 |
    | Tokudb_BUFFERS_DECOMPRESSED_PREFETCH | 0 |
    | Tokudb_BUFFERS_DECOMPRESSED_FOR_WRITE | 3428 |
    | Tokudb_PIVOTS_FETCHED_FOR_QUERY | 3789 |
    | Tokudb_PIVOTS_FETCHED_FOR_QUERY_BYTES | 96169472 |
    | Tokudb_PIVOTS_FETCHED_FOR_QUERY_SECONDS | 1.042154 |
    | Tokudb_PIVOTS_FETCHED_FOR_PREFETCH | 10 |
    | Tokudb_PIVOTS_FETCHED_FOR_PREFETCH_BYTES | 327680 |
    | Tokudb_PIVOTS_FETCHED_FOR_PREFETCH_SECONDS | 0.000682 |
    | Tokudb_PIVOTS_FETCHED_FOR_WRITE | 118 |
    | Tokudb_PIVOTS_FETCHED_FOR_WRITE_BYTES | 1263104 |
    | Tokudb_PIVOTS_FETCHED_FOR_WRITE_SECONDS | 0.001498 |
    | Tokudb_BASEMENTS_FETCHED_TARGET_QUERY | 126157 |
    | Tokudb_BASEMENTS_FETCHED_TARGET_QUERY_BYTES | 1243361280 |
    | Tokudb_BASEMENTS_FETCHED_TARGET_QUERY_SECONDS | 3.823805 |
    | Tokudb_BASEMENTS_FETCHED_PRELOCKED_RANGE | 4673 |
    | Tokudb_BASEMENTS_FETCHED_PRELOCKED_RANGE_BYTES | 40569344 |
    | Tokudb_BASEMENTS_FETCHED_PRELOCKED_RANGE_SECONDS | 0.118158 |
    | Tokudb_BASEMENTS_FETCHED_PREFETCH | 4345 |
    | Tokudb_BASEMENTS_FETCHED_PREFETCH_BYTES | 30583808 |
    | Tokudb_BASEMENTS_FETCHED_PREFETCH_SECONDS | 0.045414 |
    | Tokudb_BASEMENTS_FETCHED_FOR_WRITE | 11625 |
    | Tokudb_BASEMENTS_FETCHED_FOR_WRITE_BYTES | 138430464 |
    | Tokudb_BASEMENTS_FETCHED_FOR_WRITE_SECONDS | 0.237772 |
    | Tokudb_BUFFERS_FETCHED_TARGET_QUERY | 337 |
    | Tokudb_BUFFERS_FETCHED_TARGET_QUERY_BYTES | 182784 |
    | Tokudb_BUFFERS_FETCHED_TARGET_QUERY_SECONDS | 0.001289 |
    | Tokudb_BUFFERS_FETCHED_PRELOCKED_RANGE | 28 |
    | Tokudb_BUFFERS_FETCHED_PRELOCKED_RANGE_BYTES | 14848 |
    | Tokudb_BUFFERS_FETCHED_PRELOCKED_RANGE_SECONDS | 0.000033 |
    | Tokudb_BUFFERS_FETCHED_PREFETCH | 0 |
    | Tokudb_BUFFERS_FETCHED_PREFETCH_BYTES | 0 |
    | Tokudb_BUFFERS_FETCHED_PREFETCH_SECONDS | 0.000000 |
    | Tokudb_BUFFERS_FETCHED_FOR_WRITE | 902 |
    | Tokudb_BUFFERS_FETCHED_FOR_WRITE_BYTES | 487936 |
    | Tokudb_BUFFERS_FETCHED_FOR_WRITE_SECONDS | 0.001105 |
    | Tokudb_LEAF_COMPRESSION_TO_MEMORY_SECONDS | 117.437031 |
    | Tokudb_LEAF_SERIALIZATION_TO_MEMORY_SECONDS | 6.225201 |
    | Tokudb_LEAF_DECOMPRESSION_TO_MEMORY_SECONDS | 66.514842 |
    | Tokudb_LEAF_DESERIALIZATION_TO_MEMORY_SECONDS | 29.246856 |
    | Tokudb_NONLEAF_COMPRESSION_TO_MEMORY_SECONDS | 1.533011 |
    | Tokudb_NONLEAF_SERIALIZATION_TO_MEMORY_SECONDS | 0.013500 |
    | Tokudb_NONLEAF_DECOMPRESSION_TO_MEMORY_SECONDS | 0.038646 |
    | Tokudb_NONLEAF_DESERIALIZATION_TO_MEMORY_SECONDS | 0.048904 |
    | Tokudb_PROMOTION_ROOTS_SPLIT | 0 |
    | Tokudb_PROMOTION_LEAF_ROOTS_INJECTED_INTO | 2422 |
    | Tokudb_PROMOTION_H1_ROOTS_INJECTED_INTO | 369 |
    | Tokudb_PROMOTION_INJECTIONS_AT_DEPTH_0 | 97 |
    | Tokudb_PROMOTION_INJECTIONS_AT_DEPTH_1 | 1533 |
    | Tokudb_PROMOTION_INJECTIONS_AT_DEPTH_2 | 645 |
    | Tokudb_PROMOTION_INJECTIONS_AT_DEPTH_3 | 171 |
    | Tokudb_PROMOTION_INJECTIONS_LOWER_THAN_DEPTH_3 | 0 |
    | Tokudb_PROMOTION_STOPPED_NONEMPTY_BUFFER | 420 |
    | Tokudb_PROMOTION_STOPPED_AT_HEIGHT_1 | 83 |
    | Tokudb_PROMOTION_STOPPED_CHILD_LOCKED_OR_NOT_IN_MEMORY | 4 |
    | Tokudb_PROMOTION_STOPPED_CHILD_NOT_FULLY_IN_MEMORY | 14 |
    | Tokudb_PROMOTION_STOPPED_AFTER_LOCKING_CHILD | 10 |
    | Tokudb_BASEMENT_DESERIALIZATION_FIXED_KEY | 100372 |
    | Tokudb_BASEMENT_DESERIALIZATION_VARIABLE_KEY | 46722 |
    | Tokudb_CURSOR_SKIP_DELETED_LEAF_ENTRY | 5454 |
    | Tokudb_TXN_BEGIN | 935566 |
    | Tokudb_TXN_BEGIN_READ_ONLY | 15290 |
    | Tokudb_TXN_COMMITS | 927289 |
    | Tokudb_TXN_ABORTS | 23544 |
    | Tokudb_LOGGER_WRITES | 1732 |
    | Tokudb_LOGGER_WRITES_BYTES | 2942123 |
    | Tokudb_LOGGER_WRITES_UNCOMPRESSED_BYTES | 2942123 |
    | Tokudb_LOGGER_WRITES_SECONDS | 3.539385 |
    | Tokudb_LOGGER_WAIT_LONG | 0 |
    | Tokudb_LOADER_NUM_CREATED | 0 |
    | Tokudb_LOADER_NUM_CURRENT | 0 |
    | Tokudb_LOADER_NUM_MAX | 0 |
    | Tokudb_MEM_ESTIMATED_MAXIMUM_MEMORY_FOOTPRINT | 0 |
    | Tokudb_FILESYSTEM_THREADS_BLOCKED_BY_FULL_DISK | 0 |
    | Tokudb_FILESYSTEM_FSYNC_TIME | 59781155 |
    | Tokudb_FILESYSTEM_FSYNC_NUM | 4056 |
    | Tokudb_FILESYSTEM_LONG_FSYNC_TIME | 0 |
    | Tokudb_FILESYSTEM_LONG_FSYNC_NUM | 0 |
    | Tokudb_rows_inserted | 998 |
    | Tokudb_rows_read | 62354889 |
    | Tokudb_rows_deleted | 169 |
    | Tokudb_rows_updated | 1043 |
    +-----------------------------------------------------------------+--------------------------+
    共返回 180 行记录,花费 189.37 ms.
    

      一些参数还需要摸索调整。


      2015年7月14日夜补充:前两天看到tokudb的压缩特性很高兴,就连夜转了好几台RDS上的很多库中的表,但这两天RDS接连出现问题,主要是日志中有大量这样的报错:errno: 24 - Too many open files,查阅资料后得知tokudb是每个索引都要建一个文件(innodb/myisam都是按每个表来新建一组文件),那如果表多、索引多,就容易超出RDS打开文件数的限制。阿里云客服后来打来电话,建议:

    1. 把tokudb转回innodb,解决目前的文件数报错问题;
    2. 测试innodb压缩,看能否替换tokudb的压缩功能,并获得合适的性能;
    3. 把myisam也转为innodb压缩格式,避免myisam文件出错问题。

      今天又花了很多时间来做这方面的转换和测试。

  • 相关阅读:
    CV baseline之VGG
    CV baseline之Alexnet
    Colab踩得坑
    CV baseline之ResNet
    轻量模型之Distilling the Knowledge in a Neural Network
    轻量模型之Xception
    用Rapidminer做文本挖掘的应用:情感分析
    R语言缺失值的处理:线性回归模型插补
    R语言如何解决线性混合模型中畸形拟合(Singular fit)的问题
    数据类岗位需求的数据面
  • 原文地址:https://www.cnblogs.com/duyinqiang/p/5696322.html
Copyright © 2020-2023  润新知