1:分区概述
分区的过程是将一个表或索引分解为多个更小、更可管理的部分。就访问数据库的应用而言,逻辑上只有一个表或者索引,物理上可能由是个物理分区组成。每个分区都是独立的对象,可独自处理,也可以作为一个更大的部分进行处理。
查看MySQL是否支持分区:
mysql> SHOW VARIABLES LIKE '%partition%'G *************************** 1. row *************************** Variable_name: have_partitioning Value: YES 1 row in set (0.14 sec)
也可以:
mysql> SHOW PLUGINS G
*************************** 17. row *************************** Name: partition Status: ACTIVE Type: STORAGE ENGINE Library: NULL License: GPL 17 rows in set (0.00 sec)
PS:分区可以提高某些SQL的语句性能,但是如果一味的使用分区,而不理解分区是如何工作的,也不清楚如何使用分区,那么分区很有可能只会对性能产生负面的影响
2:MySQL的分区类型
RANGE分区:行数据基于某一个给定连续区间的列值存放入分区
LIST分区:行数据基于给定的List散列存放
HASH分区:行数据基于自定义表达式的返回值存放,返回值不能是负数。
KEY分区:根据MySQL提供的散列函数来进行分区。
不管是哪种分区,如果表中存在主键或者唯一索引,分区列必须是唯一索引的一个组成部分。
如果建表的时候没有注定主键或者唯一索引,可以执行任何一个列为分区列。
2.1 RANGE分区
CREATE TABLE testpart1( id INT PRIMARY KEY, name CHAR(8), addr VARCHAR(100), tel BIGINT, hired DATETIME, layoff DATETIME) ENGINE = INNODB DEFAULT CHARSET = UTF8 PARTITION BY RANGE(id)( PARTITION p0 VALUES LESS THAN(10), PARTITION p1 VALUES LESS THAN(20), PARTITION p2 VALUES LESS THAN(30), PARTITION p3 VALUES LESS THAN(40), PARTITION p4 VALUES LESS THAN(50));
mysql> system ls -lh /gechong/mysql/data
INSERT INTO testpart1 VALUES(1,'Hello','world',15849823389,'2015-01-01 09:20:00','2015-02-05 09:20:00'); INSERT INTO testpart1 VALUES(21,'Hello','world',15849823389,'2015-01-01 09:20:00','2015-02-05 09:20:00'); INSERT INTO testpart1 VALUES(31,'Hello','world',15849823389,'2015-01-01 09:20:00','2015-02-05 09:20:00'); INSERT INTO testpart1 VALUES(41,'Hello','world',15849823389,'2015-01-01 09:20:00','2015-02-05 09:20:00');
mysql> SELECT * FROM information_schema.PARTITIONS WHERE table_schema='testpart' AND table_name='testpart1' G *************************** 1. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: testpart TABLE_NAME: testpart1 PARTITION_NAME: p0 SUBPARTITION_NAME: NULL PARTITION_ORDINAL_POSITION: 1 SUBPARTITION_ORDINAL_POSITION: NULL PARTITION_METHOD: RANGE SUBPARTITION_METHOD: NULL PARTITION_EXPRESSION: id SUBPARTITION_EXPRESSION: NULL PARTITION_DESCRIPTION: 10 TABLE_ROWS: 1 AVG_ROW_LENGTH: 16384 DATA_LENGTH: 16384 MAX_DATA_LENGTH: NULL INDEX_LENGTH: 0 DATA_FREE: 0 CREATE_TIME: NULL UPDATE_TIME: NULL CHECK_TIME: NULL CHECKSUM: NULL PARTITION_COMMENT: NODEGROUP: default TABLESPACE_NAME: NULL *************************** 2. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: testpart TABLE_NAME: testpart1 PARTITION_NAME: p1 SUBPARTITION_NAME: NULL PARTITION_ORDINAL_POSITION: 2 SUBPARTITION_ORDINAL_POSITION: NULL PARTITION_METHOD: RANGE SUBPARTITION_METHOD: NULL PARTITION_EXPRESSION: id SUBPARTITION_EXPRESSION: NULL PARTITION_DESCRIPTION: 20 TABLE_ROWS: 0 AVG_ROW_LENGTH: 0 DATA_LENGTH: 16384 MAX_DATA_LENGTH: NULL INDEX_LENGTH: 0 DATA_FREE: 0 CREATE_TIME: NULL UPDATE_TIME: NULL CHECK_TIME: NULL CHECKSUM: NULL PARTITION_COMMENT: NODEGROUP: default TABLESPACE_NAME: NULL *************************** 3. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: testpart TABLE_NAME: testpart1 PARTITION_NAME: p2 SUBPARTITION_NAME: NULL PARTITION_ORDINAL_POSITION: 3 SUBPARTITION_ORDINAL_POSITION: NULL PARTITION_METHOD: RANGE SUBPARTITION_METHOD: NULL PARTITION_EXPRESSION: id SUBPARTITION_EXPRESSION: NULL PARTITION_DESCRIPTION: 30 TABLE_ROWS: 1 AVG_ROW_LENGTH: 16384 DATA_LENGTH: 16384 MAX_DATA_LENGTH: NULL INDEX_LENGTH: 0 DATA_FREE: 0 CREATE_TIME: NULL UPDATE_TIME: NULL CHECK_TIME: NULL CHECKSUM: NULL PARTITION_COMMENT: NODEGROUP: default TABLESPACE_NAME: NULL *************************** 4. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: testpart TABLE_NAME: testpart1 PARTITION_NAME: p3 SUBPARTITION_NAME: NULL PARTITION_ORDINAL_POSITION: 4 SUBPARTITION_ORDINAL_POSITION: NULL PARTITION_METHOD: RANGE SUBPARTITION_METHOD: NULL PARTITION_EXPRESSION: id SUBPARTITION_EXPRESSION: NULL PARTITION_DESCRIPTION: 40 TABLE_ROWS: 1 AVG_ROW_LENGTH: 16384 DATA_LENGTH: 16384 MAX_DATA_LENGTH: NULL INDEX_LENGTH: 0 DATA_FREE: 0 CREATE_TIME: NULL UPDATE_TIME: NULL CHECK_TIME: NULL CHECKSUM: NULL PARTITION_COMMENT: NODEGROUP: default TABLESPACE_NAME: NULL *************************** 5. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: testpart TABLE_NAME: testpart1 PARTITION_NAME: p4 SUBPARTITION_NAME: NULL PARTITION_ORDINAL_POSITION: 5 SUBPARTITION_ORDINAL_POSITION: NULL PARTITION_METHOD: RANGE SUBPARTITION_METHOD: NULL PARTITION_EXPRESSION: id SUBPARTITION_EXPRESSION: NULL PARTITION_DESCRIPTION: 50 TABLE_ROWS: 1 AVG_ROW_LENGTH: 16384 DATA_LENGTH: 16384 MAX_DATA_LENGTH: NULL INDEX_LENGTH: 0 DATA_FREE: 0 CREATE_TIME: NULL UPDATE_TIME: NULL CHECK_TIME: NULL CHECKSUM: NULL PARTITION_COMMENT: NODEGROUP: default TABLESPACE_NAME: NULL 5 rows in set (0.00 sec)
2.1.1 添加分区
此时最大50,当需要插入更大的值时,需要:
ALTER TABLE testpart1 ADD PARTITION( PARTITION p6 VALUES LESS THAN MAXVALUE);
2.1.2 删除分区
如果分区不需要了也可以删除:
ALTER TABLE testpart1 DROP PARTITION p3;
2.1.3 分区的执行计划
mysql> EXPLAIN PARTITIONS SELECT * FROM testpart1 WHERE id=41 G *************************** 1. row ***************************
2.1.4 RANGE分区函数的选择
在分区函数的选择上,优化器只能对YEAR()、TO_DAYS()、TO_SECONDS()、UNIX_TIMESTAMP()这类函数进行优化选择
例如这样建立分区表就是属于没有优化过的
CREATE TABLE testparttime( id INT, hired DATETIME)ENGINE = INNODB PARTITION BY RANGE(YEAR(hired)*100+MONTH(hired))( PARTITION p0 VALUES LESS THAN (201002), PARTITION p1 VALUES LESS THAN (201003), PARTITION p2 VALUES LESS THAN (201004));
mysql> EXPLAIN PARTITIONS SELECT * FROM testparttime G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: testparttime partitions: p0,p1,p2 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 3 Extra: 1 row in set (0.00 sec)
下属方法就可以的:
CREATE TABLE testparttime1( id INT, hired DATETIME)ENGINE = INNODB PARTITION BY RANGE(TO_DAYS(hired))( PARTITION p0 VALUES LESS THAN (TO_DAYS('2010-02-01')), PARTITION p1 VALUES LESS THAN (TO_DAYS('2010-03-01')), PARTITION p2 VALUES LESS THAN (TO_DAYS('2010-04-01')));
2.2 LIST 分区
list分区和range分区非常相似,只是分区列的值是离散的而非连续的。
CREATE TABLE testpartlist( id int, nul int)ENGINE=INNODB PARTITION BY LIST(id)( PARTITION p0 VALUES IN (1,3,5,7,9), PARTITION p1 VALUES IN (0,2,4,6,8));
2.3 HASH 分区
hash分区的目的是将数据均衡的分在预先定义的各个分区中,保证各分区数据量尽量是一致的。在RANGE和LIST分区中,必须明确指定一个给定的列值或者列值的集合应该保存在哪个分区中。在HASH分区中,MySQL自动完成这些工作,用户要做的只是基于将要被散列的值指定一个列或表达式,以及指定被分区的表将要被分割的分区数量
2.3.1 创建hash分区
CREATE TABLE testparthash( id INT, hired DATETIME)ENGINE = INNODB PARTITION BY HASH (YEAR(hired)) PARTITIONS 4;
INSERT INTO testparthash VALUES(10,'2011-01-01 09:20:00'); INSERT INTO testparthash VALUES(11,'2012-01-01 09:20:00'); INSERT INTO testparthash VALUES(12,'2013-01-01 09:20:00'); INSERT INTO testparthash VALUES(13,'2014-01-01 09:20:00'); INSERT INTO testparthash VALUES(14,'2015-01-01 09:20:00'); INSERT INTO testparthash VALUES(15,'2010-01-01 09:20:00'); INSERT INTO testparthash VALUES(16,'2013-01-01 09:20:00'); INSERT INTO testparthash VALUES(17,'2013-01-01 09:20:00'); INSERT INTO testparthash VALUES(18,'2015-01-01 09:20:00'); INSERT INTO testparthash VALUES(19,'2014-01-01 09:20:00'); INSERT INTO testparthash VALUES(21,'2015-01-01 09:20:00'); INSERT INTO testparthash VALUES(22,'2015-01-01 09:20:00'); INSERT INTO testparthash VALUES(23,'2011-01-01 09:20:00'); INSERT INTO testparthash VALUES(24,'2012-01-01 09:20:00'); INSERT INTO testparthash VALUES(25,'2015-01-01 09:20:00');
2.3.2 查看分区:
SELECT table_name,partition_name,table_rows FROM information_schema.PARTITIONS WHERE table_schema='testpart' AND table_name='testparthash' G
2.3.4 验证分区
mysql> SELECT table_name,partition_name,table_rows -> FROM information_schema.PARTITIONS -> WHERE table_schema='testpart' AND table_name='testparthash' G *************************** 1. row *************************** table_name: testparthash partition_name: p0 table_rows: 2 *************************** 2. row *************************** table_name: testparthash partition_name: p1 table_rows: 3 *************************** 3. row *************************** table_name: testparthash partition_name: p2 table_rows: 3 *************************** 4. row *************************** table_name: testparthash partition_name: p3 table_rows: 7 4 rows in set (0.00 sec)
可以看到如上的hash分区数据并不是很均衡,因为YEAR()函数是离散的,如果对于连续的值,例如主键,HASH可以较好的将数据进行平均分布。
2.4 KEY 分区
key分区和hash分区类似。不同指出在于hash分区通过用户定义的函数进行分区
创建分区需要考虑的几个问题?
1:采用哪种分区类型?一般是hash和range
2:根据哪些分区键分区?分区键有哪些特点?是连续的?还是离散的?