1. mycat分库分表的分片方案
目录1.1. 一. 配置规则
1.1.1. 分片方式
1.1.2. 配置规则
TableRule :
- name:分片规则标识名称(唯一)。
- columns:分片表的列名。
- algorithm:算法的名称。
<tableRule name="auto-sharding-long">
<rule>
<columns>id</columns>
<algorithm>rang-long</algorithm>
</rule>
</tableRule>
Function :
- name:算法名称(唯一)。
- class:具体的分片算法(需要指定算法的具体类)。
- property:属性根据要求指定。
<function name="rang-long" class="io.mycat.route.function.AutoPartitionByLong">
<property name="mapFile">autopartition-long.txt</property>
</function>
1.2. 二. 连续分片
1.2.1. 自定义数字范围分片
适用于明确知道分片字段的某个范围属于哪个分片,其字段为数字类型
schema.xml 配置
<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">
<!--逻辑数据库设置-->
<schema name="luckcat" checkSQLschema="false" sqlMaxLimit="100">
<!--指定切分表和分表规则-->
<table name="auto_sharding_long" primaryKey="id" dataNode="dn0,dn1,dn2" rule="auto-sharding-long"/>
</schema>
<!--逻辑节点与物理主机映射-->
<dataNode name="dn0" dataHost="dh-1" database="db0"/>
<dataNode name="dn1" dataHost="dh-1" database="db1"/>
<dataNode name="dn2" dataHost="dh-1" database="db2"/>
<dataNode name="dn3" dataHost="dh-1" database="db3"/>
<dataNode name="dn4" dataHost="dh-1" database="db4"/>
<dataNode name="dn5" dataHost="dh-1" database="db5"/>
<dataNode name="dn6" dataHost="dh-1" database="db6"/>
<dataNode name="dn7" dataHost="dh-1" database="db7"/>
<dataNode name="dn8" dataHost="dh-1" database="db8"/>
<dataNode name="dn9" dataHost="dh-1" database="db9"/>
<dataNode name="dn10" dataHost="dh-1" database="db10"/>
<!--物理主机-->
<dataHost name="dh-1" maxCon="1000" minCon="10" balance="0" writeType="0" dbType="mysql" dbDriver="native"
switchType="1" slaveThreshold="100">
<heartbeat>select user()</heartbeat>
<writeHost host="hostM1" url="localhost:3306" user="root" password="123456"/>
</dataHost>
</mycat:schema>
rule.xml 配置
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE mycat:rule SYSTEM "rule.dtd">
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="auto-sharding-long">
<rule>
<columns>age</columns> <!--分表字段-->
<algorithm>rang-long</algorithm> <!--算法-->
</rule>
</tableRule>
<function name="rang-long" class="io.mycat.route.function.AutoPartitionByLong">
<property name="mapFile">autopartition-long.txt</property>
<property name="defaultNode">0</property>
</function>
</mycat:rule>
autopartition-long.txt 配置
# range start-end ,data node index
# K=1000,M=10000.
0-2M=0
2M-3M=1
3M-6M=2
test.sql
CREATE TABLE auto_sharding_long (`age` int NOT NULL ,`db_nm` varchar(20) NULL);
INSERT INTO auto_sharding_long (age,db_nm) VALUES (20000, database());
INSERT INTO auto_sharding_long (age,db_nm) VALUES (25000, database());
INSERT INTO auto_sharding_long (age,db_nm) VALUES (35000, database());
select * from auto_sharding_long;
1.2.2. 按日期分片
schema.xml
<table name="sharding_by_day" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-day" />
rule.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE mycat:rule SYSTEM "rule.dtd">
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-day">
<rule>
<columns>create_time</columns>
<algorithm>part-by-day</algorithm>
</rule>
</tableRule>
<function name="part-by-day"
class="io.mycat.route.function.PartitionByDate">
<property name="dateFormat">yyyy-MM-dd</property>
<property name="sBeginDate">2017-10-01</property>
<!--默认从开始日期算起,每隔10天一个分区。-->
<property name="sPartionDay">10</property>
</function>
</mycat:rule>
test.sql
CREATE TABLE sharding_by_day (create_time timestamp NULL ON UPDATE CURRENT_TIMESTAMP ,`db_nm` varchar(20) NULL);
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-01', database());
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-10', database());
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-11', database());
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-21', database());
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-31', database());
select * from sharding_by_day;
-- 以上测试结果中有个错误,因为配置的dataNode节点数为3,而2017-10-31为2017-10-01后的第四个10天中的第一天,因此需要至少4个dataNode,节点不够就报如上错误了。
1.2.3. 按单月小时分片
schema.xml
<table name="sharding_by_hour" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-hour" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-hour">
<rule>
<columns>sharding_col</columns>
<algorithm>sharding-by-hour</algorithm>
</rule>
</tableRule>
<function name="sharding-by-hour" class="io.mycat.route.function.LatestMonthPartion">
<property name="splitOneDay">3</property>
</function>
</mycat:rule>
test.sql
-- 单月内按照小时拆分,最小粒度是小时,一天最多可以有24个分片,最少1个分片,下个月从头开始循环,每个月末需要手工清理数据。
-- 字段为字符串类型,yyyymmddHH 10位。
CREATE TABLE sharding_by_hour (create_time timestamp NULL ON UPDATE CURRENT_TIMESTAMP ,`db_nm` varchar(20) NULL,sharding_col varchar(10) null);
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100101','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100108','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100109','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100116','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100117','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100123','2017-10-01', database());
select * from sharding_by_hour;
1.2.4. 按自然月分片
schema.xml
<table name="sharding_by_month" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-month" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-month">
<rule>
<columns>create_time</columns>
<algorithm>sharding-by-month</algorithm>
</rule>
</tableRule>
<function name="sharding-by-month" class="io.mycat.route.function.PartitionByMonth">
<property name="dateFormat">yyyy-MM-dd</property>
<property name="sBeginDate">2017-10-01</property>
<property name="sEndDate">2017-12-30</property>
</function>
</mycat:rule>
test.sql
CREATE TABLE sharding_by_month (create_time timestamp NULL ON UPDATE CURRENT_TIMESTAMP ,`db_nm` varchar(20) NULL);
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-10-01', database());
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-10-30', database());
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-11-11', database());
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-11-21', database());
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-12-01', database());
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-12-31', database());
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2018-01-01', database());
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2018-01-31', database());
select * from sharding_by_month
-- 注意
-- * schema里的table的dataNode节点数必须:大于rule的开始时间按照分片数计算到现在的个数
-- * 按照自然月计算(无论是28、30、31天都是一个月的)
-- * 分片节点个数可以后增加,但是必须符合第一点说明。
1.3. 三. 离散分片
1.3.1. 枚举分片
schema.xml
<table name="sharding_by_intfile" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-intfile" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-intfile">
<rule>
<columns>age</columns>
<algorithm>hash-int</algorithm>
</rule>
</tableRule>
<function name="hash-int" class="io.mycat.route.function.PartitionByFileMap">
<property name="mapFile">partition-hash-int.txt</property>
<property name="type">0</property>
<property name="defaultNode">0</property>
</function>
</mycat:rule>
partition-hash-int.txt
# 枚举项=节点标识,枚举项可以是数字和字符(非汉字)
11=1
12=2
test.sql
-- 通过在配置文件中配置可能的枚举id,指定数据分布到不同的物理节点上,本规则适用于按照省份或区县来拆分数据类业务。
CREATE TABLE sharding_by_intfile (`age` int NOT NULL ,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_intfile` (age,db_nm) VALUES (10, database());
INSERT INTO `sharding_by_intfile` (age,db_nm) VALUES (11, database());
INSERT INTO `sharding_by_intfile` (age,db_nm) VALUES (12, database());
select * from `sharding_by_intfile`;
1.3.2. 程序指定分区的分片
schema.xml
<table name="sharding_by_substring" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-substring" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-substring">
<rule>
<columns>user_id</columns>
<algorithm>sharding-by-substring</algorithm>
</rule>
</tableRule>
<function name="sharding-by-substring" class="io.mycat.route.function.PartitionDirectBySubString">
<property name="startIndex">0</property>
<property name="size">2</property>
<property name="partitionCount">3</property>
<property name="defaultPartition">0</property>
</function>
</mycat:rule>
test.sql
-- 在程序运行阶段,由程序自主决定路由到哪个分片。
CREATE TABLE sharding_by_substring (`user_id` varchar(20) NOT NULL ,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_substring` (user_id,db_nm) VALUES ('05-10000', database());
INSERT INTO `sharding_by_substring` (user_id,db_nm) VALUES ('02-10001', database());
INSERT INTO `sharding_by_substring` (user_id,db_nm) VALUES ('03-10002', database());
select * from `sharding_by_substring`;
1.3.3. 十进制求模分片
schema.xml
<table name="sharding_by_mod" primaryKey="id" dataNode="dn$0-2" rule="sharding-mod-long" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-mod-long">
<rule>
<columns>id</columns>
<algorithm>mod-long</algorithm>
</rule>
</tableRule>
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
<property name="count">3</property>
</function>
</mycat:rule>
test.sql
CREATE TABLE `sharding_by_mod` (id int(10) null,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_mod` (id,db_nm) VALUES (1, database());
INSERT INTO `sharding_by_mod` (id,db_nm) VALUES (2, database());
INSERT INTO `sharding_by_mod` (id,db_nm) VALUES (3, database());
INSERT INTO `sharding_by_mod` (id,db_nm) VALUES (10, database());
select * from sharding_by_mod;
1.3.4. 字符串hash解析分片
schema.xml
<table name="sharding_by_stringhash" primaryKey="ord_no" dataNode="dn$0-2" rule="sharding-by-stringhash" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-stringhash">
<rule>
<columns>ord_no</columns>
<algorithm>sharding-by-stringhash</algorithm>
</rule>
</tableRule>
<function name="sharding-by-stringhash" class="io.mycat.route.function.PartitionByString">
<property name="partitionLength">512</property>
<property name="partitionCount">2</property>
<property name="hashSlice">-6:0</property>
</function>
</mycat:rule>
<!-- 配置说明:-->
<!-- * columns标识将要分片的表字段。-->
<!-- * algorithm为分片函数。-->
<!-- * partitionLength为字符串hash的求模基数。-->
<!-- * partitionCount为分区数。其中partitionLength*partitionCount=1024-->
<!-- * hashSlice为预算位,即根据子字符串中的int值进行hash运算。-->
<!-- 0 代表 str.length(), -1 代表 str.length()-1,大于0只代表数字自身-->
<!-- 可以理解为substring(start,end),start为0则只表示0-->
<!-- 例1:值“45abc”,hash预算位0:2 ,取其中45进行计算-->
<!-- 例2:值“aaaabbb2345”,hash预算位-4:0 ,取其中2345进行计算-->
test.sql
-- 此规则是截取字符串中的int数值的hash分片。
CREATE TABLE sharding_by_stringhash (ord_no varchar(20) NULL,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_stringhash` (ord_no,db_nm) VALUES (171022237582, database());
INSERT INTO `sharding_by_stringhash` (ord_no,db_nm) VALUES (171022553756, database());
select * from sharding_by_stringhash;
1.3.5. 一致性哈希分片
schema.xml
<table name="sharding_by_murmurhash" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-murmurhash" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-murmurhash">
<rule>
<columns>id</columns>
<algorithm>sharding-by-murmurhash</algorithm>
</rule>
</tableRule>
<function name="sharding-by-murmurhash" class="io.mycat.route.function.PartitionByMurmurHash">
<property name="seed">0</property><!-- 默认是0 -->
<property name="count">2</property><!-- 要分片的数据库节点数量,必须指定,否则没法分片 -->
<property name="virtualBucketTimes">160</property>
<!-- 一个实际的数据库节点被映射为这么多虚拟节点,默认是160倍,也就是虚拟节点数是物理节点数的160倍-->
<!-- <property name="weightMapFile">weightMapFile</property> 节点的权重,没有指定权重的节点默认是1。
以properties文件的格式填写,以从0开始到count-1的整数值也就是节点索引为key,以节点权重值为值。所有权重值必须是正整数,否则以1代替 -->
<!-- <property name="bucketMapPath">/etc/mycat/bucketMapPath</property>
用于测试时观察各物理节点与虚拟节点的分布情况,如果指定了这个属性,会把虚拟节点的murmur hash值与物理节点的映射按行输出到
这个文件,没有默认值>,如果不指定,就不会输出任何东西 -->
</function>
</mycat:rule>
test.sql
-- 一致性hash算法有效解决了分布式数据的扩容问题。因为此规则优点在于扩容时迁移数据量比较少,前提是分片节点比较多,虚拟节点分配多些。
-- 虚拟节点分配的少就会造成数据分布不够均匀。但如果实际分片数据比较少,迁移量也会比较多。
CREATE TABLE sharding_by_murmurhash (id int(10) null,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (1, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (2, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (3, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (4, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (5, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (6, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (7, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (8, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (9, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (10, database());
select * from sharding_by_murmurhash;
1.4. 四. 综合分片
1.4.1. 范围求模分片
schema.xml
<table name="sharding_by_rang_mod" primaryKey="id" dataNode="dn$0-10" rule="sharding-by-rang-mod" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-rang-mod">
<rule>
<columns>id</columns>
<algorithm>sharding-by-rang-mod</algorithm>
</rule>
</tableRule>
<function name="sharding-by-rang-mod" class="io.mycat.route.function.PartitionByRangeMod">
<!--指定分片规则的配置文件-->
<property name="mapFile">partition-range-mod.txt</property>
<!--未包含以上规则的数据存储节点,节点从0开始。-->
<property name="defaultNode">0</property>
</function>
</mycat:rule>
partition-range-mod.txt
# range start-end ,data node group size
0-200M=3
200M1-400M=1
400M1-600M=4
600M1-800M=2
test.sql
-- 该算法先进行范围分片,计算出分片组,组内再求模,综合了范围分片和求模分片的优点。分片组内使用求模可以保证组内的数据分布比较均匀,
-- 分片组之间采用范围分片可以兼顾范围分片的特点。事先规定好分片的数量,数据扩容时按分片组扩容,则原有分片组的数据不需要迁移。由于
-- 分片组内的数据分布比较均匀,所以分片组内可以避免热点数据问题。
CREATE TABLE sharding_by_rang_mod (id bigint null,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (1000, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (1002, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (30000, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (30004, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (40000, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (40005, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (60005, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (60006, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (80006, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (60008, database());
select * from sharding_by_rang_mod;
1.4.2. 日期范围hash分片
schema.xml
<table name="sharding_by_date_hash" primaryKey="id" dataNode="dn$0-10" rule="sharding-by-range-date-hash" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-range-date-hash">
<rule>
<columns>create_time</columns>
<algorithm>sharding-by-range-date-hash</algorithm>
</rule>
</tableRule>
<function name="sharding-by-range-date-hash" class="io.mycat.route.function.PartitionByRangeDateHash">
<property name="dateFormat">yyyy-MM-dd HH:mm:ss</property>
<property name="sBeginDate">2018-01-31 00:00:00</property>
<property name="sPartionDay">30</property>
<property name="groupPartionSize">10</property>
</function>
</mycat:rule>
<!-- sPartionDay:代表多少天分一个分片;-->
<!-- groupPartionSize:代表分片组的大小。-->
test.sql
-- 其思想与范围求模一致,由于日期取模方法会出现数据热点问题,所以先根据日期分组,再根据时间hash使得短期内数据分布得更均匀。
-- 其优点是可以避免扩容时的数据迁移,又可以在一定程度上避免范围分片的热点问题,要求日期格式尽量精确,不然达不到局部均匀的目的。
CREATE TABLE `sharding_by_date_hash` (`create_time` datetime NOT NULL,`db_nm` varchar(20) NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-01-12 00:01:02', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-01-21 01:02:09', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-01-28 12:00:12', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-01-02 11:00:00', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-26 10:00:09', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-03-01 22:01:02', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-02 17:09:08', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-23 11:00:04', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-24 18:12:09', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-21 07:12:00', database());
select * from sharding_by_date_hash;
1.4.3. 取模范围约束分片
schema.xml
<table name="sharding_by_pattern" primaryKey="id" dataNode="dn$0-10" rule="sharding-by-pattern" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-pattern">
<rule>
<columns>id</columns>
<algorithm>sharding-by-pattern</algorithm>
</rule>
</tableRule>
<function name="sharding-by-pattern" class="io.mycat.route.function.PartitionByPattern">
<property name="mapFile">partition-pattern.txt</property>
<property name="defaultNode">0</property>
<property name="patternValue">256</property>
</function>
</mycat:rule>
<!-- 配置说明:-->
<!-- * 1-32即代表id%256后分布的范围,如果在1-32则分区在0,其他类推。-->
<!-- * 如果id非数字,则会分配在defaultNode默认节点。-->
partition-pattern.txt
# id patition range start-end, data node index
## first host configuration
1-32=0
33-64=1
65-96=2
97-128=3
## second host configuration
129-160=4
161-192=5
193-224=6
225-256=7
0-0=7
test.sql
CREATE TABLE sharding_by_pattern (id varchar(20) null,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (1000, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (1002, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (30000, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (30004, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (40000, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (40005, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (60005, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (60006, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (80006, database());
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES ("abcd0", database());
select * from sharding_by_pattern;
1.4.4. ASCII码求模范围约束(字符串)
schema.xml
<table name="sharding_by_ascii" primaryKey="id" dataNode="dn$0-10" rule="partition-by-ascii" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="partition-by-ascii">
<rule>
<columns>id</columns>
<algorithm>sharding-by-pattern-ascii</algorithm>
</rule>
</tableRule>
<function name="sharding-by-pattern-ascii" class="io.mycat.route.function.PartitionByPrefixPattern">
<property name="mapFile">partition-pattern-ascii.txt</property>
<property name="patternValue">256</property>
<property name="prefixLength">5</property>
</function>
</mycat:rule>
partition-pattern-ascii.txt
# id patition range start-end, data node index
## first host configuration
1-32=0
33-64=1
65-96=2
97-128=3
## second host configuration
129-160=4
161-192=5
193-224=6
225-256=7
0-0=7
test.sql
-- 此种规则类似于取模范围约束,此规则支持数据符号字母取模。
CREATE TABLE sharding_by_ascii (id varchar(20) null,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("1000a", database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("1002A", database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (30000, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (30004, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("4000B", database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("4000b", database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (60007, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (60006, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (80006, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("abcd0", database());
select * from sharding_by_ascii;
1.4.5. 固定分片hash(二进制)
schema.xml
<table name="sharding_by_long" primaryKey="id" dataNode="dn$0-10" rule="sharding-by-long" />
rule.xml
<mycat:rule xmlns:mycat="http://io.mycat/">
<tableRule name="sharding-by-long">
<rule>
<columns>id</columns>
<algorithm>sharding-by-long</algorithm>
</rule>
</tableRule>
<function name="sharding-by-long" class="io.mycat.route.function.PartitionByLong">
<property name="partitionCount">2,1</property>
<property name="partitionLength">256,512</property>
</function>
</mycat:rule>
<!-- 配置说明:-->
<!-- * partitionCount为指定分片个数列表。-->
<!-- * partitionLength为分片范围列表。-->
<!-- 分区长度:默认为最大2^n=1024 ,即最大支持1024分区-->
<!-- 约束 :-->
<!-- count,length两个数组的长度必须是一致的。-->
<!-- 1024 = sum((count[i]*length[i])). count和length两个向量的点积恒等于1024-->
<!-- 本例的分区策略:希望将数据水平分成3份,前两份各占25%,第三份占50%。(故本例非均匀分区)-->
<!-- // |<———————1024———————————>|-->
<!-- // |<—-256—>|<—-256—>|<———-512————->|-->
<!-- // | partition0 | partition1 | partition2 |-->
<!-- // | 共2份,故count[0]=2 | 共1份,故count[1]=1 |-->
<!-- int[] count = new int[] { 2, 1 };-->
<!-- int[] length = new int[] { 256, 512 };-->
test.sql
-- 本条规则类似于十进制的求模运算,区别在于是二进制的操作,是取id的二进制低10位,即id二进制&1111111111。此算法的优点在
-- 于如果按照十进制取模运算,则在连续插入1~10时,1~10会被分到1~10个分片,增大了插入事务的控制难度。而此算法根据二进制则
-- 可能会分到连接的分片,降低了插入事务的控制难度。
CREATE TABLE sharding_by_long (id int(10) null,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (1000, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (1002, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (30000, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (30004, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (4000, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (4000, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (60007, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (60006, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (80006, database());
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (0, database());
select * from sharding_by_long;
1.5. 六. 分片策略
- 根据表数据量判断是否需要切分,确保切分后单分片表数据量为1000W左右。
- 根据业务的情况选择合适的分片字段: 最频繁的或者最重要的查询条件。
- 需要考虑扩容数据迁移问题(范围类,范围取模类不需要迁移,哈希类需要迁移)。
- 有关联关系的表配置相同分片规则(ER思想,为了应用join等复杂sql),一对多对应关系一般按多的那一方切分。
- 如果配置类数据, 更新频率比较少,考虑全局表。
源码链接: https://download.csdn.net/download/Gusand/12102557
参考:
推荐博客:https://blog.csdn.net/ygqygq2/article/details/78390985
在线教程:http://www.mycat.io/document/mycat-definitive-guide.pdf