首页 > 数据库 >13.MongoDB系列之分片简介

13.MongoDB系列之分片简介

时间:2022-10-17 20:23:24浏览次数:80  
标签:username 13 2bffe09ec303 min -- MongoDB shards 分片 setParameter

1. 分片概念

分片是指跨机器拆分数据的过程,有时也会用术语分区。MongoDB既可以手工分片,也支持自动分片

2. 理解集群组件

分片的目标之一是由多个分片组成的集群对应用程序来说就像是一台服务器。为了此实现,需要在分片前面运行一个或多个称为mongos的路由进程。mongos维护着一个“目录”,指明了哪个分片包含哪些数据。
应用程序可以正常连接到此路由器并发出请求。路由服务器哪些数据在哪些分片上,可以将请求转发到适当的分片。如果有对请求的响应,路由服务器会收集他们,并在必要时进行合并,然后在发送给应用程序

3. 在单机集群上进行分片
# mongo --nodb --norc
MongoDB shell version v4.2.6
>st = ShardingTest({
    name: "one-min-shards",
    chunkSize:1,
    shards: 2,
    rs:{
        nodes:3,
        oplogSize:10
    },
    other: {
        enableBalancer:true
    }
})

当ShardingTest完成集群设置后,将启动并运行10个进程:两个副本集(各有3个节点)、一个配置服务器副本集(有3个节点)、一个mongos

在新终端窗口执行ps -ef|grep mongo可以看到如下信息:

00:00:09 /usr/bin/mongod --oplogSize 10 --port 20000 --replSet one-min-shards-rs0 --dbpath /data/db/one-min-shards-rs0-0 --shardsvr --setParameter migrationLockAcquisitionMaxWaitMS=30000 --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       105    66  1 21:57 pts/0    00:00:10 /usr/bin/mongod --oplogSize 10 --port 20001 --replSet one-min-shards-rs0 --dbpath /data/db/one-min-shards-rs0-1 --shardsvr --setParameter migrationLockAcquisitionMaxWaitMS=30000 --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       140    66  1 21:57 pts/0    00:00:10 /usr/bin/mongod --oplogSize 10 --port 20002 --replSet one-min-shards-rs0 --dbpath /data/db/one-min-shards-rs0-2 --shardsvr --setParameter migrationLockAcquisitionMaxWaitMS=30000 --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       293    66  1 21:58 pts/0    00:00:09 /usr/bin/mongod --oplogSize 10 --port 20003 --replSet one-min-shards-rs1 --dbpath /data/db/one-min-shards-rs1-0 --shardsvr --setParameter migrationLockAcquisitionMaxWaitMS=30000 --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       328    66  1 21:58 pts/0    00:00:09 /usr/bin/mongod --oplogSize 10 --port 20004 --replSet one-min-shards-rs1 --dbpath /data/db/one-min-shards-rs1-1 --shardsvr --setParameter migrationLockAcquisitionMaxWaitMS=30000 --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       363    66  1 21:58 pts/0    00:00:09 /usr/bin/mongod --oplogSize 10 --port 20005 --replSet one-min-shards-rs1 --dbpath /data/db/one-min-shards-rs1-2 --shardsvr --setParameter migrationLockAcquisitionMaxWaitMS=30000 --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       525    66  2 21:58 pts/0    00:00:11 /usr/bin/mongod --oplogSize 40 --port 20006 --replSet one-min-shards-configRS --dbpath /data/db/one-min-shards-configRS-0 --journal --configsvr --storageEngine wiredTiger --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       567    66  1 21:58 pts/0    00:00:11 /usr/bin/mongod --oplogSize 40 --port 20007 --replSet one-min-shards-configRS --dbpath /data/db/one-min-shards-configRS-1 --journal --configsvr --storageEngine wiredTiger --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       609    66  1 21:58 pts/0    00:00:10 /usr/bin/mongod --oplogSize 40 --port 20008 --replSet one-min-shards-configRS --dbpath /data/db/one-min-shards-configRS-2 --journal --configsvr --storageEngine wiredTiger --setParameter writePeriodicNoops=false --setParameter numInitialSyncConnectAttempts=60 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true --setParameter orphanCleanupDelaySecs=1
root       792    66  0 21:58 pts/0    00:00:00 /usr/bin/mongos -v --port 20009 --configdb one-min-shards-configRS/2bffe09ec303:20006,2bffe09ec303:20007,2bffe09ec303:20008 --bind_ip 0.0.0.0 --setParameter enableTestCommands=1 --setParameter disableLogicalSessionCacheRefresh=true

整个集群会将日志存储到当前shell中,因此打开第二个终端窗口,并启动另一个mongo shell:

# mongo --port 20009
mongos> use accounts
switched to db accounts
mongos> for(var i=0; i<100000;i++){db.users.insert({'username': 'user'+i, 'created_at': new Date()})}
mongos> db.users.count()
100000
mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("62d2c414601aeb7a88c169f7")
  }
  shards:
        {  "_id" : "one-min-shards-rs0",  "host" : "one-min-shards-rs0/2bffe09ec303:20000,2bffe09ec303:20001,2bffe09ec303:20002",  "state" : 1 }
        {  "_id" : "one-min-shards-rs1",  "host" : "one-min-shards-rs1/2bffe09ec303:20003,2bffe09ec303:20004,2bffe09ec303:20005",  "state" : 1 }
  active mongoses:
        "4.2.6" : 1
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "accounts",  "primary" : "one-min-shards-rs1",  "partitioned" : false,  "version" : {  "uuid" : UUID("ba658ebe-bbd1-44cd-b2c9-cb48d3810551"),  "lastMod" : 1 } }
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                one-min-shards-rs0      1
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : one-min-shards-rs0 Timestamp(1, 0)

要对一个特定的集合进行分片,首先需要在集合的数据库上启用分片,如下:

mongos> sh.enableSharding("accounts")

在对集合进行分片时,需要选择一个片键。片键是MongoDB用来拆分数据的一个或几个字段。随着集合的增大,片键会成为集合中最重要的索引。只有创建了索引的字段才能够作为片键。

mongos> db.users.createIndex({'username':1})

现在可以通过username片键来对集合进行分片了

mongos> sh.shardCollection("accounts.users", {"username":1})
mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("62d2c414601aeb7a88c169f7")
  }
  shards:
        {  "_id" : "one-min-shards-rs0",  "host" : "one-min-shards-rs0/2bffe09ec303:20000,2bffe09ec303:20001,2bffe09ec303:20002",  "state" : 1 }
        {  "_id" : "one-min-shards-rs1",  "host" : "one-min-shards-rs1/2bffe09ec303:20003,2bffe09ec303:20004,2bffe09ec303:20005",  "state" : 1 }
  active mongoses:
        "4.2.6" : 1
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                6 : Success
  databases:
        {  "_id" : "accounts",  "primary" : "one-min-shards-rs1",  "partitioned" : true,  "version" : {  "uuid" : UUID("ba658ebe-bbd1-44cd-b2c9-cb48d3810551"),  "lastMod" : 1 } }
                accounts.users
                        shard key: { "username" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                one-min-shards-rs0      6
                                one-min-shards-rs1      7
                        { "username" : { "$minKey" : 1 } } -->> { "username" : "user17256" } on : one-min-shards-rs0 Timestamp(2, 0)
                        { "username" : "user17256" } -->> { "username" : "user24515" } on : one-min-shards-rs0 Timestamp(3, 0)
                        { "username" : "user24515" } -->> { "username" : "user31775" } on : one-min-shards-rs0 Timestamp(4, 0)
                        { "username" : "user31775" } -->> { "username" : "user39034" } on : one-min-shards-rs0 Timestamp(5, 0)
                        { "username" : "user39034" } -->> { "username" : "user46294" } on : one-min-shards-rs0 Timestamp(6, 0)
                        { "username" : "user46294" } -->> { "username" : "user53553" } on : one-min-shards-rs0 Timestamp(7, 0)
                        { "username" : "user53553" } -->> { "username" : "user60812" } on : one-min-shards-rs1 Timestamp(7, 1)
                        { "username" : "user60812" } -->> { "username" : "user68072" } on : one-min-shards-rs1 Timestamp(1, 7)
                        { "username" : "user68072" } -->> { "username" : "user75331" } on : one-min-shards-rs1 Timestamp(1, 8)
                        { "username" : "user75331" } -->> { "username" : "user82591" } on : one-min-shards-rs1 Timestamp(1, 9)
                        { "username" : "user82591" } -->> { "username" : "user89851" } on : one-min-shards-rs1 Timestamp(1, 10)
                        { "username" : "user89851" } -->> { "username" : "user9711" } on : one-min-shards-rs1 Timestamp(1, 11)
                        { "username" : "user9711" } -->> { "username" : { "$maxKey" : 1 } } on : one-min-shards-rs1 Timestamp(1, 12)
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                one-min-shards-rs0      1
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : one-min-shards-rs0 Timestamp(1, 0)

可以看到,这个集合被分为了13个块,其中6个块在分片one-min-shards-rs0上,7个块在分片one-min-shards-rs1上。

现在数据已经分布在分片上了,现在我们查询特定用户名:

mongos> db.users.find({username: "user12345"})
{ "_id" : ObjectId("62d2ca068dbf17063ba4d62f"), "username" : "user12345", "created_at" : ISODate("2022-07-16T14:24:06.891Z") }

通过explain可以观察到数据分布在分片one-min-shards-rs0上

mongos> db.users.find({username: "user12345"}).explain()
{
        "queryPlanner" : {
                "mongosPlannerVersion" : 1,
                "winningPlan" : {
                        "stage" : "SINGLE_SHARD",
                        "shards" : [
                                {
                                        "shardName" : "one-min-shards-rs0",
                                        "connectionString" : "one-min-shards-rs0/2bffe09ec303:20000,2bffe09ec303:20001,2bffe09ec303:20002",
                                        "serverInfo" : {
                                                "host" : "2bffe09ec303",
                                                "port" : 20001,
                                                "version" : "4.2.6",
                                                "gitVersion" : "20364840b8f1af16917e4c23c1b5f5efd8b352f8"
                                        },
                                        "plannerVersion" : 1,
                                        "namespace" : "accounts.users",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {
                                                "username" : {
                                                        "$eq" : "user12345"
                                                }
                                        },
                                        "queryHash" : "379E82C5",
                                        "planCacheKey" : "965E0A67",
                                        "winningPlan" : {
                                                "stage" : "FETCH",
                                                "inputStage" : {
                                                        "stage" : "SHARDING_FILTER",
                                                        "inputStage" : {
                                                                "stage" : "IXSCAN",
                                                                "keyPattern" : {
                                                                        "username" : 1
                                                                },
                                                                "indexName" : "username_1",
                                                                "isMultiKey" : false,
                                                                "multiKeyPaths" : {
                                                                        "username" : [ ]
                                                                },
                                                                "isUnique" : false,
                                                                "isSparse" : false,
                                                                "isPartial" : false,
                                                                "indexVersion" : 2,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
                                                                        "username" : [
                                                                                "[\"user12345\", \"user12345\"]"
                                                                        ]
                                                                }
                                                        }
                                                }
                                        },
                                        "rejectedPlans" : [ ]
                                }
                        ]
                }
        }
}

当查询全部数据时,可以发现查询必须方位两个分片才能找到所有数据。通常来说,如果查询中没有使用片键,mongos就不得不将查询发送到每个分片上。

mongos> db.users.find({}).explain()
{
        "queryPlanner" : {
                "mongosPlannerVersion" : 1,
                "winningPlan" : {
                        "stage" : "SHARD_MERGE",
                        "shards" : [
                                {
                                        "shardName" : "one-min-shards-rs0",
                                        "connectionString" : "one-min-shards-rs0/2bffe09ec303:20000,2bffe09ec303:20001,2bffe09ec303:20002",
                                        "serverInfo" : {
                                                "host" : "2bffe09ec303",
                                                "port" : 20001,
                                                "version" : "4.2.6",
                                                "gitVersion" : "20364840b8f1af16917e4c23c1b5f5efd8b352f8"
                                        },
                                        "plannerVersion" : 1,
                                        "namespace" : "accounts.users",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {

                                        },
                                        "queryHash" : "8B3D4AB8",
                                        "planCacheKey" : "8B3D4AB8",
                                        "winningPlan" : {
                                                "stage" : "SHARDING_FILTER",
                                                "inputStage" : {
                                                        "stage" : "COLLSCAN",
                                                        "direction" : "forward"
                                                }
                                        },
                                        "rejectedPlans" : [ ]
                                },
                                {
                                        "shardName" : "one-min-shards-rs1",
                                        "connectionString" : "one-min-shards-rs1/2bffe09ec303:20003,2bffe09ec303:20004,2bffe09ec303:20005",
                                        "serverInfo" : {
                                                "host" : "2bffe09ec303",
                                                "port" : 20005,
                                                "version" : "4.2.6",
                                                "gitVersion" : "20364840b8f1af16917e4c23c1b5f5efd8b352f8"
                                        },
                                        "plannerVersion" : 1,
                                        "namespace" : "accounts.users",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {

                                        },
                                        "queryHash" : "8B3D4AB8",
                                        "planCacheKey" : "8B3D4AB8",
                                        "winningPlan" : {
                                                "stage" : "SHARDING_FILTER",
                                                "inputStage" : {
                                                        "stage" : "COLLSCAN",
                                                        "direction" : "forward"
                                                }
                                        },
                                        "rejectedPlans" : [ ]
                                }
                        ]
                }
        },
        "serverInfo" : {
                "host" : "2bffe09ec303",
                "port" : 20009,
                "version" : "4.2.6",
                "gitVersion" : "20364840b8f1af16917e4c23c1b5f5efd8b352f8"
        },
        "ok" : 1,
        "operationTime" : Timestamp(1658027749, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1658030075, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

定向查询:包含片键并可以发送到单个分片或分片子集的查询

分散-收集查询:必须发送到所有分片的查询

现在可以在原来的shell,按几次Enter键返回命令行,运行sh.stop()干净的关闭所有服务器

欢迎关注公众号算法小生沈健的技术博客

标签:username,13,2bffe09ec303,min,--,MongoDB,shards,分片,setParameter
From: https://www.cnblogs.com/shenjian-online/p/16800491.html

相关文章

  • 17.MongoDB系列之了解应用程序动态
    1.查看当前操作mongos>db.currentOp(){"inprog":[{"shard":"study","type":"op......
  • 16.MongoDB系列之分片管理
    1.查看当前状态1.1查看配置信息mongos>useconfig//查看分片mongos>db.shards.find(){"_id":"study","host":"study/localhost:27018,localhost:27019,loc......
  • 15. MongoDB系列之选择片键
    1.片键类型1.1升序片键升序片键通常类似于date或ObjectId--随着时间稳步增长的字段。这种模式通常会使MongoDB更难保持块的平衡,因为所有的块都是由一个分片创建的。1......
  • ShardingSphere的分片策略
    ShardingSphere的分片策略本篇文章源码基于4.0.1版本上篇文章我们说到ShardingSphere通过路由引擎根据路由规则获取数据节点,然后生成路由结果,具体路由策略在ShardingStr......
  • 13_爬虫入门
    爬虫1.回顾web服务器知识点:fastapi和uvicorn配合搭建web服务器:0.先安装fastapi和uvicorn1.导包2.创建fastapi对象3.处理web数据(请求和响应)......
  • [IOI2013]robots 机器人
    题目传送门思路简单题,设函数\(f_i\)表示当时间为\(i\)时是否能够收拾好所有玩具,则\(f_i\)显然是单调的。所以我们可以考虑二分。设我们当前二分到\(x\),我们先把......
  • CF1335F Robots on a Grid
    CF1335F:因为每个格子都只向外连一条边,所以网格可感性理解为一个基环树森林。则每个机器人最终都会走到一个环上,那么所占据黑格便也在环上。那么若要使机器人数量最多,且......
  • 数据集 | 以色列高中的巴格鲁特成绩(2013-2016)数据集
    下载数据集请登录爱数科该文件包含2013年至2016年之间1800多个不同学科的学校平均60,000等级的Bagrut成绩。1.字段描述2.数据预览3.字段诊断信息4.数据来源来源于Kaggle......
  • 3133. 串珠子
    题目链接3133.串珠子给定\(M\)种不同颜色的珠子,每种颜色的珠子的个数都足够多。现在要从中挑选\(N\)个珠子,串成一个环形手链。请问一共可以制作出多少种不同的手......
  • 13.流程控制
    流程控制选择分支结构--某用户银行卡号为'6225540436726367',--该用户执行取钱操作,取钱5000元,余额充足则进行取钱操作,--并提示“取钱成功”,否则提示“余额不足”declar......