集群的模块介绍:
从MongoDB官方给的集群架构了解,整个集群主要有4个模块:Config Server、mongs、 shard、replica set:
Config Server:用来存放集群的元数据,也就是存放所有分片的的配置数据,mongos第一次启动就需要连接configServer读取相关数据,当configServer有数据进行更新时,也会主动推送消息到所有的mongos上,在3.0.7版本中,官方是建议配置3份的Config Server,以便挂掉两台时,业务还能够正常运转。
mongs:Mongodb集群的的请求入口,能否自动实现数据的分布式分发,生产环境中建议部署在应用服务器上。
shard:分片就比如是将一张大表分散在几个不同的shard中,实现数据分布式存储。
replica set:主要是对每个分片进行冗余,生产环境中,一般将将副本集配置在三个节点上,两份副本、一份仲裁。
环境规划:
端口及安装路径规划:
用途 | IP | 端口 | 备注 | 安装路径 |
ConfigeServer | 172.16.16.120 | 30001 | /db/configS | |
172.16.16.121 | 30001 | /db/configS | ||
172.16.16.122 | 30001 | /db/configS | ||
share1 | 172.16.16.124 | 40001 | Shard1主节点 | /db/shard1 |
172.16.16.125 | 40001 | Shard1副本节点 | /db/shard1 | |
172.16.16.126 | 40001 | Shard1仲裁节点 | /db/shard1 | |
share2 | 172.16.16.125 | 40002 | Shard2主节点 | /db/shard2 |
172.16.16.126 | 40002 | Shard2副本节点 | /db/shard2 | |
172.16.16.131 | 40002 | Shard2仲裁节点 | /db/shard2 | |
share3 | 172.16.16.126 | 40003 | Shard3主节点 | /db/shard3 |
172.16.16.131 | 40003 | Shard3副本节点 | /db/shard3 | |
172.16.16.124 | 40003 | Shard3仲裁节点 | /db/shard3 | |
share4 | 172.16.16.121 | 40004 | Shard4主节点 | /db/shard4 |
172.16.16.124 | 40004 | Shard4副本节点 | /db/shard4 | |
172.16.16.125 | 40004 | Shard4仲裁节点 | /db/shard4 | |
mongos | 172.16.16.124 | 50001 | 生产环境中一般直接部署在应用端 | /db/mongos |
172.16.16.125 | 50001 | /db/mongos | ||
172.16.16.126 | 50001 | /db/mongos | ||
172.16.16.131 | 50001 | /db/mongos |
搭建步骤:
下载mongodb(https://www.mongodb.org/downloads ) ,目前的最新版本是 3.0.7
opt]# tar zxvf mongodb-linux-x86_64-rhel55-3.0.7.gz
opt]# mv mongodb-linux-x86_64-rhel55-3.0.7 /usr/local/mongodb
opt]# useradd mongo
opt]# passwd mongo
Changing password for user mongo.
New UNIX password:
BAD PASSWORD: it is too simplistic/systematic
Retype new UNIX password:
passwd: all authentication tokens updated successfully.
opt]# chown -R mongo:mongo /usr/local/mongodb/
opt]# chown -R mongo:mongo /db
创建相关模块存放路径:
创建configeServer目录(172.16.16.120/121/122):
#mkdir -p /db/configS/data & mkdir -p /db/configS/log (存放ConfigServer的数据、日志)
创建shard1目录(172.16.16.124/125/126):
#mkdir -p /db/shard1/data & mkdir -p /db/shard1/log (存放shard1的数据、日志)
创建shard2目录(172.16.16.125/126/131 ):
#mkdir -p /db/shard2/data & mkdir -p /db/shard2/log (存放shard2的数据、日志)
创建shard3目录(172.16.16.126/131/124 ):
#mkdir -p /db/shard3/data & mkdir -p /db/shard3/log (存放shard3的数据、日志)
创建shard4目录(172.16.16.131/124/125 ):
#mkdir -p /db/shard4/data & mkdir -p /db/shard4/log (存放shard4的数据、日志)
创建mongos目录(172.16.16.124/125/126/131)
#mkdir -p /db/mongos/log (由于mongos只做路由使用,不存数据,所以只需要建立log目录)
模块配置及启动:
configServer(172.16.16.120/121/122)配置及服务启动:
编写 /usr/local/mongodb/conf/configServer.conf,将参数都放在该文件中:
#vim /usr/local/mongodb/conf/configServer.conf
#!/bin/bash
systemLog:
destination: file
path: "/db/configS/log/configServer.log" #日志存储位置
logAppend: true
storage:
journal: #journal配置
enabled: true
dbPath: "/db/configS/data" #数据文件存储位置
directoryPerDB: true #是否一个库一个文件夹
engine: wiredTiger #数据引擎
wiredTiger: #WT引擎配置
engineConfig:
cacheSizeGB: #设置为6G,默认为物理内存的一半
directoryForIndexes: true #是否将索引也按数据库名单独存储
journalCompressor: zlib
collectionConfig: #表压缩配置
blockCompressor: zlib
indexConfig: #索引配置
prefixCompression: true
net: #端口配置
port: 30001 #另外两台需要分别修改为30002、30003
processManagement: #配置启动管理方式
fork: true
sharding: #分片配置
clusterRole: configsvr #分片角色
启动configServer:
conf]$ /usr/local/mongodb/bin/mongod -f /usr/local/mongodb/conf/configServer.conf
mongos(172.16.16.124/125/126/131)配置及服务启动:
编写mongos.conf,将参数都放在该文件中(4台配置文件都一样):
#vim /usr/local/mongodb/conf/mongos.conf
#!/bin/bash
systemLog:
destination: file
path: "/db/mongos/log/mongos.log"
logAppend: true
net:
port:
sharding:
configDB: 172.16.16.120:,172.16.16.121:,172.16.16.122:
processManagement:
fork: true
启动mongos:应保证集群中设备的时间都是一样的,否则启动mongos会报错,若不相同,可先搭建一套NTP服务器
conf]$ /usr/local/mongodb/bin/mongos -f /usr/local/mongodb/conf/mongos.conf
shard1分片+副本集配置及服务启动(172.16.16.124/125/126 ):
#vim /usr/local/mongodb/conf/shard1.conf
#!/bin/bash
systemLog:
destination: file
path: "/db/shard1/log/shard1.log" #日志存储位置
logAppend: true
storage:
journal: #journal配置
enabled: true
dbPath: "/db/shard1/data" #数据文件存储位置
directoryPerDB: true #是否一个库一个文件夹
engine: wiredTiger #数据引擎
wiredTiger: #WT引擎配置
engineConfig:
cacheSizeGB: #设置为6G,默认为物理内存的一半
directoryForIndexes: true #是否将索引也按数据库名单独存储
journalCompressor: zlib
collectionConfig: #表压缩配置
blockCompressor: zlib
indexConfig: #索引配置
prefixCompression: true
net: #端口配置
port:
processManagement: #配置启动管理方式
fork: true
sharding: #分片配置
clusterRole: shardsvr
replication:
replSetName: shard1 #配置副本集名称
启动shard1 mongod:
conf]$ /usr/local/mongodb/bin/mongod -f /usr/local/mongodb/conf/shard1.conf
shard2分片+副本集配置及服务启动(172.16.16.125/126/131 ):
#vim /usr/local/mongodb/conf/shard2.conf
#!/bin/bash
systemLog:
destination: file
path: "/db/shard2/log/shard2.log" #日志存储位置
logAppend: true
storage:
journal: #journal配置
enabled: true
dbPath: "/db/shard2/data" #数据文件存储位置
directoryPerDB: true #是否一个库一个文件夹
engine: wiredTiger #数据引擎
wiredTiger: #WT引擎配置
engineConfig:
cacheSizeGB: #设置为6G,默认为物理内存的一半
directoryForIndexes: true #是否将索引也按数据库名单独存储
journalCompressor: zlib
collectionConfig: #表压缩配置
blockCompressor: zlib
indexConfig: #索引配置
prefixCompression: true
net: #端口配置
port:
processManagement: #配置启动管理方式
fork: true
sharding: #分片配置
clusterRole: shardsvr
replication:
#oplogSizeMB:
replSetName: shard2 #配置副本集名称
启动shard2 mongod:
conf]$ /usr/local/mongodb/bin/mongod -f /usr/local/mongodb/conf/shard2.conf
shard3分片+副本集配置及服务启动(172.16.16.126/131/124 ):
#vim /usr/local/mongodb/conf/shard3.conf
#!/bin/bash
systemLog:
destination: file
path: "/db/shard3/log/shard3.log" #日志存储位置
logAppend: true
storage:
journal: #journal配置
enabled: true
dbPath: "/db/shard3/data" #数据文件存储位置
directoryPerDB: true #是否一个库一个文件夹
engine: wiredTiger #数据引擎
wiredTiger: #WT引擎配置
engineConfig:
cacheSizeGB: #设置为6G,默认为物理内存的一半
directoryForIndexes: true #是否将索引也按数据库名单独存储
journalCompressor: zlib
collectionConfig: #表压缩配置
blockCompressor: zlib
indexConfig: #索引配置
prefixCompression: true
net: #端口配置
port:
processManagement: #配置启动管理方式
fork: true
sharding: #分片配置
clusterRole: shardsvr
replication:
#oplogSizeMB:
replSetName: shard3 #配置副本集名称
启动shara3 mongod:
conf]$ /usr/local/mongodb/bin/mongod -f /usr/local/mongodb/conf/shard3.conf
shard4分片+副本集配置及服务启动(172.16.16.131/124/125 ):
#vim /usr/local/mongodb/conf/shard4.conf
#!/bin/bash
systemLog:
destination: file
path: "/db/shard4/log/shard4.log" #日志存储位置
logAppend: true
storage:
journal: #journal配置
enabled: true
dbPath: "/db/shard4/data" #数据文件存储位置
directoryPerDB: true #是否一个库一个文件夹
engine: wiredTiger #数据引擎
wiredTiger: #WT引擎配置
engineConfig:
cacheSizeGB: #设置为6G,默认为物理内存的一半
directoryForIndexes: true #是否将索引也按数据库名单独存储
journalCompressor: zlib
collectionConfig: #表压缩配置
blockCompressor: zlib
indexConfig: #索引配置
prefixCompression: true
net: #端口配置
port:
processManagement: #配置启动管理方式
fork: true
sharding: #分片配置
clusterRole: shardsvr
replication:
#oplogSizeMB:
replSetName: shard4 #复制集名
启动shara4 mongod:
conf]$ /usr/local/mongodb/bin/mongod -f /usr/local/mongodb/conf/shard4.conf
集群配置:
副本集配置(在每个 shard的主节点上进行配置及初始化,否则会初始化失败 ):
shard1的副本集配置(主节点、副本节点、仲裁节点):
bin]$ ./mongo 172.16.16.124:40001
MongoDB shell version: 3.0.7
connecting to: 172.16.16.124:/test
> use admin
switched to db admin
> config = { _id:"shard1", members:[
{_id:0,host:"172.16.16.124:40001"},
{_id:1,host:"172.16.16.125:40001"},
{_id:2,host:"172.16.16.126:40001",arbiterOnly:true}]
} #以下为输出
{
"_id" : "shard1",
"members" : [
{
"_id" : ,
"host" : "172.16.16.124:40001"
},
{
"_id" : ,
"host" : "172.16.16.125:40001"
},
{
"_id" : ,
"host" : "172.16.16.126:40001",
"arbiterOnly" : true
}
]
}
> rs.initiate(config); #初始化配置
{ "ok" : }
shard2的副本集配置(主节点、副本节点、仲裁节点):
bin]$ ./mongo 172.16.16.125:40002
MongoDB shell version: 3.0.7
connecting to: 172.16.16.125:/test
> use admin
switched to db admin
> config = { _id:"shard2", members:[
{_id:0,host:"172.16.16.125:40002"},
{_id:1,host:"172.16.16.126:40002"},
{_id:2,host:"172.16.16.131:40002",arbiterOnly:true}]
} #以下为输出
{
"_id" : "shard2",
"members" : [
{
"_id" : ,
"host" : "172.16.16.125:40002"
},
{
"_id" : ,
"host" : "172.16.16.126:40002"
},
{
"_id" : ,
"host" : "172.16.16.131:40002",
"arbiterOnly" : true
}
]
}
> rs.initiate(config); #初始化配置
{ "ok" : }
shard3的副本集配置(主节点、副本节点、仲裁节点):
bin]$ ./mongo 172.16.16.126:40003
MongoDB shell version: 3.0.7
connecting to: 172.16.16.126:/test
> use admin
switched to db admin
> config = { _id:"shard3", members:[
{_id:0,host:"172.16.16.126:40003"},
{_id:1,host:"172.16.16.131:40003"},
{_id:2,host:"172.16.16.124:40003",arbiterOnly:true}]
} #以下为输出
{
"_id" : "shard3",
"members" : [
{
"_id" : ,
"host" : "172.16.16.126:40003"
},
{
"_id" : ,
"host" : "172.16.16.131:40003"
},
{
"_id" : ,
"host" : "172.16.16.124:40003",
"arbiterOnly" : true
}
]
}
> rs.initiate(config); #初始化配置
{ "ok" : }
shard4的副本集配置(主节点、副本节点、仲裁节点):
bin]$ ./mongo 172.16.16.131:40004
MongoDB shell version: 3.0.7
connecting to: 172.16.16.131:/test
> use admin
switched to db admin
> config = { _id:"shard4", members:[
{_id:0,host:"172.16.16.131:40004"},
{_id:1,host:"172.16.16.124:40004"},
{_id:2,host:"172.16.16.125:40004",arbiterOnly:true}]
} #以下为输出
{
"_id" : "shard4",
"members" : [
{
"_id" : ,
"host" : "172.16.16.131:40004"
},
{
"_id" : ,
"host" : "172.16.16.124:40004"
},
{
"_id" : ,
"host" : "172.16.16.125:40004",
"arbiterOnly" : true
}
]
}
> rs.initiate(config); #初始化配置
{ "ok" : }
分片配置
bin]$ ./mongo 172.16.16.124:50001
mongos> use admin
switched to db admin
mongos> db.runCommand({addshard:"shard1/172.16.16.124:40001,172.16.16.125:40001,172.16.16.126:40001"});
{ "shardAdded" : "shard1", "ok" : } mongos>db.runCommand({addshard:"shard2/172.16.16.125:40002,172.16.16.126:40002,172.16.16.131:40002"});
{ "shardAdded" : "shard2", "ok" : } mongos>db.runCommand({addshard:"shard3/172.16.16.126:40003,172.16.16.131:40003,172.16.16.124:40003"});
{ "shardAdded" : "shard3", "ok" : } mongos>db.runCommand({addshard:"shard4/172.16.16.131:40004,172.16.16.124:40004,172.16.16.125:40004"});
{ "shardAdded" : "shard4", "ok" : }
查看配置是否生效(仲裁不被列出 ):
mongos> db.runCommand( { listshards : 1 } );
{
"shards" : [
{
"_id" : "shard1",
"host" : "shard1/172.16.16.124:40001,172.16.16.125:40001"
},
{
"_id" : "shard2",
"host" : "shard2/172.16.16.125:40002,172.16.16.126:40002"
},
{
"_id" : "shard3",
"host" : "shard3/172.16.16.126:40003,172.16.16.131:40003"
},
{
"_id" : "shard4",
"host" : "shard4/172.16.16.124:40004,172.16.16.131:40004"
}
],
"ok" :
}
以上就完成了MongoDB shard+replica模式的集群搭建,接下来做业务测试。
集群测试
默认情况下,库和集合没有自动分片的,若有数据写入,只会往一个shard中存储,做个测试验证:
bin]$ ./mongo 172.16.16.131:50001
MongoDB shell version: 3.0.7
connecting to: 172.16.16.131:/test
mongos> use ljaidb
switched to db ljaidb
mongos> for (var i=1;i<=10000;i++) db.ljaitable.save({"name":"ljai","age":27,"addr":"fuzhou"})
WriteResult({ "nInserted" : })
mongos> db.ljaitable.stats()
{
"sharded" : false,
"primary" : "shard1",
"ns" : "ljaidb.ljaitable",
"count" : ,
"size" : ,
"avgObjSize" : ,
"storageSize" : ,
"capped" : false,
"wiredTiger" : {
"metadata" : {
"formatVersion" :
}
mongos> db.printShardingStatus()
--- Sharding Status ---
sharding version: {
"_id" : ,
"minCompatibleVersion" : ,
"currentVersion" : ,
"clusterId" : ObjectId("5625fc29e3c17fdff8517b73")
}
shards:
{ "_id" : "shard1", "host" : "shard1/172.16.16.124:40001,172.16.16.125:40001" }
{ "_id" : "shard2", "host" : "shard2/172.16.16.125:40002,172.16.16.126:40002" }
{ "_id" : "shard3", "host" : "shard3/172.16.16.126:40003,172.16.16.131:40003" }
{ "_id" : "shard4", "host" : "shard4/172.16.16.124:40004,172.16.16.131:40004" }
balancer:
Currently enabled: yes
Currently running: yes
Balancer lock taken at Tue Oct :: GMT+ (CST) by DataServer-::::Balancer:
Failed balancer rounds in last attempts:
Migration Results for the last hours:
No recent migrations
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard1" }
{ "_id" : "ljaidb", "partitioned" : false, "primary" : "shard1" }
可以看出ljaidb库并没有分片,且数据都在shard1上,登录其他shard1上查看:
bin]$ ./mongo 172.16.16.124:40001
MongoDB shell version: 3.0.7
connecting to: 172.16.16.124:/test
shard1:PRIMARY> show dbs
ljaidb .000GB
local .000GB
shard1:PRIMARY> use ljaidb
switched to db ljaidb
shard1:PRIMARY> show tables
ljaitable
shard1:PRIMARY> db.ljaitable.find().count()
验证shard2、shard3、shard4上都没有ljaidb这个库:
bin]$ ./mongo 172.16.16.125:40002
MongoDB shell version: 3.0.7
connecting to: 172.16.16.125:/test
shard2:PRIMARY> show dbs
local .000GB
指定数据库和集合进行分片:
为了让某个数据库与集合自动分片生效,对数据库(lymdb)及数据库(lymtable)中的表进行分片配置:
bin]$ ./mongo 172.16.16.124:50001
MongoDB shell version: 3.0.7
connecting to: 172.16.16.124:/test
mongos> use admin
switched to db admin
mongos> db.runCommand( { enablesharding :"lymdb"});
{ "ok" : }
mongos> db.runCommand( { shardcollection : "lymdb.lymtable",key : {_id: 1} } )
{ "collectionsharded" : "lymdb.lymtable", "ok" : }
通过java或者python驱动,连接mongo集群测试:
java连接代码:
import java.util.ArrayList;
import java.util.List; import com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.DBObject;
import com.mongodb.MongoClient;
import com.mongodb.ServerAddress; public class TestMongoDBShards { public static void main(String[] args) {
try {
List<ServerAddress> addresses = new ArrayList<ServerAddress>();
ServerAddress address1 = new ServerAddress("172.16.16.124" , 50001);
ServerAddress address2 = new ServerAddress("172.16.16.125" , 50001);
ServerAddress address3 = new ServerAddress("172.16.16.126" , 50001);
ServerAddress address4 = new ServerAddress("172.16.16.131" , 50001);
addresses.add(address1);
addresses.add(address2);
addresses.add(address3); MongoClient client = new MongoClient(addresses);
DB db = client.getDB( "lymdb" );
DBCollection coll = db.getCollection( "lymtable" ); // BasicDBObject object = new BasicDBObject();
// object.append( "id" , 1); // DBObject dbObject = coll.findOne(object); for(int i=1;i<=1000000;i++) {
DBObject saveData=new BasicDBObject();
saveData.put("id", i);
saveData.put("userName", "baiwan" + i);
saveData.put("age", "26");
saveData.put("gender", "m"); coll.save(saveData);
} // System. out .println(dbObject); } catch (Exception e) {
e.printStackTrace();
}
// TODO Auto-generated method stub } }
python连接代码:
#encoding=UTF-8
import datetime ISOTIMEFORMAT = '%Y-%m-%d %X' from pymongo import MongoClient
conn = MongoClient("172.16.16.124",50001)
db = conn.funodb
def dateDiffInSeconds(date1,date2):
timedelta = date2 - date1
return timedelta.days*24*3600 +timedelta.seconds
db.funotable.drop()
date1 = datetime.datetime.now()
for i in range(0,1000000): db.funotable.insert({"name":"ljai","age":i,"addr":"fuzhou"})
c = db.funotable.find().count()
print("count is ",c)
date2 = datetime.datetime.now()
print(date1)
print(date2)
print("消耗:",dateDiffInSeconds(date1,date2),"seconds")
conn.close()
测试是否自动分片:
mongos> db.lymtable.getShardDistribution() Shard shard1 at shard1/172.16.16.124:40001,172.16.16.125:40001
data : 96.46MiB docs : 1216064 chunks : 4
estimated data per chunk : 24.11MiB
estimated docs per chunk : 304016 Shard shard2 at shard2/172.16.16.125:40002,172.16.16.126:40002
data : 44.9MiB docs : 565289 chunks : 4
estimated data per chunk : 11.22MiB
estimated docs per chunk : 141322 Shard shard3 at shard3/172.16.16.126:40003,172.16.16.131:40003
data : 99.39MiB docs : 1259979 chunks : 4
estimated data per chunk : 24.84MiB
estimated docs per chunk : 314994 Shard shard4 at shard4/172.16.16.124:40004,172.16.16.131:40004
data : 76.46MiB docs : 958668 chunks : 4
estimated data per chunk : 19.11MiB
estimated docs per chunk : 239667 Totals
data : 317.22MiB docs : 4000000 chunks : 16
Shard shard1 contains 30.4% data, 30.4% docs in cluster, avg obj size on shard : 83B
Shard shard2 contains 14.15% data, 14.13% docs in cluster, avg obj size on shard : 83B
Shard shard3 contains 31.33% data, 31.49% docs in cluster, avg obj size on shard : 82B
Shard shard4 contains 24.1% data, 23.96% docs in cluster, avg obj size on shard : 83B
可以看出,插入400万条数据,都有分布各个shard上,但是不够均匀,需要进一步研究分片的配置。