我们已经对ceph有了一个大概的了解,现在就进行手动的安装ceph集群。
在我们安装集群之前,首先应该对自己的服务器环境以及集群节点作用做一个规划。
架构设计
Ceph 分布式存储集群有三大组件组成,分为:Ceph Monitor、Ceph OSD、Ceph MDS,后边使用对象存储和块存储时,MDS 非必须安装,只有当使用 Cephfs 文件存储时,才需要安装。这里我们暂时不安装 MDS。
ceph集群至少需要一个MON节点和两个OSD节点,那么我们至少需要3个节点。
我们现在有四台服务器,系统都为CentOS。
ip如下:
192.168.1.80
192.168.1.81
192.168.1.82
192.168.1.83
主机名 | 角色 | ip |
cephAdmin | ceph-deploy+client | 192.168.1.80 |
ceph1 | mon+osd | 192.168.1.81 |
ceph2 | mon+osd | 192.168.1.82 |
ceph3 | mon+osd | 192.168.1.83 |
Ceph Monitors 之间默认使用 6789 端口通信, OSD 之间默认用 6800:7300 这个范围内的端口通信
使用命令查看磁盘情况:
sudo df -h
sda1为系统盘
/dev/mapper/vg_localhost-lv_root里面包含了很多系统文件,我们不能直接使用这个盘作为osd盘。
我们需要新增磁盘新建分区,划分出新的分区sdb作为osd盘。
如果是生产环境的话 也需要使用没使用过的独立分区来作为osd盘最好。
新建分区参考链接:(生产环境谨慎操作,如果划分区弄到相关系统文件等系统就启动不了了)
VMware虚拟机添加新硬盘以及对磁盘进行分区挂载
最终的磁盘情况如下:
准备工作(所有节点)
非root用户需要在命令前加sudo执行
关闭防火墙
centOS 7.0版本
sudo systemctl stop firewalld.service #停止firewall
sudo systemctl disable firewalld.service #禁止firewall开机启动
sudo firewall-cmd --state #查看防火墙状态
centOS 6.0版本
sudo service iptables stop #停止firewall
sudo chkconfig iptables off #禁止firewall开机启动
sudo service iptables status #查看防火墙状态
修改yum源
有时候CentOS默认的yum源不一定是国内镜像,导致yum在线安装及更新速度不是很理想。这时候需要将yum源设置为国内镜像站点。国内主要开源的开源镜像站点应该是网易和阿里云了。
更多源参考:
https://www.centos.org/download/mirrors/
如果报错-bash: wget: command not found则使用命令安装wget。
sudo yum -y install wget
或者
rpm 安装
rpm下载源地址:http://vault.centos.org/6.4/os/x86_64/Packages/
下载wget的RPM包:http://vault.centos.org/6.4/os/x86_64/Packages/wget-1.12-1.8.el6.x86_64.rpm
使用xftp上传到服务器中使用如下命令安装即可。
sudo rpm -i wget-1.12-1.8.el6.x86_64.rpm
网易
cd /etc/yum.repos.d
sudo mv CentOS-Base.repo CentOS-Base.repo.bk
sudo wget http://mirrors.163.com/.help/CentOS6-Base-163.repo
sudo yum makecache
注意原链接http://mirrors.163.com 中很多6.4分支版本的已经没有源了。只能使用6分支中的源。
阿里云
cd /etc/yum.repos.d
sudo mv CentOS-Base.repo CentOS-Base.repo.bk
sudo wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-6.repo
sudo yum makecache
修改同步时区
centOS7
sudo cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
sudo yum -y install ntp
sudo systemctl enable ntpd
sudo systemctl start ntpd
sudo ntpstat
CentOS6
sudo cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
sudo yum -y install ntp
sudo /etc/init.d/ntpd stop
sudo /etc/init.d/ntpd start
sudo ntpstat
修改hosts
sudo vim /etc/hosts
输入如下内容:
192.168.1.80 admin
192.168.1.81 ceph1
192.168.1.82 ceph2
192.168.1.83 ceph3
使用命令分别修改三台机子的hostname为ceph1,ceph2,ceph3
修改运行时Linux系统的hostname,即不需要重启系统
hostname命令可以设置系统的hostname
sudo hostname ceph1
newname即要设置的新的hostname,运行后立即生效,但是在系统重启后会丢失所做的修改,如果要永久更改系统的hostname,就要修改相关的设置文件。
永久更改Linux的hostname
如果要永久修改RedHat的hostname,就修改/etc/sysconfig/network文件,将里面的HOSTNAME这一行修改成HOSTNAME=NEWNAME,其中NEWNAME就是你要设置的hostname。
Debian发行版的hostname的配置文件是/etc/hostname。
我们这里使用命令
sudo vi /etc/sysconfig/network
修该配置文件后,重启系统就会读取配置文件设置新的hostname。
安装epel仓库、添加yum ceph仓库、更新软件库
安装epel仓库
使用命令
sudo yum install epel-release -y
或者
sudo rpm -i http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm
sudo rpm -i http://rpms.famillecollet.com/enterprise/remi-release-6.rpm
添加yum ceph仓库
sudo vi /etc/yum.repos.d/ceph.repo
把如下内容粘帖进去,用 Ceph 的最新主稳定版名字替换 {ceph-stable-release} (如 firefly ),用你的Linux发行版名字替换 {distro} (如 el6 为 CentOS 6 、 el7 为 CentOS 7 、 rhel6 为 Red Hat 6.5 、 rhel7 为 Red Hat 7 、 fc19 是 Fedora 19 、 fc20 是 Fedora 20 )。最后保存到 /etc/yum.repos.d/ceph.repo 文件中。
[ceph-noarch]
name=Ceph noarch packages
baseurl=http://download.ceph.com/rpm-{ceph-release}/{distro}/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc
例如我的是CentOS6版本则需要使用的是el6,千万别搞错,不然后面安装会报错很多版本不一致和You could try using –skip-broken to work around the problem。
安装ceph使用配置如下:
[ceph]
name=Ceph noarch packages
baseurl=http://mirrors.163.com/ceph/rpm-hammer/el6/x86_64/
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=http://mirrors.163.com/ceph/keys/release.asc
安装ceph-deploy使用配置如下:
[ceph]
name=Ceph noarch packages
baseurl=http://mirrors.163.com/ceph/rpm-hammer/el6/noarch/
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=http://mirrors.163.com/ceph/keys/release.asc
原理就是去baseurl的链接里看看有没有相关的安装包。
修改epel源
因为epel原来的源库很多都不全了,所以可以修改下链接:
sudo vi /etc/yum.repos.d/epel.repo
将baseusrl修改成:
baseurl=https://mirrors.tuna.tsinghua.edu.cn/epel/6//$basearch
注释mirrolist。
安装ceph-deploy,ceph
ceph所有ceph节点都安装,ceph-deploy只需admin节点安装
sudo yum -y update
sudo yum -y install --release hammer ceph
和
sudo yum -y update
sudo yum -y install --release hammer ceph-deploy
如果都安装则
sudo yum -y update
sudo yum -y install --release hammer ceph ceph-deploy
如果遇到问题:
No package ceph-deploy available
则需要检查ceph.repo的baseurl里是否有ceph-deploy,没有的话需要修正baseurl。
sudo vi /etc/yum.repos.d/ceph.repo
yum clean all
sudo yum -y install --release hammer ceph-deploy
或者找到一个有ceph-deploy的baseusl,使用命令:
sudo rm /etc/yum.repos.d/ceph.repo
sudo rpm -Uvh http://mirrors.163.com/ceph/rpm-hammer/el6/noarch/ceph-release-1-0.el6.noarch.rpm
yum clean all
yum makecache
yum install ceph-deploy -y
如果遇到问题
Not using downloaded repomd.xml because it is older than what we have
应该清空/var/cache/yum/目录下面的文件
使用命令
sudo rm -rf /var/cache/yum/
如果遇到问题:
https://mirrors.tuna.tsinghua.edu.cn/epel/6//x86_64/repodata/repomd.xml: [Errno 14] problem making ssl connection
使用命令
sudo vi /etc/yum.repos.d/epel.repo
将enabled=1先改为enabled=0
然后命令行执行
sudo yum install ca-certificates
安装成功后,将enabled重新改为1,再行
sudo yum install -y XXXX可以正常安装了
允许无密码 SSH 登录(admin节点)
生成SSH密钥对,提示 “Enter passphrase” 时,直接回车,口令即为空:
sudo ssh-keygen
把公钥拷贝至所有节点
sudo ssh-copy-id root@ceph1
sudo ssh-copy-id root@ceph2
sudo ssh-copy-id root@ceph3
可能遇到的问题ssh-copy-id permission denied。
这个主要是不允许root用户使用ssh远程连接导致的。
这里在有一个测试方式就是在xshell使用root用户连接看看是否能连通。
如果是ssh服务器拒绝了密码 则说明root的ssh连接配置有问题。
处理方法,需要修改两个地方,一个是PermitRootLogin属性,一个是AllowUsers需要新增root的记录条。
使用命令:
sudo vi /etc/ssh/sshd_config
找到PermitRootLogin属性,将配置修改为yes,允许使用root远程登录。
然后在配置文件最后加上(ip)地址与自己的机子对应
AllowUsers [email protected] root
重启ssh服务,使用命令(如果重启ssh服务无效,可以尝试重启服务器试试)
sudo service sshd restart
验证是否可以无密码SSH登录
sudo ssh ceph1
sudo ssh ceph2
sudo ssh ceph3
创建Monitor(admin节点)
在ceph1、ceph2、ceph3上创建monitor
sudo mkdir myceph
cd myceph
sudo ceph-deploy new ceph1 ceph2 ceph3
正确运行输出如下:
[zzq@localhost myceph]$ sudo ceph-deploy new ceph1 ceph2 ceph3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.37): /usr/bin/ceph-deploy new ceph1 ceph2 ceph3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] func : <function new at 0xa66e60>
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fb7f161e638>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] ssh_copykey : True
[ceph_deploy.cli][INFO ] mon : ['ceph1', 'ceph2', 'ceph3']
[ceph_deploy.cli][INFO ] public_network : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] cluster_network : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] fsid : None
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds
[ceph1][DEBUG ] connected to host: localhost.mongodb0
[ceph1][INFO ] Running command: ssh -CT -o BatchMode=yes ceph1
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] find the location of an executable
[ceph1][INFO ] Running command: /sbin/ip link show
[ceph1][INFO ] Running command: /sbin/ip addr show
[ceph1][DEBUG ] IP addresses found: [u'192.168.199.81']
[ceph_deploy.new][DEBUG ] Resolving host ceph1
[ceph_deploy.new][DEBUG ] Monitor ceph1 at 192.168.199.81
[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds
[ceph2][DEBUG ] connected to host: localhost.mongodb0
[ceph2][INFO ] Running command: ssh -CT -o BatchMode=yes ceph2
[ceph2][DEBUG ] connected to host: ceph2
[ceph2][DEBUG ] detect platform information from remote host
[ceph2][DEBUG ] detect machine type
[ceph2][DEBUG ] find the location of an executable
[ceph2][INFO ] Running command: /sbin/ip link show
[ceph2][INFO ] Running command: /sbin/ip addr show
[ceph2][DEBUG ] IP addresses found: [u'192.168.199.82']
[ceph_deploy.new][DEBUG ] Resolving host ceph2
[ceph_deploy.new][DEBUG ] Monitor ceph2 at 192.168.199.82
[ceph_deploy.new][INFO ] making sure passwordless SSH succeeds
[ceph3][DEBUG ] connected to host: localhost.mongodb0
[ceph3][INFO ] Running command: ssh -CT -o BatchMode=yes ceph3
[ceph3][DEBUG ] connected to host: ceph3
[ceph3][DEBUG ] detect platform information from remote host
[ceph3][DEBUG ] detect machine type
[ceph3][DEBUG ] find the location of an executable
[ceph3][INFO ] Running command: /sbin/ip link show
[ceph3][INFO ] Running command: /sbin/ip addr show
[ceph3][DEBUG ] IP addresses found: [u'192.168.199.83']
[ceph_deploy.new][DEBUG ] Resolving host ceph3
[ceph_deploy.new][DEBUG ] Monitor ceph3 at 192.168.199.83
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph1', 'ceph2', 'ceph3']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['192.168.199.81', '192.168.199.82', '192.168.199.83']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
Error in sys.exitfunc:
[zzq@localhost myceph]$ ls
ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
如果报错:
Error in sys.exitfunc
解决方法
网上收集到一个解决方式, 添加环境变量,在admin节点的shell中运行命令后重新运行deploy命令,不过我运行了感觉没生效
export CEPH_DEPLOY_TEST=YES
sudo ceph-deploy new ceph1 ceph2 ceph3
如果在当前目录使用命令ls看到三个配置文件,则可以无视这个错误继续安装。
[zzq@localhost myceph]$ ls
ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
[zzq@localhost myceph]$
修改osd的副本数
将osd pool default size = 2添加至末尾
sudo vi /etc/ceph.conf
输入内容:
osd pool default size = 2
配置初始 monitor(s)、并收集所有密钥
sudo ceph-deploy mon create-initial
正确运行输出如下:
[zzq@localhost myceph]$ sudo ceph-deploy mon create-initial
[sudo] password for zzq:
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.37): /usr/bin/ceph-deploy mon create-initial
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create-initial
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f2a6bd232d8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mon at 0x7f2a6bd15578>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] keyrings : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph1 ceph2 ceph3
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph1 ...
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro info: CentOS 6.9 Final
[ceph1][DEBUG ] determining if provided host has same hostname in remote
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] deploying mon to ceph1
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] remote hostname: ceph1
[ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph1][DEBUG ] create the mon path if it does not exist
[ceph1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph1/done
[ceph1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph1][DEBUG ] create the init path if it does not exist
[ceph1][DEBUG ] locating the `service` executable...
[ceph1][INFO ] Running command: /sbin/service ceph -c /etc/ceph/ceph.conf start mon.ceph1
[ceph1][DEBUG ] === mon.ceph1 ===
[ceph1][DEBUG ] Starting Ceph mon.ceph1 on ceph1...already running
[ceph1][INFO ] Running command: chkconfig ceph on
[ceph1][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][DEBUG ] ********************************************************************************
[ceph1][DEBUG ] status for monitor: mon.ceph1
[ceph1][DEBUG ] {
[ceph1][DEBUG ] "election_epoch": 0,
[ceph1][DEBUG ] "extra_probe_peers": [
[ceph1][DEBUG ] "192.168.199.82:6789/0",
[ceph1][DEBUG ] "192.168.199.83:6789/0"
[ceph1][DEBUG ] ],
[ceph1][DEBUG ] "monmap": {
[ceph1][DEBUG ] "created": "0.000000",
[ceph1][DEBUG ] "epoch": 0,
[ceph1][DEBUG ] "fsid": "93cfef7e-98a5-440f-af59-1a950e283b74",
[ceph1][DEBUG ] "modified": "0.000000",
[ceph1][DEBUG ] "mons": [
[ceph1][DEBUG ] {
[ceph1][DEBUG ] "addr": "192.168.199.81:6789/0",
[ceph1][DEBUG ] "name": "ceph1",
[ceph1][DEBUG ] "rank": 0
[ceph1][DEBUG ] },
[ceph1][DEBUG ] {
[ceph1][DEBUG ] "addr": "0.0.0.0:0/1",
[ceph1][DEBUG ] "name": "ceph2",
[ceph1][DEBUG ] "rank": 1
[ceph1][DEBUG ] },
[ceph1][DEBUG ] {
[ceph1][DEBUG ] "addr": "0.0.0.0:0/2",
[ceph1][DEBUG ] "name": "ceph3",
[ceph1][DEBUG ] "rank": 2
[ceph1][DEBUG ] }
[ceph1][DEBUG ] ]
[ceph1][DEBUG ] },
[ceph1][DEBUG ] "name": "ceph1",
[ceph1][DEBUG ] "outside_quorum": [
[ceph1][DEBUG ] "ceph1"
[ceph1][DEBUG ] ],
[ceph1][DEBUG ] "quorum": [],
[ceph1][DEBUG ] "rank": 0,
[ceph1][DEBUG ] "state": "probing",
[ceph1][DEBUG ] "sync_provider": []
[ceph1][DEBUG ] }
[ceph1][DEBUG ] ********************************************************************************
[ceph1][INFO ] monitor: mon.ceph1 is running
[ceph1][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph2 ...
[ceph2][DEBUG ] connected to host: ceph2
[ceph2][DEBUG ] detect platform information from remote host
[ceph2][DEBUG ] detect machine type
[ceph2][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro info: CentOS 6.9 Final
[ceph2][DEBUG ] determining if provided host has same hostname in remote
[ceph2][DEBUG ] get remote short hostname
[ceph2][DEBUG ] deploying mon to ceph2
[ceph2][DEBUG ] get remote short hostname
[ceph2][DEBUG ] remote hostname: ceph2
[ceph2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph2][DEBUG ] create the mon path if it does not exist
[ceph2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph2/done
[ceph2][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph2][DEBUG ] create the init path if it does not exist
[ceph2][DEBUG ] locating the `service` executable...
[ceph2][INFO ] Running command: /sbin/service ceph -c /etc/ceph/ceph.conf start mon.ceph2
[ceph2][DEBUG ] === mon.ceph2 ===
[ceph2][DEBUG ] Starting Ceph mon.ceph2 on ceph2...
[ceph2][DEBUG ] Starting ceph-create-keys on ceph2...
[ceph2][INFO ] Running command: chkconfig ceph on
[ceph2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph2][DEBUG ] ********************************************************************************
[ceph2][DEBUG ] status for monitor: mon.ceph2
[ceph2][DEBUG ] {
[ceph2][DEBUG ] "election_epoch": 1,
[ceph2][DEBUG ] "extra_probe_peers": [
[ceph2][DEBUG ] "192.168.199.81:6789/0",
[ceph2][DEBUG ] "192.168.199.83:6789/0"
[ceph2][DEBUG ] ],
[ceph2][DEBUG ] "monmap": {
[ceph2][DEBUG ] "created": "0.000000",
[ceph2][DEBUG ] "epoch": 0,
[ceph2][DEBUG ] "fsid": "93cfef7e-98a5-440f-af59-1a950e283b74",
[ceph2][DEBUG ] "modified": "0.000000",
[ceph2][DEBUG ] "mons": [
[ceph2][DEBUG ] {
[ceph2][DEBUG ] "addr": "192.168.199.81:6789/0",
[ceph2][DEBUG ] "name": "ceph1",
[ceph2][DEBUG ] "rank": 0
[ceph2][DEBUG ] },
[ceph2][DEBUG ] {
[ceph2][DEBUG ] "addr": "192.168.199.82:6789/0",
[ceph2][DEBUG ] "name": "ceph2",
[ceph2][DEBUG ] "rank": 1
[ceph2][DEBUG ] },
[ceph2][DEBUG ] {
[ceph2][DEBUG ] "addr": "0.0.0.0:0/2",
[ceph2][DEBUG ] "name": "ceph3",
[ceph2][DEBUG ] "rank": 2
[ceph2][DEBUG ] }
[ceph2][DEBUG ] ]
[ceph2][DEBUG ] },
[ceph2][DEBUG ] "name": "ceph2",
[ceph2][DEBUG ] "outside_quorum": [],
[ceph2][DEBUG ] "quorum": [],
[ceph2][DEBUG ] "rank": 1,
[ceph2][DEBUG ] "state": "electing",
[ceph2][DEBUG ] "sync_provider": []
[ceph2][DEBUG ] }
[ceph2][DEBUG ] ********************************************************************************
[ceph2][INFO ] monitor: mon.ceph2 is running
[ceph2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph3 ...
[ceph3][DEBUG ] connected to host: ceph3
[ceph3][DEBUG ] detect platform information from remote host
[ceph3][DEBUG ] detect machine type
[ceph3][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro info: CentOS 6.9 Final
[ceph3][DEBUG ] determining if provided host has same hostname in remote
[ceph3][DEBUG ] get remote short hostname
[ceph3][DEBUG ] deploying mon to ceph3
[ceph3][DEBUG ] get remote short hostname
[ceph3][DEBUG ] remote hostname: ceph3
[ceph3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph3][DEBUG ] create the mon path if it does not exist
[ceph3][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph3/done
[ceph3][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph3][DEBUG ] create the init path if it does not exist
[ceph3][DEBUG ] locating the `service` executable...
[ceph3][INFO ] Running command: /sbin/service ceph -c /etc/ceph/ceph.conf start mon.ceph3
[ceph3][DEBUG ] === mon.ceph3 ===
[ceph3][DEBUG ] Starting Ceph mon.ceph3 on ceph3...
[ceph3][DEBUG ] Starting ceph-create-keys on ceph3...
[ceph3][INFO ] Running command: chkconfig ceph on
[ceph3][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph3.asok mon_status
[ceph3][DEBUG ] ********************************************************************************
[ceph3][DEBUG ] status for monitor: mon.ceph3
[ceph3][DEBUG ] {
[ceph3][DEBUG ] "election_epoch": 1,
[ceph3][DEBUG ] "extra_probe_peers": [
[ceph3][DEBUG ] "192.168.199.81:6789/0",
[ceph3][DEBUG ] "192.168.199.82:6789/0"
[ceph3][DEBUG ] ],
[ceph3][DEBUG ] "monmap": {
[ceph3][DEBUG ] "created": "0.000000",
[ceph3][DEBUG ] "epoch": 0,
[ceph3][DEBUG ] "fsid": "93cfef7e-98a5-440f-af59-1a950e283b74",
[ceph3][DEBUG ] "modified": "0.000000",
[ceph3][DEBUG ] "mons": [
[ceph3][DEBUG ] {
[ceph3][DEBUG ] "addr": "192.168.199.81:6789/0",
[ceph3][DEBUG ] "name": "ceph1",
[ceph3][DEBUG ] "rank": 0
[ceph3][DEBUG ] },
[ceph3][DEBUG ] {
[ceph3][DEBUG ] "addr": "192.168.199.82:6789/0",
[ceph3][DEBUG ] "name": "ceph2",
[ceph3][DEBUG ] "rank": 1
[ceph3][DEBUG ] },
[ceph3][DEBUG ] {
[ceph3][DEBUG ] "addr": "192.168.199.83:6789/0",
[ceph3][DEBUG ] "name": "ceph3",
[ceph3][DEBUG ] "rank": 2
[ceph3][DEBUG ] }
[ceph3][DEBUG ] ]
[ceph3][DEBUG ] },
[ceph3][DEBUG ] "name": "ceph3",
[ceph3][DEBUG ] "outside_quorum": [],
[ceph3][DEBUG ] "quorum": [],
[ceph3][DEBUG ] "rank": 2,
[ceph3][DEBUG ] "state": "electing",
[ceph3][DEBUG ] "sync_provider": []
[ceph3][DEBUG ] }
[ceph3][DEBUG ] ********************************************************************************
[ceph3][INFO ] monitor: mon.ceph3 is running
[ceph3][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph3.asok mon_status
[ceph_deploy.mon][INFO ] processing monitor mon.ceph1
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] find the location of an executable
[ceph1][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph_deploy.mon][WARNIN] mon.ceph1 monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[ceph1][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph_deploy.mon][INFO ] mon.ceph1 monitor has reached quorum!
[ceph_deploy.mon][INFO ] processing monitor mon.ceph2
[ceph2][DEBUG ] connected to host: ceph2
[ceph2][DEBUG ] detect platform information from remote host
[ceph2][DEBUG ] detect machine type
[ceph2][DEBUG ] find the location of an executable
[ceph2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph_deploy.mon][INFO ] mon.ceph2 monitor has reached quorum!
[ceph_deploy.mon][INFO ] processing monitor mon.ceph3
[ceph3][DEBUG ] connected to host: ceph3
[ceph3][DEBUG ] detect platform information from remote host
[ceph3][DEBUG ] detect machine type
[ceph3][DEBUG ] find the location of an executable
[ceph3][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph3.asok mon_status
[ceph_deploy.mon][WARNIN] mon.ceph3 monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[ceph3][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph3.asok mon_status
[ceph_deploy.mon][WARNIN] mon.ceph3 monitor is not yet in quorum, tries left: 4
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[ceph3][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph3.asok mon_status
[ceph_deploy.mon][INFO ] mon.ceph3 monitor has reached quorum!
[ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum
[ceph_deploy.mon][INFO ] Running gatherkeys...
[ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmpaZC0NP
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] fetch remote file
[ceph1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph1/keyring auth get client.admin
[ceph1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph1/keyring auth get client.bootstrap-mds
[ceph1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph1/keyring auth get client.bootstrap-osd
[ceph1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph1/keyring auth get client.bootstrap-rgw
[ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO ] keyring 'ceph.mon.keyring' already exists
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpaZC0NP
Error in sys.exitfunc:
[zzq@localhost myceph]$
如果报错
[ceph_deploy.mon][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use –overwrite-conf to overwrite
则使用命令如下:
sudo ceph-deploy --overwrite-conf mon create-initial
可能会遇到问题
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
这个报错的原因有很多种,需要仔细看上面的报错信息。如果是重复安装不同版本例如:
之前安装了ceph-0.94.6的环境中重新安装ceph-10.2.5,只是卸载了ceph的包,没有对ceph的一些配置文件进行删除。
则使用命令修复:
rm -rf /etc/ceph/*
rm -rf /var/lib/ceph/*/*
rm -rf /var/log/ceph/*
rm -rf /var/run/ceph/*
#然后重新运行
sudo ceph-deploy new ceph1 ceph2 ceph3
sudo ceph-deploy mon create-initial
我这里的报错是
[ceph3][DEBUG ] determining if provided host has same hostname in remote
[ceph3][DEBUG ] get remote short hostname
[ceph3][WARNIN] ********************************************************************************
[ceph3][WARNIN] provided hostname must match remote hostname
[ceph3][WARNIN] provided hostname: ceph3
[ceph3][WARNIN] remote hostname: localhost
[ceph3][WARNIN] monitors may not reach quorum and create-keys will not complete
是因为hostname没对应。
使用命令分别修改三台机子的hostname为ceph1,ceph2,ceph3
修改运行时Linux系统的hostname,即不需要重启系统
hostname命令可以设置系统的hostname
sudo hostname ceph1
newname即要设置的新的hostname,运行后立即生效,但是在系统重启后会丢失所做的修改,如果要永久更改系统的hostname,就要修改相关的设置文件。
永久更改Linux的hostname
如果要永久修改RedHat的hostname,就修改/etc/sysconfig/network文件,将里面的HOSTNAME这一行修改成HOSTNAME=NEWNAME,其中NEWNAME就是你要设置的hostname。
Debian发行版的hostname的配置文件是/etc/hostname。
我们这里使用命令
sudo vi /etc/sysconfig/network
修该配置文件后,重启系统就会读取配置文件设置新的hostname。
可能遇到的问题
[ceph3][WARNIN] 2018-05-14 23:10:43.131747 7f7a3135e7a0 -1 accepter.accepter.bind unable to bind to 192.168.199.83:6789: (98) Address already in use
解决方法
发生这种问题是由于端口被程序绑定而没有释放造成.
可以使用netstat -lp命令或者ps aux|grep ceph查询当前处于连接的程序以及对应的进程信息。
然后用ps pid 察看对应的进程,并使用kill pid 关闭该进程即可
使用命令及输出如下:
netstat -lp
ps aux|grep ceph
ps pid
kill pid
输出如下:
[zzq@ceph2 ~]$ sudo netstat -lp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 ceph2:smc-https *:* LISTEN 1256/ceph-mon
tcp 0 0 localhost:27017 *:* LISTEN 1513/mongod
tcp 0 0 *:ssh *:* LISTEN 1154/sshd
tcp 0 0 localhost:smtp *:* LISTEN 1487/master
tcp 0 0 *:ssh *:* LISTEN 1154/sshd
tcp 0 0 localhost:smtp *:* LISTEN 1487/master
Active UNIX domain sockets (only servers)
Proto RefCnt Flags Type State I-Node PID/Program name Path
unix 2 [ ACC ] STREAM LISTENING 12680 1513/mongod /tmp/mongodb-27017.sock
unix 2 [ ACC ] STREAM LISTENING 12028 1256/ceph-mon /var/run/ceph/ceph-mon.localhost.asok
unix 2 [ ACC ] STREAM LISTENING 12524 1487/master public/cleanup
unix 2 [ ACC ] STREAM LISTENING 12531 1487/master private/tlsmgr
unix 2 [ ACC ] STREAM LISTENING 12535 1487/master private/rewrite
unix 2 [ ACC ] STREAM LISTENING 12539 1487/master private/bounce
unix 2 [ ACC ] STREAM LISTENING 12543 1487/master private/defer
unix 2 [ ACC ] STREAM LISTENING 12547 1487/master private/trace
unix 2 [ ACC ] STREAM LISTENING 12551 1487/master private/verify
unix 2 [ ACC ] STREAM LISTENING 12555 1487/master public/flush
unix 2 [ ACC ] STREAM LISTENING 12559 1487/master private/proxymap
unix 2 [ ACC ] STREAM LISTENING 12563 1487/master private/proxywrite
unix 2 [ ACC ] STREAM LISTENING 12567 1487/master private/smtp
unix 2 [ ACC ] STREAM LISTENING 12571 1487/master private/relay
unix 2 [ ACC ] STREAM LISTENING 12575 1487/master public/showq
unix 2 [ ACC ] STREAM LISTENING 12579 1487/master private/error
unix 2 [ ACC ] STREAM LISTENING 12583 1487/master private/retry
unix 2 [ ACC ] STREAM LISTENING 12587 1487/master private/discard
unix 2 [ ACC ] STREAM LISTENING 12591 1487/master private/local
unix 2 [ ACC ] STREAM LISTENING 12595 1487/master private/virtual
unix 2 [ ACC ] STREAM LISTENING 12599 1487/master private/lmtp
unix 2 [ ACC ] STREAM LISTENING 12603 1487/master private/anvil
unix 2 [ ACC ] STREAM LISTENING 12607 1487/master private/scache
unix 2 [ ACC ] STREAM LISTENING 9512 1/init @/com/ubuntu/upstart
[zzq@ceph2 ~]$ ps aux|grep ceph
root 1256 0.3 2.5 235060 25536 ? Sl 23:04 0:08 /usr/bin/ceph-mon -i localhost --pid-file /var/run/ceph/mon.localhost.pid -c /etc/ceph/ceph.conf --cluster ceph
zzq 4871 0.0 0.0 103320 880 pts/0 S+ 23:41 0:00 grep ceph
[zzq@ceph2 ~]$ ps 1256
PID TTY STAT TIME COMMAND
1256 ? Sl 0:08 /usr/bin/ceph-mon -i localhost --pid-file /var/run/ceph/mon.localhost.pid -c /etc/ceph/ceph.conf --cluster ceph
[zzq@ceph2 ~]$ kill 1256
-bash: kill: (1256) - Operation not permitted
[zzq@ceph2 ~]$ sudo kill 1256
[zzq@ceph2 ~]$
如果某节点一直报错
monitor is not yet in quorum
解决方法
关闭该节点的防火墙
使用命令
sudo service iptables stop #停止firewall
sudo chkconfig iptables off #禁止firewall开机启动
sudo service iptables status #查看防火墙状态
第二个是需要注意hostname和host配置是否对应
使用命令
sudo hostname ceph1
sudo vi /etc/hosts
重启该节点
sudo reboot -h
在admin节点重新执行即可
sudo ceph-deploy --overwrite-conf mon create-initial
如果还是不行,或者确认之前修改过该节点的配置以及其他信息,出现这种问题通常是对应的节点中存在原来的配置,导致新部署过程中无法生成认证秘钥。此时遍历待部署的所有节点将/etc/ceph,和/var/lib/ceph下的目录清除掉,然后再部署,通常就能解决。
有问题的节点运行以下命令修复:
sudo rm -rf /etc/ceph/*
sudo rm -rf /var/lib/ceph/*/*
sudo rm -rf /var/log/ceph/*
sudo rm -rf /var/run/ceph/*
#然后admin重新运行
sudo ceph-deploy new ceph1 ceph2 ceph3
sudo ceph-deploy mon create-initial
创建OSD(admin节点)
ceph磁盘相关命令参考:
http://docs.ceph.org.cn/man/8/ceph-disk/
列举磁盘
sudo ceph-deploy disk list ceph1
sudo ceph-deploy disk list ceph2
sudo ceph-deploy disk list ceph3
输出如下:
[zzq@localhost ~]$ sudo ceph-deploy disk list ceph1
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.37): /usr/bin/ceph-deploy disk list ceph1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : list
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1d53fc8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function disk at 0x1d44de8>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] disk : [('ceph1', None, None)]
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS 6.9 Final
[ceph_deploy.osd][DEBUG ] Listing disks on ceph1...
[ceph1][DEBUG ] find the location of an executable
[ceph1][INFO ] Running command: /usr/sbin/ceph-disk list
[ceph1][DEBUG ] /dev/sda :
[ceph1][DEBUG ] /dev/sda1 other, ext4, mounted on /boot
[ceph1][DEBUG ] /dev/sda2 other, LVM2_member
[ceph1][DEBUG ] /dev/sdb :
[ceph1][DEBUG ] /dev/sdb1 other, ext4, mounted on /ceph
[ceph1][DEBUG ] /dev/sr0 other, unknown
Error in sys.exitfunc:
擦净磁盘
sudo ceph-deploy disk zap ceph1:/dev/sdb
sudo ceph-deploy disk zap ceph2:/dev/sdb
sudo ceph-deploy disk zap ceph3:/dev/sdb
正确输出如下:
[zzq@localhost ~]$ sudo ceph-deploy disk zap ceph1:/dev/sdb
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.37): /usr/bin/ceph-deploy disk zap ceph1:/dev/sdb
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : zap
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x276efc8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function disk at 0x275fde8>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] disk : [('ceph1', '/dev/sdb', None)]
[ceph_deploy.osd][DEBUG ] zapping /dev/sdb on ceph1
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS 6.9 Final
[ceph1][DEBUG ] zeroing last few blocks of device
[ceph1][DEBUG ] find the location of an executable
[ceph1][INFO ] Running command: /usr/sbin/ceph-disk zap /dev/sdb
[ceph1][DEBUG ]
[ceph1][DEBUG ] ***************************************************************
[ceph1][DEBUG ] Found invalid GPT and valid MBR; converting MBR to GPT format
[ceph1][DEBUG ] in memory.
[ceph1][DEBUG ] ***************************************************************
[ceph1][DEBUG ]
[ceph1][DEBUG ] Warning: The kernel is still using the old partition table.
[ceph1][DEBUG ] The new table will be used at the next reboot.
[ceph1][DEBUG ] GPT data structures destroyed! You may now partition the disk using fdisk or
[ceph1][DEBUG ] other utilities.
[ceph1][DEBUG ] Creating new GPT entries.
[ceph1][DEBUG ] Warning: The kernel is still using the old partition table.
[ceph1][DEBUG ] The new table will be used at the next reboot.
[ceph1][DEBUG ] The operation has completed successfully.
[ceph1][WARNIN] error deleting partition 1: BLKPG: Device or resource busy
[ceph1][WARNIN] error deleting partitions 2-256: BLKPG: No such device or address
Error in sys.exitfunc:
Unhandled exception in thread started by
[zzq@localhost ~]$
准备OSD
注意prepare的盘最好使用单独的盘,单独使用分区会有点麻烦。而且不能是已经挂载的盘。
在每个节点上先解除挂载
sudo umount /dev/sdb1
在admin节点上运行
sudo ceph-deploy osd prepare ceph1:/dev/sdb
sudo ceph-deploy osd prepare ceph2:/dev/sdb
sudo ceph-deploy osd prepare ceph3:/dev/sdb
正确输出如下:
[zzq@localhost ~]$ sudo ceph-deploy osd prepare ceph1:/dev/sdb
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.37): /usr/bin/ceph-deploy osd prepare ceph1:/dev/sdb
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] disk : [('ceph1', '/dev/sdb', None)]
[ceph_deploy.cli][INFO ] dmcrypt : False
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] bluestore : None
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : prepare
[ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0xe145a8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] fs_type : xfs
[ceph_deploy.cli][INFO ] func : <function osd at 0xe02d70>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] zap_disk : False
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph1:/dev/sdb:
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS 6.9 Final
[ceph_deploy.osd][DEBUG ] Deploying osd to ceph1
[ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host ceph1 disk /dev/sdb journal None activate False
[ceph1][DEBUG ] find the location of an executable
[ceph1][INFO ] Running command: /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /dev/sdb
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_cryptsetup_parameters
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_key_size
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_type
[ceph1][WARNIN] INFO:ceph-disk:Will colocate journal with data on /dev/sdb
[ceph1][WARNIN] DEBUG:ceph-disk:Creating journal partition num 2 size 5120 on /dev/sdb
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/sbin/sgdisk --new=2:0:5120M --change-name=2:ceph journal --partition-guid=2:2f50f047-1d68-4b0f-afb7-0eb8ae9e9d22 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sdb
[ceph1][DEBUG ] Setting name!
[ceph1][DEBUG ] partNum is 1
[ceph1][DEBUG ] REALLY setting name!
[ceph1][DEBUG ] The operation has completed successfully.
[ceph1][WARNIN] INFO:ceph-disk:calling partx on prepared device /dev/sdb
[ceph1][WARNIN] INFO:ceph-disk:re-reading known partitions will display errors
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/partx -a /dev/sdb
[ceph1][WARNIN] BLKPG: Device or resource busy
[ceph1][WARNIN] error adding partition 2
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/udevadm settle
[ceph1][WARNIN] DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/2f50f047-1d68-4b0f-afb7-0eb8ae9e9d22
[ceph1][WARNIN] DEBUG:ceph-disk:Journal is GPT partition /dev/disk/by-partuuid/2f50f047-1d68-4b0f-afb7-0eb8ae9e9d22
[ceph1][WARNIN] DEBUG:ceph-disk:Creating osd partition on /dev/sdb
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:1f27c6e7-7f8e-4250-a48a-4e85915a575c --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be -- /dev/sdb
[ceph1][DEBUG ] Setting name!
[ceph1][DEBUG ] partNum is 0
[ceph1][DEBUG ] REALLY setting name!
[ceph1][DEBUG ] The operation has completed successfully.
[ceph1][WARNIN] INFO:ceph-disk:calling partx on created device /dev/sdb
[ceph1][WARNIN] INFO:ceph-disk:re-reading known partitions will display errors
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/partx -a /dev/sdb
[ceph1][WARNIN] BLKPG: Device or resource busy
[ceph1][WARNIN] error adding partition 1
[ceph1][WARNIN] BLKPG: Device or resource busy
[ceph1][WARNIN] error adding partition 2
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/udevadm settle
[ceph1][WARNIN] DEBUG:ceph-disk:Creating xfs fs on /dev/sdb1
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdb1
[ceph1][DEBUG ] meta-data=/dev/sdb1 isize=2048 agcount=4, agsize=327615 blks
[ceph1][DEBUG ] = sectsz=512 attr=2, projid32bit=0
[ceph1][DEBUG ] data = bsize=4096 blocks=1310459, imaxpct=25
[ceph1][DEBUG ] = sunit=0 swidth=0 blks
[ceph1][DEBUG ] naming =version 2 bsize=4096 ascii-ci=0
[ceph1][DEBUG ] log =internal log bsize=4096 blocks=2560, version=2
[ceph1][DEBUG ] = sectsz=512 sunit=0 blks, lazy-count=1
[ceph1][DEBUG ] realtime =none extsz=4096 blocks=0, rtextents=0
[ceph1][WARNIN] DEBUG:ceph-disk:Mounting /dev/sdb1 on /var/lib/ceph/tmp/mnt.QLTqXz with options noatime,inode64
[ceph1][WARNIN] INFO:ceph-disk:Running command: /bin/mount -t xfs -o noatime,inode64 -- /dev/sdb1 /var/lib/ceph/tmp/mnt.QLTqXz
[ceph1][WARNIN] DEBUG:ceph-disk:Preparing osd data dir /var/lib/ceph/tmp/mnt.QLTqXz
[ceph1][WARNIN] DEBUG:ceph-disk:Creating symlink /var/lib/ceph/tmp/mnt.QLTqXz/journal -> /dev/disk/by-partuuid/2f50f047-1d68-4b0f-afb7-0eb8ae9e9d22
[ceph1][WARNIN] DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.QLTqXz
[ceph1][WARNIN] INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.QLTqXz
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sdb
[ceph1][DEBUG ] The operation has completed successfully.
[ceph1][WARNIN] INFO:ceph-disk:calling partx on prepared device /dev/sdb
[ceph1][WARNIN] INFO:ceph-disk:re-reading known partitions will display errors
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/partx -a /dev/sdb
[ceph1][WARNIN] BLKPG: Device or resource busy
[ceph1][WARNIN] error adding partition 1
[ceph1][WARNIN] BLKPG: Device or resource busy
[ceph1][WARNIN] error adding partition 2
[ceph1][INFO ] checking OSD status...
[ceph1][DEBUG ] find the location of an executable
[ceph1][INFO ] Running command: /usr/bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host ceph1 is now ready for osd use.
Error in sys.exitfunc:
[zzq@localhost ~]$
激活OSD
激活使用的是分区,否则报错
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/sdb
[ceph1][WARNIN] ceph-disk: Cannot discover filesystem type: device /dev/sdb: Line is truncated:
[ceph1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init sysvinit --mount /dev/sdb
激活命令如下:
sudo ceph-deploy osd activate ceph1:/dev/sdb1
sudo ceph-deploy osd activate ceph2:/dev/sdb1
sudo ceph-deploy osd activate ceph3:/dev/sdb1
正确激活输出如下:
[zzq@localhost ~]$ sudo ceph-deploy osd activate ceph1:/dev/sdb1
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.37): /usr/bin/ceph-deploy osd activate ceph1:/dev/sdb1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : activate
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x175a5a8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function osd at 0x1748d70>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] disk : [('ceph1', '/dev/sdb1', None)]
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks ceph1:/dev/sdb1:
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO ] Distro info: CentOS 6.9 Final
[ceph_deploy.osd][DEBUG ] activating host ceph1 disk /dev/sdb1
[ceph_deploy.osd][DEBUG ] will use init type: sysvinit
[ceph1][DEBUG ] find the location of an executable
[ceph1][INFO ] Running command: /usr/sbin/ceph-disk -v activate --mark-init sysvinit --mount /dev/sdb1
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/sdb1
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[ceph1][WARNIN] DEBUG:ceph-disk:Mounting /dev/sdb1 on /var/lib/ceph/tmp/mnt.J1G4hm with options noatime,inode64
[ceph1][WARNIN] INFO:ceph-disk:Running command: /bin/mount -t xfs -o noatime,inode64 -- /dev/sdb1 /var/lib/ceph/tmp/mnt.J1G4hm
[ceph1][WARNIN] DEBUG:ceph-disk:Cluster uuid is 60cfd9c4-e6c1-4d0d-a2f4-5a0213326909
[ceph1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[ceph1][WARNIN] DEBUG:ceph-disk:Cluster name is ceph
[ceph1][WARNIN] DEBUG:ceph-disk:OSD uuid is 1f27c6e7-7f8e-4250-a48a-4e85915a575c
[ceph1][WARNIN] DEBUG:ceph-disk:OSD id is 0
[ceph1][WARNIN] DEBUG:ceph-disk:Marking with init system sysvinit
[ceph1][WARNIN] DEBUG:ceph-disk:ceph osd.0 data dir is ready at /var/lib/ceph/tmp/mnt.J1G4hm
[ceph1][WARNIN] INFO:ceph-disk:ceph osd.0 already mounted in position; unmounting ours.
[ceph1][WARNIN] DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.J1G4hm
[ceph1][WARNIN] INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.J1G4hm
[ceph1][WARNIN] DEBUG:ceph-disk:Starting ceph osd.0...
[ceph1][WARNIN] INFO:ceph-disk:Running command: /sbin/service ceph --cluster ceph start osd.0
[ceph1][DEBUG ] === osd.0 ===
[ceph1][DEBUG ] Starting Ceph osd.0 on ceph1...already running
[ceph1][INFO ] checking OSD status...
[ceph1][DEBUG ] find the location of an executable
[ceph1][INFO ] Running command: /usr/bin/ceph --cluster=ceph osd stat --format=json
[ceph1][INFO ] Running command: chkconfig ceph on
Error in sys.exitfunc:
[zzq@localhost ~]$
删除OSD
sudo ceph osd out osd.3
sudo ssh ceph1 service ceph stop osd.3
sudo ceph osd crush remove osd.3
sudo ceph auth del osd.3 //从认证中删除
sudo ceph osd rm 3 //删除
注意:OSD的目录也可以使用其他目录
sudo ssh ceph1
sudo mkdir /var/local/osd
exit
sudo ssh ceph2
sudo mkdir /var/local/osd
exit
sudo ssh ceph3
sudo mkdir /var/local/osd
exit
sudo ceph-deploy osd prepare ceph1:/var/local/osd ceph2:/var/local/osd ceph3:/var/local/osd
sudo ceph-deploy osd activate ceph1:/var/local/osd ceph2:/var/local/osd ceph3:/var/local/osd
把配置文件和 admin 密钥拷贝到各节点
这样每次执行 Ceph 命令行时就无需指定 monitor 地址和 ceph.client.admin.keyring
sudo ceph-deploy admin ceph1 ceph2 ceph3
正确输出如下:
[zzq@localhost ~]$ sudo ceph-deploy admin ceph1 ceph2 ceph3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.37): /usr/bin/ceph-deploy admin ceph1 ceph2 ceph3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1f49f38>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] client : ['ceph1', 'ceph2', 'ceph3']
[ceph_deploy.cli][INFO ] func : <function admin at 0x7f50770a49b0>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph1
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph2
[ceph2][DEBUG ] connected to host: ceph2
[ceph2][DEBUG ] detect platform information from remote host
[ceph2][DEBUG ] detect machine type
[ceph2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph3
[ceph3][DEBUG ] connected to host: ceph3
[ceph3][DEBUG ] detect platform information from remote host
[ceph3][DEBUG ] detect machine type
[ceph3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
Error in sys.exitfunc:
[zzq@localhost ~]$
查看集群健康状况
在安装有ceph的节点中使用命令
sudo ceph health
正确输出如下:
[zzq@ceph1 ~]$ sudo ceph health
[sudo] password for zzq:
HEALTH_WARN clock skew detected on mon.ceph2, mon.ceph3; 64 pgs degraded; 64 pgs stuck degraded; 64 pgs stuck inactive; 64 pgs stuck unclean; 64 pgs stuck undersized; 64 pgs undersized; too few PGs per OSD (21 < min 30); Monitor clock skew detected
[zzq@ceph1 ~]$
使用Ceph
配置块设备(client节点)
注意作为client节点也需要安装ceph,否则rbd会报错sudo: rbd: command not found。
需要注意的是client节点需要是ceph集群中的一员,需要有/etc/ceph/ceph.client.admin.keyring文件,否则无法跟ceph集群沟通,会报错ceph monclient(hunting) error missing keyring cannot use cephx for authentication。
/etc/ceph/ceph.client.admin.keyring文件是在执行以下命令时生产的
ceph-deploy new NodeA NodeB NodeC
ceph-deploy mon create-initial
或者
可以从集群的其他节点(主节点的/etc/ceph目录下)上将ceph.client.admin.keyring和ceph.conf文件复制一份过来,放到/etc/ceph目录下。
比如在client节点使用命令
scp ceph1:/etc/ceph/ceph.client.admin.keyring /etc/ceph
scp ceph1:/etc/ceph/ceph.conf /etc/ceph
如果没有conf文件会报错
-1 did not load config file, using default settings
在执行rbd命令之前请先使用命令ceph -s确保集群的健康情况如下:
[zzq@localhost ceph]$ ceph -s
cluster 60cfd9c4-e6c1-4d0d-a2f4-5a0213326909
health HEALTH_OK
monmap e1: 3 mons at {ceph1=192.168.199.81:6789/0,ceph2=192.168.199.82:6789/0,ceph3=192.168.199.83:6789/0}
election epoch 4, quorum 0,1,2 ceph1,ceph2,ceph3
osdmap e29: 3 osds: 3 up, 3 in
pgmap v88: 120 pgs, 1 pools, 0 bytes data, 0 objects
100 MB used, 15226 MB / 15326 MB avail
120 active+clean
[zzq@localhost ceph]$
否则rbd可能遇到的问题卡住不动,因为集群不健康io是被封锁的,解决方式参考
遇到问题—ceph—ceph的rbd命令没反应卡住
创建映像
命令格式如下
sudo rbd create foo --size 4096 [-m {mon-IP}] [-k /path/to/ceph.client.admin.keyring]
foo是块的名称
size是块大小
-m 是主机名或ip
-k是client节点的key文件,可以使用命令查看ls /etc/ceph
例如
sudo rbd create foo --size 4096 -m ceph1 -k /etc/ceph/ceph.client.admin.keyring
可以使用命令查看块的创建情况
sudo rbd list -k /etc/ceph/ceph.client.admin.keyring
正确输出如下:
[zzq@localhost ceph]$ sudo rbd create foo --size 4096 -m ceph1 -k /etc/ceph/ceph.client.admin.keyring
[zzq@localhost ceph]$ sudo rbd list -k /etc/ceph/ceph.client.admin.keyring
foo
[zzq@localhost ceph]$
如果报错
[zzq@localhost ~]$ sudo rbd create foo --size 4096 -m ceph1 -k /etc/ceph/ceph.client.admin.keyring
ERROR: modinfo: could not find module rbd
FATAL: Module rbd not found.
rbd: failed to load rbd kernel module (1)
rbd: sysfs write failed
rbd: map failed: (2) No such file or directory
原因
Since CEPH RBD module was first introduced with kernel 2.6.34 (and current RHEL/CentOS kernel is 2.6.32) – that means we need a newer kernel.
所以需要升级内核,使用命令如下:
sudo rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org
sudo rpm -Uvh http://www.elrepo.org/elrepo-release-6-8.el6.elrepo.noarch.rpm
sudo yum --enablerepo=elrepo-kernel install kernel-ml
编辑etc/grub.conf然后重启
使用命令
sudo vi /etc/grub.conf
sudo reboot
将default = 1 改为default = 0
将映像映射为块设备
将映像映射为块设备的命令
格式
sudo rbd map foo --name client.admin [-m {mon-IP}] [-k /path/to/ceph.client.admin.keyring]
例如
sudo rbd map foo --name client.admin -m ceph1 -k /etc/ceph/ceph.client.admin.keyring
正确输出如下:
[zzq@localhost ceph]$ sudo rbd map foo --name client.admin -m ceph1 -k /etc/ceph/ceph.client.admin.keyring
/dev/rbd0
[zzq@localhost ceph]$
查看块映射map和创建文件系统
rbd showmapped
输出如下:
[zzq@localhost ceph]$ rbd showmapped
id pool image snap device
0 rbd foo - /dev/rbd0
则使用/dev/rbd0创建文件系统
sudo mkfs.ext4 -m0 /dev/rbd0
正确输出如下:
[zzq@localhost ceph]$ sudo mkfs.ext4 -m0 /dev/rbd0
mke2fs 1.41.12 (17-May-2010)
Discarding device blocks: done
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=1024 blocks, Stripe width=1024 blocks
262144 inodes, 1048576 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 32 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
[zzq@localhost ceph]$
挂载文件系统并写入文件查看
sudo mkdir /cephAll
sudo mount /dev/rbd0 /cephAll
cd /cephAll
sudo vi helloCeph.txt
ls
df -h
这样我们就能在client中直接对ceph集群存储进行读取写入了。
输出如下:
[zzq@localhost ceph]$ sudo mkdir /cephAll
[zzq@localhost ceph]$ sudo mount /dev/rbd0 /cephAll
[zzq@localhost ceph]$ cd /cephAll
[zzq@localhost cephAll]$ sudo vi helloCeph.txt
[zzq@localhost cephAll]$ ls
helloCeph.txt lost+found
[zzq@localhost cephAll]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_localhost-lv_root
18G 2.2G 15G 13% /
tmpfs 482M 0 482M 0% /dev/shm
/dev/sda1 477M 86M 362M 20% /boot
/dev/rbd0 3.9G 8.0M 3.8G 1% /cephAll
[zzq@localhost cephAll]$
其他块命令
映射块设备
格式如下:
sudo rbd map {pool-name}/{image-name} --id {user-name}
例如:
sudo rbd map test/foo2 --id admin
如若启用cephx认证,还需指定密钥
sudo rbd map test/foo2 --id admin --keyring /etc/ceph/ceph.client.admin.keyring
查看已映射设备
rbd showmapped
取消块设备映射
格式如下:
sudo rbd unmap /dev/rbd/{poolname}/{imagename}
例如:
sudo rbd unmap /dev/rbd/test/foo2
罗列块设备映像
sudo rbd ls
检索映像信息
格式
sudo rbd info {image-name}
例如
sudo rbd info foo
或者
格式
sudo rbd info {pool-name}/{image-name}
例如:
sudo rbd info test/foo
调整块设备映像大小
命令如下:
sudo rbd resize --size 512 test/foo --allow-shrink #调小
sudo rbd resize --size 4096 test/foo #调大
删除块设备
sudo rbd rm test/foo
OSD管理命令
列出存储池
sudo ceph osd lspools
创建存储池
sudo ceph osd pool create pool-name pg-num pgp-num
sudo ceph osd pool create test 512 512
删除存储池
sudo ceph osd pool delete test test --yes-i-really-really-mean-it
重命名存储池
sudo ceph osd pool rename current-pool-namenew-pool-name
sudo ceph osd pool rename testtest2
查看存储池统计信息
sudo rados df
调整存储池选项值
sudo ceph osd pool set test size 3
#设置对象副本数
获取存储池选项值
sudo ceph osd pool get test size
#获取对象副本数
参考链接:
http://docs.ceph.org.cn/start/quick-ceph-deploy/
http://blog.51cto.com/kaiyuandiantang/1784429