IP改变引起的Ceph monitor异常及OSD盘崩溃怎么办
发表于:2025-12-01 作者:千家信息网编辑
千家信息网最后更新 2025年12月01日,这篇文章主要介绍了IP改变引起的Ceph monitor异常及OSD盘崩溃怎么办,具有一定借鉴价值,感兴趣的朋友可以参考下,希望大家阅读完这篇文章之后大有收获,下面让小编带着大家一起了解一下。公司搬家
千家信息网最后更新 2025年12月01日IP改变引起的Ceph monitor异常及OSD盘崩溃怎么办
这篇文章主要介绍了IP改变引起的Ceph monitor异常及OSD盘崩溃怎么办,具有一定借鉴价值,感兴趣的朋友可以参考下,希望大家阅读完这篇文章之后大有收获,下面让小编带着大家一起了解一下。
公司搬家,所有服务器的ip改变。对ceph服务器配置好ip后启动,发现monitor进程启动失败,monitor进程总是试图绑定到以前的ip地址,那当然不可能成功了。开始以为服务器的ip设置有问题,在改变hostname、ceph.conf等方法无果后,逐步分析发现,是monmap中的ip地址还是以前的ip,ceph通过读取monmap来启动monitor进程,所以需要修改monmap。方法如下:
#Add the new monitor locations # monmaptool --create --add mon0 192.168.32.2:6789 --add osd1 192.168.32.3:6789 \ --add osd2 192.168.32.4:6789 --fsid 61a520db-317b-41f1-9752-30cedc5ffb9a \ --clobber monmap #Retrieve the monitor map # ceph mon getmap -o monmap.bin #Check new contents # monmaptool --print monmap.bin #Inject the monmap # ceph-mon -i mon0 --inject-monmap monmap.bin # ceph-mon -i osd1 --inject-monmap monmap.bin # ceph-mon -i osd2 --inject-monmap monmap.bin
再启动monitor,一切正常。
但出现了上一篇文章中描述的一块osd盘挂掉的情况。查了一圈,只搜到ceph的官网上说是ceph的一个bug。无力修复,于是删掉这块osd,再重装:
# service ceph stop osd.4 #不必执行ceph osd crush remove osd.4 # ceph auth del osd.4 # ceph osd rm 4 # umount /cephmp1 # mkfs.xfs -f /dev/sdc # mount /dev/sdc /cephmp1 #此处执行create无法正常安装osd # ceph-deploy osd prepare osd2:/cephmp1:/dev/sdf1 # ceph-deploy osd activate osd2:/cephmp1:/dev/sdf1
完成后重启该osd,成功运行。ceph会自动平衡数据,***的状态是:
[root@osd2 ~]# ceph -s cluster 61a520db-317b-41f1-9752-30cedc5ffb9a health HEALTH_WARN 9 pgs incomplete; 9 pgs stuck inactive; 9 pgs stuck unclean; 3 requests are blocked > 32 sec monmap e3: 3 mons at {mon0=192.168.32.2:6789/0,osd1=192.168.32.3:6789/0,osd2=192.168.32.4:6789/0}, election epoch 76, quorum 0,1,2 mon0,osd1,osd2 osdmap e689: 6 osds: 6 up, 6 in pgmap v189608: 704 pgs, 5 pools, 34983 MB data, 8966 objects 69349 MB used, 11104 GB / 11172 GB avail 695 active+clean 9 incomplete出现了9个pg的incomplete状态。
[root@osd2 ~]# ceph health detail HEALTH_WARN 9 pgs incomplete; 9 pgs stuck inactive; 9 pgs stuck unclean; 3 requests are blocked > 32 sec; 1 osds have slow requests pg 5.95 is stuck inactive for 838842.634721, current state incomplete, last acting [1,4] pg 5.66 is stuck inactive since forever, current state incomplete, last acting [4,0] pg 5.de is stuck inactive for 808270.105968, current state incomplete, last acting [0,4] pg 5.f5 is stuck inactive for 496137.708887, current state incomplete, last acting [0,4] pg 5.11 is stuck inactive since forever, current state incomplete, last acting [4,1] pg 5.30 is stuck inactive for 507062.828403, current state incomplete, last acting [0,4] pg 5.bc is stuck inactive since forever, current state incomplete, last acting [4,1] pg 5.a7 is stuck inactive for 499713.993372, current state incomplete, last acting [1,4] pg 5.22 is stuck inactive for 496125.831204, current state incomplete, last acting [0,4] pg 5.95 is stuck unclean for 838842.634796, current state incomplete, last acting [1,4] pg 5.66 is stuck unclean since forever, current state incomplete, last acting [4,0] pg 5.de is stuck unclean for 808270.106039, current state incomplete, last acting [0,4] pg 5.f5 is stuck unclean for 496137.708958, current state incomplete, last acting [0,4] pg 5.11 is stuck unclean since forever, current state incomplete, last acting [4,1] pg 5.30 is stuck unclean for 507062.828475, current state incomplete, last acting [0,4] pg 5.bc is stuck unclean since forever, current state incomplete, last acting [4,1] pg 5.a7 is stuck unclean for 499713.993443, current state incomplete, last acting [1,4] pg 5.22 is stuck unclean for 496125.831274, current state incomplete, last acting [0,4] pg 5.de is incomplete, acting [0,4] pg 5.bc is incomplete, acting [4,1] pg 5.a7 is incomplete, acting [1,4] pg 5.95 is incomplete, acting [1,4] pg 5.66 is incomplete, acting [4,0] pg 5.30 is incomplete, acting [0,4] pg 5.22 is incomplete, acting [0,4] pg 5.11 is incomplete, acting [4,1] pg 5.f5 is incomplete, acting [0,4] 2 ops are blocked > 8388.61 sec 1 ops are blocked > 4194.3 sec 2 ops are blocked > 8388.61 sec on osd.0 1 ops are blocked > 4194.3 sec on osd.0 1 osds have slow requests
查了一圈无果。一个有同样遭遇的人的一段话:
I already tried "ceph pg repair 4.77", stop/start OSDs, "ceph osd lost", "ceph pg force_create_pg 4.77". Most scary thing is "force_create_pg" does not work. At least it should be a way to wipe out a incomplete PG without destroying a whole pool.
以上方法尝试了一下,都不行。暂时无法解决,感觉有点坑。
PS:常用pg操作
[root@osd2 ~]# ceph pg map 5.de osdmap e689 pg 5.de (5.de) -> up [0,4] acting [0,4] [root@osd2 ~]# ceph pg 5.de query [root@osd2 ~]# ceph pg scrub 5.de instructing pg 5.de on osd.0 to scrub [root@osd2 ~]# ceph pg 5.de mark_unfound_lost revert pg has no unfound objects #ceph pg dump_stuck stale #ceph pg dump_stuck inactive #ceph pg dump_stuck unclean [root@osd2 ~]# ceph osd lost 1 Error EPERM: are you SURE? this might mean real, permanent data loss. pass --yes-i-really-mean-it if you really do. [root@osd2 ~]# [root@osd2 ~]# ceph osd lost 4 --yes-i-really-mean-it osd.4 is not down or doesn't exist [root@osd2 ~]# service ceph stop osd.4 === osd.4 === Stopping Ceph osd.4 on osd2...kill 22287...kill 22287...done [root@osd2 ~]# ceph osd lost 4 --yes-i-really-mean-it marked osd lost in epoch 690 [root@osd1 mnt]# ceph pg repair 5.de instructing pg 5.de on osd.0 to repair [root@osd1 mnt]# ceph pg repair 5.de instructing pg 5.de on osd.0 to repair
感谢你能够认真阅读完这篇文章,希望小编分享的"IP改变引起的Ceph monitor异常及OSD盘崩溃怎么办"这篇文章对大家有帮助,同时也希望大家多多支持,关注行业资讯频道,更多相关知识等着你来学习!
篇文章
方法
服务器
进程
服务
怎么办
成功
地址
状态
不行
价值
公司
兴趣
同时
常用
情况
感觉
数据
更多
朋友
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
dnf全服务器有哪些
界首app软件开发
如何安装文件服务器资源管理器
戴尔管理服务器是干什么的
宿州迅捷网络技术有限公司
北京访客管理软件开发方案
软件工程和网络安全专业
江苏海航软件开发业务流程
获取 服务器 时间
服务器和
服务器odm 出货量
北京帅弘魏网络技术有限公司
erp数据库挂了怎么解决
全面筑牢网络安全管理
大学软件开发专科必修课程
网络安全分析技术大会
盐城学软件开发
金铲铲之战不用服务器可以联机吗
vs 2010 修改数据库路径
怎样进行用友数据库维护
应用服务器的一般配置
做医疗软件开发需要什么知识
蓟州区电子网络技术不二之选
计算机网络技术简历培训经历
西安第六届国家网络安全宣传周
pb连接数据库显示找不到表
专科生计算机网络技术和建筑工程技术哪个好
基础设施网络技术
小学生有关网络安全的画
sql数据库工程师证书样本