首页 > 其他分享 >asm disk被加入到另外一个磁盘组故障恢复---惜分飞

asm disk被加入到另外一个磁盘组故障恢复---惜分飞

时间:2023-09-05 14:33:13浏览次数:38  
标签:NOTE group 分飞 --- 磁盘 disk DATA asm

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:asm disk被加入到另外一个磁盘组故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友在aix环境对其中一个rac的asm磁盘组进行扩容
add_disk


之后另外一套rac的磁盘组直接dismount

 

Wed Aug 23 12:44:02 2023 NOTE: SMON starting instance recovery for group DATA domain 2 (mounted) NOTE: F1X0 found on disk 0 au 2 fcn 0.128808679 NOTE: SMON skipping disk 7 - no header NOTE: cache initiating offline of disk 7 group DATA NOTE: process _smon_+asm1 (1770932) initiating offline of disk 7.3422955792 (DATA_0007) with mask 0x7e in group 2 NOTE: initiating PST update: grp = 2, dsk = 7/0xcc062910, mask = 0x6a, op = clear Wed Aug 23 12:44:02 2023 GMON updating disk modes for group 2 at 7 for pid 17, osid 1770932 ERROR: Disk 7 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 2) Wed Aug 23 12:44:02 2023 NOTE: cache dismounting (not clean) group 2/0x7FE6D808 (DATA) WARNING: Offline for disk DATA_0007 in mode 0x7f failed. Wed Aug 23 12:44:02 2023 NOTE: halting all I/Os to diskgroup 2 (DATA) ERROR: No disks with F1X0 found on disk group DATA NOTE: aborting instance recovery of domain 2 due to diskgroup dismount NOTE: SMON skipping lock domain (2) validation because diskgroup being dismounted Abort recovery for domain 2 Wed Aug 23 12:44:02 2023 ERROR: ORA-15130 in COD recovery for diskgroup 2/0x7fe6d808 (DATA) ERROR: ORA-15130 thrown in RBAL for group number 2 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_2360526.trc: ORA-15130: diskgroup "DATA" is being dismounted [

再次尝试mount该磁盘组,报ORA-15042和ORA-15038错误

SQL> alter diskgroup data mount NOTE: cache registered group DATA number=2 incarn=0x79e6d861 NOTE: cache began mount (first) of group DATA number=2 incarn=0x79e6d861 NOTE: Assigning number (2,0) to disk (/dev/rhdisk31) NOTE: Assigning number (2,3) to disk (/dev/rhdisk33) NOTE: Assigning number (2,4) to disk (/dev/rhdisk34) NOTE: Assigning number (2,5) to disk (/dev/rhdisk35) NOTE: Assigning number (2,6) to disk (/dev/rhdisk36) NOTE: Assigning number (2,9) to disk (/dev/rhdisk39) NOTE: Assigning number (2,1) to disk (/dev/rhdisk8) NOTE: Assigning number (2,2) to disk (/dev/rhdisk9) Wed Aug 23 12:58:46 2023 NOTE: GMON heartbeating for grp 2 GMON querying group 2 at 11 for pid 27, osid 3736034 NOTE: Assigning number (2,7) to disk () NOTE: Assigning number (2,8) to disk () GMON querying group 2 at 12 for pid 27, osid 3736034 NOTE: cache dismounting (clean) group 2/0x79E6D861 (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 3736034, image: oracle@hbbz01 (TNS V1-V3) NOTE: dbwr not being msg'd to dismount NOTE: lgwr not being msg'd to dismount NOTE: cache dismounted group 2/0x79E6D861 (DATA) NOTE: cache ending mount (fail) of group DATA number=2 incarn=0x79e6d861 NOTE: cache deleting context for group DATA 2/0x79e6d861 GMON dismounting group 2 at 13 for pid 27, osid 3736034 NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0003 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0004 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0005 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0006 in mode 0x7f marked for de-assignment NOTE: Disk  in mode 0x7f marked for de-assignment NOTE: Disk  in mode 0x7f marked for de-assignment NOTE: Disk DATA_0009 in mode 0x7f marked for de-assignment ERROR: diskgroup DATA was not mounted ORA-15032: not all alterations performed ORA-15040: diskgroup is incomplete ORA-15042: ASM disk "8" is missing from group number "2" ORA-15042: ASM disk "7" is missing from group number "2" ORA-15038: disk '/dev/rhdisk37' mismatch on 'Time Stamp' with target disk group [2129689239] [2062898314] ERROR: alter diskgroup data mount

怀疑把报错这个磁盘组的rhdisk37加入到另外一套rac的asm中了(也就是说两套asm使用了同一块磁盘),aix操作系统层面分析确认

---对asm扩容的机器上 # lscfg -vpl hdisk15   hdisk15          U78C5.001.DQD076A-P2-C4-T1-W200C00A098BC9A83-L0  MPIO NetApp FCP Default PCM Disk           Manufacturer................NETAPP          Machine Type and Model......LUN C-Mode              ROS Level and ID............9000         Serial Number...............80DYz]L/OpCA         Device Specific.(Z0)........FAS8020               PLATFORM SPECIFIC     Name:  disk     Node:  disk     Device Type:  block   ---磁盘组dismount的机器上 # lscfg -vpl hdisk37        hdisk37          U5802.001.9K87776-P1-C1-T1-W200500A098BC9A83-L0  MPIO NetApp FCP Default PCM Disk           Manufacturer................NETAPP          Machine Type and Model......LUN C-Mode              ROS Level and ID............9000         Serial Number...............80DYz]L/OpCA         Device Specific.(Z0)........FAS8020               PLATFORM SPECIFIC     Name:  disk     Node:  disk     Device Type:  block

通过lscfg 命令确认两套rac使用了同一块盘导致一个磁盘组异常,在新加的机器上查询确认新盘被破坏情况(新加入的磁盘由于reblance操作,已经被写入了380G左右数据[也就意味着这个磁盘在老磁盘组中最少会丢失380G数据]
20230905140603


对于这种情况,dismount磁盘组是外部冗余不可能直接mount起来,只能通过以前处理的类似方法:
asm disk header 彻底损坏恢复
asm磁盘加入vg恢复
asm磁盘dd破坏恢复
asm disk 磁盘部分被清空恢复
再一例asm disk被误加入vg并且扩容lv恢复
fdisk分区导致asm disk破坏数据库恢复
再一起asm disk被格式化成ext3文件系统故障恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统
ORA-15063: ASM discovered an insufficient number of disks for diskgroup 恢复
通过底层处理恢复出来没有覆盖的数据块中数据
20230827200941

再使用dul恢复出来其中数据,完成这次故障的核心数据恢复

 

标签:NOTE,group,分飞,---,磁盘,disk,DATA,asm
From: https://www.cnblogs.com/xifenfei/p/17679465.html

相关文章

  • 2023暑假集训总结-mjh
       在近40天的暑假集训时间内,比赛方面主要是通过牛客上萌新联赛和杭电多校联赛进行练习,偶尔会打cf。日常刷题方面主要是通过洛谷上的官方题单进行练习。首先从日常写题来说,通过洛谷的官方题单,可以对相同类型的题目进行集中训练,对于基础算法:前缀,差分,二分,搜索,快速幂,并查......
  • 2023暑假集训总结-wh
    在7.10-8.18得集训日子中,我们参加了很多比赛和练习,主要是航电多校,Acwing得系统刷题和cf的日常比赛。其实大部分时间都是在打比赛,因为航电多校一周两场+cf+acwing周赛差不多一天一场,所以每日比赛还是很舒服的,比完赛在补个题。基本上是这个节奏。其实航电多校我们是非常坐牢的,......
  • MySQL安装--rpm(CentOS7 + MySQL 5.7.35)
    Linux系统-部署-运维系列导航 MySQL常用安装方式有3种:rpm安装、yum安装、二进制文件安装。本文介绍rpm安装方式。 组件安装操作步骤参考 组件安装部署手册模板,根据不同组件的安装目标,部分操作可以省略。本文将按照该参考步骤执行。 一、获取组件可执行程序库,包括主程......
  • 文盘Rust -- 生命周期问题引发的 static hashmap 锁
    2021年上半年,撸了个rustcli开发的框架,基本上把交互模式,子命令提示这些cli该有的常用功能做进去了。项目地址:https://github.com/jiashiwen/interactcli-rs。春节以前看到axum已经0.4.x了,于是想看看能不能用rust做个服务端的框架。春节后开始动手,在做的过程中会碰到各种有趣的问......
  • django-celery定时任务(beat)
    前言Celery可以异步执行,也可以通过定时任务触发Django中使用Celery要在Django项目中使用Celery,您必须首先定义Celery库的一个实例(称为“应用程序”)如果你有一个现代的Django项目布局,比如: 创建一个celery模块,来定义celery实例importo......
  • day21 - 二叉树part07
    530. 二叉搜索树的最小绝对差详解/***Definitionforabinarytreenode.*structTreeNode{*intval;*TreeNode*left;*TreeNode*right;*TreeNode():val(0),left(nullptr),right(nullptr){}*TreeNode(intx):val(x),left......
  • 2023暑假集训总结-crf
    暑假集训从七月十号到八月18号,在这段期间的我参与的主要活动有牛客的萌新赛,杭电多校,acwing上的课程学习和刷题联系,codeforecs的比赛和补题。先说acwing,集训的前期我把时间投入到了acwing上,acwing上的课确实起到了作用,让我不用迷茫下一步应该学什么,按部就班地学习知识点,同时写一些......
  • Java JDK安装 - AdoptOpenJDK(CentOS 7 + AdoptOpenJDK 8)
    Linux系统-部署-运维系列导航 关于JVM、JRE与JDK  1.JVM(JavaVirtualMechinal)Java虚拟机,是整个java实现跨平台的最核心的部分,负责解释执行字节码文件,是可运行java字节码文件的虚拟计算机。当使用Java编译器编译Java程序时,生成的是与平台无关的字节码,这些字节码只......
  • Java JDK安装 - OracleJDK(CentOS 7 + OracleJDK 8u201)
    Linux系统-部署-运维系列导航 关于JVM、JRE与JDK  1.JVM(JavaVirtualMechinal)Java虚拟机,是整个java实现跨平台的最核心的部分,负责解释执行字节码文件,是可运行java字节码文件的虚拟计算机。当使用Java编译器编译Java程序时,生成的是与平台无关的字节码,这些字节码只......
  • 无涯教程-JavaScript - DVARP函数
    描述DVARP函数通过使用列表或数据库中符合您指定条件的记录的字段(列)中的数字,基于整个总体计算总体的方差。语法DVARP(database,field,criteria)争论Argument描述Required/Optionaldatabase组成列表或数据库的单元格范围。数据库是相关数据的列表,其中相关信息......