首页 > 其他分享 >19C 起库/资源重组时间过长

19C 起库/资源重组时间过长

时间:2024-09-26 14:50:17浏览次数:3  
标签:15 过长 32 09 2875075 2024 起库 IpReasmFails 19C

集团要集中采购一批存储,本次测试的存储是DELL&EMC的Powerflex存储,本来定的压测目标等,但是在其中一个场景断掉计算节点后,起库时间巨长,进行了相关的分析,如下是一些粗浅的步骤,好记性不如烂笔头。记录一下

2024-09-24T14:31:42.636014+08:00
Starting ORACLE instance (normal) (OS id: 80696)

2024-09-24T14:31:42.746937+08:00
PAGESIZE AVAILABLE_PAGES EXPECTED_PAGES ALLOCATED_PAGES ERROR(s)
2024-09-24T14:31:42.746964+08:00
4K Configured 39 39 NONE
2024-09-24T14:31:42.747012+08:00
2048K 266246 262145 262145 NONE
2024-09-24T14:31:42.747038+08:00
**********************************************************************

2024-09-24T14:32:06.212071+08:00
Increasing priority of 26 RS
* Setting GES domain 0
Attached to domain 0 (addr: 0xe3498eeb8)
Reconfiguration started (old inc 0, new inc 12)
Dynamic remastering is disabled
List of instances (total 2) :
1 2
My inst 1 (I'm a new instance)
Global Resource Directory frozen
Enabling Dynamic Remastering: NONE->NORM switch
Communication channels reestablished
2024-09-24T14:32:06.259101+08:00
* domain 0 valid = 1 (flags x8820, pdb flags x8000) according to instance 2
2024-09-24T14:32:06.287062+08:00
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
2024-09-24T14:32:06.291798+08:00
LMS 15: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291808+08:00
LMS 17: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291810+08:00
LMS 5: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291814+08:00
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291818+08:00
LMS 19: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291821+08:00
LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291826+08:00
LMS 4: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291833+08:00
LMS 22: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291856+08:00
LMS 7: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291863+08:00
LMS 20: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291919+08:00
LMS 25: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291923+08:00
LMS 14: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291942+08:00
LMS 12: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291945+08:00
LMS 6: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291966+08:00
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291975+08:00
LMS 24: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291976+08:00
LMS 16: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291980+08:00
LMS 11: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291983+08:00
LMS 21: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291988+08:00
LMS 23: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291990+08:00
LMS 9: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291993+08:00
LMS 10: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291994+08:00
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.291996+08:00
LMS 18: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.292004+08:00
LMS 13: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2024-09-24T14:32:06.292009+08:00
LMS 8: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
Set master node info
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
2024-09-24T14:37:55.980243+08:00
Reconfiguration complete (total time 349.8 secs) <<<<<<<<<<<<<<<<<<<<
Decreasing priority of 26 RS
2024-09-24T14:37:55.980646+08:00
Starting background process LCK0

可以看见,在alert日志中有Reconfiguration complete (total time 349.8 secs) 的提示,在我们标准化的系统中是没有过的。

通过alert日志得到,当时LMON的进程ID为81255,我们可以去当时的TRC下找找原因

2024-09-24T14:31:57.273227+08:00

LMON started with pid=22, OS id=81255 

Trace file /u01/app/oracle/diag/rdbms/flexdbt/flexdbt1/trace/flexdbt1_lmon_81255.trc
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.24.0.0.0
Build label: RDBMS_19.24.0.0.0DBRU_LINUX.X64_240627
ORACLE_HOME: /u01/app/oracle/product/19.3.0/db
System name: Linux
Node name: flexdbt1
Release: 4.18.0-477.10.1.el8_8.x86_64
Version: #1 SMP Wed Apr 5 13:35:01 EDT 2023
Machine: x86_64
CLID: P
Instance name: flexdbt1
Instance number: 1
Database name: flexdbt
Database unique name: flexdbt
Database id: N/A
Redo thread mounted by this instance: 0 <none>
Oracle process number: 22
Unix process pid: 81255, image: oracle@flexdbt1 (LMON)

2024-09-24 14:32:06.226 : * Begin lmon rcfg step KJGA_RCFG_BEGIN (kjidomena 0, rcfginfo x0)
2024-09-24 14:32:06.229 : * Begin lmon rcfg step KJGA_RCFG_FREEZE
2024-09-24 14:32:06.230 : * Begin lmon rcfg step KJGA_RCFG_COMM
2024-09-24 14:32:06.238 : * Begin lmon rcfg step KJGA_RCFG_EXCHANGE (kjidomena 0, hvmaster 2)
2024-09-24 14:32:06.287 : * Begin lmon rcfg step KJGA_RCFG_ENQCLEANUP
2024-09-24 14:32:06.288 : * Begin lmon rcfg step KJGA_RCFG_CLEANUP
2024-09-24 14:32:06.363 : * Begin lmon rcfg step KJGA_RCFG_TIMERQ
2024-09-24 14:32:06.363 : * Begin lmon rcfg step KJGA_RCFG_DDQ
2024-09-24 14:32:06.363 : * Begin lmon rcfg step KJGA_RCFG_SETMASTER
2024-09-24 14:32:06.737 : * Begin lmon rcfg step KJGA_RCFG_ENQREPLAY
2024-09-24 14:32:06.754 : * Begin lmon rcfg step KJGA_RCFG_ENQDUBIOUS
2024-09-24 14:32:06.765 : * Begin lmon rcfg step KJGA_RCFG_ENQGRANT
2024-09-24 14:32:06.773 : * Begin lmon rcfg step KJGA_RCFG_PCMREPLAY <<<<<
2024-09-24 14:37:55.697 : * Begin lmon rcfg step KJGA_RCFG_FIXWRITES
2024-09-24 14:37:55.979 : * Begin lmon rcfg step KJGA_RCFG_END (kjidomena 0)

Total dlm rcfg time (inc 12): 341.557 secs (1681687, 2023244)
Begin step .........: 0.002 secs (1681687, 1681689)
Freeze step ........: 0.000 secs (1681690, 1681690)
Remap step .........: 0.001 secs (1681690, 1681691)
Comm step ..........: 0.005 secs (1681691, 1681696)
Sync 1 step ........: 0.003 secs (1681696, 1681699)
Exchange step ......: 0.000 secs (1681699, 1681699)
Sync 2 step ........: 0.046 secs (1681700, 1681746)
Enqueue cleanup step: 0.002 secs (1681746, 1681748)
Sync pcm1 step .....: 0.000 secs (1681748, 1681748)
Cleanup step .......: 0.073 secs (1681748, 1681821)
Timerq step ........: 0.000 secs (1681821, 1681821)
Ddq step ...........: 0.000 secs (1681821, 1681821)
Set master step ....: 0.002 secs (1681821, 1681823)
Sync 3 step ........: 0.364 secs (1681823, 1682187)
Enqueue replay step : 0.002 secs (1682187, 1682189)
Sync 4 step ........: 0.014 secs (1682189, 1682203)
Enqueue dubious step: 0.002 secs (1682203, 1682205)
Sync 5 step ........: 0.009 secs (1682205, 1682214)
Enqueue grant step .: 0.005 secs (1682214, 1682219)
Sync 6 step ........: 0.003 secs (1682219, 1682222)
PCM replay step ....: 0.098 secs (1682222, 1682320)
Sync 7 step ........: 340.647 secs (1682320, 2022967)
Fixwrt replay step .: 0.276 secs (2022967, 2023243)
Sync 8 step ........: 0.000 secs (2023244, 2023244)
End step ...........: 0.000 secs (2023244, 2023244)
Number of replayed enqueues sent / received .......: 0 / 2232
Number of replayed fusion locks sent / received ...: 0 / 13601328
Number of enqueues mastered before / after rcfg ...: 0 / 2002
Number of fusion locks mastered after rcfg: 13601328
Number of affinity locks expanded remote / local ..: 0 / 0
Number of kjbr resources scanned / kjbr buckets skipped in cleanup step ..: 13601328 / 1664

 Reconfiguration 大约有 340 秒花在 KJGA_RCFG_PCMREPLAY 步骤上了,该步骤的含义是

Transfer of the local lock information to the new master.

也就是新实例(实例 1)启动后,现存实例(实例 2) 把其现存的 resource master 的信息分发给 新实例的过程。

根据 OSWatcher,系统在 14:32 到 14:37 出现大幅度的 IpReasmFails 增长。

$ awk '/zzz/{t=$5}/IpReasmFails/{print t,$1,$2}' flexdbt1_netstat_24.09.24.1400.dat

14:31:02 IpReasmFails 1703436
14:31:32 IpReasmFails 1703436
14:32:02 IpReasmFails 1703436
14:32:32 IpReasmFails 1868043
14:33:02 IpReasmFails 2069652
14:33:32 IpReasmFails 2269260
14:34:02 IpReasmFails 2468103
14:34:32 IpReasmFails 2667326
14:35:02 IpReasmFails 2866252
14:35:32 IpReasmFails 3065929
14:36:02 IpReasmFails 3265551
14:36:32 IpReasmFails 3465252
14:37:02 IpReasmFails 3665142
14:37:32 IpReasmFails 3861132
14:38:02 IpReasmFails 3975386
14:38:32 IpReasmFails 3975394
14:39:02 IpReasmFails 3975395
14:39:32 IpReasmFails 3975395
14:40:02 IpReasmFails 3975395
Jumbo Frames 能减少 UDP fragmentation reassembly 的发生,从而能有效避免 IpReasmFails


eno145: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.2  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::2eea:7fff:feed:e6c8  prefixlen 64  scopeid 0x20<link>
        ether 2c:ea:7f:ed:e6:c8  txqueuelen 1000  (Ethernet)
        RX packets 67856043  bytes 123641995882 (115.1 GiB)
        RX errors 0  dropped 6  overruns 0  frame 0
        TX packets 77358134  bytes 96528195964 (89.8 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 70  

To implement the solution, please execute the following steps:

对所有节点的私网设备 (eno145) 启用 Jumbo Frames,参考:

Recommendation for the Real Application Cluster Interconnect and Jumbo Frames ( Doc ID 341788.1 )

注意启用 Jumbo Frames 需要网络设备及交换机支持

ifconfig eno145 mtu 9000
vi/etc/sysconfig/network-scripts/ifcfg-eno145 
MTU=9000

修改MTU后,netstat记录中我们再看看

[root@flexdbt1:/u01/app/grid/oracle.ahf/data/repository/suptools/flexdbt1/oswbb/grid/archive/oswnetstat]$ awk '/zzz/{t=$5}/IpReasmFails/{print t,$1,$2}' flexdbt1_netstat_24.09.25.1500.dat
15:00:28 IpReasmFails 2875075
15:00:58 IpReasmFails 2875075
15:01:29 IpReasmFails 2875075
15:01:59 IpReasmFails 2875075
15:02:29 IpReasmFails 2875075
15:02:59 IpReasmFails 2875075
15:03:29 IpReasmFails 2875075
15:03:59 IpReasmFails 2875075
15:04:29 IpReasmFails 2875075
15:04:59 IpReasmFails 2875075
15:05:29 IpReasmFails 2875075
15:05:59 IpReasmFails 2875075
15:06:29 IpReasmFails 2875075
15:06:59 IpReasmFails 2875075
15:07:29 IpReasmFails 2875075
15:07:59 IpReasmFails 2875075
15:08:29 IpReasmFails 2875075
15:08:59 IpReasmFails 2875075
15:09:29 IpReasmFails 2875075
15:09:59 IpReasmFails 2875075
15:10:29 IpReasmFails 2875075
15:10:59 IpReasmFails 2875075
15:11:29 IpReasmFails 2875075
15:11:59 IpReasmFails 2875075
15:12:29 IpReasmFails 2875075
15:12:59 IpReasmFails 2875075
15:13:29 IpReasmFails 2875075
15:13:59 IpReasmFails 2875075
15:14:29 IpReasmFails 2875075
15:14:59 IpReasmFails 2875075
15:15:29 IpReasmFails 2875075
15:15:59 IpReasmFails 2875075
15:16:29 IpReasmFails 2875075
15:16:59 IpReasmFails 2875075
15:17:29 IpReasmFails 2875075
15:17:59 IpReasmFails 2875075
15:18:29 IpReasmFails 2875075
15:18:59 IpReasmFails 2875075
15:19:29 IpReasmFails 2875075
15:19:59 IpReasmFails 2875075
15:20:30 IpReasmFails 2875075
15:21:00 IpReasmFails 2875075
15:21:30 IpReasmFails 2875075
15:22:00 IpReasmFails 2875075
15:22:30 IpReasmFails 2875075
15:23:00 IpReasmFails 2875075
15:23:30 IpReasmFails 2875075
15:24:00 IpReasmFails 2875075
15:30:20 IpReasmFails 0
15:30:50 IpReasmFails 0
15:31:20 IpReasmFails 0
15:31:50 IpReasmFails 0
15:32:20 IpReasmFails 0
15:32:50 IpReasmFails 0
15:33:20 IpReasmFails 0
15:33:50 IpReasmFails 0
15:34:20 IpReasmFails 0
15:34:50 IpReasmFails 0
15:35:20 IpReasmFails 0
15:35:50 IpReasmFails 0
15:36:20 IpReasmFails 0
15:36:50 IpReasmFails 0
15:37:20 IpReasmFails 0
15:37:50 IpReasmFails 0
15:38:20 IpReasmFails 0
15:38:50 IpReasmFails 0
15:39:20 IpReasmFails 0
15:39:50 IpReasmFails 0
15:40:20 IpReasmFails 0
15:40:50 IpReasmFails 0
15:41:20 IpReasmFails 0
15:41:50 IpReasmFails 0
15:42:20 IpReasmFails 0
15:42:50 IpReasmFails 0
15:43:20 IpReasmFails 0
15:43:50 IpReasmFails 0
15:44:20 IpReasmFails 0
15:44:50 IpReasmFails 0
15:45:20 IpReasmFails 0
15:45:50 IpReasmFails 0
15:46:20 IpReasmFails 0
15:46:51 IpReasmFails 0
15:47:21 IpReasmFails 0
15:53:53 IpReasmFails 0
15:54:23 IpReasmFails 0
15:54:53 IpReasmFails 0
15:55:23 IpReasmFails 0
15:55:53 IpReasmFails 0
15:56:23 IpReasmFails 0
15:56:53 IpReasmFails 0
15:57:23 IpReasmFails 0
15:57:53 IpReasmFails 0
15:58:23 IpReasmFails 0
15:58:53 IpReasmFails 0
15:59:23 IpReasmFails 0
15:59:53 IpReasmFails 0

同时资源重组的时间,也降了下来。起库时间也快了很多

2024-09-25T15:56:17.445405+08:00

Reconfiguration complete (total time 5.9 secs) 

标签:15,过长,32,09,2875075,2024,起库,IpReasmFails,19C
From: https://blog.51cto.com/yangjunfeng/12119426

相关文章

  • redhat7静默安装oracle19c
    文章目录创建用户组:创建安装目录:安装依赖:环境变量配置:上传安装包,赋权并解压编辑安装响应文件:根据自己的路径修改查看参数修改情况:路径修改为自己的路径运行安装监听创建:无需修改响应文件数据库实例创建:查看修改后的参数:运行创建:创建oracle数据库用户创建用户组:group......
  • Oracle 19c OCP 认证考试 082 题库(第26题)- 2024年修正版
    【优技教育】Oracle19cOCP082题库(Q26题)-2024年修正版考试科目:1Z0-082考试题量:90通过分数:60%考试时间:150min本文为(CUUG原创)整理并解析,转发请注明出处,禁止抄袭及未经注明出处的转载。原文地址:http://www.cuug.com/index.php?s=/home/article/detail/id/3412.html第......
  • Oracle 19c OCP 认证考试 082 题库(第24题)- 2024年修正版
    【优技教育】Oracle19cOCP082题库(Q24题)-2024年修正版考试科目:1Z0-082考试题量:90通过分数:60%考试时间:150min本文为(CUUG原创)整理并解析,转发请注明出处,禁止抄袭及未经注明出处的转载。原文地址:http://www.cuug.com/index.php?s=/home/article/detail/id/3410.html第......
  • Oracle 19c OCP 认证考试 082 题库(第23题)- 2024年修正版
    【优技教育】Oracle19cOCP082题库(Q23题)-2024年修正版考试科目:1Z0-082考试题量:90通过分数:60%考试时间:150min本文为(CUUG原创)整理并解析,转发请注明出处,禁止抄袭及未经注明出处的转载。原文地址:http://www.cuug.com/index.php?s=/home/article/detail/id/3407.html第......
  • Oracle 19c通过cdb的service name连接后为pdb库
     Oracle19c通过cdb的servicename连接后为pdb库 现在数据库版本为19.19,库名为oemdb,有1个容器数据库pdb为empdbrepos,如下:[oracle@oem13c~]$sqlplus/assysdbaSQL*Plus:Release19.0.0.0.0-ProductiononThuSep1909:34:512024Version19.19.0.0.0......09:......
  • Oracle 19c OCP 认证考试 082 题库(第22题)- 2024年修正版
    【优技教育】Oracle19cOCP082题库(Q22题)-2024年修正版考试科目:1Z0-082考试题量:90通过分数:60%考试时间:150min本文为(CUUG原创)整理并解析,转发请注明出处,禁止抄袭及未经注明出处的转载。原文地址:http://www.cuug.com/index.php?s=/home/article/detail/id/3406.html第......
  • Oracle 19c OCP 认证考试 082 题库(第20题)- 2024年修正版
    【优技教育】Oracle19cOCP082题库(Q20题)-2024年修正版考试科目:1Z0-082考试题量:90通过分数:60%考试时间:150min本文为(CUUG原创)整理并解析,转发请注明出处,禁止抄袭及未经注明出处的转载。原文地址:http://www.cuug.com/index.php?s=/home/article/detail/id/3401.html第......
  • Oracle 19c OCP 认证考试 082 题库(第19题)- 2024年修正版
    【优技教育】Oracle19cOCP082题库(Q19题)-2024年修正版考试科目:1Z0-082考试题量:90通过分数:60%考试时间:150min本文为(CUUG原创)整理并解析,转发请注明出处,禁止抄袭及未经注明出处的转载。原文地址:http://www.cuug.com.cn/ocp/082kaoshitiku/38228860619.html第19题:Q1......
  • Oracle 19c OCP 认证考试 082 题库(第18题)- 2024年修正版
    【优技教育】Oracle19cOCP082题库(Q18题)-2024年修正版考试科目:1Z0-082考试题量:90通过分数:60%考试时间:150min本文为(CUUG原创)整理并解析,转发请注明出处,禁止抄袭及未经注明出处的转载。原文地址:http://www.cuug.com.cn/ocp/082kaoshitiku/38219540954.html第18题:Q1......
  • 【优技教育】Oracle 19c OCP 082题库(第16题)- 2024年修正版
    【优技教育】Oracle19cOCP082题库(Q16题)-2024年修正版考试科目:1Z0-082考试题量:90通过分数:60%考试时间:150min本文为(CUUG原创)整理并解析,转发请注明出处,禁止抄袭及未经注明出处的转载。原文地址:http://www.cuug.com/index.php?s=/home/article/detail/id/3397.html第......