1、故障描述
一套11.2.0.4 RAC测试环境,无意间发现ora.oc4j资源无法启动。 本文主要记录处理过程。
2、故障处理
(1).手动启动oc4j资源:
[root@11grac1 log]# crsctl start resource ora.oc4j
该命令一直处于执行状态,没有任何输出
(2).查看scriptagent_grid.log日志
2024-03-01 11:19:37.458: [ora.oc4j][1870296832]{1:56992:319} [start] Executing action script: /u01/app/11.2.0.4/grid/bin/oc4jctl[start] 2024-03-01 11:19:37.509: [ora.oc4j][1870296832]{1:56992:319} [start] Start OC4J 2024-03-01 11:19:54.039: [ora.oc4j][1870296832]{1:56992:319} [start] /u01/app/11.2.0.4/grid/bin/oc4jctl.pl: Could not fetch http://localhost:8888/ 2024-03-01 11:19:54.996: [ora.oc4j][1870296832]{1:56992:319} [start] /u01/app/11.2.0.4/grid/bin/oc4jctl.pl: Could not fetch http://localhost:8888/ 2024-03-01 11:19:56.006: [ora.oc4j][1870296832]{1:56992:319} [start] /u01/app/11.2.0.4/grid/bin/oc4jctl.pl: Could not fetch http://localhost:8888/ 2024-03-01 11:19:57.015: [ora.oc4j][1870296832]{1:56992:319} [start] /u01/app/11.2.0.4/grid/bin/oc4jctl.pl: Could not fetch http://localhost:8888/ 2024-03-01 11:19:58.022: [ora.oc4j][1870296832]{1:56992:319} [start] /u01/app/11.2.0.4/grid/bin/oc4jctl.pl: Could not fetch http://localhost:8888/ 2024-03-01 11:19:59.030: [ora.oc4j][1870296832]{1:56992:319} [start] /u01/app/11.2.0.4/grid/bin/oc4jctl.pl: Could not fetch http://localhost:8888/ ...... |
Could not fetch http://localhost:8888/这个异常信息一直不断地重复出现。
(3).搜索MOS,查阅了很多篇相关的故障处理文档,发现Fail to Start oc4j Resource On 11gR2 RAC as Content of Config File server.xml is Gone (Doc ID 2016052.1)这篇与当前故障非常相似。
(4).依据2016052.1文档,检查节点1中的server.xml配置文件:
[root@11grac1 OC4J_DBWLM_config]# cd /u01/app/11.2.0.4/grid/oc4j/j2ee/home/OC4J_DBWLM_config [root@11grac1 OC4J_DBWLM_config]# ll total 168 -rw-r--r-- 1 grid oinstall 1710 Aug 16 2014 application.xml drwxr-xr-x 2 grid oinstall 125 Apr 27 2023 database-schemas -rw-r--r-- 1 grid oinstall 2668 Aug 16 2014 data-sources.xml -rw-r--r-- 1 grid oinstall 1453 Aug 16 2014 default-web-site.xml -rw-r--r-- 1 grid oinstall 31373 Aug 16 2014 entity-resolver-config.xml -rw-r--r-- 1 grid oinstall 9917 Aug 16 2014 global-web-application.xml -rw-r--r-- 1 grid oinstall 1356 Aug 16 2014 http-web-site.xml -rw-r--r-- 1 grid oinstall 1014 Aug 16 2014 internal-settings.xml -rw-r--r-- 1 grid oinstall 4824 Aug 16 2014 j2ee-logging.xml -rw-r--r-- 1 grid oinstall 21860 Aug 16 2014 java2.policy -rw-r--r-- 1 grid oinstall 764 Aug 16 2014 javacache.xml -rw-r--r-- 1 grid oinstall 434 Aug 16 2014 jazn-data.xml -rw-r--r-- 1 grid oinstall 325 Aug 16 2014 jazn.security.props -rw-r--r-- 1 grid oinstall 1362 Aug 16 2014 jazn.xml -rw-r--r-- 1 grid oinstall 1246 Aug 16 2014 jms.xml -rw-r--r-- 1 grid oinstall 2810 Aug 16 2014 mime.types -rw-r--r-- 1 grid oinstall 207 Aug 16 2014 oc4jclient.policy -rw-r--r-- 1 grid oinstall 2059 Aug 16 2014 oc4j-connectors.xml -rw-r--r-- 1 grid oinstall 32 Aug 16 2014 oc4j.properties -rw-r--r-- 1 grid oinstall 287 Aug 16 2014 orb-config.xml -rw-r--r-- 1 grid oinstall 2045 Aug 16 2014 principals.xml -rw-r--r-- 1 grid oinstall 1324 Aug 16 2014 rmi.xml -rw-r--r-- 1 grid oinstall 0 Feb 29 13:15 server.xml -rw-r--r-- 1 grid oinstall 3190 Aug 16 2014 system-application.xml -rw-r--r-- 1 grid oinstall 13377 Mar 1 11:19 system-jazn-data.xml -rw-r--r-- 1 grid oinstall 2588 Aug 16 2014 transaction-manager.xml [root@11grac1 OC4J_DBWLM_config]# |
发现节点1的server.xml配置文件果然存在异常。
(5).检查节点2中的server.xml配置文件:
[root@11grac2 OC4J_DBWLM_config]# ll total 172 -rw-r--r-- 1 grid oinstall 1710 Aug 16 2014 application.xml drwxr-xr-x 2 grid oinstall 125 Apr 27 2023 database-schemas -rw-r--r-- 1 grid oinstall 2668 Aug 16 2014 data-sources.xml -rw-r--r-- 1 grid oinstall 1453 Aug 16 2014 default-web-site.xml -rw-r--r-- 1 grid oinstall 31373 Aug 16 2014 entity-resolver-config.xml -rw-r--r-- 1 grid oinstall 9917 Aug 16 2014 global-web-application.xml -rw-r--r-- 1 grid oinstall 1356 Aug 16 2014 http-web-site.xml -rw-r--r-- 1 grid oinstall 1014 Aug 16 2014 internal-settings.xml -rw-r--r-- 1 grid oinstall 4824 Aug 16 2014 j2ee-logging.xml -rw-r--r-- 1 grid oinstall 21860 Aug 16 2014 java2.policy -rw-r--r-- 1 grid oinstall 764 Aug 16 2014 javacache.xml -rw-r--r-- 1 grid oinstall 434 Aug 16 2014 jazn-data.xml -rw-r--r-- 1 grid oinstall 325 Aug 16 2014 jazn.security.props -rw-r--r-- 1 grid oinstall 1362 Aug 16 2014 jazn.xml -rw-r--r-- 1 grid oinstall 1246 Aug 16 2014 jms.xml -rw-r--r-- 1 grid oinstall 2810 Aug 16 2014 mime.types -rw-r--r-- 1 grid oinstall 207 Aug 16 2014 oc4jclient.policy -rw-r--r-- 1 grid oinstall 2059 Aug 16 2014 oc4j-connectors.xml -rw-r--r-- 1 grid oinstall 32 Aug 16 2014 oc4j.properties -rw-r--r-- 1 grid oinstall 287 Aug 16 2014 orb-config.xml -rw-r--r-- 1 grid oinstall 2045 Aug 16 2014 principals.xml -rw-r--r-- 1 grid oinstall 1324 Aug 16 2014 rmi.xml -rw-r--r-- 1 grid oinstall 2506 Mar 1 11:26 server.xml -rw-r--r-- 1 grid oinstall 3190 Aug 16 2014 system-application.xml -rw-r--r-- 1 grid oinstall 13365 Mar 1 11:26 system-jazn-data.xml -rw-r--r-- 1 grid oinstall 2588 Aug 16 2014 transaction-manager.xml |
节点2的server.xml配置文件应该是正常的。
(6).将节点2的server.xml配置文件复制至节点1:
[root@11grac2 OC4J_DBWLM_config]# pwd /u01/app/11.2.0.4/grid/oc4j/j2ee/home/OC4J_DBWLM_config [root@11grac2 OC4J_DBWLM_config]# scp server.xml 11grac1:/u01/app/11.2.0.4/grid/oc4j/j2ee/home/OC4J_DBWLM_config server.xml 100% 2506 1.1MB/s 00:00 [root@11grac2 OC4J_DBWLM_config]# |
(7).检查正在执行的oc4j启动命令:
[root@11grac1 log]# crsctl start resource ora.oc4j CRS-2672: Attempting to start 'ora.oc4j' on '11grac2' CRS-2674: Start of 'ora.oc4j' on '11grac2' failed CRS-2679: Attempting to clean 'ora.oc4j' on '11grac2' CRS-2681: Clean of 'ora.oc4j' on '11grac2' succeeded CRS-2563: Attempt to start resource 'ora.oc4j' on '11grac2' has failed. Will re-retry on '11grac1' now. CRS-2672: Attempting to start 'ora.oc4j' on '11grac1' CRS-2676: Start of 'ora.oc4j' on '11grac1' succeeded [root@11grac1 log]# |
可见,oc4j已经在节点1启动成功。
(8).检查集群状态:
[root@11grac1 log]# crsctl status resource -t ...... ora.cvu 1 ONLINE ONLINE 11grac2 ora.oc4j 1 ONLINE ONLINE 11grac1 ora.racdb.db 1 ONLINE ONLINE 11grac2 Open 2 ONLINE ONLINE 11grac1 Open ora.scan1.vip 1 ONLINE ONLINE 11grac1 [root@11grac1 log]# |