公司平台从11g升级到19c之后,Linux平台下Oracle 19c启动时,告警日志出现ORA-00800错误的问题,并且能定位是启动VKTM进程时抛出的错误。
环境描述:
操作系统:Red Hat Enterprise Linux release 8.8
数据库 :19.24.0.0.0 企业版
问题描述:
在Oracle 19c启动时,在Oracle的告警日志中会出现下面这样一条告警信息:
Errors in file /oracle/oracle/diag/rdbms/prod/trace/gsp_vktm_1900.trc (incident=51251) (PDBNAME=CDB$ROOT):
ORA-00800: soft external error, arguments: [Set Priority Failed], [VKTM], [Check traces and OS configuration], [Check Oracle documen
t and MOS notes], []
Incident details in: /oracle/prod/.../incdir_51251/gsp_vktm_1900_i51251.trc
分析解决:
$ oerr ora 00800
00800, 00000, "soft external error, arguments: [%s], [%s], [%s], [%s], [%s]"
// *Cause: An improper system configuration or setting resulted in failure.
// This failure is not fatal to the instance at the moment, however, this might result
// in an unexpected behavior during query execution.
// *Action: Check the database trace files and rectify system settings or the configuration.
// For additional information, refer to Oracle database documentation or refer to
// My Oracle Support (MOS) notes.
可以看到,错误是由于不正确的系统配置或数据库设置导致的。这个失败目前对实例不是致命的,但是,这可能会导致在查询执行期间发生意外行为。所以最好还是解决掉这个问题。
参考官方支持文件: https://support.oracle.com/epmos/faces/DocumentDisplay?id=2718971
首先,我们检查oradism文件的权限 这是重点,我的库就是这个文件的权限不对。
$ cd $ORACLE_HOME/bin
$ ls -lrt oradism
-rwsr-x--- 1 root oinstall 147848 Apr 17 2019 oradism
这个文件权限应该+s的权限,但是奇怪的是,我们修改权限后,并不生效。
chown root $ORACLE_HOME/bin/oradism
chmod 4750 $ORACLE_HOME/bin/oradism
然后我们检查数据库的优先级别:VKTM还是LMS*
SQL> set linesize 680
SQL> col Parameter for a30
SQL> col "Session Value" for a16
SQL> col "Instance Value" for a16
SQL> col "Description" for a30
SQL> select a.ksppinm "Parameter", b.ksppstvl "Session Value", c.ksppstvl "Instance Value", a.KSPPDESC "Description"
2 from x$ksppi a, x$ksppcv b, x$ksppsv c
3 where a.indx = b.indx and a.indx = c.indx and a.ksppinm like '_%' and a.ksppinm like '_highest_priority_process%';
Parameter Session Value Instance Value Description
------------------------------ ---------------- ---------------- ------------------------------
_highest_priority_processes VKTM VKTM Highest Priority Process Name
Mask
如上所示,此参数值设置正确,如果不正确的话,那么必须优先级为VKTM
alter system set "_high_priority_processes"='VKTM' scope=spfile;
然后我们检查Cgroup配置
$ ps -eaf|grep -i vktm |grep -v grep
oracle 1900 1 0 13:53 ? 00:00:00 ora_vktm_gsp
$ cat /proc/1900/cgroup | grep cpu
6:cpu,cpuacct:/user.slice
2:cpuset:/
$ ps -eaf|grep -i pmon|grep -v grep
oracle 1888 1 0 13:53 ? 00:00:00 ora_pmon_gsp
$ cat /proc/1888/cgroup | grep cpu
6:cpu,cpuacct:/user.slice
2:cpuset:/
检查发现设置显示其他路径,检查cpu.rt_runtime_us的值,如下所示
# cat /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us
0
# cat /sys/fs/cgroup/cpu,cpuacct/user.slice/cpu.rt_runtime_us
0
根据官方文档其值应该为0和950000,可以使用下面命令修改,但是系统重启后,此参数设置会失效
echo 0 > /sys/fs/cgroup/cpu,cpuacct/system.slice/cpu.rt_runtime_us
echo 950000 > /sys/fs/cgroup/cpu,cpuacct/user.slice/cpu.rt_runtime_us
说来也奇怪,修改完再去修改oradism文件权限就可以生效了。再次重启数据库,就没有故障码了。
如果要使其永久生效,我们必须在cgconfig.conf文件中设置,具体操作也很简单,官方文档中有详细步骤,具体如下所示:
Install libcgroup-tools* on the system. (You can find this package on OL7 latest repository)
# yum install libcgroup-tools
/etc/cgconfig.conf will be created automatically when you start cgconfig service
# systemctl start cgconfig
Edit /etc/cgconfig.conf with user.slice parameter below.
group user.slice {
cpu {
cpu.rt_period_us = 1000000;
cpu.rt_runtime_us = 950000;
}
}
Restart cgfconfig service so the value will take effect.
# systemctl restart cgconfig
Enable cgconfig so it will take effect during reboot.
#systemctl enable cgconfig
Reboot the server and check the value if it is now persistent.
标签:rt,VKTM,slice,cgconfig,00800,报错,grep,cpu,ORA
From: https://www.cnblogs.com/lndt/p/18502332