normal数据库关闭hang的问题

标签：22 normal process hang 数据库 Jun Mon shutdown 2015

今晚办公楼停电维护，需要提前关闭服务器，为防止异常关闭导致的各种问题，有个测试库，使用shutdown normal停库，结果就是很常见的hang住了。

操作顺序：

1. shutdown normal，然后关闭了当前sqlplus窗口。

从alert日志中看：

Mon Jun 22 16:50:22 2015

Shutting down instance (normal)

Stopping background process SMCO

Shutting down instance: further logons disabled

这里涉及到shutdown normal的原理，稍后引述。

2. 此时重新登录，sqlplus / as sysdba，执行startup或shutdown immediate命令都提示失败，

SQL*Plus: Release 11.2.0.1.0 Production on Mon Jun 22 17:03:06 2015

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected.
SQL> startup
ORA-01012: not logged on
SQL> shutdown immediate
ORA-24324: service handle not initialized
ORA-24323: value not allowed
ORA-01090: shutdown in progress - connection is not permitted
Mon Jun 22 16:50:24 2015
Stopping background process CJQ0
Stopping background process QMNC
Stopping background process MMNL
Stopping background process MMON
License high water mark = 125

ORA-01090提示说正在执行关闭操作，不允许其他连接的操作。3. 其实这涉及到normal关闭的原理，他需要等待所有已连接用户中断连接，换句话说，如果仍有连接到库的用户，shutdown的操作就一直等待。这是最完全的关闭方式，但同时是变数最大的，因为可能你不知其他用户什么时候中断。首先尝试查找出所有连接用户，用kill -9直接杀进程。可以使用ps -ef查找所有(LOCAL=NO)的进程，LOCAL=NO表示连接不是本地，而是远程。

ps -ef|grep ora|grep -v grep|grep -v ora_|grep LOCAL=NO|awk '{print $2}'，然后kill -9 进程号
或者
ps -ef|grep ora|grep -v grep|grep -v ora_|grep LOCAL=NO|awk '{print $2}'|xargs kill

从alert日志看：

Mon Jun 22 16:55:26 2015
Active process 27446 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27402 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27555 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 11697 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 14942 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27559 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27513 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 26911 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 31993 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 30810 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27557 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 11684 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 11666 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27510 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 11688 user 'oracle11g' program 'oracle@dcsopenNode1'
SHUTDOWN: waiting for logins to complete.
Mon Jun 22 17:01:29 2015
All dispatchers and shared servers shutdown

是提示了所有dispatcher和共享服务关闭，但sqlplus登录后仍是上面的提示。

4. 尝试关闭监听服务，lsnrctl stop。

问题依旧。

5. 重登陆执行shutdown abort，强制关闭。

从alert日志看：

USER (ospid: 28558): terminating the instance

Instance terminated by USER, pid = 28558

看样子是关闭了实例。

重新执行sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Mon Jun 22 17:43:25 2015

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

再次执行startup-shutdown normal，

SQL> startup
ORACLE instance started.

Total System Global Area 3290345472 bytes
Fixed Size                  2217832 bytes
Variable Size            2499807384 bytes
Database Buffers          771751936 bytes
Redo Buffers               16568320 bytes
Database mounted.
Database opened.
SQL> shutdown normal
Database closed.
Database dismounted.
ORACLE instance shut down.

由于现在已经没有连接的用户了，正常启动，正常关闭了。

从alert日志看，

Mon Jun 22 17:46:01 2015
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /oracle/ora11gR2/product/11.2.0/dbhome_1/dbs/arch
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =27
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Using parameter settings in server-side spfile /oracle/ora11gR2/product/11.2.0/dbhome_1/dbs/spfiledcsopen.ora
System parameters with non-default values:
  processes                = 150
  memory_target            = 3152M
  control_files            = "/oracle/ora11gR2/oradata/dcsopen/control01.ctl"
  control_files            = "/oracle/ora11gR2/oradata/dcsopen/control02.ctl"
  db_block_size            = 8192
  compatible               = "11.2.0.0.0"
  undo_tablespace          = "UNDOTBS1"
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=dcsopenXDB)"
  audit_file_dest          = "/oracle/ora11gR2/admin/dcsopen/adump"
  audit_trail              = "DB"
  db_name                  = "dcsopen"
  open_cursors             = 300
  diagnostic_dest          = "/oracle/ora11gR2"
Mon Jun 22 17:46:03 2015
PMON started with pid=2, OS id=30699 
Mon Jun 22 17:46:03 2015
VKTM started with pid=3, OS id=30701 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Mon Jun 22 17:46:03 2015
GEN0 started with pid=4, OS id=30705 
Mon Jun 22 17:46:03 2015
DIAG started with pid=5, OS id=30707 
Mon Jun 22 17:46:03 2015
DBRM started with pid=6, OS id=30709 
Mon Jun 22 17:46:03 2015
PSP0 started with pid=7, OS id=30711 
Mon Jun 22 17:46:03 2015
DIA0 started with pid=8, OS id=30713 
Mon Jun 22 17:46:03 2015
MMAN started with pid=9, OS id=30715 
Mon Jun 22 17:46:03 2015
DBW0 started with pid=10, OS id=30717 
Mon Jun 22 17:46:03 2015
LGWR started with pid=11, OS id=30721 
Mon Jun 22 17:46:03 2015
CKPT started with pid=12, OS id=30723 
Mon Jun 22 17:46:03 2015
SMON started with pid=13, OS id=30725 
Mon Jun 22 17:46:03 2015
RECO started with pid=14, OS id=30727 
Mon Jun 22 17:46:03 2015
MMON started with pid=15, OS id=30729 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Mon Jun 22 17:46:03 2015
MMNL started with pid=16, OS id=30731 
starting up 1 shared server(s) ...
ORACLE_BASE from environment = /oracle/ora11gR2
Mon Jun 22 17:46:04 2015
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 2809595100
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Mon Jun 22 17:46:08 2015
ALTER DATABASE OPEN
Thread 1 opened at log sequence 1279
  Current log# 1 seq# 1279 mem# 0: /oracle/ora11gR2/oradata/dcsopen/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Mon Jun 22 17:46:09 2015
QMNC started with pid=20, OS id=30789 
Completed: ALTER DATABASE OPEN
Starting background process CJQ0
Mon Jun 22 17:46:11 2015
CJQ0 started with pid=22, OS id=30806 
Mon Jun 22 17:46:18 2015
Shutting down instance (normal)
Shutting down instance: further logons disabled
Stopping background process QMNC
Stopping background process CJQ0
Stopping background process MMNL
Stopping background process MMON
License high water mark = 5
All dispatchers and shared servers shutdown
ALTER DATABASE CLOSE NORMAL
Mon Jun 22 17:46:22 2015
SMON: disabling tx recovery
SMON: disabling cache recovery
Mon Jun 22 17:46:22 2015
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Thread 1 closed at log sequence 1279
Successful close of redo thread 1
Completed: ALTER DATABASE CLOSE NORMAL
ALTER DATABASE DISMOUNT
Completed: ALTER DATABASE DISMOUNT
ARCH: Archival disabled due to shutdown: 1090
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Mon Jun 22 17:46:23 2015
Stopping background process VKTM: 
ARCH: Archival disabled due to shutdown: 1090
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Mon Jun 22 17:46:25 2015
Instance shutdown complete

借鉴《Concept》，一些知识点：

1. 如果用户试图访问一个正在关闭的数据库，会得到错误提示：ORA-01090: shutdown in progress - connection is not permitted.

2. 关闭数据库，必须用SYSOPER或SYSDBA的角色。

3. 关闭数据库，是有超时时间的，如果用户未中断连接，或者交易未完成，超过一小时，则shutdown命令会取消，提示错误：ORA-01013: user requested cancel of current operation.

4. 几种关闭库的参数，

shutdown normal：

默认的关闭参数，需要两个条件：

(1) 执行语句后，不允许新的连接。

(2) 数据库关闭之前，数据库会等待所有已连接用户中断连接。

下一次启动时不需要实例恢复。

shutdown immediate：

使用场景：

(1) 初始化一个自动，无人值守的备份。

(2) 马上就要断电。

(3) 数据库或应用工作不正常，你不能马上联系到用户退出登录或他们无法退出登录。

条件：

(1) 不允许新的连接，不允许新的交易。

(2) 任何未提交的事务会回滚(如果此时有个长交易，未提交，那么不会像这种关闭名称immediate那样迅速地关闭)。

(3) 不会等待已连接用户退出登录。数据库会隐式回滚活动事务，中断连接用户。

下一次启动时不需要实例恢复。

shutdown transactional：

适用于计划停机，允许活动交易处理完成后再停止实例的场景。

条件：

(1) 不允许新的连接，不允许新的交易。

(2) 所有交易完成后，会中断所有和库的连接。

(3) 在这个时间点，关闭实例就像执行了shutdown immediate。

下一次启动时不需要实例恢复。

transactional参数主要会防止用户丢失交易，同时不需要所有用户退出登录。

shutdown abort：

适用场景：

数据库或应用不能正常工作，并且没有其它类型的关闭操作正在进行。

(1) 需要立即关闭数据库(例如，一分钟后电源会被关闭)。

(2) 启动实例时碰到了问题。

条件：

(1) 不允许新的连接，不允许新的交易。

(2) 正在被Oracle处理的客户端SQL语句会被立即中断。

(3) 未提交事务不会回滚。

(4) Oracle不会等待正保持连接的客户端退出登录。数据库会隐式地中断所有连接。

下一次启动时需要进行实例恢复。

总结：

以上四种参数会适合于不同的场景，简单讲，shutdown normal是默认的关闭方式，最完整的关闭方式，缺点是需要被动等待所有交易完成，所有用户退出登录。shutdown immediate只要不存在较长的需要回滚的事务，其关闭时间会快。shutdown transactional会最大限度地保证交易的完成。前三种都不需要实例恢复。shutdown abort则是最暴力的关闭，关闭时间最快，但代价是启动需要实例恢复，因为关闭时存在未回滚未提交的事务。

标签：22,normal,process,hang,数据库,Jun,Mon,shutdown,2015
From： https://blog.51cto.com/u_13950417/6512496

normal数据库关闭hang的问题

相关文章

赞助商

阅读排行