实例2上语句长时间等待某个块
故障现象:相同的语句,在实例1上运行正常,实例二上长时间的“buffer busy waits”在等待某个块。
故障现象:相同的语句,在实例1上运行正常,实例二上长时间的“buffer busy waits”在等待某个块。
由于业务逻辑的改变,需要增加分区,分区键想由VARCHAR2(1)改为VARCHAR2(2)。报了个:
ORA-14060: data type or length of a table partitioning column may not be changed
解决方法挺简单的,就是使用exchange partition。
总结一个去年由于Paging Space不足导致系统挂起的案例。
监控报警,数据库挂起,无法连接,检查主机告警时,发现大量Paging Space不足的告警。
检查数据库发现有不少的row cache lock:
SYS@DB1> select Inst_id, event, username, sql_hash_value, count(*) 2 from gv$session 3 where Wait_class <> 'Idle' 4 group by Inst_id, event, username, sql_hash_value 5 order by 5; INST_ID EVENT USERNAME SQL_HASH_VALUE COUNT(*) ------- --------------------------------------- --------------- --------------- ---------- 1 latch free USER 2266733057 1 1 direct path write USER 3348906585 1 1 log file parallel write 0 1 1 SQL*Net message to client SYS 2334880858 1 1 log file sync USER 0 1 1 latch free USER 3115510059 4 1 latch: cache buffers chains USER 2266733057 5 1 row cache lock USER 24588433 22 |
10G开始,引入mutexes机制用以代替library cache pin
关于mutexes的一些解释:
转自http://space.itpub.net/?uid-756652-action-viewspace-itemid-348176
Mutexes are new thing in 10.2 and they enable shared access to objects in somewhat similar manner than shared latches, that every successful get of particular mutex will increment its value and release will decrement. When the count is zero, no-one has the mutex and it is safe to get it in exclusive mode too. However they are more fine grained than kgl latches and provide better waiting mechanism as far as I understand.
Enqueue等待事件中,TYPE为TX,MODE为6的等待,应该是最为常见,解决也相对较为容易的一种。
出现这种类型的等待,一般都是通过找出堵住其它会话的进程,通知应用提交或回滚事物,或是强制杀掉进程来解决。
下面是一个的生产案例:
查看等待事件,发现有10几个enqueue
查看锁资源情况,发现SID为984的会话堵住了其它会话
但984会话update是A表,而被锁住的是B表。
在执行一个存储过程,报了个ORA 600错误:
ORA-00600: internal error code, arguments: [kcbgcur_6], [8], [], [], [], [], [], []
而且在其它29个库都执行成功了,只在DB27上报错,alter log内容:
Errors in file /app/oracle/admin/DB27/udump/DB27_ora_10140.trc:
ORA-00600: internal error code, arguments: [kcbgcur_6], [8], [], [], [], [], [], []
站内搜索