latch cache buffers chains的解决步骤是什么

108次阅读

共计 4671 个字符，预计需要花费 12 分钟才能阅读完成。

今天就跟大家聊聊有关 latch cache buffers chains 的解决步骤是什么，可能很多人都不太了解，为了让大家更加了解，丸趣 TV 小编给大家总结了以下内容，希望大家根据这篇文章可以有所收获。

latch：cache buffers chains 解决步骤

问题产生原因：
某天查看 v$session_wait 时发现有很多 cache buffer chains，但是情况紧急，所以就只是杀了几个执行时间较长的 sql，然后就发现等待事件渐渐消失了。

找到为何引起此等待事件：

1. 先找到出问题时段的 ash
SQL create table mao_ash as select * from dba_hist_active_sess_history where SAMPLE_TIME between TO_TIMESTAMP (2013-12-27 10:00:00 , YYYY-MM-DD HH24:MI:SS) and TO_TIMESTAMP (2013-12-27 12:00:00 , YYYY-MM-DD HH24:MI:SS

2. Verify the issue time frame:

select /*+ parallel 8 */ instance_number,sample_id, sample_time, count(*) from mao_ash t
group by instance_number,sample_id, sample_time
order by 3;

INSTANCE_NUMBER SAMPLE_ID SAMPLE_TIME COUNT(*)
2 72736930 2013-12-27 11:14:48.374 1
1 72762620 2013-12-27 11:14:51.059 11 Begin–active session 突然变为双数，并且持续了一段时间
1 72762630 2013-12-27 11:15:01.161 11

1 72762970 2013-12-27 11:20:44.756 10
1 72762980 2013-12-27 11:20:54.856 11
1 72762990 2013-12-27 11:21:04.956 15
1 72763000 2013-12-27 11:21:15.056 16
……
1 72763940 2013-12-27 11:37:04.830 11
1 72763950 2013-12-27 11:37:14.930 11
1 72763960 2013-12-27 11:37:25.032 11
1 72763970 2013-12-27 11:37:35.142 12
1 72763980 2013-12-27 11:37:45.242 9 End—acive session 变为单数
1 72763990 2013-12-27 11:37:55.342 8
以上可以定位问题出现的时间段。

3. Verify the wait events:

select t.instance_number,
t.sample_id,
t.sample_time,
t.event,
t.session_state,
–t.r,
t.c
from (select t.*,
–row_number() over(partition by instance_number, sample_id order by c desc) r
rank() over(partition by instance_number, sample_id order by c desc) r
from (select /*+ parallel 8 */ t.*,
count(*) over(partition by instance_number, sample_id, event) c,
row_number() over(partition by instance_number, sample_id, event order by 1) r1
from mao_ash t) t
where r1 = 1) t
where r 3
order by sample_time, r;

INSTANCE_NUMBER SAMPLE_ID SAMPLE_TIME EVENT SESSION_STATE C
2 72736930 2013-12-27 11:14:48.374 ON CPU 1— 在这个时间点，有一个 sql 在 on cpu
1 72762620 2013-12-27 11:14:51.059 ON CPU 9— 在这个时间点，有九个 sql 在 on cpu
1 72762620 2013-12-27 11:14:51.059 library cache lock WAITING 1— 在这个时间点，有一个 library cache lock WAITING 等待事件
1 72762620 2013-12-27 11:14:51.059 cursor: pin S wait on X WAITING 1
……
1 72763100 2013-12-27 11:22:56.079 ON CPU 7
1 72763100 2013-12-27 11:22:56.079 library cache lock WAITING 4
……
1 72763290 2013-12-27 11:26:08.193 ON CPU 10
1 72763300 2013-12-27 11:26:18.291 ON CPU 12
2 72737620 2013-12-27 11:26:25.403 ON CPU 1
1 72763310 2013-12-27 11:26:28.391 ON CPU 11
……
1 72763720 2013-12-27 11:33:22.568 ON CPU 17
1 72763730 2013-12-27 11:33:32.689 ON CPU 18
1 72763740 2013-12-27 11:33:42.788 ON CPU 18
…..;.
备注：等待事件是 cache buffers chains，但这里是有 library cache lock 引起的，所以给我们的感觉是 cache buffer chains，这里并不能通过 p1，p2 来定位问题。

4. Find out the holders：
select t.instance_number,
t.sample_time,
t.sample_id,
t.session_id,
t.sql_id,
t.session_type,
t.event,
t.session_state,
–t.blocking_session,
–t.blocking_inst_id,
–t.blocking_session_status,
–t.lv,
–t.r,
t.c
from (select t.*,
row_number() over(partition by instance_number, sample_id order by c desc) r
–rank() over(partition by instance_number, sample_id order by c desc) r
from (select t.*,
count(*) over(partition by instance_number, sample_id, session_id) c,
row_number() over(partition by instance_number, sample_id, session_id order by 1) r1
from (select /*+ parallel 8 */
level lv, connect_by_isleaf isleaf, t.*
from mao_ash t
start with blocking_session is not null
connect by nocycle
prior blocking_session = session_id
and prior t.blocking_session_serial# =
session_serial#
and ((prior sample_time) – sample_time between
interval -3 second and interval 3 second)) t
where t.isleaf = 1) t
where r1 = 1) t
where r 3
order by sample_time, r;

INSTANCE_NUMBER SAMPLE_TIME SAMPLE_ID SESSION_ID SQL_ID SESSION_TYPE EVENT SESSION_STATE C
1 2013-12-27 11:09:47.982 72762320 2697 62h7yux977dmw FOREGROUND db file parallel read WAITING 1
1 2013-12-27 11:09:58.082 72762330 2697 62h7yux977dmw FOREGROUND gc cr multi block request WAITING 1
1 2013-12-27 11:10:08.183 72762340 2697 62h7yux977dmw FOREGROUND ON CPU 1
1 2013-12-27 11:10:18.282 72762350 2697 62h7yux977dmw FOREGROUND ON CPU 1
1 2013-12-27 11:10:28.382 72762360 2697 62h7yux977dmw FOREGROUND gc current block 2-way WAITING 1
1 2013-12-27 11:10:38.482 72762370 2697 62h7yux977dmw FOREGROUND ON CPU 1
……sid 为 2697 正在执行 62h7yux977dmw 的 sql，与此同时在 11:09:47 这个时间点，有一个 session 正在等待

1 2013-12-27 11:28:39.723 72763440 2720 dts1t1fjha4m2 FOREGROUND gc current block 2-way WAITING 1
1 2013-12-27 11:43:18.608 72764310 2753 BACKGROUND log file parallel write WAITING 1
备注：这个 sql 很强大，可以找出是罪魁祸首的那个 sql，这里就是 62h7yux977dmw 了，因为它引起的 session 等待最多。

5. Find out the which SQL cause the most CPU usage:

select sql_id,count(*)
from mao_ash t
where sample_time
to_timestamp(2013-12-27 11:30:40 , yyyy-mm-dd hh34:mi:ss)
and session_state = ON CPU
group by sql_id order by 2 desc;

SQL_ID COUNT(*)
58xvzzydq83f1 350
4fk8mz3jx2898 63
6zwy49juu8wxa 52
ayvngp9bb3dum 48
a3v2gkv5r4gj6 47
451xth7g96cx7 35

结果：
1. 调整 58xvzzydq83f1，让 sql 尽快执行完毕，而不是一直执行着，消耗着 cpu
2. 找出 62h7yux977dmw 的 sql_text，再做调整。其实这里 62h7yux977dmw 可能只执行了一次，可能由于 shared_pool 比较忙，所以很有可能在 v$sql 里找不到。

看完上述内容，你们对 latch cache buffers chains 的解决步骤是什么有进一步的了解吗？如果还想了解更多知识或者相关内容，请关注丸趣 TV 行业资讯频道，感谢大家的支持。

正文完