千家信息网

oracle rac 单个实例不能生成awr报告的问题

发表于:2025-11-08 作者:千家信息网编辑
千家信息网最后更新 2025年11月08日,同事对rac集群生成性能报告时发现rac集群有一个实例没有生成awr快照,另一个实例快照正常。下面是具体处理步骤。1号实例没有生成awr快照SQL> select SNAP_ID,END_INTERV
千家信息网最后更新 2025年11月08日oracle rac 单个实例不能生成awr报告的问题

同事对rac集群生成性能报告时发现rac集群有一个实例没有生成awr快照,另一个实例快照正常。下面是具体处理步骤。
1号实例没有生成awr快照

SQL> select SNAP_ID,END_INTERVAL_TIME,instance_number from dba_hist_snapshot where instance_number=1;no rows selected

2号实快照正常

SQL> set long 200   SQL> set linesize 200SQL> select * from ( select SNAP_ID,END_INTERVAL_TIME,instance_number from dba_hist_snapshot where instance_number=2 order by SNAP_ID desc) where rownum < =10;   SNAP_ID END_INTERVAL_TIME                                                           INSTANCE_NUMBER---------- --------------------------------------------------------------------------- ---------------     24405 17-AUG-19 07.00.47.595 PM                                                                 2     24404 17-AUG-19 06.00.42.150 PM                                                                 2     24403 17-AUG-19 05.00.37.041 PM                                                                 2     24402 17-AUG-19 04.00.31.774 PM                                                                 2     24401 17-AUG-19 03.00.26.414 PM                                                                 2     24400 17-AUG-19 02.00.21.176 PM                                                                 2     24399 17-AUG-19 01.00.16.316 PM                                                                 2     24398 17-AUG-19 12.00.10.997 PM                                                                 2     24397 17-AUG-19 11.00.05.446 AM                                                                 2     24396 17-AUG-19 10.00.59.801 AM                                                                 210 rows selected.

mmon进程与awr快照相关,mmnl与ash相关,如是查看两个实例的mmon与mmnl进程
2号实例

[root@db2 ~]# ps -ef | grep mmonroot     128329 127956  0 18:11 pts/2    00:00:00 grep mmonoracle   201527      1  0  2018 ?        17:17:11 ora_mmon_RLZY2[root@db2 ~]# ps -ef | grep mmnlroot     131772 127956  0 18:17 pts/2    00:00:00 grep mmnloracle   201531      1  0  2018 ?        1-06:06:24 ora_mmnl_RLZY2

1号实例

[root@db1 ~]# ps -ef | grep mmonroot     239020 238963  0 18:52 pts/2    00:00:00 grep mmon[root@db1 ~]# ps -ef | grep mmnlroot     239052 238963  0 18:52 pts/2    00:00:00 grep mmnl

可以看到1号实例没有mmon与mmnl进程了。

如是查看1号实例的mmon进程的跟踪文件

[root@db1 trace]# ls -lrt *mmon*.trc-rw-r----- 1 oracle asmadmin 1351052 Jan 19  2018 RLZY1_mmon_20073.trc-rw-r----- 1 oracle asmadmin  173031 Jan 22  2018 RLZY1_mmon_49119.trc[root@db1 trace]# more RLZY1_mmon_49119.trcTrace file /u01/app/oracle/diag/rdbms/rlzy/RLZY1/trace/RLZY1_mmon_49119.trcOracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit ProductionWith the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,Data Mining and Real Application Testing optionsORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1System name:    LinuxNode name:      db1Release:        3.8.13-68.3.4.el6uek.x86_64Version:        #2 SMP Tue Jul 14 15:03:36 PDT 2015Machine:        x86_64Instance name: RLZY1Redo thread mounted by this instance: 1Oracle process number: 36Unix process pid: 49119, image: oracle@db1 (MMON)*** 2018-01-19 13:55:20.030*** SESSION ID:(1369.1) 2018-01-19 13:55:20.030*** CLIENT ID:() 2018-01-19 13:55:20.030*** SERVICE NAME:() 2018-01-19 13:55:20.030*** MODULE NAME:() 2018-01-19 13:55:20.030*** ACTION NAME:() 2018-01-19 13:55:20.030 minact-scn slave-status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000*** 2018-01-19 14:00:20.643minact-scn master-status: grec-scn:0x0e0e.55ad96ef gmin-scn:0x0e0e.55abf256 gcalc-scn:0x0e0e.55ac2a0a..........KEBM: MMON action policy violation. 'Block Cleanout Optim, Undo Segment Scan' viol=1; err=12751minact-scn master-status: grec-scn:0x0e0e.5f0ebf8c gmin-scn:0x0e0e.5f0eac2e gcalc-scn:0x0e0e.5f0ead91DDE rules only execution for: ORA 12751*** 2018-01-22 07:06:04.060----- START Event Driven Actions Dump -------- END Event Driven Actions Dump --------- START DDE Actions Dump -----Executing SYNC actionsExecuting ASYNC actions----- START DDE Action: 'ORA_12751_DUMP' (Sync) -----Runtime exceeded 300 secondsTime limit violation detected at:ksedsts()+465< -kspol_12751_dump()+145<-dbgdaExecuteAction()+1065<-dbgerRunAction()+109<-dbgerRunActions()+4134<-dbgexPhaseII()+1873<-dbgexProcessError()+2680<-dbgeExecuteForError()+88<-dbgePostErrorKGE()+2136<-dbkePostKGE_kgsf()+71<-kgeselv()+276<-ksesecl0()+162<-ksucin()+147<-kcbzwb()+2727<-kcbgtcr()+31325<-ktucloUsMinScn()+539<-ktucloUsegScan()+992<-ksb_run_managed_action()+384<-ksbcti()+2490<-ksbabs()+1735<-kebm_mmon_main()+209<-ksbrdp()+1045<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+250<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253Current Wait Stack: 0: waiting for 'gc buffer busy acquire' file#=0x5, block#=0x278, class#=0x49 wait_id=255378 seq_num=59358 snap_id=1 wait times: snap=5 min 5 sec, exc=5 min 5 sec, total=5 min 5 sec wait times: max=infinite, heur=5 min 5 sec wait counts: calls=358 os=358 in_wait=1 iflags=0x15a2 There is at least one session blocking this session. Dumping 1 direct blocker(s): inst: 1, sid: 990, ser: 1 Dumping final blocker: inst: 1, sid: 990, ser: 1 Wait State: fixed_waits=0 flags=0x22 boundary=(nil)/-1 Session Wait History: elapsed time of 0.000061 sec since current wait 0: waited for 'gc cr block 2-way' =0x5, =0x258, =0x47 wait_id=255377 seq_num=59357 snap_id=1 wait times: snap=0.000478 sec, exc=0.000478 sec, total=0.000478 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000122 sec of elapsed time 1: waited for 'gc cr block 2-way' =0x5, =0x228, =0x45 wait_id=255376 seq_num=59356 snap_id=1 wait times: snap=0.000741 sec, exc=0.000741 sec, total=0.000741 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000120 sec of elapsed time 2: waited for 'gc cr block 2-way' =0x5, =0x138, =0x43 wait_id=255375 seq_num=59355 snap_id=1 wait times: snap=0.000528 sec, exc=0.000528 sec, total=0.000528 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000111 sec of elapsed time 3: waited for 'gc cr block 2-way' =0x5, =0xb8, =0x41 wait_id=255374 seq_num=59354 snap_id=1 wait times: snap=0.000583 sec, exc=0.000583 sec, total=0.000583 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000139 sec of elapsed time 4: waited for 'gc cr block 2-way' =0x5, =0x110, =0x37 wait_id=255373 seq_num=59353 snap_id=1 wait times: snap=0.000541 sec, exc=0.000541 sec, total=0.000541 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000110 sec of elapsed time 5: waited for 'gc cr block 2-way' =0x5, =0x100, =0x35 wait_id=255372 seq_num=59352 snap_id=1 wait times: snap=0.000629 sec, exc=0.000629 sec, total=0.000629 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000158 sec of elapsed time 6: waited for 'gc cr block 2-way' =0x5, =0xf0, =0x33 wait_id=255371 seq_num=59351 snap_id=1 wait times: snap=0.000617 sec, exc=0.000617 sec, total=0.000617 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000128 sec of elapsed time 7: waited for 'gc cr block 2-way' =0x5, =0xe0, =0x31 wait_id=255370 seq_num=59350 snap_id=1 wait times: snap=0.000561 sec, exc=0.000561 sec, total=0.000561 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000124 sec of elapsed time 8: waited for 'gc cr block 2-way' =0x5, =0xd0, =0x2f wait_id=255369 seq_num=59349 snap_id=1 wait times: snap=0.000565 sec, exc=0.000565 sec, total=0.000565 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000128 sec of elapsed time 9: waited for 'gc cr block 2-way' =0x5, =0xc0, =0x2d wait_id=255368 seq_num=59348 snap_id=1 wait times: snap=0.000555 sec, exc=0.000555 sec, total=0.000555 sec wait times: max=infinite wait counts: calls=1 os=1 occurred after 0.000125 sec of elapsed time Sampled Session History of session 1369 serial 1 --------------------------------------------------- The sampled session history is constructed by sampling the target session every 1 second. The sampling process captures at each sample if the session is in a non-idle wait, an idle wait, or not in a wait. If the session is in a non-idle wait then one interval is shown for all the samples the session was in the same non-idle wait. If the session is in an idle wait or not in a wait for consecutive samples then one interval is shown for all the consecutive samples. Though we display these consecutive samples in a single interval the session may NOT be continuously idle or not in a wait (the sampling process does not know). The history is displayed in reverse chronological order. sample interval: 1 sec, max history 120 sec --------------------------------------------------- [121 samples, 07:04:03 - 07:06:03] waited for 'gc buffer busy acquire', seq_num: 59358 p1: 'file#'=0x5 p2: 'block#'=0x278 p3: 'class#'=0x49 time_waited: >= 120 sec (still in wait)---------------------------------------------------Sampled Session History Summary:  longest_non_idle_wait: 'gc buffer busy acquire'  [121 samples, 07:04:03 - 07:06:03]      time_waited: >= 120 sec (still in wait)-------------------------------------------------------- END DDE Action: 'ORA_12751_DUMP' (SUCCESS, 1 csec) ---------- END DDE Actions Dump (total 1 csec) -----KEBM: MMON action policy violation. 'Block Cleanout Optim, Undo Segment Scan' viol=1; err=12751minact-scn master-status: grec-scn:0x0e0e.5f0ec4ce gmin-scn:0x0e0e.5f0eac2e gcalc-scn:0x0e0e.5f0ead91*** 2018-01-22 07:11:11.071DDE rules only execution for: ORA 12751----- START Event Driven Actions Dump -------- END Event Driven Actions Dump ----

ORA12751的错误原因是陈旧的SYS对象统计数据会导致生成次优执行计划,从而使AWR自动刷新从操作的语句运行更长时间和超时。

解决方法就是收集新的SYS对象统计信息,为优化器提供更好的统计信息,并生成更高效的执行计划

SQL> EXEC DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;PL/SQL procedure successfully completed.SQL> EXEC DBMS_STATS.GATHER_SCHEMA_STATS ('SYS');PL/SQL procedure successfully completed.

下面就是重启mmon和mmnl进程

SQL> alter system enable restricted session;System altered.SQL> alter system disable restricted session;System altered.

查看alert日志可以看到mmon和mmnl进程已经重启了

Sat Aug 17 19:18:22 2019Starting background process MMONSat Aug 17 19:18:22 2019Starting background process MMNLMMON started with pid=399, OS id=10373 Sat Aug 17 19:18:22 2019MMNL started with pid=405, OS id=10375 ALTER SYSTEM enable restricted session;Sat Aug 17 19:18:25 2019Some DDE async actions failed or were cancelledSat Aug 17 19:18:25 2019Sweep [inc][48021]: completedSweep [inc][48011]: completedSweep [inc][48002]: completedSweep [inc][35010]: completedSweep [inc][34706]: completedSweep [inc][34242]: completedSweep [inc][33546]: completedSweep [inc][33394]: completedSweep [inc2][48021]: completedSweep [inc2][48011]: completedSweep [inc2][48002]: completedSweep [inc2][35010]: completedSweep [inc2][34706]: completedSweep [inc2][34242]: completedSweep [inc2][33546]: completedSweep [inc2][33394]: completedminact-scn: Inst 1 is a slave inc#:30 mmon proc-id:10373 status:0x2minact-scn status: grec-scn:0x0e0e.61cb2e9c gmin-scn:0x0e0e.5f0eac2e gcalc-scn:0x0e0e.5f0ead91Sat Aug 17 19:18:29 2019db_recovery_file_dest_size of 10240 MB is 20.87% used. This is auser-specified limit on the amount of space that will be used by thisdatabase for recovery-related files, and does not reflect the amount ofspace available in the underlying filesystem or ASM diskgroup.Sat Aug 17 19:18:42 2019ALTER SYSTEM disable restricted session;

再查看1号实例的mmon与mmnl进程状态

[root@db1 ~]# ps -ef | grep mmnloracle    10375      1  0 19:18 ?        00:00:00 ora_mmnl_RLZY1root      10611 238963  0 19:18 pts/2    00:00:00 grep mmnl[root@db1 ~]# ps -ef | grep mmonoracle    10373      1  7 19:18 ?        00:00:02 ora_mmon_RLZY1root      10630 238963  0 19:18 pts/2    00:00:00 grep mmon
过了两个小时去查看1号实例已经生成了两条快照信息
SQL> set long 200SQL> set linesize 200SQL> select * from ( select SNAP_ID,END_INTERVAL_TIME,instance_number from dba_hist_snapshot where instance_number=1 order by SNAP_ID desc) where rownum < =10;   SNAP_ID END_INTERVAL_TIME                                                           INSTANCE_NUMBER---------- --------------------------------------------------------------------------- ---------------     24407 17-AUG-19 09.00.58.595 PM                                                                 1     24406 17-AUG-19 08.00.40.244 PM                                                                 1

到此问题解决了。


0