The customer in this case is from a provincial telecom. A large number of ORA-00600[4187] errors in alert logs have affected the normal operation of the business.
Fri Nov 19 16:07:09 2021
Errors in file /u01/ora
cle/app/oracle/diag/rdbms/lcfa/LCFA1/trace/LCFA1_smon_5811.trc (incident=184182): ORA-00600: internal error code, arguments: [4187], [], [], [], [], [], [], [], [], [], [], [] Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Block recovery from logseq 54671, block 287204 to scn 17162371413499 Recovery of Online Redo Log: Thread 1 Group 5 Seq 54671 Reading mem 0 Mem# 0: +DATA/lcfa/onlinelog/group_5.287.904064243 Block recovery stopped at EOT rba 54671.287388.16 Block recovery completed at rba 54671.287388.16, scn 3995.3977065979 Non-fatal internal error happenned while SMON was doing flushing of monitored table stats. SMON encountered 1 out of maximum 100 non-fatal internal errors. Fri Nov 19 16:07:10 2021 Sweep [inc][184182]: completed Fri Nov 19 16:07:10 2021 Errors in file /u01/oracle/app/oracle/diag/rdbms/lcfa/LCFA1/trace/LCFA1_ora_1734.trc (incident=190317): ORA-00600: Ě²¿´펳´ú«, ²Ίý: [4187], [], [], [], [], [], [], [], [], [], [], [] Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Fri Nov 19 16:09:04 2021 Block recovery from logseq 54671, block 287204 to scn 17162371413499 Recovery of Online Redo Log: Thread 1 Group 5 Seq 54671 Reading mem 0 Mem# 0: +DATA/lcfa/onlinelog/group_5.287.904064243 Block recovery completed at rba 54671.287388.16, scn 3995.3977065982 Fri Nov 19 16:10:30 2021 Errors in file /u01/oracle/app/oracle/diag/rdbms/lcfa/LCFA1/trace/LCFA1_ora_6392.trc (incident=184485): ORA-00600: Ě²¿´펳´ú«, ²Ίý: [4187], [], [], [], [], [], [], [], [], [], [], [] Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details.
You can see that ORA-00600[4187] is accompanied by blockrecover. Usually, ORA-00600[4XXX] errors come from undo related and will trigger BRR. SMON has encountered an internal error once. If SMON encounters 100 internal errors, it will restart the instance, which is determined by the parameters_ smon_internal_errlimit control.
SQL> @sp smon_inter -- show parameter by sp -- show hidden parameter by sp old 3: where x.indx=y.indx and ksppinm like '_%&p%' new 3: where x.indx=y.indx and ksppinm like '_%smon_inter%' NAME VALUE DESC ---------------------------------------- ---------- ------------------------------------------------------------------------------------------ _smon_internal_errlimit 100 limit of SMON internal errors ORA-00600 4187 stay Doc ID 19700135.8 There are clear instructions on: Description ORA-600 [4187] can occur for undo segments where wrap# is close to the max value of 0xffffffff (KSQNMAXVAL). This normally affects databases with high transaction rate that have existed for a relatively long time.
In a long-term high TPS environment, when a new transaction is bound to a slot in an undo segment, wrap # will be incremented, but the incremented wrap # exceeds the maximum KSQNMAXVAL(0xffffffff), ORA-00600[4187] error will be thrown.
Continue to view the trace file to find the undo segment header with exception:
TRN CTL:: seq: 0xd14f chd: 0x0009 ctl: 0x0004 inc: 0x00000000 nfb: 0x0000 mgc: 0xb000 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe) uba: 0x00c06434.d14f.2c scn: 0x0f9b.ed0bc826 Version: 0x01 FREE BLOCK POOL:: uba: 0x00000000.d14f.2b ext: 0x2 spc: 0x1dc uba: 0x00000000.d14f.2a ext: 0x2 spc: 0x70e uba: 0x00000000.d14b.02 ext: 0x1e spc: 0x1f02 uba: 0x00000000.ce3f.02 ext: 0x12 spc: 0x14da uba: 0x00000000.3226.02 ext: 0x32 spc: 0x14ae TRN TBL:: index state cflags wrap# uel scn dba parent-xid nub stmt_num cmt ------------------------------------------------------------------------------------------------ 0x00 9 0x00 0xfffffa0c 0x000c 0x0f9b.ed0bc8f1 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x01 9 0x00 0xfffff4ab 0x0016 0x0f9b.ed0bc84e 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x02 9 0x00 0xfffff3aa 0x0008 0x0f9b.ed0bc934 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x03 9 0x00 0xfffff8d9 0x000e 0x0f9b.ed0bc985 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x04 9 0x00 0xfffffce8 0xffff 0x0f9b.ed0bc9e7 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x05 9 0x00 0xfffff627 0x001a 0x0f9b.ed0bc833 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x06 9 0x00 0xfffff4e6 0x0004 0x0f9b.ed0bc9cb 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x07 9 0x00 0xffffece5 0x000b 0x0f9b.ed0bc85c 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x08 9 0x00 0xfffff724 0x0021 0x0f9b.ed0bc93a 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x09 9 0x00 0xfffffff3 0x0015 0x0f9b.ed0bc828 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x0a 9 0x00 0xfffffaf2 0x0018 0x0f9b.ed0bc90c 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x0b 9 0x00 0xfffff671 0x0010 0x0f9b.ed0bc867 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x0c 9 0x00 0xfffffec0 0x001e 0x0f9b.ed0bc900 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x0d 9 0x00 0xfffff8bf 0x0020 0x0f9b.ed0bc889 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x0e 9 0x00 0xfffff4ce 0x0013 0x0f9b.ed0bc9ab 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x0f 9 0x00 0xfffff64d 0x000d 0x0f9b.ed0bc875 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x10 9 0x00 0xfffff5ec 0x000f 0x0f9b.ed0bc86b 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x11 9 0x00 0xfffffccb 0x001c 0x0f9b.ed0bc950 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x12 9 0x00 0xfffff55a 0x001f 0x0f9b.ed0bc976 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x13 9 0x00 0xfffff659 0x0014 0x0f9b.ed0bc9b1 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x14 9 0x00 0xffffefb8 0x0006 0x0f9b.ed0bc9c2 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x15 9 0x00 0xffffed27 0x0005 0x0f9b.ed0bc82e 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x16 9 0x00 0xfffffd66 0x0007 0x0f9b.ed0bc854 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x17 9 0x00 0xfffffdd5 0x0000 0x0f9b.ed0bc8e6 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x18 9 0x00 0xfffff1f4 0x001d 0x0f9b.ed0bc917 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x19 9 0x00 0xfffff303 0x0002 0x0f9b.ed0bc927 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x1a 9 0x00 0xfffff592 0x0001 0x0f9b.ed0bc83b 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305927 0x1b 9 0x00 0xfffff9f1 0x0017 0x0f9b.ed0bc8df 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x1c 9 0x00 0xffffeee0 0x0012 0x0f9b.ed0bc95b 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x1d 9 0x00 0xfffff23f 0x0019 0x0f9b.ed0bc91e 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x1e 9 0x00 0xfffff67e 0x000a 0x0f9b.ed0bc908 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x1f 9 0x00 0xfffff1ad 0x0003 0x0f9b.ed0bc982 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x20 9 0x00 0xfffffb0c 0x001b 0x0f9b.ed0bc8ba 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928 0x21 9 0x00 0xfffff1eb 0x0011 0x0f9b.ed0bc943 0x00c06434 0x0000.000.00000000 0x00000001 0x00000000 1637305928
The dump in the header of the abnormal undo segment shows that the wrap # of all slots is very high, and the chd in ktuxc is 0009, indicating that the transaction slot of slot 9 will be used in the next transaction, and the wrap # of slot 9 is 0xfffffff3, which is very close to KSQNMAXVAL. However, we know that each wrap# reuse will only increase by 1 and will not exceed KSQNMAXVAL. So why is ORA-00600[4187] reported?
The reason is that the wrap#+1 algorithm for reusing slots is outdated. At present, when the ktubnd function is executed to bind undo segments for transactions, kjqghd will be called to calculate a incremental Delta for reusing slots. This delta is also limited and must be less than 16 (defined by KTU_MAX_KSQN_DELTA). Therefore, the value of 0xfffffff3 +delta may exceed KSQNMAXVAL.
After knowing the cause of the error, the solution is actually very simple, that is, delete the abnormal undo segment or rebuild the undo table space. If the undo segment cannot be deleted, for example, there are other active transactions, you can use the_ corrupted_rollback_segments blocks the undo segment. mos also provides scripts to check which undo segments face such problems.
select b.segment_name, b.tablespace_name ,a.ktuxeusn "Undo Segment Number" ,a.ktuxeslt "Slot" ,a.ktuxesqn "Wrap#" from x$ktuxe a, dba_rollback_segs b where a.ktuxesqn > -429496730 and a.ktuxesqn < 0 and a.ktuxeusn = b.segment_id;
One more thing to think about here is, why is wrap# so big? Is it just high TPS? We know that the principle of binding undo segments to transactions is to average the active transactions to each undo segment as much as possible. The specific algorithm is as follows:
Search the online undo segment in the current undo tablespace for the undo segment that has no active transaction in the transaction table;
If it is not found, try to find those undo segment s in offline status in the current undo tablespace online;
If it is not found, try to create undo segment in the current undo tablespace and online;
If it cannot be created, it will look for the most recently used undo segment.
There is a great possibility that there are too few undo segments that can be online. After checking the instance, the undo table space is 1.5g and cannot be automatically expanded, which leads to the high wrap# of each slot in the undo transaction table.
Therefore, the supplementary suggestion for this case is to reasonably set the size of undo table space and_ rollback_segment_count.
Original link of ink Sky Wheel: https://www.modb.pro/db/17494... (copy the link to the browser or click the end of the text to read the original text)
About the author
Li Xiangyu, delivery technical consultant of Yunhe enmo West District, has long served customers in the mobile operator industry, and is familiar with Oracle Performance Optimization, fault diagnosis and special recovery.