一套Sparc Solaris上的10.2.0.1数据库,告警日志中出现ORA-07445:[_memcmp()+88] [SIGSEGV]内部错误日志,具体日志如下:
Errors in file /global/oracle1/centDB/admin/centDB/udump/centdb_ora_8749.trc:
ORA-07445: exception encountered: core dump [_memcmp()+88] [SIGSEGV] [Address not mapped to object] [0x000000010] []
/global/oracle1/centDB/admin/centDB/udump/centdb_ora_8749.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /global/oracle1/ORAHOME1/product/10.2/db_1
System name: SunOS
Node name: ora03ud-us
Release: 5.10
Version: Generic_142900-13
Machine: sun4u
Instance name: centDB
Redo thread mounted by this instance: 1
Oracle process number: 41
Unix process pid: 8749, image: oraclecentDB@ora03ud-us
*** SERVICE NAME:(SYS$USERS) 2011-03-08 04:54:18.528
*** SESSION ID:(1226.58882) 2011-03-08 04:54:18.528
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x10, PC: [0xffffffff7d600ca4, _memcmp()+88]
*** 2011-03-08 04:54:18.533
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [_memcmp()+88] [SIGSEGV] [Address not mapped to object] [0x000000010] [] []
----- Call Stack Trace -----
ksedmp <- ssexhd <- sighndlr <- call_user_handler <- sigacthandler
<- memcmp <- kpzgkvl <- kziaia <- kpolnb <- kpolon
<- opiodr <- ttcpip <- opitsk <- opiino <- opiodr
<- opidrv <- sou2o <- opimai_real <- main <- start
(session) sid: 1226 trans: 0, creator: 3bd522c00, flag: (41) USR/- BSY/-/-/-/-/-
DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
txn branch: 0
oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
O/S info: user: root, term: unknown, ospid: , machine: 7cta-031-eqism
program: JDBC Thin Client
application name: JDBC Thin Client, hash value=0
last wait for 'SQL*Net message from client' blocking sess=0x0 seq=2 wait_time=5705 seconds since wait started=0
driver id=74637000, #bytes=1, =0
Dumping Session Wait History
for 'SQL*Net message from client' count=1 wait_time=5705
driver id=74637000, #bytes=1, =0
for 'SQL*Net message to client' count=1 wait_time=5
driver id=74637000, #bytes=1, =0
temporary object counter: 0
----------------------------------------
Virtual Thread:
kgskvt: 3a6bc1be8, sess: 3bcdc71e8, vc: 0, proc: 0
consumer group cur: (upd? 0), mapped: , orig:
vt_state: 0x0, vt_flags: 0x20, blkrun: 0
is_assigned: 0, in_sched: 0 (0)
vt_active: 0 (pending: 0)
used quanta: 0 (cg: 0)
cpu start time: 0, quantum status: 0x0
quantum checks to skip: 0, check thresh: 0
idle time: 0, active time: 0 (cg: 0)
cpu yields: 0 (cg: 0), waits: 0 (cg: 0), wait time: 0 (cg: 0)
queued time outs: 0, time: 0 (cur 0, cg 0)
calls aborted: 0, num est exec limit hit: 0
undo current: 0k max: 0k
以上7445内部错误并未导致实例意外终止crash,可以看到其最近的stack call为:memcmp kpzgkvl kziaia kpolnb kpolon opiodr ttcpip opitsk opiino opiodr opidrv sou2o opimai_real main start;通过Metalink搜索可以同Bug 5292883的调用堆栈匹配,Bug Note如下:
Bug 5292883 Dump from OCI client using OCI7 olog() call
Affects:
Product (Component) Oracle Server (Rdbms)
Range of versions believed to be affected Versions < 11
Versions confirmed as being affected
10.2.0.3
Platforms affected Generic (all / most platforms affected)
Fixed:
This issue is fixed in
10.2.0.2 Patch 10 on Windows Platforms
10.2.0.3 Patch 7 on Windows Platforms
10.2.0.4 (Server Patch Set)
11.1.0.6 (Base Release)
Symptoms:
Related To:
Process May Dump (ORA-7445) / Abend / Abort
Dump in or under kpzgkvl / kziaia
OCI
Description
On a 64 bit machines a dump can occur with the following stack
if the client uses the olog() OCI call to connect.
memcmp()<-kpzgkvl()<-kziaia()<--kpolnb()<-kpolon()
Workaround
Use OCIServerAttach and OCISessionBegin instead of olog()
Hdr: 5292883 10.2.0.2.0 RDBMS 10.2.0.2.0 SECURITY PRODID-5 PORTID-197 ORA-7445
Abstract: ORA-7445: EXCEPTION ENCOUNTERED: CORE DUMP [_MEMCMP()+160] WHEN CONNECTING TO I
PROBLEM:
--------
1. Clear description of the problem encountered
OCI client connection fails with ORA-7445 [_memcmp()+160]. At the
time of error occurence no connection to database could be established
by OCI clients. Only sqlplus sessions were able to connect. The alertlog
confirms a lot of ORA-7445 which were logged. Client version is 9.2.0.6.
ALERT LOG
---------
Tue Jun 6 18:59:14 2006
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Tue Jun 6 19:02:32 2006
Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_16756.trc:
ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV]
[Address not mapped to object] [0x2000000000] [] []
Tue Jun 6 19:02:32 2006
Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_16756.trc:
ORA-81: address range [0x60000000000A7D70, 0x60000000000A7D74) is not
readable
ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV]
[Address not mapped to object] [0x2000000000] [] []
Tue Jun 6 19:04:42 2006
Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_20524.trc:
ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV]
[Address not mapped to object] [0x2A00000000] [] []
Tue Jun 6 19:04:42 2006
Errors in file /u01/app/oracle/admin/XAL/udump/xal1_ora_20524.trc:
ORA-81: address range [0x60000000000A7D70, 0x60000000000A7D74) is not
readable
ORA-7445: exception encountered: core dump [_memcmp()+160] [SIGSEGV]
[Address not mapped to object] [0x2A00000000] [] []
...
2. Pertinent configuration information (MTS/OPS/distributed/etc)
3 instance RAC database.
Errors were logged for 2 of these instances.
3. Indication of the frequency and predictability of the problem
Intermittend occurence.
Not reproducible at will.
4. Technical impact on the customer. Include persistent after effects.
No connections possible from ERP application which fails with ORA-7445.
More than 500 ERP users affected.
STACK TRACE:
------------
_memcmp()+160 call
kpzgkvl()+192 call _memcmp() 2000000002 ?
4000000001229FF0 ?
000000011 ?
kziaia()+480 call kpzgkvl() 9FFFFFFFFFFFADB0 ?
9FFFFFFFFFFFAE18 ?
4000000001229FF0 ?
000000011 ? 000000000 ?
9FFFFFFFFFFF6ED8 ?
9FFFFFFFFFFF6EE0 ?
9FFFFFFFFFFF6ED0 ?
kpolnb()+1344 call kziaia() 9FFFFFFFFFFF8040 ?
9FFFFFFFFFFF6EE0 ?
9FFFFFFFFFFF6ED8 ?
9FFFFFFFFFFF81E0 ?
9FFFFFFFFFFF81D8 ?
9FFFFFFFFFFF81E8 ?
000000000 ?
400000000233B440 ?
kpolon()+336 call kpolnb() 9FFFFFFFFFFF8030 ?
4000000003F47E10 ?
9FFFFFFFFFFF6F80 ?
600000000009DB00 ?
00000820D ?
opiodr()+2064 call kpolon() 000000051 ?
60000000000219A8 ?
9FFFFFFFFFFF81F0 ?
9FFFFFFFFFFF81B0 ?
60000000000AAA50 ?
40000000030BE570 ?
ttcpip()+1824 call opiodr() 60000000000AA3B0 ?
6000000000015DD0 ?
9FFFFFFFFFFFA9A0 ?
6000000000015DD0 ?
9FFFFFFFFFFF82D0 ?
600000000009DB00 ?
00000001A ?
6000000000021838 ?
该Bug 5292883在10.2.0.1上没有相应的one-off patch补丁,而在11g和10.2.0.4补丁集中得到修复(fix)。如果无法实施补丁的话,那么一般可以通过以下2种途径绕过该问题:
1)限制用户名和密码的长度在9个字符以内
2)若使用OCI,登录使用OCIServerAttach和OCISessionBegin函数