Archive for the ‘bugs’ Category

Goldengate Extract Abended – Recovery Record Is Missing [ID 1094767.1]

Modified 29-SEP-2011 Type PROBLEM Status PUBLISHED
In this Document
Symptoms
Cause
Solution
References

Applies to:
Oracle GoldenGate – Version: 10.4.0.19 to 11.1.1.1.0 – Release: 10.4.0 to 11.1.1
Information in this document applies to any platform.
Symptoms
Extract abended with:
2010-05-04 15:42:23 GGS ERROR 190 Recovery record is missing from log with seqno 14496 when extract has reached log with seqno 14497, block size 512, and next_checkpoint RBA at 149274704.

in later OGG version, the error code is like following:
OGG-01028 Recovery record is missing from log with seqno 1934 when extract has reached log with seqno 1935, block size 512, and next_checkpoint RBA at 83962348.

Archive logs 14497 & 14498 exist, and restarting extract failed with same error.
Cause
The extract was manually stopped and current/next checkpoint is set to a zero length record in redo. This is a very rare condition, and the chance of hitting it is also low. When this extract was restarted, the error occurred. Extract’s Current/Next checkpoint is set to a zero length record at the end of a log and on restart this zero length record is skipped as it has no useful data. Since this record is skipped the recovery code thinks this record was missed and generates an error.

This is bug 12693183.
Solution
This bug is fixed in v11.1.1.1 with patches for bugs 12693183.

The following workaround may also be used with no data loss.
Workaround: change the current checkpoint RBA to next block boundary, keep the recovery checkpoint the same

1. Get block size of your platform.
The block size is shown in the error message.
2010-05-04 15:42:23 GGS ERROR 190 Recovery record is missing from log with seqno 14496 when extract has reached log with seqno 14497, block size 512, and next_checkpoint RBA at 149274704.

For reference the following lists the block size of major platforms:
AIX, Linux, Sun, Windows & VMS: 512 Bytes
HP-UX, Tru64: 1024 Bytes
S390, MVS: 4096 Bytes.

2. Backup the checkpoint file in directory dirchk

3. info showch
Get current and recovery checkpoints, example:

EXTRACT EIDLD Last Started 2010-05-04 16:39 Status ABENDED
Checkpoint Lag 01:14:54 (updated 01:14:41 ago)
Log Read Checkpoint Oracle Redo Logs
2010-05-04 15:24:21 Seqno 14496, RBA 149274704
Current Checkpoint Detail:
Read Checkpoint #1
Oracle Redo Log
Startup Checkpoint (starting position in the data source):
Sequence #: 14496
RBA: 149273104
……………….
Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Sequence #: 14496
RBA: 149273104
……………….
Current Checkpoint (position of last record read in the data source):
Sequence #: 14496
RBA: 149274704
………………

4. Alter both recovery checkpoint and current checkpoint to the start point of next block number
alter , extseqno , extrba

In the above example:
SQL> select ceil (149274704/512) * 512 from dual;
CEIL(149274704/512)*512
———————–
149275136

seq# remains the same as in the showch display.

ggsci> alter extract , extseqno 14496, extrba 149275136
If the new rba number is bigger than the current file size of the archivelog file, you have to alter both checkpoint to start from RBA 0 in next seqno file.
For example, if the size of archived log that has seqno 14496 is only 149275000, which is smaller than the new RBA calculated, 149275136, you have to issue the following command
ggsci> alter extract , extseqno 14497, extrba 0

5. Alter the extseqno and extrba of recovery checkpoint back to its original recovery checkpoint position
In this example,
ggsci> alter extract , ioextseqno 14496, ioextrba 149273104

6. info , showch
Confirm the change.

7. start

Note: Above example is for extract without threads option. For extract with threads option, “thread ” is needed for “alter extract ” command.
e.g., “ggsci> alter extract , extseqno 14497, extrba 0” will be “ggsci> alter extract , thread , extseqno 14497, extrba 0” (here, is the ogg thread number).
References
BUG:12693183 – EXTRACT ERRORS WITH: RECOVERY RECORD IS MISSING FROM LOG WITH SEQNO 26743
BUG:12693291 – EXTRACT ABENDED WITH RECOVERY RECORD IS MISSING FROM LOG

Advertisements
OGG Extract abends with OGG-01028 Record position is beyond end of recovery [ID 1335470.1]

Modified 07-OCT-2011 Type PROBLEM Status PUBLISHED

In this Document
Symptoms
Cause
Solution
References


 

 

Applies to:

Oracle GoldenGate – Version: 11.1.1.1.0 and later [Release: 11.1.1 and later ]
Information in this document applies to any platform.

Symptoms

Extract abends with below error:

ERROR OGG-01028 Record position (SeqNo: 336, RBA: 1040, SCN: 0.40418944 (40418944)) is beyond end of recovery (SeqNo: 335, RBA: 9016952, SCN: 0.40255602 (40255602), Timestamp: 2011-06-21 10:57:03.000000).

Cause

Extract may have hit bug which can occur while processing a 0-length record while the extract is enabled for Bounded Recovery.

Solution

As a work around , please use the below extract parameter and start the extract which will turn off the Bounded recovery. When BR is off, the Extract will do a normal recovery, so it needs all the archive logs file starting from recovery checkpoint of the extract.

BR BROFF

The only time BR is used is upon restart of extract, and only if there were any long running transactions that were persisted. If all of the transactions in the workload are of short duration (less than the BR interval default of 4 hrs), then even upon restart, standard recovery is active instead of bounded recovery.

And regarding transactions we are only concerned with ones that modify data. There may, of course, be batch jobs which run and take a long time, so these could fall into the long duration category thats mentioned above, but even in this case, as long as the redo logs/archive logs are available for standard recovery, the restart time should not be much longer with SR (standard recovery) than BR.

Once the extract has passed by the problematic RBA, this parameter can be removed.The permanent fix is planned for OGG v11.2.1.

ORA-600/ORA-7445 Error Look-up Tool [ID 153788.1]

SHUTDOWN IMMEDIATE无响应

Posted: 三月 4, 2011 in bugs, oracle

今天关闭一台问题数据库时,出现了这个问题。

用SYS执行SHUTDOWN IMMEDIATE命令后,数据库没有反应。

在另外的会话以SYSDBA登陆,利用SHUTDOWN ABORT可以关闭数据库,重启数据库也没有问题。查询metalink发现是Oracle10g的一个bug,bug描述为:Doc ID: Note:5057695.8。Oracle在10.2.0.4和11中解决了这个bug。