After nearly a year, a scheduled scripted PDB hot clone worked flawlessly, and ‘suddenly’ it failed with this error in the target cdb alert log.
CDB1900A/PROD is the source cdb/pdb on machine PRODUCTION
CDB1900B/CLONER is the target cdb/pdb on machine CLONE
2021-03-08 15:50:49.421000 +01:00 CREATE PLUGGABLE DATABASE CLONER FROM PROD@clone_pdb file_name_convert=('/ora1/oradata/cdb1900a/prod','/ora1/oradata/cdb1900b/cloner') 2021-03-08 16:14:53.480000 +01:00 Warning: VKTM detected a forward time drift. Please see the VKTM trace file for more details: /ora0/app/diag/rdbms/cdb1900b/cdb1900b/trace/cdb1900b_vktm_7467.trc 2021-03-08 16:38:16.590000 +01:00 Endian type of dictionary set to little 2021-03-08 16:38:19.135000 +01:00 **************************************************************** Pluggable Database CLONER with pdb id - 3 is created as UNUSABLE. If any errors are encountered before the pdb is marked as NEW, then the pdb must be dropped local undo-1, localundoscn-0x0000000000000115 **************************************************************** 2021-03-08 16:38:20.262000 +01:00 Applying media recovery for pdb-4099 from SCN 501140373 to SCN 501213500 Remote log information: count-1 thr-1,seq-36143,logfile-/arch/cdb1900a/parlog_1_36143_f6e8ef45_1036597637.arc,los-501203026,nxs-18446744073709551615,maxblks-39425 Media Recovery Start Serial Media Recovery started max_pdb is 4 Refresh recovery failed to find archive log for thr-1, scn-501140373 Media Recovery failed with error 65345 ORA-283 signalled during: CREATE PLUGGABLE DATABASE CLONER FROM PROD@clone_pdb file_name_convert=('/ora1/oradata/cdb1900a/prod','/ora1/oradata/cdb1900b/cloner')...
Asking the source PDB prod where the SCN should be found:
set pages 100 lines 100 col name for a70 col first_change# for 9999999999 col next_change# for 9999999999 alter session set nls_date_format='DD-MON-RRRR HH24:MI:SS'; select name, sequence#, status, FIRST_CHANGE#,NEXT_CHANGE# from v$archived_log where 501140373 between FIRST_CHANGE# and NEXT_CHANGE#; NAME SEQUENCE# S FIRST_CHANGE# ---------------------------------------------------------------------- ---------- - ------------- NEXT_CHANGE# ------------ /arch/cdb1900a/cdb1900a_0001_36136_1036597637.arc 36136 A 501140362 501164651
So when I asked the source PDB prod where the mentioned missing SCN 501140373 should be held, in which archive log file and sequence 36136, it is there in an archive log:
But as you can see above, it searches in a parlog file, which begins with a higher sequence number: 36143:
So there is an archive log with the right sequence to ‘get’ the right SCN 501140373, but it looks in a parlog file which has begun 7 sequences higher.
But is this all a symptom or is it the cause? After logging an SR with Oracle, I never got the feeling that this is the cause, but the symptom of something else. To make a long story short, the cause is a bug, probably introduced in 19.10. What also seems to matter (or adds to the problem really) is high load on either the source server or target server in terms of CPU.
I found out that at the time the clone was scheduled, in the source database also a heavy and important scheduler job was scheduled. At the exact same time. This caused quite a load on the server and the source database itself, noticably with high CPU load.
Now my take is, that if you take a pdb clone from a source database and the source database is to busy, the SCN’s get out of sync during the clone, resulting in a failed clone, if you have patched to 19.10.
Now didn’t I begin this story with that everything worked fine for almost a year…?
So what changed? Now I really can’t tell.
update 16th May of 2021: In the meanwhile Oracle has proposed a one-off patch (already solved in Oracle DB 23.1) to resolve this issue. We are waiting for the one-off patch merge request for 19.x.
update 1st July of 2021: In the meanwhile no news from Oracle about the one-off, except that it will only come out for 19.10 at this time. Also I saw a tweet from Oraclebase (Tim Hall) about parlog, maybe it’s related to this problem. See Bug 32690109 : PDB HOT CLONE FAILS WITH ORA-283 DUE TO INCORRECT PARLOG FILE GENERATION
update 25th September 2021: A few weeks ago I got notice from Oracle Support on my SR that (unpublished) bug 32631551 has been fixed and that a patch for different patch levels have come out. The patch has the same number as the bug: 32631551 and is available at this moment on top of RU 19.10, 19.11 and 19.12. After asking what I should patch, the source or target database (or both?), this patch seems only relevant to the target of the hot pdb clone. So within a few weeks I have permission to patch the 19.12 home with patch 32631551 and I can do some testing. Will update here as soon as I have results.
update 12th October 2021: On the 5th of October I got permission to patch the target home, on the 12th I was able to do a test with the pdb hot clone. To my dissapointment I got the same errors and hot clone didn’t succeed. So got back to the SR to ask to confirm if the patch should be applied both to the source and target home. Waiting very long for a response from MOS.
update 11th November 2021: On the 1st of November I performed a new test with the hot pdb clone, The hot pdb clone itself succeeded, but it was opened in restricted mode. So only admins could access it. After looking in the alert log and pdb_plugin_violations view I found out there was a discrepancy in the patch level between my cdb (which I had backed up long ago and restored for the occasion) and the Oracle software home. CDB was 19.10, home was 19.12.
So on the 8th november I gave it another try but this time with a datapatch on the cdb. The hot pdb clone succeeded and was opened normally. The pdb hot clone was performed on a 1,2TB PDB and was ready in 30 minutes (source en target server had no high database load from other processes, so have to count that in).
For the next patch round 19.13 this fix wil be incorporated, so the one-off patch will be rolled backup before applying 19.13.
This finally concluded this endeavour. Cheers!