The error occurred when switching from MAXIMUM PERFORMANCE to MAXIMUM AVAILABILITY Data Guard mode on an Exadata system. The primary RAC database, running on version 19c, crashed with an ORA-600 error code. The error mentioned that the redo log for group 45, sequence 509, was incorrectly located on DAX storage, causing issues. The incident details pointed to an internal error code [kfk_iodone_invalid_buffer], indicating that an I/O buffer did not pass HARD checks. The error was associated with the LGWR background process.

Changes
The issue arose after changing the Data Guard mode to MAXIMUM AVAILABILITY; switching to MAXIMUM PROTECTION did not trigger the ORA-600 error. Interestingly, a 12c database within the same Exadata rack did not experience the ORA-600 error. The problem manifested when the online or standby redo log files were located on Extreme Flash Cell or High Capacity Cell nodes. The system applied the following relevant patches: Patch 30165493, fixing log file fast sync parameters for PMEMLOG, Bug 31119057 associated with the ORA-600 error, and Bug 31305624 linked to instance crashes.

Solution
When transitioning to MAXIMUM AVAILABILITY Data Guard mode on Exadata, specific steps must be taken to resolve the ORA-600 [kfk_iodone_invalid_buffer] error.

Firstly, dynamic parameters must be set on all database instances: ‘_smart_log_threshold_usec‘ should be set to 0.

For Exadata systems with PMEM, ‘_exa_pmemlog_threshold_usec‘ must also be set to 0.

Furthermore, it is necessary to download and implement patch 31305624 from support.oracle.com.

Lastly, updating the system to version 19.6.0.0.200114DBRU or a higher version that includes the bug fix is part of the solution.

The symptoms include issues such as the ‘ALTER DATABASE OPEN’ process not completing as logged in the alert log file. DIA0 (Hang Manager reports) sessions are also blocked while waiting for ‘gc freelist.’ The hang manager reports instances waiting for ‘cursor: pin S wait on X’ and ‘gc freelist’, which can lead to extended waiting times and potentially block other sessions.

Diagnosis by MMAN and Development

MMAN (Memory Manager) indicates an ORA-4031 error related to the shared pool, potentially caused by the bug identified as BUG 31459369. This bug leads to multiple incidents of ORA-00600 [15709], [29] during parallel execution. Development has confirmed that SGA_TARGET usage can result in an imbalance in the number of Lock Elements (LE) assigned to LMS processes on NUMA machines, along with setting a minimum size for the buffer cache due to this bug.

Workaround Suggested

To address the issues caused by the bug, the recommended workaround is to establish a minimum size for the database buffer cache and shared pool. By setting these minimum sizes, the workaround aims to mitigate the effects of the bug identified as BUG 31459369, which triggers incidents of ORA-00600 [15709], [29] with parallel execution.

This error is triggered when trying to update a column in a table that is part of a correlated subquery involving the same table. This issue was encountered in Oracle databases of versions equal to or greater than 12.1 but below 12.2, with 12.1.0.2 being the confirmed affected version.

Description:
The error was caused by a regression that prevented the correct marking of the compare column in cases where there is a correlation column from the same table but a different view. This regression led to the ORA-00600 [qeselupdpre_20] error when executing certain update statements within the specified version range of Oracle databases.

Version Affected:
The problem affects Oracle database versions from 12.1 to 12.2, with the specific version 12.1.0.2 (Server Patch Set) confirmed to experience this issue. It’s crucial to note that this problem does not extend to versions 12.2 and above.

Workaround:
Unfortunately, there is no available workaround for the ORA-00600 [qeselupdpre_20] error in the affected versions of the Oracle database. Users encountering this issue are advised to proceed directly to the fix provided by Oracle.

Fixed:
The fix for the bug causing the ORA-00600 [qeselupdpre_20] error is first included in Oracle database version 12.2.0.1 (Base Release). Users experiencing this problem should consider upgrading their Oracle database to version 12.2.0.1 or later to resolve this issue.