OCI – iSCSI bug 30711156

If you run Oracle DB on OCI compute and leverage iSCSI as volume attachment, beware of bug 30711156 on iSCSI.

We hit this bug a while ago and as consequence we were not able to read/write to the block volume anymore.

Fix: kill all Oracle processes then remount the Filesystem.

If you see errors on /var/log/messages like the ones below, you mostly like hit the same issue:

Aug 10 00:29:30 host iscsid: iscsid: Kernel reported iSCSI connection 1:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3) 
Aug 10 00:29:30 host iscsid: iscsid: re-opening session 1 (reopen_cnt 0) 
Aug 10 00:29:30 host iscsid: iscsid: disconnecting conn 0x563c2f155068, fd 7 
Aug 10 00:33:01 host kernel: session1: iscsi_eh_cmd_timed_out scsi cmd ffff9c3622aea948 timedout 
Aug 10 00:33:01 host kernel: session2: iscsi_eh_cmd_timed_out scsi cmd ffff9c3622ae8d48 timedout 
Aug 10 00:33:01 host kernel: session1: iscsi_eh_cmd_timed_out return timer reset 
Aug 10 00:33:01 host kernel: session2: iscsi_eh_cmd_timed_out return shutdown or nh 
Aug 10 00:33:01 host kernel: session1: iscsi_eh_cmd_timed_out scsi cmd ffff9c3622aec148 timedout 
Aug 10 00:33:01 host kernel: session1: iscsi_eh_cmd_timed_out return timer reset 

Nice, right ?

A new iscsi-initiator-utils is available for download so go ahead and update your server.

Linux Errata available here.

Good patching !

OCI – Patching a DB System

On this blog post I will write detailed steps on how to apply patches on bare metal/virtual machine DB systems and database homes by using DBCLI.

Yes, I enjoy the black screen 🙂

No, this is not the only option. You can also use Console and API’s.

This procedure is not applicable for Exadata DB systems.

Prerequisites

1) Access to the Oracle Cloud Infrastructure Object Storage service, including connectivity to the applicable Swift endpoint for Object Storage. Oracle recommend using Service Gateway to enable this access.

2) /u01 FS with at least 15Gb of free space

3) Clusterware running

4) All DB system nodes running

Backup

Backup your database prior to the patch event.

Non Prod first

Test this patch on non prod (or test) server first

Patching

1) Update the CLI

 cliadm update-dbcli

2) Wait the job to complete

dbcli list-jobs

3) Check for installed and available patches

dbcli describe-component

4) Display the latest patch versions available

dbcli describe-latestpatch

5) Run pre check

dbcli update-server --precheck

Example:

run describe-job to check job status:

dbcli describe-job -i <jobId>

Example:

6) Update the server components. This step will patch GI.

dbcli update-server

Example:

Once successfully completed proceed with DB home patching.

7) List db homes

dbcli list-dbhomes

8) Run update home on select OH

dbcli update-dbhome -i <Id>

Example:

And check status with describe-job command:

Logs

You can find the DCS Logs at:

/opt/oracle/dcs/log/

Under the hood dbcli relies on opatchauto so you can also check $ORACLE_HOME/cfgtoollogs/opatchauto directory for logs.

There is also a nice doc about troubleshooting in case something goes wrong.

DB system – OS patching

When using Oracle Cloud Database Services VM or Bare Metal, customer is responsible for OS updates and GI/DB patches so let’s first go through the steps to update an OL7 server.

This procedure is not applicable for Exadata DB systems.

General Recommendations

1) Don’t touch oraenv or .bash_profile

2) Don’t touch default local firewall rules

Backup

Backup your db prior to the OS update

NonProd first

Test this procedure on non prod server first

OS Update

1) Check if kernel is 4.1.12-124.27.1.el7uek then you need to change the bootefi label before updating the OS.

uname -r

To change bootefi label:

Edit /etc/fstab: Change the label bootefi to BOOTEFI (uppercase), reboot the server and run ls -l /etc/grub2-efi.cfg to check required link was created.

2) Run command below to identify the region server is running:

curl -s http://169.254.169.254/opc/v1/instance/ |grep region

3) Download the repo file and cp to yum dir:

wget https://swiftobjectstorage.<region>.oraclecloud.com/v1/dbaaspatchstore/DBaaSOSPatches/oci_dbaas_ol7repo -O /tmp/oci_dbaas_ol7repo

and copy it to yum dir

cp /tmp/oci_dbaas_ol7repo /etc/yum.repos.d/ol7.repo

4) Download the version lock files and overwrite existing one:

wget https://swiftobjectstorage.<region>.oraclecloud.com/v1/dbaaspatchstore/DBaaSOSPatches/versionlock_ol7.list -O /tmp/versionlock.list

cp /etc/yum/pluginconf.d/versionlock.list /etc/yum/pluginconf.d/versionlock.list-`date+%Y%m%d`
cp /tmp/versionlock.list /etc/yum/pluginconf.d/versionlock.list

5) run YUM update

yum update

6) reboot

7) check new kernel version

uname -r

Hopefully I can rely on OS Management to update this type of server next time 🙂 So stay tuned !

Oracle Gold Image

A gold image is a copy of a software-only, installed Oracle home. It is used to copy an image of an Oracle home to a new host on a new file system to serve as an active, usable Oracle home.

Why should you use it ?

It will allow you to have a “perfect” version of the OH you need to deploy.

I think until now most DBA’s usually installs the base Oracle version then applies the RU needed (or latest).

From now on you can just build an Oracle Gold Image and create your OH’s from it.

Very simple example: Creating a Gold Image for Oracle 19.7

First you need to install Oracle 19.3, then apply patch 30783556.

[oracle@server ~]$ $ORACLE_HOME/OPatch/opatch lspatches
30894985;OCW RELEASE UPDATE 19.7.0.0.0 (30894985)
30869156;Database Release Update : 19.7.0.0.200414 (30869156)

Finally to create the Gold Image:

[oracle@server ~]$ $ORACLE_HOME/runInstaller -silent -createGoldImage -destinationLocation /u01/soft
Launching Oracle Database Setup Wizard…
Successfully Setup Software.
Gold Image location: /u01/soft/db_home_2020-05-22_08-08-26PM.zip

So this zip file db_home_2020-05-22_08-08-26PM.zip will allow you to deploy Oracle 19.7

How ?

Just unzip it and execute runInstaller 🙂

Example:

[oracle@server ~]$ mkdir -p /u01/c0s6/app/oracle/product/19.7.0.0/dbhome_1

[oracle@server ~]$ unzip -q /u01/soft/db_home_2020-05-22_08-08-26PM.zip -d /u01/c0s6/app/oracle/product/19.7.0.0/dbhome_1

export ORACLE_HOME=/u01/c0s6/app/oracle/product/19.7.0.0/dbhome_1
export ORA_INVENTORY=/u01/c0s6/app/oraInventory
export ORACLE_BASE=/u01/c0s6/app/oracle

[oracle@server ~]$ ${ORACLE_HOME}/runInstaller -ignorePrereq -waitforcompletion -silent \
-responseFile ${ORACLE_HOME}/install/response/db_install.rsp \
oracle.install.option=INSTALL_DB_SWONLY \
ORACLE_HOSTNAME=${ORACLE_HOSTNAME} \
UNIX_GROUP_NAME=oinstall \
INVENTORY_LOCATION=${ORA_INVENTORY} \
SELECTED_LANGUAGES=en \
ORACLE_HOME=${ORACLE_HOME} \
ORACLE_BASE=${ORACLE_BASE} \
oracle.install.db.InstallEdition=EE \
oracle.install.db.OSDBA_GROUP=dba \
oracle.install.db.OSOPER_GROUP=dba \
oracle.install.db.OSBACKUPDBA_GROUP=dba \
oracle.install.db.OSDGDBA_GROUP=dba \
oracle.install.db.OSKMDBA_GROUP=dba \
oracle.install.db.OSRACDBA_GROUP=dba \
oracle.install.db.ConfigureAsContainerDB=false \
SECURITY_UPDATES_VIA_MYORACLESUPPORT=false \
DECLINE_SECURITY_UPDATES=true
Launching Oracle Database Setup Wizard…

The response file for this session can be found at:
/u01/c0s6/app/oracle/product/19.7.0.0/dbhome_1/install/response/db_2020-05-22_08-54-24PM.rsp
You can find the log of this install session at:
/u01/c0s6/app/oraInventory/logs/InstallActions2020-05-22_08-54-24PM/installActions2020-05-22_08-54-24PM.log
As a root user, execute the following script(s):
1. /u01/c0s6/app/oracle/product/19.7.0.0/dbhome_1/root.sh
Execute /u01/c0s6/app/oracle/product/19.7.0.0/dbhome_1/root.sh on the following nodes:
[server]
Successfully Setup Software.

[root@server tmp]# /u01/c0s6/app/oracle/product/19.7.0.0/dbhome_1/root.sh
Check /u01/c0s6/app/oracle/product/19.7.0.0/dbhome_1/install/root_c0s64150_2020-05-22_20-56-10-578402868.log for the output of root script

[oracle@server ~]$ $ORACLE_HOME/OPatch/opatch lspatches
30894985;OCW RELEASE UPDATE 19.7.0.0.0 (30894985)
30869156;Database Release Update : 19.7.0.0.200414 (30869156)

How cool is that ?

And it gets better with Fleet Provisioning.

OCI – Unknown error when running DB system patch pre-check

This week I’ve been working on a DB system Patch for an Oracle 12.2 system.

Task prerequisites are as follow:

1) Access to the Oracle Cloud Infrastructure Object Storage service, including connectivity to the applicable Swift endpoint for Object Storage.

2) /u01 FS with at least 15Gb of free space

3) Clusterware running

4) All DB system nodes running

With all prerequisites checked, I then ran, at the console, a pre-check option to ensure patch can be successfully applied.

Example below:

But I’ve got an unknown error as you can see in the images from work requests page:

Suck an error does not help at all but if run the same pre-check command from dbcli you will get another and actually meaningful error:

[root@node ~]# dbcli update-server --precheck
DCS-10214:Minimum Kernel version required is: 4.1.12-124.20.3.el6uek.x86_64 Current Kernel version is: 4.1.12-112.16.7.el6uek.x86_64.

Much better, right ?

So in order to apply the DB system patch I will need to first update the OS kernel.

This information should be on the prerequisites list but it is not 😦

So from now on I would suggest you to run pre-check from dbcli instead of OCI console but default or at least when facing unexpected behaviour.

Full documentation on patch DB systems here and to update the kernel you can check OCI docs here.

Hope this helps !

-ocmrf is no longer needed

I was patching an Oracle 12.1 Restart with latest bundle patch this week and realized that the -ocmrf option is no longer needed.

This enhancement started with Opatch 12.2.0.1.5 so now you have one more reason to update your Opatch 🙂

Don’t even bother looking for emocmrsp binary … it is no longer there !

So things just got easier ! All you have to do is run:

opatchauto apply <UNZIPPED_PATCH_LOCATION>/<BUG NO> 

Example for 12.1 BP:

export PATH=$PATH:/u01/app/12.1.0.2/grid/OPatch/
opatchauto apply /home/oracle/stage/29176139/29141038

Reference: MOS note 2161861.1, 1591616.1 and readme.html for patch 29176139.

OPatch update for EMGC

Download the latest version from here.

For EMGC 13.1 and above , select the version as “13.9.0.0.0” (Release- OPatch 13.9.0.0.0). Unzip the file into a staging directory like /u01/stage/.

Stop OMS with $ORACLE_HOME/emctl stop oms -all where OH is the middleware home.

Run: java -jar <STAGE_DIR>/6880880/opatch_generic.jar -silent oracle_home=$ORACLE_HOME

To validate the installation run $ORACLE_HOME/OPatch/opatch version

The output should match the value from the readme file.



CRS crash after upgrade to Oracle 12.2

vector illustration of Crazy man cartoon

Cai num bug “interessante” na última semana após a atualização para o Oracle 12.2

I have hit an interesting bug this week after upgrade to Oracle 12.2.

A nota Doc ID 2460394.1 tem os detalhes do bug mas basicamente sua base será reiniciada pelo CRS de tempos em tempos.

MOS note Bug 28298447: cluster crashed due to mellanox driver related issue (Doc ID 2460394.1) has the details but basically your cluster will bounce the DB from time to time.

Ou seja, se você tem um Exadata, atualize o kernel para a versão 4.1.12-94.8.5 antes de realizar o upgrade do banco para 12.2

So if you running an Exadata, upgrade the kernel to version 4.1.12-94.8.5 before upgrading db to 12.2.

Essa correção de kernel também está disponível nas versões 12.2.1.1.7 (ou superior) ou 18.1.5.0.0 (ou superior), ou seja, QFSDP abril/2018.

This kernel fix is also included on image version 12.2.1.1.7 (or higher) or 18.1.5.0.0 (or higher), or QFSDP from april/2018.

ORA-20003-Configuring job Load_opatch_ inventory_1on node and on instancefailed

Após realizar o upgrade para a versão Oracle RAC 12.2.0.1 o alert.log reportou o erro abaixo:

Just after an DB upgrade to Oracle RAC 12.2.0.1 the alert.log file have reported the error below:

Unable to obtain current patch information due to error: 20003, ORA-20003: Configuring job Load_opatch_inventory_1on node and on instancefailed
ORA-06512: at "SYS.DBMS_QOPATCH", line 777
ORA-06512: at "SYS.DBMS_QOPATCH", line 479
ORA-06512: at "SYS.DBMS_QOPATCH", line 455
ORA-06512: at "SYS.DBMS_QOPATCH", line 574
ORA-06512: at "SYS.DBMS_QOPATCH", line 2247

===========================================================
Dumping current patch information
===========================================================
Unable to obtain current patch information due to error: 20003
===========================================================

Novamente hora de abrir o MOS e iniciar algumas pesquisas.

Time to open MOS and start researching !

E alguns minutos depois descobrir mais um bug 😦

And just a few minutes later find another Oracle bug 😦

12.2 RAC Database Alert.log reports Unable to obtain current patch information due to error: 20003, ORA-20003: Configuring job Load_opatch_ inventory_1on node and on instancefailed (Doc ID 2364768.1)

O erro não tem impacto e pode ser ignorado. Se preferir patch 23333567 pode ser aplicado para corrigir esse erro.

This error is harmless and can be ignored or patch 23333567 can be applied to fix it.

 

Control File and Server Parameter File Autobackups

Encontrei uma situação interessante hoje no meu ambiente: os backups automáticos do control file e spfile estão se acumulando no FRA.

I have found a interesting situation at my env today: controlfile autobackup were pilling up.

Meus scripts de backup removem os backupsets expirados para liberar espaço e esses backups automaticos deveriam cair nessa situação, respeitando a politica de retenção, obviamente.

Since my backup scripts purge the expired backupsets to free space and theses auto backups should be marked as expired according to your retention police.

Bom, abri o MOS e comecei a procurar por essa situação e encontrei duas notas técnicas sobre esse problema:

I then started searching MOS and found two technical notes about this issue: 

1) autobackup of Spfile+controlfile was not reported as obsolete as expected (Doc ID 2365626.1)

2) Bug 25943271 – rman report obsolete does not report controlfile backup as obsolete (Doc ID 25943271.8)

Correção: Aplicar o patch 25943271. O backport já está disponível para diversas versões.

Fix: Apply patch 25943271. Backport is available for several rdbms versions.

No meu caso eu apliquei, com sucesso, a seguinte solução de contorno via rman:

In my case I could successfully workaround this issue running the following rman command: 

DELETE OBSOLETE RECOVERY WINDOW OF 10 DAYS;

Links úteis / Useful links:

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/bradv/rman-backup-concepts.html#GUID-95840C84-1595-49AC-923D-310DA750676B

https://blog.dbi-services.com/oracle-12c-automatic-control-file-backups/