OCI – File storage snapshot management

OCI file storage snapshots are not managed automatically by OCI like block volumes backups are when using policy-managed backups.

Which means that you have to create and delete the snapshots by yourself.

On this blog post I will share a shell script to accomplish this task using oci cli.

Here is an example to create daily snapshots: fssnap.sh

# !/bin/bash
export CLI=/home/oracle/ocicli
export FSSOCID=<file storage ocid>
export TSTAMP=`date +%Y%m%d`
export ENV=prod

# create snap
$CLI/oci fs snapshot create --file-system-id=$FSSOCID --name=$ENV.$TSTAMP

You can get the file storage ocid at OCI console:

Create a group to your and assign proper privileges like:

Allow group StorageAdmins to manage file-family in compartment PROD

Schedule this shell script to run on a regular internal that fits your needs.

But in my case I had to keep a limited amount of backups based on same information such as: environment (prod or dev) and customer retention policy (bronze, silver or gold).

So I wrote another simple shell to accomplish this: fscleanup.sh

!/bin/bash
export CI=/home/oracle/ocicli
export FSSOCID=<file storage ocid>
export TSTAMP=`date +%Y%m%d`
export KEEP=42

# dump backups to tempfile
$CI/oci fs snapshot list --file-system-id=$FSSOCID | grep '"id"' | awk '{print $2}' | sed 's/"//g' | sed 's/,//g' > /tmp/fss.bkp

#count
CT=`cat /tmp/fss.bkp | wc -l`

#remove backups older then $KEEP
if [ "$CT" -gt $KEEP ]; then
    DIFF=$(expr $CT - $KEEP)
    for id in `tail -$DIFF /tmp/fss.bkp`
    do
       $CI/oci fs snapshot delete --force --snapshot-id $id
    done
else
    echo "less then KEEP: $KEEP"
fi

Please check OCI doc about managing snapshots for more info.

Let’s wait for the OCI native and automated way for doing this but until then this is the workaround.

Crash package on Oracle UEK kernel

Kdump is the Linux kernel crash-dump mechanism and Oracle recommends that you enabled it. You can configure Kdump using system-config-kdump utility.

So, if you are running Oracle Enterprise Linux UEK kernel and have faced a kernel panic, follow the steps below to install crash utility and analyse the vmcore file.

To use crash, packages debuginfo and debuginfo-common must also be installed.

To install packages on OL7 run:

yum install crash -y

export DLP="https://oss.oracle.com/ol7/debuginfo"

wget ${DLP}/kernel-uek-debuginfo-`uname -r`.rpm
wget ${DLP}/kernel-uek-debuginfo-common-`uname -r`.rpm

yum install kernel-uek-debuginfo*

Running crash:

syntax: crash <vmcore> <vmlinux>

crash vmcore /usr/lib/debug/lib/modules/`uname -r`/vmlinux

You will find vmcore files under /var/crash/<127.0.0.1-TIMESTAMP>

Output example:

[root@machine 127.0.0.1-2020-01-26-20:29:08]# crash vmcore /usr/lib/debug/lib/modules/4.1.12-124.35.2.el7uek.x86_64/vmlinux
crash 7.2.3-10.el7
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"…

KERNEL: /usr/lib/debug/lib/modules/4.1.12-124.35.2.el7uek.x86_64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
LOAD AVERAGE: 0.88, 0.26, 0.18
TASKS: 366
NODENAME: machine
RELEASE: 4.1.12-124.35.2.el7uek.x86_64
VERSION: #2 SMP Mon Jan 13 16:42:30 PST 2020
MACHINE: x86_64 (3400 Mhz)
MEMORY: 15.7 GB
PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000030"
PID: 24893
COMMAND: "oracle_24893_ic"
TASK: ffff880168877000 [THREAD_INFO: ffff8801a6eb8000]
CPU: 0
STATE: TASK_RUNNING (PANIC)

OCI VCN – Don’t forget DNS

I’ve found an interesting situation when using different VCN configurations.

Let’s get started.

I’ve created two VCN’s:

VCN1:

oci network vcn create --cidr-block 10.2.0.0/16 --compartment-id ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa --display-name vcn1
 {
   "data": {
     "cidr-block": "10.2.0.0/16",
     "compartment-id": "ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa",
     "default-dhcp-options-id": "ocid1.dhcpoptions.oc1.phx.aaaaaaaanuue6l7pre4vtyu6pp2ygucjbcwnvejmezuyfxxxmft76lroemmq",
     "default-route-table-id": "ocid1.routetable.oc1.phx.aaaaaaaaqybsv6xnwe6gbjo74zom4jtxequtnk5bwbe7qvkwyjjfwbusi7fq",
     "default-security-list-id": "ocid1.securitylist.oc1.phx.aaaaaaaactpod3l5kgukj7dkuq4gi2nhi4jojng4eetvhk5googboy3l5poq",
     "defined-tags": {},
     "display-name": "vcn1",
     "dns-label": null,
     "freeform-tags": {},
     "id": "ocid1.vcn.oc1.phx.aaaaaaaaklvs2bjyw3tx5fzfd76n2ab2fhbx2v4afgckksniqidudntoegwq",
     "ipv6-cidr-block": null,
     "ipv6-public-cidr-block": null,
     "lifecycle-state": "AVAILABLE",
     "time-created": "2019-11-13T14:50:57.323000+00:00",
     "vcn-domain-name": null
   },
   "etag": "8d3a4408"
 }

VCN2:

oci network vcn create --cidr-block 10.3.0.0/20 --compartment-id ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa --display-name vcn2 --dns-label vcn2
 {
   "data": {
     "cidr-block": "10.3.0.0/20",
     "compartment-id": "ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa",
     "default-dhcp-options-id": "ocid1.dhcpoptions.oc1.phx.aaaaaaaaka2ja2efstff2el3pw4t46co6jyjx3cq2xl46zr4cstle7s6mlya",
     "default-route-table-id": "ocid1.routetable.oc1.phx.aaaaaaaa2wqwmsiu32n3tc33pylycv7u75b66xuycn7ij2ilwwjeju5hofkq",
     "default-security-list-id": "ocid1.securitylist.oc1.phx.aaaaaaaaas3yhwpvilophj2nshwfdyp2g3o5vykgooj27xt2kqaqsghc5rjq",
     "defined-tags": {},
     "display-name": "vcn2",
     "dns-label": "vcn2",
     "freeform-tags": {},
     "id": "ocid1.vcn.oc1.phx.aaaaaaaauzosk3jx4mhxwwxngnvx5wco3ckoylqu4nioudm5zgrb5o6w6a7a",
     "ipv6-cidr-block": null,
     "ipv6-public-cidr-block": null,
     "lifecycle-state": "AVAILABLE",
     "time-created": "2019-11-13T15:47:09.602000+00:00",
     "vcn-domain-name": "vcn2.oraclevcn.com"
   },
   "etag": "d8b160ad"
 }

As you can see on VCN2 I’ve informed the parameter dns_label.

At OCI console it shows the VCN’s like this:

Let’s now create one subnet on each VCN.

oci network subnet create --cidr-block 10.2.1.0/24 --compartment-id ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa --vcn-id=ocid1.vcn.oc1.phx.aaaaaaaa7gr26rfluz43crfypp7qp3dscuqsibfrfq6iai7z5sxz5uhel4va --display-name=sub1pub --availability-domain="xbee:PHX-AD-1"
 {
   "data": {
     "availability-domain": "xbee:PHX-AD-1",
     "cidr-block": "10.2.1.0/24",
     "compartment-id": "ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa",
     "defined-tags": {},
     "dhcp-options-id": "ocid1.dhcpoptions.oc1.phx.aaaaaaaaba6d7pjjlkj6ectw4facyrxvsdfs5ppxdfpliuqgkzpz5q6j4z2a",
     "display-name": "sub1pub",
     "dns-label": null,
     "freeform-tags": {},
     "id": "ocid1.subnet.oc1.phx.aaaaaaaas3fh55h46kadophdob7tj3o26pbcjogxyyqeh2jisrhpqfrrbnqa",
     "ipv6-cidr-block": null,
     "ipv6-public-cidr-block": null,
     "ipv6-virtual-router-ip": null,
     "lifecycle-state": "AVAILABLE",
     "prohibit-public-ip-on-vnic": false,
     "route-table-id": "ocid1.routetable.oc1.phx.aaaaaaaaoycbxt5tkp3e5jei2bu74qnm7x2h3hvydwdtqzc4lciayal466wq",
     "security-list-ids": [
       "ocid1.securitylist.oc1.phx.aaaaaaaadjzgagusqdrppvtp3hrhl4coocyayjozwthirnpjpv2vnfyl4laq"
     ],
     "subnet-domain-name": null,
     "time-created": "2019-11-13T16:07:42.192000+00:00",
     "vcn-id": "ocid1.vcn.oc1.phx.aaaaaaaa7gr26rfluz43crfypp7qp3dscuqsibfrfq6iai7z5sxz5uhel4va",
     "virtual-router-ip": "10.2.1.1",
     "virtual-router-mac": "00:00:17:11:DA:D5"
   },
   "etag": "da7f5e26"
 }

oci network subnet create --cidr-block 10.3.1.0/24 --compartment-id ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa --vcn-id=ocid1.vcn.oc1.phx.aaaaaaaauzosk3jx4mhxwwxngnvx5wco3ckoylqu4nioudm5zgrb5o6w6a7a --display-name=sub1pub --availability-domain="xbee:PHX-AD-1"
 {
   "data": {
     "availability-domain": "xbee:PHX-AD-1",
     "cidr-block": "10.3.1.0/24",
     "compartment-id": "ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa",
     "defined-tags": {},
     "dhcp-options-id": "ocid1.dhcpoptions.oc1.phx.aaaaaaaaka2ja2efstff2el3pw4t46co6jyjx3cq2xl46zr4cstle7s6mlya",
     "display-name": "sub1pub",
     "dns-label": null,
     "freeform-tags": {},
     "id": "ocid1.subnet.oc1.phx.aaaaaaaa5jfcmbumzpebx6svtfp75yqsjtw2r34g3qnfbwvyxo6jps3fytea",
     "ipv6-cidr-block": null,
     "ipv6-public-cidr-block": null,
     "ipv6-virtual-router-ip": null,
     "lifecycle-state": "AVAILABLE",
     "prohibit-public-ip-on-vnic": false,
     "route-table-id": "ocid1.routetable.oc1.phx.aaaaaaaa2wqwmsiu32n3tc33pylycv7u75b66xuycn7ij2ilwwjeju5hofkq",
     "security-list-ids": [
       "ocid1.securitylist.oc1.phx.aaaaaaaaas3yhwpvilophj2nshwfdyp2g3o5vykgooj27xt2kqaqsghc5rjq"
     ],
     "subnet-domain-name": null,
     "time-created": "2019-11-13T16:08:31.031000+00:00",
     "vcn-id": "ocid1.vcn.oc1.phx.aaaaaaaauzosk3jx4mhxwwxngnvx5wco3ckoylqu4nioudm5zgrb5o6w6a7a",
     "virtual-router-ip": "10.3.1.1",
     "virtual-router-mac": "00:00:17:BB:57:17"
   },
   "etag": "16e1d3e4"
 }

Great. Now the interesting part: let’s try to launch a DB system on each VCN.

You have to inform a lot of parameters so I’m using a json file.

Attempt to launch a DB on VCN1 fails with error below:

oci db system launch --from-json file://db_19c_vcn1.json  ServiceError:  {      "code": "InvalidParameter",      "message": "domain name cannot be an empty string.",      "opc-request-id": "6B793936B1BB4D33BC0DDE6399C21B8B/6049601F5F54B75274E10E48542D39AD/D048ADC38DCDE42EA972F2D3E65FB9AE",      "status": 400  }

Now on VCN2, db launch works ! (output truncated):

reguchi@macpro bin % ./oci db system launch --from-json file://db_19c_vcn2.json
 {
   "data": {
     "availability-domain": "xbee:PHX-AD-1",
     "backup-network-nsg-ids": null,
     "backup-subnet-id": null,
     "cluster-name": "db19c",
     "compartment-id": "ocid1.compartment.oc1..aaaaaaaayjazkpkwmzys6xolc4kwncsj3p54iluporxw2iens4qutkjxatpa",
     "cpu-core-count": 1,
     "data-storage-percentage": 80,
     "data-storage-size-in-gbs": 256,
     "database-edition": "ENTERPRISE_EDITION_EXTREME_PERFORMANCE",
     "db-system-options": {
       "storage-management": "ASM"
     },
     "defined-tags": {},
     "disk-redundancy": "HIGH",
     "display-name": "myTestDB",
     "domain": "sub1pub.vcn2.oraclevcn.com",
     "fault-domains": [
       "FAULT-DOMAIN-1"
     ],
     "freeform-tags": {},
     "hostname": "db1",
     "id": "ocid1.dbsystem.oc1.phx.abyhqljtpqwao42mvywxggjk5xhk6sg6oez65vjopetg5fjcgvtpychuvcma",
     "iorm-config-cache": null,
     "last-patch-history-entry-id": null,
     "license-model": "BRING_YOUR_OWN_LICENSE",
     "lifecycle-details": null,
     "lifecycle-state": "PROVISIONING",
     "listener-port": 1521,
     "node-count": 1,
     "nsg-ids": null,
     "reco-storage-size-in-gb": 256,
     "scan-dns-record-id": null,
     "scan-ip-ids": null,
     "shape": "VM.Standard1.1",

Bottom line is: Don’t forget to define DNS while using oci cli for VCN provisioning. It is not required by default but you may miss this later and it is not possible to change DNS after VCN creation.

Reference: https://docs.cloud.oracle.com/iaas/tools/oci-cli/latest/oci_cli_docs/cmdref/network/vcn/create.html

If you rely on OCI console the DNS checkbox is checked and you are asked to define the dns_label, just like image below ūüôā

OGB Appreciation Day: parallel ops #ThanksOGB

Time is money so let’s get things done faster with parallel operations like:

  • Expdp/Impdp:
expdp scott/tiger@db10g schemas=SCOTT directory=TEST_DIR parallel=4 dumpfile=SCOTT_%U.dmp logfile=expdpSCOTT.log

impdp scott/tiger@db10g schemas=SCOTT directory=TEST_DIR parallel=4 dumpfile=SCOTT_%U.dmp logfile=impdpSCOTT.log

Must read note here.

  • RMAN
CONFIGURE DEVICE TYPE disk PARALLELISM 2;
CONFIGURE DEVICE TYPE sbt PARALLELISM 2;


RMAN always allocates the number of channels specified in PARALLELISM, using specifically configured channels if you have configured them and generic channels if you have not. Note that if you configure specific channels with numbers higher than the parallelism setting, RMAN will not use these channels.
  • Parallel Execution of SQL statments

Nice Oracle Paper here and of course Oracle docs.

  • Parallel Upgrade Utility (catctl.pl/dbupgrade)
$ORACLE_HOME/bin/dbupgrade -n 4

-n options specifies the number of processes to use for parallel operations.

Non-CDBs: The -n parameter specifies the number of SQL processes to use when upgrading the database.

Multitenant architecture databases (CDBs): The number of PDBs upgraded concurrently is controlled by the value of the -n parameter. Multiple PDB upgrades are processed together. Starting in Oracle Database 12c, the default value for multitenant architecture databases is the number of CPUs on your system. A cpu_count equal to 24 equates to a default value of 24 for -n.

Values for the -n parameter:

Non-CDBs: The maximum value for -n is 8. The minimum value is 1. The default value is 4.

Multitenant architecture databases (CDBs): The maximum value for -n is unlimited. The minimum value is 4. The maximum PDB upgrades running concurrently is the value of -n divided by the value of -N.

-N option specifies the number of SQL processors to use when upgrading PDBs.

For non-CDBs, this parameter is ignored.
For CDBs, the maximum value is 8. The minimum value is 1. The default value is 2.

So without specifying values for the parameters -n and -N (that is, accept the default value of -N, which is 2, and accept the default value of -n as the CPU_COUNT parameter value, which is 24). 

The following parallel processing occurs:

12 PDBs are upgraded in parallel (CPU_COUNT divided by 2)
2 parallel processes run for each PDB

That is what came from the top of my head so it should have more parallel options ūüôā

Thanks Tim to get the community together every year ! #ThanksOGB

GUOB TECH DAY / Oracle Groundbreakers LAD Tour 2019 – sobre o evento

Na minha opinião o evento foi, em todos os aspectos, melhor do que no ano passado. A organização leva muito a sério a questão de melhoria contínua e por isso é muito importante que todos preencham a ficha de avaliação para que no próximo ano o evento seja melhor ainda !

A localiza√ß√£o, as instala√ß√Ķes, os palestrantes e a audi√™ncia foram incr√≠veis !

Em todas as palestras que assisti (e também na que ministrei) a sala estava cheia e com gente interessada em aprender e trocar experiências.

Tivemos Product Managers da Oracle trazendo conte√ļdo exclusivo e super dispon√≠veis para tirar d√ļvidas. Eu mesmo fiz um monte de perguntas para o Ricardo Gonzalez (PM para Oracle Cloud Migration, ACFS & RHP) e acabei ganhando uma caneca do evento ūüôā

A √ļltima palestra foi do Alex Gorbachev (CTO da Pythian) sobre a evolu√ß√£o da carreira dele e como ele enxerga o futuro do DBA Oracle. Sim, vamos ter que mudar/evoluir muito !

Em resumo: o melhor evento de tecnologia Oracle do pa√≠s … melhor que o Oracle Open World Brasil de longe.

Perdeu o evento desse ano ? Faça o cadastro gratuito no site guob.com.br e veja as palestras disponíveis.

Mas n√£o perca o pr√≥ximo ano ūüôā

ORA-19563 on rman restore

I hit error ORA-19563 during rman restore last week.

I was restoring a database backup from Oracle 11.2 on ASM to a Oracle 11.2 on Filesystem.

Although not the same procedure, MOS RMAN Duplicate Fails With RMAN-06136 ORA-19563 (Doc ID 453123.1) helped me on this issue when it mentioned I could have duplicate filenames.

There was duplicated filenames indeed !

My restore script was:

run {
     set newname for database to '/oradata/%b';
     restore database;
     switch datafile all;
 }

Which means that datafile 55 was overwritten by datafile 56 hence rman error on switch step.

Quick fix:

  • remove datafile 56
  • restore only datafile 55 and 56 using set newname clause but specifying different filename:
run {
         set newname for datafile 55 to '/oradata/svm910t02.dbf';
         set newname for datafile 56 to '/oradata/svm910t03.dbf';
         restore datafile 55;
         restore datafile 56;
 }
  • run original script again which this time will just perform the switch step

How to prevent this from happening again ?

  • Check if you have duplicate filenames before restoring:
select substr ( file_name, instr( file_name, '/', -1)) file_name, count() from dba_data_files group by substr ( file_name,instr( file_name, '/', -1))  having count() > 1
/
  • If yes, run rman restore like this:
run {
         set newname for datafile 56 to '/oradata/svm910t03.dbf';
         set newname for database to '/oradata/%b';
         restore database;
         switch datafile all;
 }

More information can be found at Oracle 11.2 docs here.

Restart from a failed Oracle upgrade

Interesting situation this morning.

I was performing a database upgrade from 11.2 to 12.2 when my VPN crashed.

I realized that I forgot to start the upgrade process using linux screen terminal which means that my upgrade process was lost.

Well, Oracle 12.2 has the ability to resume a failed upgrade process from the failed step automatically !

From the Oracle Docs:

Oracle Database 12c release 2 (12.2) includes a new Resume option for Parallel Upgrade Utility. This option is available for both CDBs and Non-CDBs. You are not required to identify failed or incomplete phases when you rerun or restart the upgrade. When you use the Parallel Upgrade Utility using the resume option (-R), the utility automatically detects phases from the previous upgrade that are not completed successfully. The Parallel Upgrade Utility then reruns or restarts just these phases that did not complete successfully, so that the upgrade is completed. Bypassing steps that already completed successfully reduces the amount of time it takes to rerun the upgrade. 

So I just ran $ORACLE_HOME/bin/dbupgrade -n 4 -R -l $ORACLE_HOME/diagnostics and the upgrade process was restarted.

Cool new feature, right ?

More info about it here.