I recently had to install an Oracle Database Appliance X11 HA and it failed when creating the appliance:
[root@oak0 ~]# odacli create-appliance -r /u01/patch/my_new_oda.json
...
[root@mynewoda ~]# odacli describe-job -i 88e4b5e3-3a73-4c18-9d9f-960151abc45e
Job details
----------------------------------------------------------------
ID: 88e4b5e3-3a73-4c18-9d9f-960151abc45e
Description: Provisioning service creation
Status: Failure (To view Error Correlation report, run "odacli describe-job -i 88e4b5e3-3a73-4c18-9d9f-960151abc45e --ecr" command)
Created: April 23, 2025 16:15:35 CEST
Message: DCS-10001:Internal error encountered: Failed to provision GI with RHP at the home: /u01/app/19.26.0.0/grid: DCS-10001:Internal error encountered: PRGH-1002 : Failed to copy files from /opt/oracle/rhp/RHPCheckpoints/rhptemp/grid8631129022929485455.rsp to /opt/oracle/rhp/RHPCheckpoints/wOraGrid192600
PRKC-1191 : Remote command execution setup check for node mynewoda using shell /usr/bin/ssh failed.
No ECDSA host key is known for mynewoda and you have requested strict checking.Host key verification failed...
It happened randomly in the past that we got this error „host key verification failed“ and we just had to rerun our „odacli create-appliance“ command again. However, this time restarting was not possible:
[root@mynewoda ~]# odacli create-appliance -r /u01/patch/my_new_oda.json
DCS-10047:Same job is already running: Provisioning FAILED in different request.
Following MOS Note „ODA Provisioning Fails to Create Appliance w/ Error: DCS-10047:Same Job is already running : Provisioning FAILED in different request. (Doc ID 2809836.1)“ I cleaned up the ODA, updated the repository with the Grid Infrastructure clone and DB clone:
Stop the dcs agent on both nodes:
# systemctl stop initdcsagent
Then, run cleanup.pl on ODA node 0.
# /opt/oracle/oak/onecmd/cleanup.pl -f
...
If you get warnings that the cleanup cannot transfer the public key to node 1 or cannot setup SSH equivalence, then run the cleanup on node 1 as well.
At the end of the cleanup-output you get those messages:
WARNING: After system reboot, please re-run "odacli update-repository" for GI/DB clones,
WARNING: before running "odacli create-appliance".
So, after the reboot I updated the repository with the GI and DB Clone:
[root@oak0 patch]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/patch/odacli-dcs-19.26.0.0.0-250127-GI-19.26.0.0.zip
...
[root@oak0 patch]# odacli describe-job -i 674f7c66-1615-450f-be27-4e4734abca97
Job details
----------------------------------------------------------------
ID: 674f7c66-1615-450f-be27-4e4734abca97
Description: Repository Update
Status: Success
Created: April 23, 2025 14:37:29 UTC
Message: /u01/patch/odacli-dcs-19.26.0.0.0-250127-GI-19.26.0.0.zip
...
[root@oak0 patch]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/patch/odacli-dcs-19.26.0.0.0-250127-DB-19.26.0.0.zip
...
[root@oak0 patch]# odacli describe-job -i 4299b124-1c93-4d22-bac4-44a65cbaac67
Job details
----------------------------------------------------------------
ID: 4299b124-1c93-4d22-bac4-44a65cbaac67
Description: Repository Update
Status: Success
Created: April 23, 2025 14:39:34 UTC
Message: /u01/patch/odacli-dcs-19.26.0.0.0-250127-DB-19.26.0.0.zip
...
Checked that the clones are available:
[root@oak0 patch]# ls -ltrh /opt/oracle/oak/pkgrepos/orapkgs/clones
total 12G
-rwxr-xr-x 1 root root 6.0G Jan 28 03:33 grid19.250121.tar.gz
-rwxr-xr-x 1 root root 21 Jan 28 03:34 grid19.250121.tar.gz.info
-r-xr-xr-x 1 root root 5.4G Jan 28 03:42 db19.250121.tar.gz
-rw-rw-r-- 1 root root 19K Jan 28 03:42 clonemetadata.xml
-rw-rw-r-- 1 root root 21 Jan 28 03:43 db19.250121.tar.gz.info
[root@oak0 patch]#
The same on node 1:
[root@oak1 ~]# ls -ltrh /opt/oracle/oak/pkgrepos/orapkgs/clones
total 12G
-rwxr-xr-x 1 root root 6.0G Jan 28 03:33 grid19.250121.tar.gz
-rwxr-xr-x 1 root root 21 Jan 28 03:34 grid19.250121.tar.gz.info
-r-xr-xr-x 1 root root 5.4G Jan 28 03:42 db19.250121.tar.gz
-rw-rw-r-- 1 root root 19K Jan 28 03:42 clonemetadata.xml
-rw-rw-r-- 1 root root 21 Jan 28 03:43 db19.250121.tar.gz.info
[root@oak1 ~]#
Before running the create-appliance again, you should first validate the storage topology on both nodes again.
[root@oak0 ~]# odacli validate-storagetopology
INFO : ODA Topology Verification
INFO : Running on Node0
INFO : Check hardware type
INFO : Check for Environment(Bare Metal or Virtual Machine)
SUCCESS : Type of environment found : Bare Metal
INFO : Check number of Controllers
SUCCESS : Number of onboard OS disk found : 2
SUCCESS : Number of External SCSI controllers found : 2
INFO : Check for Controllers correct PCIe slot address
SUCCESS : Internal RAID controller :
SUCCESS : External LSI SAS controller 0 : 61:00.0
SUCCESS : External LSI SAS controller 1 : e1:00.0
INFO : Check for Controller Type in the System
SUCCESS : There are 2 SAS 38xx controller in the system
INFO : Check if JBOD powered on
SUCCESS : 1JBOD : Powered-on
INFO : Check for correct number of EBODS(2 or 4)
SUCCESS : EBOD found : 2
INFO : Check for External Controller 0
SUCCESS : Controller connected to correct EBOD number
SUCCESS : Controller port connected to correct EBOD port
SUCCESS : Overall Cable check for controller 0
INFO : Check for External Controller 1
SUCCESS : Controller connected to correct EBOD number
SUCCESS : Controller port connected to correct EBOD port
SUCCESS : Overall Cable check for Controller 1
INFO : Check for overall status of cable validation on Node0
SUCCESS : Overall Cable Validation on Node0
INFO : Check Node Identification status
SUCCESS : Node Identification
SUCCESS : Node name based on cable configuration found : NODE0
INFO : The details for Storage Topology Validation can also be found in the log file=/opt/oracle/oak/diag/oak0/oak/storagetopology/StorageTopology-2025-04-23-14:42:34_70809_7141.log
[root@oak0 ~]#
Validate the storage-topology on node 1 as well. Not validating the storage topology may lead to the following error when creating the appliance again:
OAK-10011:Failure while running storage setup on the system. Cause: Node number set on host not matching node number returned by storage topology tool. Action: Node number on host not set correctly. For default storage shelf node number needs to be set by storage topology tool itself.
Afterwards the „odacli create-appliance“ should run through.
Summary
If your „odacli create-appliance“ fails on an ODA HA environment and you cannot restart it, then run a cleanup, update the repository with the Grid Infra- and DB-clone and validate the storage-topology before doing the create-appliance again.