The standard Flash GUI of the actual Oracle Support site is driving me crazy. For a guy like me, used to work especially with console applications, addicted to VIM and browsing using Vimperator, having this Flash site in front of me really hurts. It hurts when I have to move my hand to click something with the mouse, it hurts when I cannot go back to the previous page, it hurts... it hurts... it hurts! Oracle Support is not a game, it doesn't need animation, video or other such things Flash is good at.
Okey, enough with all these complaints! If you don't like Flash, or your browser simply doesn't support it you may switch to the plain html interface: https://supporthtml.oracle.com. Isn't it cool? Just have a look:
Now I can use my beloved Vimperator and, of course, Flashblock.
However, I have to point out the limitations.The HTML option does not include the following functionality, which is only available in the Flash version of My Oracle Support:
* Systems
* Projects
* Healthchecks
* Patch Advice & Recommendations
* Inventory Reporting
* OnDemand Portal, Service Request and RFC Functionality
* CRM OnDemand Service Requests & Knowledge
As far as I'm concerned, I didn't use those features anyway therefore I'm good!
This blog has moved here.
Wednesday, January 26, 2011
Monday, January 24, 2011
crs_stat pretty print on RAC 11.2
I really don't like the way
Looking at TARGET and STATE is clear that something is OFFLINE and the red flag pops up right away. Of course, "crsctl status resource" command has the same problem as crs_stat when it comes to pretty print the result. It also has a tabular format (see the -t switch) but it'a a little bit verbose as it also displays the local resources status. But hey, do you remember Note 259301.1? It was about an awk script used for parsing the crs_stat output and displaying it in a nicer way. Okey, let's take that script and change it to take the output of the "crsctl status resource" command. I'm not an awk expert, but the following script is working pretty well:
Looking at the above output I can clearly see the partial OFFLINE status of my database. From my point of view, is much better.
crs_stat -t
displays RAC status information on my 11.2.0.2 database. For example, one of my instances are down. Can you figure out this looking at the output below?[grid@owl ~]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.DATA.dg ora....up.type ONLINE ONLINE hen ora....ER.lsnr ora....er.type ONLINE ONLINE hen ora....N1.lsnr ora....er.type ONLINE ONLINE hen ora....N2.lsnr ora....er.type ONLINE ONLINE hen ora....N3.lsnr ora....er.type ONLINE ONLINE owl ora.asm ora.asm.type ONLINE ONLINE hen ora.cvu ora.cvu.type ONLINE ONLINE hen ora.gns ora.gns.type ONLINE ONLINE hen ora.gns.vip ora....ip.type ONLINE ONLINE hen ora.gsd ora.gsd.type OFFLINE OFFLINE ora....SM2.asm application ONLINE ONLINE hen ora....EN.lsnr application ONLINE ONLINE hen ora.hen.gsd application OFFLINE OFFLINE ora.hen.ons application ONLINE ONLINE hen ora.hen.vip ora....t1.type ONLINE ONLINE hen ora....network ora....rk.type ONLINE ONLINE hen ora.oc4j ora.oc4j.type ONLINE ONLINE hen ora.ons ora.ons.type ONLINE ONLINE hen ora....SM1.asm application ONLINE ONLINE owl ora....WL.lsnr application ONLINE ONLINE owl ora.owl.gsd application OFFLINE OFFLINE ora.owl.ons application ONLINE ONLINE owl ora.owl.vip ora....t1.type ONLINE ONLINE owl ora.poc.db ora....se.type ONLINE ONLINE hen ora....uci.svc ora....ce.type ONLINE ONLINE hen ora....ry.acfs ora....fs.type ONLINE ONLINE hen ora.scan1.vip ora....ip.type ONLINE ONLINE hen ora.scan2.vip ora....ip.type ONLINE ONLINE hen ora.scan3.vip ora....ip.type ONLINE ONLINE owlThe resource "ora.poc.db" is ONLINE therefore no red flags you might say. Well, bad luck: the database is up & running but with only one instance. The other one is dead. The database is policy managed but I want to be aware if all instances from the pool are running. How can we figure this out? Not a very big deal, just issue: "crsctl status resource". You'll get something like this:
NAME=ora.poc.db TYPE=ora.database.type TARGET=OFFLINE, ONLINE STATE=OFFLINE, ONLINE on hen
Looking at TARGET and STATE is clear that something is OFFLINE and the red flag pops up right away. Of course, "crsctl status resource" command has the same problem as crs_stat when it comes to pretty print the result. It also has a tabular format (see the -t switch) but it'a a little bit verbose as it also displays the local resources status. But hey, do you remember Note 259301.1? It was about an awk script used for parsing the crs_stat output and displaying it in a nicer way. Okey, let's take that script and change it to take the output of the "crsctl status resource" command. I'm not an awk expert, but the following script is working pretty well:
#!/usr/bin/ksh # # Sample 10g CRS resource status query script # # Description: # - Returns formatted version of crs_stat -t, in tabular # format, with the complete rsc names and filtering keywords # - The argument, $RSC_KEY, is optional and if passed to the script, will # limit the output to HA resources whose names match $RSC_KEY. # Requirements: # - $ORA_CRS_HOME should be set in your environment RSC_KEY=$1 QSTAT=-u AWK=/bin/awk # if not available use /usr/bin/awk # Table header:echo "" $AWK \ 'BEGIN {printf "%-45s %-10s %-18s\n", "HA Resource", "Target", "State"; printf "%-45s %-10s %-18s\n", "-----------", "------", "-----";}' # Table body: $ORACLE_HOME/bin/crsctl status resource | $AWK \ ' function ltrim(s) { sub(/^[ \t]+/, "", s); return s } function rtrim(s) { sub(/[ \t]+$/, "", s); return s } function trim(s) { return rtrim(ltrim(s)); } BEGIN { FS="="; state = 0; } $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1}; state == 0 {next;} $1~/TARGET/ && state == 1 {apptarget = $2; split(apptarget, atarget, ","); state=2;} $1~/STATE/ && state == 2 {appstate = $2; split(appstate, astate, ","); state=3;} state == 3 { split(appname, a, ","); for (i = 1; i <= length(atarget); i++) { printf "%-45s %-10s %-18s\n", appname, trim(atarget[i]), trim(astate[i]) }; state=0;}'And the output is:
[grid@owl ~]$ cs HA Resource Target State ----------- ------ ----- ora.DATA.dg ONLINE ONLINE on hen ora.DATA.dg ONLINE ONLINE on owl ora.LISTENER.lsnr ONLINE ONLINE on hen ora.LISTENER.lsnr ONLINE ONLINE on owl ora.LISTENER_SCAN1.lsnr ONLINE ONLINE on hen ora.LISTENER_SCAN2.lsnr ONLINE ONLINE on hen ora.LISTENER_SCAN3.lsnr ONLINE ONLINE on owl ora.asm ONLINE ONLINE on hen ora.asm ONLINE ONLINE on owl ora.cvu ONLINE ONLINE on hen ora.gns ONLINE ONLINE on hen ora.gns.vip ONLINE ONLINE on hen ora.gsd OFFLINE OFFLINE ora.gsd OFFLINE OFFLINE ora.hen.vip ONLINE ONLINE on hen ora.net1.network ONLINE ONLINE on hen ora.net1.network ONLINE ONLINE on owl ora.oc4j ONLINE ONLINE on hen ora.ons ONLINE ONLINE on hen ora.ons ONLINE ONLINE on owl ora.owl.vip ONLINE ONLINE on owl ora.poc.db OFFLINE OFFLINE ora.poc.db ONLINE ONLINE on hen ora.poc.muci.svc ONLINE ONLINE on hen ora.poc.muci.svc ONLINE OFFLINE ora.registry.acfs ONLINE ONLINE on hen ora.registry.acfs ONLINE ONLINE on owl ora.scan1.vip ONLINE ONLINE on hen ora.scan2.vip ONLINE ONLINE on hen ora.scan3.vip ONLINE ONLINE on owl
Looking at the above output I can clearly see the partial OFFLINE status of my database. From my point of view, is much better.
Tags:
RAC
Tuesday, December 14, 2010
Kill a Session From Any Node
I really like this new 11g feature which allows the DBA to kill a session despite his session is on a different instance than the instance where the session to be killed resides. The ALTER SYSTEM KILL SESSION statement has been improved and allows specifying the instance number where the session you want to kill is located:
Great!
ALTER SYSTEM KILL SESSION 'sid, serial#, @inst_no';
Great!
Tags:
RAC
Saturday, December 11, 2010
Extending my RAC with a new node
I have a 11.2.0.2 database comprised of one node. I especially created it with one node just to have the chance to add another node later. Why? Because I wanted to play with this new GPnP feature. So, despite my RAC was comprised of one node, it was actually a fully functional environment, with GNS, IPMI, CTSAS and a policy managed database. Okey, the process should be straightforward: run some CVU checks to see if the node to be added is ready and then run addNode.sh script from the GI home of the existing RAC node. In my case, the existing node was named "owl" and the node to be added was "hen".
First of all, I ran:
The next step would be to run addNode.sh script from [GI_HOME]/oui/bin location. I ran the script and I found that it does nothing if the CVU checks are not passed. You can figure out this if you run the script with debugging:
As you can see, the check_nodeadd.pl script ends with a non-zero exit code which means error (this perl script is really running the cluvfy utility so, it fails because of the GNS check). The only workaround I found was to ignore this checking using:
After that I was able to successfully run addNode.sh script:
Okey, GREAT! Let's run those scripts on the new node:
Ups! I did not see that coming! First of all, OLR?! Yea, it's like an OCR but local. The only note I found about this error was 1123453.1 and it advises to double check if all install prerequisites are passed using cluvfy. In my case, the only problem I had was with the GNS check. Does GNS have anything to do with my error? As it turned out, no, it doesn't! The big mistake I made (and the cluvfy didn't notice that) was that the SSH setup between nodes was wrong. Connecting from owl to hen was okey, but not vice-versa. After I fixed the SSH configuration the root.sh script was executed without any problems. Great!
The next step was to clone the database oracle home. That was really easy: just run the addNode.sh in the same way I did for GI. So far so good... at this point I was expecting that little magic to happen. Look what the documentation says:
If you store your policy-managed database on Oracle Automatic Storage Management (Oracle ASM), Oracle Managed Files (OMF) is enabled, and if there is space in a server pool for node2, then crsd adds the Oracle RAC instance to node2 and no further action is necessary. If OMF is not enabled, then you must manually add undo and redo logs.
Hey, that's my case! Unfortunately, the new instance didn't show up. Furthermore, the pool configuration was asking for a new node:
And, that's all! Now, my fancy RAC has a new baby node.
First of all, I ran:
[grid@owl bin]$ cluvfy stage -pre nodeadd -n hen Performing pre-checks for node addition Checking node reachability... Node reachability check passed from node "owl" Checking user equivalence... User equivalence check passed for user "grid" Checking node connectivity... Checking hosts config file... Verification of the hosts config file successful Check: Node connectivity for interface "eth0" Node connectivity passed for interface "eth0" Node connectivity check passed Checking CRS integrity... CRS integrity check passed Checking shared resources... Checking CRS home location... The location "/u01/app/11.2.0.2/grid" is not shared but is present/creatable on all nodes Shared resources check for node addition passed Checking node connectivity... Checking hosts config file... Verification of the hosts config file successful Check: Node connectivity for interface "eth0" Node connectivity passed for interface "eth0" Check: Node connectivity for interface "eth1" Node connectivity passed for interface "eth1" Node connectivity check passed Total memory check passed Available memory check passed Swap space check passed Free disk space check passed for "owl:/tmp" Free disk space check passed for "hen:/tmp" Check for multiple users with UID value 1100 passed User existence check passed for "grid" Run level check passed Hard limits check passed for "maximum open file descriptors" Soft limits check passed for "maximum open file descriptors" Hard limits check passed for "maximum user processes" Soft limits check passed for "maximum user processes" System architecture check passed Kernel version check passed Kernel parameter check passed for "semmsl" Kernel parameter check passed for "semmns" Kernel parameter check passed for "semopm" Kernel parameter check passed for "semmni" Kernel parameter check passed for "shmmax" Kernel parameter check passed for "shmmni" Kernel parameter check passed for "shmall" Kernel parameter check passed for "file-max" Kernel parameter check passed for "ip_local_port_range" Kernel parameter check passed for "rmem_default" Kernel parameter check passed for "rmem_max" Kernel parameter check passed for "wmem_default" Kernel parameter check passed for "wmem_max" Kernel parameter check passed for "aio-max-nr" Package existence check passed for "make-3.81( x86_64)" Package existence check passed for "binutils-2.17.50.0.6( x86_64)" Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)" Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)" Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)" Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)" Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)" Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)" Package existence check passed for "glibc-common-2.5( x86_64)" Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)" Package existence check passed for "glibc-headers-2.5( x86_64)" Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)" Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)" Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)" Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)" Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)" Package existence check passed for "sysstat-7.0.2( x86_64)" Package existence check passed for "ksh-20060214( x86_64)" Check for multiple users with UID value 0 passed Current group ID check passed Checking OCR integrity... OCR integrity check passed Checking Oracle Cluster Voting Disk configuration... Oracle Cluster Voting Disk configuration check passed Time zone consistency check passed Starting Clock synchronization checks using Network Time Protocol(NTP)... NTP Configuration file check started... No NTP Daemons or Services were found to be running Clock synchronization check using Network Time Protocol(NTP) passed User "grid" is not part of "root" group. Check passed Checking consistency of file "/etc/resolv.conf" across nodes File "/etc/resolv.conf" does not have both domain and search entries defined domain entry in file "/etc/resolv.conf" is consistent across nodes search entry in file "/etc/resolv.conf" is consistent across nodes All nodes have one search entry defined in file "/etc/resolv.conf" The DNS response time for an unreachable node is within acceptable limit on all nodes File "/etc/resolv.conf" is consistent across nodes Checking GNS integrity... The GNS subdomain name "vmrac.fits.ro" is a valid domain name GNS VIP "poc-gns-vip.vmrac.fits.ro" resolves to a valid IP address PRVF-5229 : GNS VIP is active before Clusterware installation PRVF-5232 : The GNS subdomain qualified host name "hen.vmrac.fits.ro" was resolved into an IP address GNS integrity check failed Pre-check for node addition was unsuccessful on all the nodes.PRVF-5229 is really a strange error: of course the GNS VIP is active because I already have my RAC installed. It really makes sense when installing a new RAC and the GNS vip sould be unallocated but otherwise I don't get it. So, I decided to go on even the CVU was complaining.
The next step would be to run addNode.sh script from [GI_HOME]/oui/bin location. I ran the script and I found that it does nothing if the CVU checks are not passed. You can figure out this if you run the script with debugging:
[grid@owl bin]$ sh -x ./addNode.sh -silent "CLUSTER_NEW_NODES={hen}" + OHOME=/u01/app/11.2.0.2/grid + INVPTRLOC=/u01/app/11.2.0.2/grid/oraInst.loc + ADDNODE='/u01/app/11.2.0.2/grid/oui/bin/runInstaller -addNode -invPtrLoc /u01/app/11.2.0.2/grid/oraInst.loc ORACLE_HOME=/u01/app/11.2.0.2/grid -silent CLUSTER_NEW_NODES={hen}' + '[' '' = Y -o '!' -f /u01/app/11.2.0.2/grid/cv/cvutl/check_nodeadd.pl ']' + CHECK_NODEADD='/u01/app/11.2.0.2/grid/perl/bin/perl /u01/app/11.2.0.2/grid/cv/cvutl/check_nodeadd.pl -pre -silent CLUSTER_NEW_NODES={hen}' + /u01/app/11.2.0.2/grid/perl/bin/perl /u01/app/11.2.0.2/grid/cv/cvutl/check_nodeadd.pl -pre -silent 'CLUSTER_NEW_NODES={hen}' + '[' 1 -eq 0 ']'
As you can see, the check_nodeadd.pl script ends with a non-zero exit code which means error (this perl script is really running the cluvfy utility so, it fails because of the GNS check). The only workaround I found was to ignore this checking using:
export IGNORE_PREADDNODE_CHECKS=Y
After that I was able to successfully run addNode.sh script:
[grid@owl bin]$ ./addNode.sh -silent "CLUSTER_NEW_NODES={hen}" Starting Oracle Universal Installer... ... output truncated ... Saving inventory on nodes (Friday, December 10, 2010 8:49:27 PM EET) . 100% Done. Save inventory complete WARNING:A new inventory has been created on one or more nodes in this session. However, it has not yet been registered as the central inventory of this system. To register the new inventory please run the script at '/u01/app/oraInventory/orainstRoot.sh' with root privileges on nodes 'hen'. If you do not register the inventory, you may not be able to update or patch the products you installed. The following configuration scripts need to be executed as the "root" user in each cluster node. /u01/app/oraInventory/orainstRoot.sh #On nodes hen /u01/app/11.2.0.2/grid/root.sh #On nodes hen To execute the configuration scripts: 1. Open a terminal window 2. Log in as "root" 3. Run the scripts in each cluster node The Cluster Node Addition of /u01/app/11.2.0.2/grid was successful. Please check '/tmp/silentInstall.log' for more details.
Okey, GREAT! Let's run those scripts on the new node:
[root@hen app]# /u01/app/oraInventory/orainstRoot.sh Creating the Oracle inventory pointer file (/etc/oraInst.loc) Changing permissions of /u01/app/oraInventory. Adding read,write permissions for group. Removing read,write,execute permissions for world. Changing groupname of /u01/app/oraInventory to oinstall. The execution of the script is complete. [root@hen app]# /u01/app/11.2.0.2/grid/root.sh Running Oracle 11g root script... The following environment variables are set as: ORACLE_OWNER= grid ORACLE_HOME= /u01/app/11.2.0.2/grid Enter the full pathname of the local bin directory: [/usr/local/bin]: The contents of "dbhome" have not changed. No need to overwrite. The contents of "oraenv" have not changed. No need to overwrite. The contents of "coraenv" have not changed. No need to overwrite. Creating /etc/oratab file... Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root script. Now product-specific root actions will be performed. Using configuration parameter file: /u01/app/11.2.0.2/grid/crs/install/crsconfig_params Creating trace directory PROTL-16: Internal Error Failed to create or upgrade OLR
Failed to create or upgrade OLR at /u01/app/11.2.0.2/grid/crs/install/crsconfig_lib.pm line 6740.
/u01/app/11.2.0.2/grid/perl/bin/perl -I/u01/app/11.2.0.2/grid/perl/lib -I/u01/app/11.2.0.2/grid/crs/install /u01/app/11.2.0.2/grid/crs/install/rootcrs.pl execution failed
Ups! I did not see that coming! First of all, OLR?! Yea, it's like an OCR but local. The only note I found about this error was 1123453.1 and it advises to double check if all install prerequisites are passed using cluvfy. In my case, the only problem I had was with the GNS check. Does GNS have anything to do with my error? As it turned out, no, it doesn't! The big mistake I made (and the cluvfy didn't notice that) was that the SSH setup between nodes was wrong. Connecting from owl to hen was okey, but not vice-versa. After I fixed the SSH configuration the root.sh script was executed without any problems. Great!
The next step was to clone the database oracle home. That was really easy: just run the addNode.sh in the same way I did for GI. So far so good... at this point I was expecting that little magic to happen. Look what the documentation says:
If you store your policy-managed database on Oracle Automatic Storage Management (Oracle ASM), Oracle Managed Files (OMF) is enabled, and if there is space in a server pool for node2, then crsd adds the Oracle RAC instance to node2 and no further action is necessary. If OMF is not enabled, then you must manually add undo and redo logs.
Hey, that's my case! Unfortunately, the new instance didn't show up. Furthermore, the pool configuration was asking for a new node:
[oracle@hen oracle]$ srvctl config srvpool -g poc Server pool name: poc Importance: 10, Min: 2, Max: -1 Candidate server names:Look, I have increased the importance level and I set the "Min" property to 2. Damn it! I don't know why the new server was not automatically picked up, but maybe is also my leak of experience concerning this new server pools concept. In the end I launched "dbca" from the new added node hoping that some new magic options were added. But, no... even the "Instance Management" option was disabled. But, if you are choosing "Configure database" and next, next, next until the SYSDBA credentials are requested then dbca will try to connect to the local instance and it will actually create this new instance. I'm sure this is not the way it was supposed to work but, at least, I could see some results. However, there was another interesting thing. Looking into the alert of the new created instance I found:
Could not open audit file: /u01/app/oracle/admin/poc/adump/poc_2_ora_18197_1.aud Retry Iteration No: 1 OS Error: 2 Retry Iteration No: 2 OS Error: 2 Retry Iteration No: 3 OS Error: 2 Retry Iteration No: 4 OS Error: 2 Retry Iteration No: 5 OS Error: 2 OS Audit file could not be created; failing after 5 retriesI didn't create the /u01/app/oracle/admin/poc/adump folder on my new node and that was causing the error. So, this is another thing I should remember... as part of the addNode.sh cloning process the "adump" location is not automatically created.
And, that's all! Now, my fancy RAC has a new baby node.
Tags:
RAC
Wednesday, December 08, 2010
Upgrade GI to 11.2.0.2: Simply Surprising...
I never thought I'd write a post about such a trivial task... Well, if you are going to upgrade from 11.2.0.1 to 11.2.0.2 be prepared for surprises.
The first surprise is given by the download page from the oracle support site (formally know as Oracle Metalink). The 11.2.0.2 patch set has 4.8G! WTF?! Furthermore, it is split in 7 pieces... Despite of this huge size, the good thing is that, unlike the previous releases, this patch-set may be used as a self-contained Oracle installer, which means you don't have to install a base 11.2.0.1 release and, after that, to apply the 11.2.0.2 patch-set on top of it, but you may simply install the 11.2.0.2 release directly. There's one more catch: if you want to upgrade just the Grid Infrastructure you don't need all 7 pieces from the patch-set. On the download page is not very clear mentioned but if you have the curiosity to open the README (and you should!) then you'll find the following:
Great! So, for the beginning we'd need the 3rd piece in order to upgrade our Grid Infrastructure.
The second surprise is the fact that the GI cannot be in-place upgraded. In previous releases we used to patch providing an existing home location. Starting with 11.2.0.2 in-place upgrades for GI are not supported. According to the "Upgrade" guide:
As of Oracle Database 11g release 2 (11.2), the Oracle Clusterware software must be upgraded to a new home location in the Oracle grid infrastructure home. Additionally, Oracle ASM and Oracle Clusterware (and Oracle Restart for single-instance databases) must run in the same Oracle grid infrastructure home. When upgrading Oracle Clusterware to release 11.2, OUI automatically calls Oracle ASM Cluster Assistant (ASMCA) to perform the upgrade into the grid infrastructure home.
Okey, good to know! Let's start the upgrade process of GI. The wizard provided by the OUI is quite intuitive therefore I will not bother you with screenshots and other obvious things. However, the next surprise comes when you are running the
Now, you have to unzip the PSU patch into an empty folder, let's say /u01/stage, and run the following command as root:
After applying the patch we are ready for our
The first surprise is given by the download page from the oracle support site (formally know as Oracle Metalink). The 11.2.0.2 patch set has 4.8G! WTF?! Furthermore, it is split in 7 pieces... Despite of this huge size, the good thing is that, unlike the previous releases, this patch-set may be used as a self-contained Oracle installer, which means you don't have to install a base 11.2.0.1 release and, after that, to apply the 11.2.0.2 patch-set on top of it, but you may simply install the 11.2.0.2 release directly. There's one more catch: if you want to upgrade just the Grid Infrastructure you don't need all 7 pieces from the patch-set. On the download page is not very clear mentioned but if you have the curiosity to open the README (and you should!) then you'll find the following:
Great! So, for the beginning we'd need the 3rd piece in order to upgrade our Grid Infrastructure.
The second surprise is the fact that the GI cannot be in-place upgraded. In previous releases we used to patch providing an existing home location. Starting with 11.2.0.2 in-place upgrades for GI are not supported. According to the "Upgrade" guide:
As of Oracle Database 11g release 2 (11.2), the Oracle Clusterware software must be upgraded to a new home location in the Oracle grid infrastructure home. Additionally, Oracle ASM and Oracle Clusterware (and Oracle Restart for single-instance databases) must run in the same Oracle grid infrastructure home. When upgrading Oracle Clusterware to release 11.2, OUI automatically calls Oracle ASM Cluster Assistant (ASMCA) to perform the upgrade into the grid infrastructure home.
Okey, good to know! Let's start the upgrade process of GI. The wizard provided by the OUI is quite intuitive therefore I will not bother you with screenshots and other obvious things. However, the next surprise comes when you are running the
rootupgrade.sh
script. The error is:Failed to add (property/value):('OLD_OCR_ID/'-1') for checkpoint:ROOTCRS_OLDHOMEINFO.Error code is 256 The fixes for bug 9413827 are not present in the 11.2.0.1 crs home Apply the patches for these bugs in the 11.2.0.1 crs home and then run rootupgrade.sh /oragi/perl/bin/perl -I/oragi/perl/lib -I/oragi/crs/install /oragi/crs/install/rootcrs.pl execution failedWTF? You cannot patch if you don't have another stupid patch already there. Okey, as an Oracle DBA you have to be a patient guy... take a deep breath and start looking for bug 9413827. First of all there is the 10036834.8 note, which basically says that you might still get this error even if you apply the patch for the 9413827 bug. As an workaround they suggest to also apply the patch for 9655006 bug. That's madness! In the end it turns out that 9655006 patch is actually the July 10.2.0.1.2 PSU. Okey, just download the appropriate version for your platform. Now, another surprise... you need an updated version of OPatch utility. Damn it! Back to metalink, search for patch 6880880 and download the 11.2.0.0.0 version for your platform (Take care not to download the wrong version. By the way, did you noticed that you may download a wget script which can be used to download the patch without using a browser? Yea, finally something good on that shitty flash GUI.) According to the README they suggest to unzip the upgraded OPatch utility directly into your CRS home, using something like:
unzip [p6880880...zip] -d [your GI home]... which I did!
Now, you have to unzip the PSU patch into an empty folder, let's say /u01/stage, and run the following command as root:
In my case, the output was:/OPatch/opatch auto /u01/stage/ -och [your GI home]
Executing /usr/bin/perl /u01/app/11.2.0.1/grid/OPatch/crs/patch112.pl -patchdir /u01 -patchn stage -och /u01/app/11.2.0.1/grid/ -paramfile /u01/app/11.2.0.1/grid/crs/install/crsconfig_params 2010-12-08 12:32:19: Parsing the host name 2010-12-08 12:32:19: Checking for super user privileges 2010-12-08 12:32:19: User has super user privileges Using configuration parameter file: /u01/app/11.2.0.1/grid/crs/install/crsconfig_params The opatch Component check failed. This patch is not applicable for /u01/app/11.2.0.1/grid/ The opatch Component check failed. This patch is not applicable for /u01/app/11.2.0.1/grid/ Patch Component/Conflict check failed for /u01/app/11.2.0.1/grid/Upssy! Another surprise! This patch is not applicable for bla bla bla? Are you serious? Let's check the logs. They should be in your $CRS_HOME/cfgtoollogs. Search for a log file named as
opatchauto[timestamp].log
. The important part for the log:2010-12-08 12:32:19: The component check failed with following error 2010-12-08 12:32:19: bash: /u01/app/11.2.0.1/grid/OPatch/opatch: Permission deniedHa? I'm root! Aaaa... okey! Apparently it tries to run the OPatch tool under the grid user. Okey, let's fix the permissions.
chown root:oinstall /u01/app/11.2.0.1/grid/OPatch -R chmod g+r /u01/app/11.2.0.1/grid/OPatch/opatchNow, try again! Yeap... now it's working.
After applying the patch we are ready for our
rootupgrade.sh
. It's interesting that the output still contains the Failed to add (property/value):('OLD_OCR_ID/'-1')
message but the upgrade continues without any other complaints. Okey, let's perform a quick check:srvctl config asm ASM home: /u01/app/11.2.0.2/grid ASM listener: LISTENER srvctl config listener -a Name: LISTENER Network: 1, Owner: grid Home:Great, the ASM and listeners are relocated to the new GI home. The next logical thing to do is to uninstall the old GI home, right? It's as simple as:/u01/app/11.2.0.2/grid on node(s) owl End points: TCP:1521
Oookey, meet SURPRISE Number 6:/deinstall/deinstall
ERROR: You must delete or downgrade the Oracle RAC databases and de-install the Oracle RAC homes before attempting to remove the Oracle Clusterware homes.Isn't it great? On metalink I found Bug 10332736 and, on the WORKAROUND section, it says something about writing a note with a manual uninstall procedure. However, at the time of writing this, the note wasn't published yet. Yea... all I can say is that I'm tired of these stupid issues. What happend with the Oracle testing department? They encourage to patch frequently but, as far as I'm concerned, I always have this creepy feeling before doing it.
Tags:
RAC
Sunday, November 28, 2010
My first 11gR2 RAC on VirtualBox - Learned Lessons
Oracle 11gR2 comes with many RAC goodies and, of course, I want to see them in action. But, the first step is to actually have a 11gR2 available. So, I decided to install one on my poor Dell PowerEdge 1850 server. As it turned out, even the installation process is changed with the introduction of the SCAN, GNS, server pools etc. However it makes this task more challenging, doesn't it?
Because this RAC was not intended to be used for any productive purposes my first choice was for a virtualized environment. I tried Oracle VM in the first place and I was quite disappointed about the results:
1. my virtual machines were reboting unexpectedly, even they were idle. I didn't managed to find the cause of this.
2. during heavy loads on my virtual machines I was constantly getting:
I guess Oracle already fixed that but I don't have any ULN subscriptions, so no updates.
My next option was VirtualBox which is a nice choice and it is also provided by Oracle. VirtualBox supports now shared disks which makes it a very appealing solution for RAC testing. In addition, there's also a well written guide about how to install a RAC on VirtualBox here.
To summarize, below are the main lessons I learned out of this RAC installation process:
1. High CPU load for my virtual hosts: after I created the hosts which were supposed to act as RAC nodes, I noticed that the CPUs on the host server was on 100% even the guests were idle. My host server has 8G RAM and 2 physical CPU on 3.4GHz, so this high CPU consumption didn't feel right at all. The solution was to boot my virtual hosts with the divider=10 option. Even with this tweak the whole installation process was slow, so be prepared to wait...
2. pay attention to the groups membership for the oracle/grid users: I made a stupid mistake and I forgot to add the oracle user to asmdba group. The prerequisites checks didn't complained and I successfully installed the Grid Infrastructure and the Oracle database software. However, when I reached the database installation phase using dbca I noticed that no asm diskgroups were available even they were accessible on my "grid" user. So, in order to save precious time for debugging such tricky issues double check these groups membership requirements.
3. the time synchronization issue: because I wanted to use new stuff for my RAC I decided to get rid of the ntpd synchronization and to use the Oracle CTSSD implementation. However, be careful here. Oracle is peaky when it comes to detecting if other synchronization software is installed. Even your ntpd daemon is stopped you also have to remove/rename the /etc/ntpd.conf file. Otherwise, the time synchronization check will fail. And another thing: if you configure your NIC interfaces via DHCP you may end up having this /etc/ntpd.conf after every node reboot. In order to prevent this you may use static address initialization or you may add PEERNTPD=no to your ifcfg-ethX scripts.
4. GNS preparations: this GNS (Grid Naming Service) component is new in 11gR2 and is not a very tasty concept for those DBAs (like me) who do not have a lot of knowledge in network administration field. So, if you are going to use GNS, be sure you have an experienced system administrator around, to provide you support for configuring it. However, you still need to know what to ask him to do. Basically, you have to agree on a new DNS zone. If your company domain is gigel.ro you may choose for your RAC rac.gigel.ro. Then, you need to ask him to delegate the requests form *.rac.gigel.ro to an unallocated IP address from the same IP class as your future RAC public interface. This IP is the VIP for your GNS and it will be available only when your RAC installation is successfully finished. Then, your system administrator will ask you under which name to "glue" the new rac.gigel.ro zone. He actually want to know under which DNS name to register this GNS vip address. The glue is really a well-known concept in the DNS terminology. As far as I noticed Oracle uses <cluster_name>-gns-vip.<gns_zone>. So, for our hypothetical example, assuming the rac name is "muci", the gns glue would be: muci-gns-vip.rac.gigel.ro.
5. ORA-15081: I think this has to do with the membership mistake. DBCA was reporting ORA-15081, complaining that it cannot create stuff into ASM diskgroups. The metalink note 1084186.1 provides the solution.
Okey, that would be all. Happy (but slow) RAC on VirtualBox.
Because this RAC was not intended to be used for any productive purposes my first choice was for a virtualized environment. I tried Oracle VM in the first place and I was quite disappointed about the results:
1. my virtual machines were reboting unexpectedly, even they were idle. I didn't managed to find the cause of this.
2. during heavy loads on my virtual machines I was constantly getting:
INFO: task kjournal:337 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_sec" disables this message.
I guess Oracle already fixed that but I don't have any ULN subscriptions, so no updates.
My next option was VirtualBox which is a nice choice and it is also provided by Oracle. VirtualBox supports now shared disks which makes it a very appealing solution for RAC testing. In addition, there's also a well written guide about how to install a RAC on VirtualBox here.
To summarize, below are the main lessons I learned out of this RAC installation process:
1. High CPU load for my virtual hosts: after I created the hosts which were supposed to act as RAC nodes, I noticed that the CPUs on the host server was on 100% even the guests were idle. My host server has 8G RAM and 2 physical CPU on 3.4GHz, so this high CPU consumption didn't feel right at all. The solution was to boot my virtual hosts with the divider=10 option. Even with this tweak the whole installation process was slow, so be prepared to wait...
2. pay attention to the groups membership for the oracle/grid users: I made a stupid mistake and I forgot to add the oracle user to asmdba group. The prerequisites checks didn't complained and I successfully installed the Grid Infrastructure and the Oracle database software. However, when I reached the database installation phase using dbca I noticed that no asm diskgroups were available even they were accessible on my "grid" user. So, in order to save precious time for debugging such tricky issues double check these groups membership requirements.
3. the time synchronization issue: because I wanted to use new stuff for my RAC I decided to get rid of the ntpd synchronization and to use the Oracle CTSSD implementation. However, be careful here. Oracle is peaky when it comes to detecting if other synchronization software is installed. Even your ntpd daemon is stopped you also have to remove/rename the /etc/ntpd.conf file. Otherwise, the time synchronization check will fail. And another thing: if you configure your NIC interfaces via DHCP you may end up having this /etc/ntpd.conf after every node reboot. In order to prevent this you may use static address initialization or you may add PEERNTPD=no to your ifcfg-ethX scripts.
4. GNS preparations: this GNS (Grid Naming Service) component is new in 11gR2 and is not a very tasty concept for those DBAs (like me) who do not have a lot of knowledge in network administration field. So, if you are going to use GNS, be sure you have an experienced system administrator around, to provide you support for configuring it. However, you still need to know what to ask him to do. Basically, you have to agree on a new DNS zone. If your company domain is gigel.ro you may choose for your RAC rac.gigel.ro. Then, you need to ask him to delegate the requests form *.rac.gigel.ro to an unallocated IP address from the same IP class as your future RAC public interface. This IP is the VIP for your GNS and it will be available only when your RAC installation is successfully finished. Then, your system administrator will ask you under which name to "glue" the new rac.gigel.ro zone. He actually want to know under which DNS name to register this GNS vip address. The glue is really a well-known concept in the DNS terminology. As far as I noticed Oracle uses <cluster_name>-gns-vip.<gns_zone>. So, for our hypothetical example, assuming the rac name is "muci", the gns glue would be: muci-gns-vip.rac.gigel.ro.
5. ORA-15081: I think this has to do with the membership mistake. DBCA was reporting ORA-15081, complaining that it cannot create stuff into ASM diskgroups. The metalink note 1084186.1 provides the solution.
Okey, that would be all. Happy (but slow) RAC on VirtualBox.
Tags:
RAC
Sunday, October 31, 2010
SHARED remote_login_password_file
When talking about the shared option of the remote_login_password_file parameter, the official 11.2 documentation states:
One or more databases can use the password file. The password file can contain SYS as well as non-SYS users.
Whiles that's true, it is important to mention that, as soon as you set this parameter on SHARED, you are not allowed to add more SYSDBA users nor to change their passwords. A shared password file may contain non-SYS users, only if they were previously granted SYSDBA privilege, at the time the password file was in exclusive mode.
One or more databases can use the password file. The password file can contain SYS as well as non-SYS users.
Whiles that's true, it is important to mention that, as soon as you set this parameter on SHARED, you are not allowed to add more SYSDBA users nor to change their passwords. A shared password file may contain non-SYS users, only if they were previously granted SYSDBA privilege, at the time the password file was in exclusive mode.
Wednesday, September 22, 2010
Statistics on Client Result Cache
I've just noticed that the result cache client statistics are not very accurate on my 11.2.0.1 Oracle server. I have the following java code:
While the above code is running I'm monitoring the CLIENT_RESULT_CACHE_STATS$. And this is what I've got:
The "Find Count" should be 999, right? My test program is still running (see the System.in.read at the end) therefore I expect my client result cache to be still there. My first guess was a delay in computing the statistics but even after 15 minutes of waiting I didn't get the right figures. Hmm... am I miss something?
package test;
import java.io.IOException;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import oracle.jdbc.pool.OracleDataSource;
public class ClientResultCache {
public static void main(String[] args) throws SQLException, IOException, InterruptedException {
OracleDataSource ods = new OracleDataSource();
ods.setDriverType("oci");
ods.setTNSEntryName("owldb");
ods.setUser("talek");
ods.setPassword("muci");
Connection conn = ods.getConnection();
String query = "select /*+ result_cache */ * from xxx";
((oracle.jdbc.OracleConnection)conn).setImplicitCachingEnabled(true);
((oracle.jdbc.OracleConnection)conn).setStatementCacheSize(10);
PreparedStatement pstmt;
ResultSet rs;
for (int j = 0 ; j < 1000 ; j++ ) {
System.out.println(j);
pstmt = conn.prepareStatement (query);
rs = pstmt.executeQuery();
while (rs.next( ) ) {
}
rs.close();
pstmt.close();
Thread.sleep(100);
}
System.in.read();
}
}
While the above code is running I'm monitoring the CLIENT_RESULT_CACHE_STATS$. And this is what I've got:
STAT_ID NAME VALUE
---------- ------------------------------ ----------
1 Block Size 256
2 Block Count Max 128
3 Block Count Current 128
4 Hash Bucket Count 1024
5 Create Count Success 1
6 Create Count Failure 0
7 Find Count 812
8 Invalidation Count 0
9 Delete Count Invalid 0
10 Delete Count Valid 0
The "Find Count" should be 999, right? My test program is still running (see the System.in.read at the end) therefore I expect my client result cache to be still there. My first guess was a delay in computing the statistics but even after 15 minutes of waiting I didn't get the right figures. Hmm... am I miss something?
Thursday, July 15, 2010
Oracle IDE for Geeks
Let's be honest guys... how many times you find yourself googeling for “the best oracle IDE”? If you are like me then the answer is “too many times”... Why this? Well, partly I guess because we are not satisfied with what the market offers us in this area.
If we're going to take a look at what we have now, the most well known Oracle IDEs are:
So, what I don't like about these tools? Let's see:
If we're going to take a look at what we have now, the most well known Oracle IDEs are:
- Toad from Quest Software
- PLSQL Developer offered by Allround Automations
- SQL Developer from Oracle
So, what I don't like about these tools? Let's see:
- They are heavy... some of them take a lot of time just to startup.
- Most of them are not cross platform.
- They are closed software. You don't have access to the code.
- Limited editing features. I know they offer templates, auto-complete and stuff, but they look so small in comparison with what VIM provides.
- They are not suitable for server environments. I mean... what if you have to connect to the database on a remote Unix server, connected via ssh within a "friendly" console? I guess sqlplus is all you have there and it's not a very pleasant experience.
- A lot of the so useful sqlplus commands doesn't work in these environments. PLSQL Developer does a good job emulating many of these commands but I still miss AUTOTRACE, sub-totals and all the other cool features sqlplus provides.
Wednesday, May 26, 2010
SqlPlus Injection
Despite that at the very first sight it might look stupid you may be hacked by a colleague in a very rude way. Suppose one developer asks you to create a new user for an upcoming system. Because he's a nice guy, he also hands you a simple script which creates this user along with all the required grants. Of course, even you like your colleague and appreciate his effort, you carefully inspect that script before running it. Let's see a preview of this script in a plain vim window:
Oookey! The script has nice comments, nothing unusual... You run it in your sqlplus SYS session and... BANG! your SYSTEM user is compromised and you'll even don't know that. If you still have the WTF face, then look again.
The catch is in the last comment. We used to think that in sqlplus a multiline comment start with an /* (and because sqlplus is quite picky it has to be further followed by a space or CR) and then, everything till the closing */ is taken as a comment. This assumption is wrong because, in sqlplus, a # at the very beginning of a line means "execute the command on that line". In fact, it doesn't have to be # but this is the symbol configured by default for sqlprefix setting. Just check it out:
Oookey! The script has nice comments, nothing unusual... You run it in your sqlplus SYS session and... BANG! your SYSTEM user is compromised and you'll even don't know that. If you still have the WTF face, then look again.
The catch is in the last comment. We used to think that in sqlplus a multiline comment start with an /* (and because sqlplus is quite picky it has to be further followed by a space or CR) and then, everything till the closing */ is taken as a comment. This assumption is wrong because, in sqlplus, a # at the very beginning of a line means "execute the command on that line". In fact, it doesn't have to be # but this is the symbol configured by default for sqlprefix setting. Just check it out:
SQL> show sqlprefix sqlprefix "#" (hex 23)However, we are simply fooled by our editor which, with its nice code highlighting feature, just marked our comments accordingly. Of course, it doesn't know anything about the sqlplus "sqlprefix" setting. So, before running any third-party scripts you should carefully look at them, even at comments.
Tags:
sqlplus
Sunday, May 02, 2010
Autobackup CF with Flash Recovery Area
In our office we have a 10g RAC database. It has a flash recovery area enabled, which points to an ASM disk. Nothing special I would say... However, from time to time, our nightly backup script simply fails complaining that it can't find some obsolete backups which should be deleted:
That's weird! All those backup pieces are controlfile autobackups. RMAN looks for them into a local filesystem and, being a RAC database, those files are accessible, obvious, just from one node. But how? They were supposed to be placed into our shared storage, in FRA, to be more precise. Well, let's look once again to our settings:
Okey, it's clear we have a FRA! What about RMAN settings?
It looks good... the autobackup format for controlfile is '%F' which is the default one, right? The documentation proves that:
The default location for the autobackup on disk is the flash recovery area (if configured) or a platform-specific location (if not configured). RMAN automatically backs up the current control file using the default format of %F.
Okey, we have a flash recovery area and a %F default autobackup format... WTF? Well, the answer is given by the 338483.1 metalink note. Apparently, there is a big difference between having the autobackup format set on its default value and having it reset to its default... Interesting, ha? It is... So, if you set (explicitly) the autobackup format to %F, the autobackup file will go to a OS specific location, which on Linux is $?/dbs. But if you have the autobackup format on its default (explicitly reset it, or never set it at all) and you have a FRA configured then that autobackup file will actually go to FRA.
So, in my case the solution was simple (please notice the "# default" marker):
Ooookey, really really unintuitive... I think the Oracle documentation should be more precise regarding this.
RMAN-06207: WARNING: 4 objects could not be deleted for DISK channel(s) due
RMAN-06208: to mismatched status. Use CROSSCHECK command to fix status
RMAN-06210: List of Mismatched objects
RMAN-06211: ==========================
RMAN-06212: Object Type Filename/Handle
RMAN-06213: --------------- ---------------------------------------------------
RMAN-06214: Backup Piece /u01/app/oracle/product/10.2.0/db_1/dbs/c-24173594-20100427-00
RMAN-06214: Backup Piece /u01/app/oracle/product/10.2.0/db_1/dbs/c-24173594-20100427-01
RMAN-06214: Backup Piece /u01/app/oracle/product/10.2.0/db_1/dbs/c-24173594-20100428-00
RMAN-06214: Backup Piece /u01/app/oracle/product/10.2.0/db_1/dbs/c-24173594-20100428-01
That's weird! All those backup pieces are controlfile autobackups. RMAN looks for them into a local filesystem and, being a RAC database, those files are accessible, obvious, just from one node. But how? They were supposed to be placed into our shared storage, in FRA, to be more precise. Well, let's look once again to our settings:
SQL> show parameter recov
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string +DG1
db_recovery_file_dest_size big integer 150000M
recovery_parallelism integer 0
Okey, it's clear we have a FRA! What about RMAN settings?
RMAN> show all;
using target database control file instead of recovery catalog
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 2 DAYS;
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F';
CONFIGURE DEVICE TYPE DISK PARALLELISM 4 BACKUP TYPE TO COMPRESSED BACKUPSET;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_fd1.f'; # default
It looks good... the autobackup format for controlfile is '%F' which is the default one, right? The documentation proves that:
The default location for the autobackup on disk is the flash recovery area (if configured) or a platform-specific location (if not configured). RMAN automatically backs up the current control file using the default format of %F.
Okey, we have a flash recovery area and a %F default autobackup format... WTF? Well, the answer is given by the 338483.1 metalink note. Apparently, there is a big difference between having the autobackup format set on its default value and having it reset to its default... Interesting, ha? It is... So, if you set (explicitly) the autobackup format to %F, the autobackup file will go to a OS specific location, which on Linux is $?/dbs. But if you have the autobackup format on its default (explicitly reset it, or never set it at all) and you have a FRA configured then that autobackup file will actually go to FRA.
So, in my case the solution was simple (please notice the "# default" marker):
RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK CLEAR;
old RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F';
RMAN configuration parameters are successfully reset to default value
RMAN> show controlfile autobackup format;
RMAN configuration parameters are:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
Ooookey, really really unintuitive... I think the Oracle documentation should be more precise regarding this.
Tags:
RMAN
Friday, March 19, 2010
When having a rman retention policy based on REDUNDANCY is a bad idea...
Suppose you have a RMAN retention policy of "REDUNDANCY 2". This means that as long as you have at least two backups of the same datafile, controlfile/spfile or archivelog the other older backups become obsolete and RMAN is allowed to safely remove them.
Now, let's also suppose that every night you backup your database using the following script:
The backup task is quite simple: first of all it ensures that we have the controlfile autobackup feature on, then it backups the database and archives and, at the end, it deletes all obsolete backups using the REDUNDANCY 2 retention policy.
Using the above approach you might think that you can restore your database as it was two days ago, right? For example, if you have a backup taken on Monday and another one taken on Tuesday you may restore your database as it was within the (Monday_last_backup - Today) time interval. Well, that's wrong!
Consider the following scenario:
1. On Monday night you backup the database using the above script;
2. On Tuesday, during the day, you drop a tablespace. Because this is a structural database change a controlfile autobackup will be triggered. Ieeei, you have a new controlfile backup.
3. On Tuesday night you backup again the database... nothing unusual, right?
Well, the tricky part is regarding the DELETE OBSOLETE command. When the backup script will run this command, RMAN finds out three controlfile backups: one is originating from the Monday backup, one is from the structural change and the third is from our just finished Tuesday backup database command. Now according to the retention policy of "REDUNDANCY 2", RMAN will assume that it is safe to delete the backup of the controlfile taken on Monday night backup because it's out of our retention policy and because this backup is the oldest one. Uuups... this means that we gonna have a big problem restoring the database as it was before our structural change because we don't have a controlfile backup from that time.
So, if you intend to incomplete recover your database to a previous time in the past it's really a good idea to switch to a retention policy based on a "RECOVERY WINDOW" instead. In our case a RECOVERY WINDOW OF 2 DAYS would be more appropriate.
Now, let's also suppose that every night you backup your database using the following script:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
rman {
backup database plus archivelog;
delete noprompt obsolete redundancy 2;
}
The backup task is quite simple: first of all it ensures that we have the controlfile autobackup feature on, then it backups the database and archives and, at the end, it deletes all obsolete backups using the REDUNDANCY 2 retention policy.
Using the above approach you might think that you can restore your database as it was two days ago, right? For example, if you have a backup taken on Monday and another one taken on Tuesday you may restore your database as it was within the (Monday_last_backup - Today) time interval. Well, that's wrong!
Consider the following scenario:
1. On Monday night you backup the database using the above script;
2. On Tuesday, during the day, you drop a tablespace. Because this is a structural database change a controlfile autobackup will be triggered. Ieeei, you have a new controlfile backup.
3. On Tuesday night you backup again the database... nothing unusual, right?
Well, the tricky part is regarding the DELETE OBSOLETE command. When the backup script will run this command, RMAN finds out three controlfile backups: one is originating from the Monday backup, one is from the structural change and the third is from our just finished Tuesday backup database command. Now according to the retention policy of "REDUNDANCY 2", RMAN will assume that it is safe to delete the backup of the controlfile taken on Monday night backup because it's out of our retention policy and because this backup is the oldest one. Uuups... this means that we gonna have a big problem restoring the database as it was before our structural change because we don't have a controlfile backup from that time.
So, if you intend to incomplete recover your database to a previous time in the past it's really a good idea to switch to a retention policy based on a "RECOVERY WINDOW" instead. In our case a RECOVERY WINDOW OF 2 DAYS would be more appropriate.
Tags:
RMAN
Sunday, February 28, 2010
PLSQL "All or Nothing" Pitfall
Transactions are such a common thing when working with databases. They act on an "all or nothing" basis, that is, they succeed or fail but they always should let the database into a consistent state. Of course, in Oracle databases the rules are the same, but the interesting part I want to refer to is in connection with PL/SQL modules (procedures, functions or packages).
A PL/SQL module is some kind of "all or nothing" component. If the procedure fails it rollbacks the uncommited work it has done. Suppose we have the following procedure:
Let's see what happens:
Nice... we didn't explicitly rollback, but Oracle was smart enough to do the cleanup job for us. This makes sense and proves that PLSQL modules are, in a way, "all or nothing" components.
Now, let's say we have an oracle job which calls our "test" procedure and if an error occurs it has to log it into another table. A possible implementation of the job PLSQL caller block may be:
The above code may seem harmless: the test procedure is called and if it raises an error the exception part of the PL/SQL caller block is executed which further inserts the error into our log table. Of course, we commit the log entry we just inserted and we re-raise the originating error. We know that if test procedure fails then it rollbacks its uncommited work as we seen above. After all, it's an "all or nothing" piece, right? Well, here's the pitfall: if you catch the exception then the procedure which raised the error will not clean up anything as long as you are within the EXCEPTION section. Even the whole anonymous block will fail because of re-raising the original error, the COMMIT statement from the EXCEPTION section will actually commit the incomplete work done by our "TEST" procedure. So, in most cases you should look twice to such EXCEPTION WHEN THEN ... COMMIT definitions... otherwise you may end up with nasty bugs. In the above example, in order to avoid this problem, a ROLLBACK should be performed before logging the error. Of course, there are smarter logging solutions which use autonomous transactions but, anyway, the goal was just to reveal the pitfall.
A PL/SQL module is some kind of "all or nothing" component. If the procedure fails it rollbacks the uncommited work it has done. Suppose we have the following procedure:
CREATE OR REPLACE PROCEDURE test AS
BEGIN
insert into yyy values (1);
raise_application_error(-20000, 'I am a cute error!');
END test;
Let's see what happens:
SQL> truncate table yyy;
Table truncated.
SQL> exec test;
BEGIN test; END;
*
ERROR at line 1:
ORA-20000: I am a cute error!
ORA-06512: at "TALEK.TEST", line 4
ORA-06512: at line 1
SQL> select * from yyy;
no rows selected
Nice... we didn't explicitly rollback, but Oracle was smart enough to do the cleanup job for us. This makes sense and proves that PLSQL modules are, in a way, "all or nothing" components.
Now, let's say we have an oracle job which calls our "test" procedure and if an error occurs it has to log it into another table. A possible implementation of the job PLSQL caller block may be:
begin
test;
exception
when others then
insert into log values (dbms_utility.format_error_stack);
commit;
raise;
end;
/
The above code may seem harmless: the test procedure is called and if it raises an error the exception part of the PL/SQL caller block is executed which further inserts the error into our log table. Of course, we commit the log entry we just inserted and we re-raise the originating error. We know that if test procedure fails then it rollbacks its uncommited work as we seen above. After all, it's an "all or nothing" piece, right? Well, here's the pitfall: if you catch the exception then the procedure which raised the error will not clean up anything as long as you are within the EXCEPTION section. Even the whole anonymous block will fail because of re-raising the original error, the COMMIT statement from the EXCEPTION section will actually commit the incomplete work done by our "TEST" procedure. So, in most cases you should look twice to such EXCEPTION WHEN THEN ... COMMIT definitions... otherwise you may end up with nasty bugs. In the above example, in order to avoid this problem, a ROLLBACK should be performed before logging the error. Of course, there are smarter logging solutions which use autonomous transactions but, anyway, the goal was just to reveal the pitfall.
Tags:
RDBMS
Wednesday, February 24, 2010
INS-32018 Warning for Standalone Server
When it comes to installing Oracle you should always follow the procedures written into the installation guides. As you already know, Oracle 11.2 packages ASM within a new separate component called Oracle Grid Infrastructure. So, if you want to install the database files into ASM then you must install Grid Infrastructure. As a good practice, Oracle recommends to install it under a different user, typically named "grid".
As far as the OFA directories structure is concerned the installation guide recommends:
If you're like me, the above configuration looks a little bit weird because I used to think that the ORACLE_HOME should be somewhere under the ORACLE_BASE directory. Nevertheless, the documentation clearly states the following:
Caution:
For grid infrastructure for a cluster installations, the Grid home must not be placed under one of the Oracle base directories, or under Oracle home directories of Oracle Database installation owners, or in the home directory of an installation owner. During installation, ownership of the path to the Grid home is changed to root. This change causes permission errors for other installations.
However, the above applies just to cluster installations. If you just want ASM installed for a single instance database then it's fine (and recommended) to place the ORACLE_HOME under the ORACLE_BASE. If not doing so, you'll get the following warning:

So, to sum up the above ideas, remember that if you are going to install a RAC then you need to create the grid ORACLE_HOME out of the ORACLE_BASE of any oracle software owner. If you choose to install the Oracle Grid Infrastructure for a standalone server then the ORACLE_HOME of the grid user should be under its ORACLE_BASE.
As far as the OFA directories structure is concerned the installation guide recommends:
- to create an "/u01/app/grid" directory to be used as an ORACLE_BASE for this "grid" user;
- to create an "/u01/app/11.2.0/grid" directory to be used as an ORACLE_HOME for this "grid" user.
If you're like me, the above configuration looks a little bit weird because I used to think that the ORACLE_HOME should be somewhere under the ORACLE_BASE directory. Nevertheless, the documentation clearly states the following:
Caution:
For grid infrastructure for a cluster installations, the Grid home must not be placed under one of the Oracle base directories, or under Oracle home directories of Oracle Database installation owners, or in the home directory of an installation owner. During installation, ownership of the path to the Grid home is changed to root. This change causes permission errors for other installations.
However, the above applies just to cluster installations. If you just want ASM installed for a single instance database then it's fine (and recommended) to place the ORACLE_HOME under the ORACLE_BASE. If not doing so, you'll get the following warning:

So, to sum up the above ideas, remember that if you are going to install a RAC then you need to create the grid ORACLE_HOME out of the ORACLE_BASE of any oracle software owner. If you choose to install the Oracle Grid Infrastructure for a standalone server then the ORACLE_HOME of the grid user should be under its ORACLE_BASE.
Wednesday, February 17, 2010
ALL_TABLES versus ALL_ALL_TABLES
If you ever wondered what's the difference between ALL_TABLES and ALL_ALL_TABLES then here's the answer: both views provide all tables to which the current user has access to but, in addition to the tables returned by ALL_TABLES, the ALL_ALL_TABLES will also return all object tables (system generated or not) accessible by the current user.
Pay attention that this may be an interview question (e.g. how can you get all tables you have access to?) and you may leave a good impression if you respond with another question: "Do you also want object tables to be included?". :)
Pay attention that this may be an interview question (e.g. how can you get all tables you have access to?) and you may leave a good impression if you respond with another question: "Do you also want object tables to be included?". :)
Tags:
RDBMS
Wednesday, December 16, 2009
A DDL statement may fire a DML trigger
Maybe you know this, maybe you don't. Because it's not quite obvious it deserves a little attention. We all know about DML triggers. Remember? Yea, yea... the before/after insert/update/delete each row triggers. We use to think that the INSERT, UPDATE or DELETE statements fire the corresponding triggers (of course, if any are defined). That's true with one (as far as I know) important note: a DDL statement which adds a new column with a default value will also fire the UPDATE trigger.
For example, let's create a dummy table:
Then, the corresponding trigger:
Add some records:
We end up having:
Now, the moment of truth:
Take a look at the MODIFY_DATE and see the new timestamp. The update trigger was invoked in response to our DDL statement. This is important to know. Think to a deposit table which has a column named LAST_UPATED and a trigger which updates it whenever something within a deposit changes. Now, suppose the business logic dictates that a new column must be added with a default value. You run the DDL statement to add that column and... suddenly, all information regarding when a particular deposit was last upated is lost. Ups. So, I should write down one hundred times: "Think twice before adding new columns with default values on a table with UPDATE triggers".
For example, let's create a dummy table:
SQL> create table muc (col1 integer primary key, modify_date timestamp);
Table created.
Then, the corresponding trigger:
SQL> create or replace trigger trg_muc_mod_dt before update on muc for each row
2 begin
3 :new.modify_date := systimestamp;
4 end;
5 /
Add some records:
SQL> insert into muc values (1, systimestamp);
1 row created.
SQL> insert into muc values (2, systimestamp);
1 row created.
SQL> commit;
We end up having:
SQL> select * from muc;
COL1 MODIFY_DATE
---------- ------------------------------
1 16-DEC-09 09.54.03.804223 PM
2 16-DEC-09 09.54.41.815575 PM
Now, the moment of truth:
SQL> alter table muc add (active integer default '0');
Table altered.
SQL> select * from muc;
COL1 MODIFY_DATE ACTIVE
---------- ------------------------------ ----------
1 16-DEC-09 09.55.53.836113 PM 0
2 16-DEC-09 09.55.53.840896 PM 0
Take a look at the MODIFY_DATE and see the new timestamp. The update trigger was invoked in response to our DDL statement. This is important to know. Think to a deposit table which has a column named LAST_UPATED and a trigger which updates it whenever something within a deposit changes. Now, suppose the business logic dictates that a new column must be added with a default value. You run the DDL statement to add that column and... suddenly, all information regarding when a particular deposit was last upated is lost. Ups. So, I should write down one hundred times: "Think twice before adding new columns with default values on a table with UPDATE triggers".
Tags:
RDBMS
Sunday, November 29, 2009
Strange RMAN snapshot controlfile issue
A strange thing happen today. I executed a delete obsolete command on my RMAN prompt and it reported the snapshot controlfile as obsolete. I don't know under which circumstances this problem occurs and I couldn't find any relevant information on forums or metalink (oh! sorry "my oracle support") about this.
Below is the output of the DELETE OBSOLETE command:
Indeed, this is the default configured snapshot controlfile:
It seems I'm in a kind of deadlock here. The snapshot controlfile is reported as obsolete but it can't be deleted as it is used by RMAN. The only solution I found was to change the RMAN configuration to use another snapshot controlfile, to remove then the reported obsolete one and to switch back to the default. However, the question remains: why the snapshot controlfile is reported as obsolete?
PS: This happend on a 11gR2 database installed under a Linux x86 platform.
Update: Apparently this is encountered after executing a DUPLICATE database from ACTIVE DATABASE. Furthermore, the snapshot controlfile is reported as a "datafile copy" when a CROSSCHECK is suggested. See below:
Obviously, that can't be a datafile copy. So, let's try a crosscheck as suggested:
Okey, this was expected as I don't have any datafilecopy with that name despite of what RMAN says. So, let's try a crosscheck for the controlfile copy:
As it can be seen the validation fails, although the file exists on that location:
I don't know if this is documented somewhere but it looks to me like a bug. No idea why the snapshot control file is messed up after a DUPLICATE TARGET DATABASE ... FROM ACTIVE DATABASE.
Below is the output of the DELETE OBSOLETE command:
RMAN> delete obsolete;
RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
using channel ORA_DISK_1
using channel ORA_DISK_2
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Control File Copy 36 29-11-2009 12:35:33 /u01/app/oracle/product/11.2.0/
dbhome_1/dbs/snapcf_tetris.f
Do you really want to delete the above objects (enter YES or NO)? y
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of delete command on ORA_DISK_2 channel at 11/29/2009 21:11:16
ORA-19606: Cannot copy or restore to snapshot control file
Indeed, this is the default configured snapshot controlfile:
RMAN> show snapshot controlfile name;
RMAN configuration parameters for database with db_unique_name TETRIS are:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/11.2.0/
dbhome_1/dbs/snapcf_tetris.f';
It seems I'm in a kind of deadlock here. The snapshot controlfile is reported as obsolete but it can't be deleted as it is used by RMAN. The only solution I found was to change the RMAN configuration to use another snapshot controlfile, to remove then the reported obsolete one and to switch back to the default. However, the question remains: why the snapshot controlfile is reported as obsolete?
PS: This happend on a 11gR2 database installed under a Linux x86 platform.
Update: Apparently this is encountered after executing a DUPLICATE database from ACTIVE DATABASE. Furthermore, the snapshot controlfile is reported as a "datafile copy" when a CROSSCHECK is suggested. See below:
RMAN> delete obsolete;
RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
using channel ORA_DISK_1
using channel ORA_DISK_2
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Control File Copy 40 30-11-2009 18:41:15 /u01/app/oracle/product/11.2.0/dbhome_1
/dbs/snapcf_tetris.f
Do you really want to delete the above objects (enter YES or NO)? y
RMAN-06207: WARNING: 1 objects could not be deleted for DISK channel(s) due
RMAN-06208: to mismatched status. Use CROSSCHECK command to fix status
RMAN-06210: List of Mismatched objects
RMAN-06211: ==========================
RMAN-06212: Object Type Filename/Handle
RMAN-06213: --------------- ---------------------------------------------------
RMAN-06214: Datafile Copy /u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f
Obviously, that can't be a datafile copy. So, let's try a crosscheck as suggested:
RMAN> crosscheck datafilecopy '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f';
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=148 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=140 device type=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of crosscheck command at 11/30/2009 19:09:43
RMAN-20230: datafile copy not found in the repository
RMAN-06015: error while looking up datafile copy name: /u01/app/oracle/product/11.2.0
/dbhome_1/dbs/snapcf_tetris.f
Okey, this was expected as I don't have any datafilecopy with that name despite of what RMAN says. So, let's try a crosscheck for the controlfile copy:
RMAN> crosscheck controlfilecopy '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f';
released channel: ORA_DISK_1
released channel: ORA_DISK_2
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=148 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=140 device type=DISK
validation failed for control file copy
control file copy file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f
RECID=40 STAMP=704313675
Crosschecked 1 objects
As it can be seen the validation fails, although the file exists on that location:
$ ls -al /u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f
-rw-r----- 1 oracle oinstall 10436608 Nov 30 18:57 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f
I don't know if this is documented somewhere but it looks to me like a bug. No idea why the snapshot control file is messed up after a DUPLICATE TARGET DATABASE ... FROM ACTIVE DATABASE.
Tags:
RMAN
Friday, November 27, 2009
TSPITR to recover a dropped tablespace
A nice feature of Oracle 11gR2 is the ability to recover a dropped tablespace using TSPITR. Of course, in order to succeed this, you need valid backups. Let's test this! First of all, just to be on the safe side, take a fresh backup of the database:
Then supposing you have a "MUCI" tablespace, simply drop it:
Let's try to recover "MUCI" tablespace. You'll need the nearest timestamp or SCN before the tablespace was dropped.
If you are tempted to use fully automatic TSPITR then be prepared for troubles. This is what happen to me when I tried it:
I google it and found this post which recommends to drop the tablespace without "AND DATAFILES" but, as far as I'm concerned, it didn't work.
Nevertheless, setting a new name for the datafile which belongs to the dropped datafile did the job.
A direct consequence of this in 11gR2 is that you can apply multiple TSPITR for the same tablespace without using a recovery catalog. If you chosen a wrong SCN and you already brought the recovered tablespace ONLINE then you can simply drop it and try again with another SCN.
Awesome!
BACKUP DATABASE PLUS ARCHIVELOG;
Then supposing you have a "MUCI" tablespace, simply drop it:
drop tablespace MUCI including contents;
Let's try to recover "MUCI" tablespace. You'll need the nearest timestamp or SCN before the tablespace was dropped.
If you are tempted to use fully automatic TSPITR then be prepared for troubles. This is what happen to me when I tried it:
RMAN> recover tablespace muci until scn 2240386 auxiliary destination '/u01/app/backup';
...
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 11/27/2009 21:57:13
RMAN-06965: Datapump job has stopped
RMAN-06961: IMPDP> Job "SYS"."TSPITR_IMP_hilc" stopped due to fatal error at 21:57:09
RMAN-06961: IMPDP> ORA-39123: Data Pump transportable tablespace job aborted
ORA-01565: error in identifying file '/u01/app/oracle/oradata/TETRIS/datafile/o1_mf_muci_5k0bwdmb_.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
I google it and found this post which recommends to drop the tablespace without "AND DATAFILES" but, as far as I'm concerned, it didn't work.
Nevertheless, setting a new name for the datafile which belongs to the dropped datafile did the job.
RMAN> run {
2> set newname for datafile 6 to new;
3> recover tablespace muci until scn 2240386 auxiliary destination '/u01/app/backup';
4> }
A direct consequence of this in 11gR2 is that you can apply multiple TSPITR for the same tablespace without using a recovery catalog. If you chosen a wrong SCN and you already brought the recovered tablespace ONLINE then you can simply drop it and try again with another SCN.
Awesome!
Tags:
RMAN
Annoying Tablespaces Quotas
There's one thing about tablespace quotas which I really don't like. If I allocate quota on a tablespace to a user and then I drop that tablespace the quota is not automatically revoked. It still can be seen in DBA_TS_QUOTAS view but with the DROPPED column set as YES. However, if i create afterwards a tablespace with the same name as the one previously dropped the old quota is auto-magically reactivated on this new tablespace which might not be my intention. Let's see it in action:
1. first of all, let's create a dummy tablespace:
2. let's also create a user and grant quota on the TEST_TBS tablespace:
3. take a look at quotas:
4. now drop the TEST_TBS tablespace and look again at quotas:
Just notice that the DROPPED column is now set to YES for the TEST_TBS tablespace. This I don't like and if I want to revoke the quota oracle complains that it doesn't know anything about the TEST_TBS tablespace.
Obvious, but then why preserving that quota in DBA_TS_QUOTAS anyway?
5. Let's recreate the TEST_TBS tablespace and then look at quotas:
See how the "DROPPED" column is now back on "NO". But wait... this TEST_TBS tablespace is a new tablespace which just happen to be named like an old dropped tbs. Bleah... ugly!
So, this boils down to the conclusion that when you are about to drop a tablespace is a good thing to check the quotas allocated to users and to revoke them before dropping the tablespace. Otherwise they will remain in DBA_TS_QUOTAS and they'll be reactivated when a tablespace with the same name is created. Furthermore, I don't know how you can get rid of them if the tablespace no longer exists. Of course, you can create a dummy tablespace with the same name, revoke quotas and after that to drop the dummy tablespace. But this is an awful workaround.
Update: Yet, I see an advantage of the above behaviour. In 11gR2 you can recover a dropped tablespace with TSPITR. After the TSPITR successfully completes and the dropped tablespace is recovered, the old quotas are also reactivated which is a good thing for the users who had objects in that tablespace.
1. first of all, let's create a dummy tablespace:
SQL> create tablespace test_tbs datafile size 20M;
Tablespace created.
2. let's also create a user and grant quota on the TEST_TBS tablespace:
SQL> create user gogu identified by xxx quota unlimited on users;
User created.
SQL> alter user gogu quota unlimited on test_tbs;
User altered.
3. take a look at quotas:
SQL> select * from dba_ts_quotas where username='GOGU';
TABLESPACE_NAME USERNAME BYTES MAX_BYTES BLOCKS MAX_BLOCKS DRO
--------------- --------------- ---------- ---------- ---------- ---------- ---
USERS GOGU 0 -1 0 -1 NO
TEST_TBS GOGU 0 -1 0 -1 NO
4. now drop the TEST_TBS tablespace and look again at quotas:
SQL> drop tablespace test_tbs including contents and datafiles;
Tablespace dropped.
SQL> select * from dba_ts_quotas where username='GOGU';
TABLESPACE_NAME USERNAME BYTES MAX_BYTES BLOCKS MAX_BLOCKS DRO
--------------- --------------- ---------- ---------- ---------- ---------- ---
USERS GOGU 0 -1 0 -1 NO
TEST_TBS GOGU 0 -1 0 -1 YES
Just notice that the DROPPED column is now set to YES for the TEST_TBS tablespace. This I don't like and if I want to revoke the quota oracle complains that it doesn't know anything about the TEST_TBS tablespace.
SQL> alter user gogu quota 0 on test_tbs;
alter user gogu quota 0 on test_tbs
*
ERROR at line 1:
ORA-00959: tablespace 'TEST_TBS' does not exist
Obvious, but then why preserving that quota in DBA_TS_QUOTAS anyway?
5. Let's recreate the TEST_TBS tablespace and then look at quotas:
SQL> create tablespace test_tbs datafile size 20M;
Tablespace created.
SQL> select * from dba_ts_quotas where username='GOGU';
TABLESPACE_NAME USERNAME BYTES MAX_BYTES BLOCKS MAX_BLOCKS DRO
--------------- --------------- ---------- ---------- ---------- ---------- ---
USERS GOGU 0 -1 0 -1 NO
TEST_TBS GOGU 0 -1 0 -1 NO
See how the "DROPPED" column is now back on "NO". But wait... this TEST_TBS tablespace is a new tablespace which just happen to be named like an old dropped tbs. Bleah... ugly!
So, this boils down to the conclusion that when you are about to drop a tablespace is a good thing to check the quotas allocated to users and to revoke them before dropping the tablespace. Otherwise they will remain in DBA_TS_QUOTAS and they'll be reactivated when a tablespace with the same name is created. Furthermore, I don't know how you can get rid of them if the tablespace no longer exists. Of course, you can create a dummy tablespace with the same name, revoke quotas and after that to drop the dummy tablespace. But this is an awful workaround.
Update: Yet, I see an advantage of the above behaviour. In 11gR2 you can recover a dropped tablespace with TSPITR. After the TSPITR successfully completes and the dropped tablespace is recovered, the old quotas are also reactivated which is a good thing for the users who had objects in that tablespace.
Tags:
RDBMS
Wednesday, November 18, 2009
Do archivelogs become obsolete if they contain blocks from an BEGIN BACKUP operation?
Of course, not every possible case is described within the docs therefore some of them have to be simply tried. So, today I was wondering what would happen if I leave a tablespace in BEGIN BACKUP mode and I will continue to backup the database using:
As you already know, if a tablespace is put in BEGIN BACKUP mode then all subsequent changes will force the dirty blocks to be written into the redologs which will be eventually archived. My main concern here was regarding the DELETE OBSOLETE command. Is RMAN smart enough to know that those archives are not going to become obsolete as long as the BEGIN BACKUP status is in place? After some tests I can conclude: RMAN knows this and will NOT consider those archives as obsolete. This was kind of obvious but, you know... it's always good to try and to see by your own eyes.
RUN {
BACKUP DATABASE PLUS ARCHIVELOG;
DELETE NOPROMPT OBSOLETE.
}
As you already know, if a tablespace is put in BEGIN BACKUP mode then all subsequent changes will force the dirty blocks to be written into the redologs which will be eventually archived. My main concern here was regarding the DELETE OBSOLETE command. Is RMAN smart enough to know that those archives are not going to become obsolete as long as the BEGIN BACKUP status is in place? After some tests I can conclude: RMAN knows this and will NOT consider those archives as obsolete. This was kind of obvious but, you know... it's always good to try and to see by your own eyes.
Tags:
RMAN
Subscribe to:
Posts (Atom)