This blog has moved here.

Sunday, October 31, 2010

SHARED remote_login_password_file

When talking about the shared option of the remote_login_password_file parameter, the official 11.2 documentation states:

One or more databases can use the password file. The password file can contain SYS as well as non-SYS users.

Whiles that's true, it is important to mention that, as soon as you set this parameter on SHARED, you are not allowed to add more SYSDBA users nor to change their passwords. A shared password file may contain non-SYS users, only if they were previously granted SYSDBA privilege, at the time the password file was in exclusive mode.

Wednesday, September 22, 2010

Statistics on Client Result Cache

I've just noticed that the result cache client statistics are not very accurate on my 11.2.0.1 Oracle server. I have the following java code:

package test;

import java.io.IOException;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import oracle.jdbc.pool.OracleDataSource;

public class ClientResultCache {

public static void main(String[] args) throws SQLException, IOException, InterruptedException {
OracleDataSource ods = new OracleDataSource();
ods.setDriverType("oci");
ods.setTNSEntryName("owldb");
ods.setUser("talek");
ods.setPassword("muci");
Connection conn = ods.getConnection();
String query = "select /*+ result_cache */ * from xxx";
((oracle.jdbc.OracleConnection)conn).setImplicitCachingEnabled(true);
((oracle.jdbc.OracleConnection)conn).setStatementCacheSize(10);
PreparedStatement pstmt;
ResultSet rs;
for (int j = 0 ; j < 1000 ; j++ ) {
System.out.println(j);
pstmt = conn.prepareStatement (query);
rs = pstmt.executeQuery();
while (rs.next( ) ) {
}
rs.close();
pstmt.close();
Thread.sleep(100);
}
System.in.read();
}

}

While the above code is running I'm monitoring the CLIENT_RESULT_CACHE_STATS$. And this is what I've got:
   
STAT_ID NAME VALUE
---------- ------------------------------ ----------
1 Block Size 256
2 Block Count Max 128
3 Block Count Current 128
4 Hash Bucket Count 1024
5 Create Count Success 1
6 Create Count Failure 0
7 Find Count 812
8 Invalidation Count 0
9 Delete Count Invalid 0
10 Delete Count Valid 0


The "Find Count" should be 999, right? My test program is still running (see the System.in.read at the end) therefore I expect my client result cache to be still there. My first guess was a delay in computing the statistics but even after 15 minutes of waiting I didn't get the right figures. Hmm... am I miss something?

Thursday, July 15, 2010

Oracle IDE for Geeks

Let's be honest guys... how many times you find yourself googeling for “the best oracle IDE”? If you are like me then the answer is “too many times”... Why this? Well, partly I guess because we are not satisfied with what the market offers us in this area.
If we're going to take a look at what we have now, the most well known Oracle IDEs are:
  1. Toad from Quest Software
  2. PLSQL Developer offered by Allround Automations
  3. SQL Developer from Oracle
If you ask me, my favorite choice would be PLSQL Developer. It has a lot of cool features on a fair price. Toad is also nice but it's expensive. As far as SQL Developer is concerned I simply can't get used with it... In other words, I don't like it: it's slow (maybe Java has something to do with this) and it uses Java Swing for its GUI, bypassing my OS look & feel. However, there are some pros which might count: it's free and it's cross-platform. In fact, if you are on a Unix based OS you don't have many choices but SQL Developer. Of course, there is/was Tora... but I would rather mention it on past tense.
So, what I don't like about these tools? Let's see:
  1. They are heavy... some of them take a lot of time just to startup.
  2. Most of them are not cross platform.
  3. They are closed software. You don't have access to the code.
  4. Limited editing features. I know they offer templates, auto-complete and stuff, but they look so small in comparison with what VIM provides.
  5. They are not suitable for server environments. I mean... what if you have to connect to the database on a remote Unix server, connected via ssh within a "friendly" console? I guess sqlplus is all you have there and it's not a very pleasant experience.
  6. A lot of the so useful sqlplus commands doesn't work in these environments. PLSQL Developer does a good job emulating many of these commands but I still miss AUTOTRACE, sub-totals and all the other cool features sqlplus provides.
So, taking into consideration the above limitations I decided to create (why not?) my own Oracle IDE. It may sound stupid or too ambitious, but is not (well, maybe ambitious is). The new Oracle IDE I'm working on is called VoraX and is hosted on GoogleCode. VoraX stands for Vim ORAcle eXtenstion and yes... you have right: it's a VIM plugin. If you are a big fan of the VIM editor then you have to give VoraX a try. Of course there is also dbext plugin which provides support for Oracle databases but the main problem with it is that it doesn't maintain a persistent connection to the database. In dbext, when you are going to execute something, sqlplus is launched, a new connection is done, the statement is handed over, the results are fetched into VIM and then, sqlplus is closed. This cycle is restarted on every statement execution which is quite expensive, slow and does not preserve your transactional context. Anyway, dbext is a tool designed for many databases, not just Oracle and, personally, I don't like generic database tools.

Wednesday, May 26, 2010

SqlPlus Injection

Despite that at the very first sight it might look stupid you may be hacked by a colleague in a very rude way. Suppose one developer asks you to create a new user for an upcoming system. Because he's a nice guy, he also hands you a simple script which creates this user along with all the required grants. Of course, even you like your colleague and appreciate his effort, you carefully inspect that script before running it. Let's see a preview of this script in a plain vim window:


Oookey! The script has nice comments, nothing unusual... You run it in your sqlplus SYS session and... BANG! your SYSTEM user is compromised and you'll even don't know that. If you still have the WTF face, then look again.
The catch is in the last comment. We used to think that in sqlplus a multiline  comment start with an /* (and because sqlplus is quite picky it has to be further followed by a space or CR) and then, everything till the closing */ is taken as a comment. This assumption is wrong because, in sqlplus, a # at the very beginning of a line means "execute the command on that line". In fact, it doesn't have to be # but this is the symbol configured by default for sqlprefix setting. Just check it out:

SQL> show sqlprefix
sqlprefix "#" (hex 23)
However, we are simply fooled by our editor which, with its nice code highlighting feature, just marked our comments accordingly. Of course, it doesn't know anything about the sqlplus "sqlprefix" setting. So, before running any third-party scripts you should carefully look at them, even at comments.

Sunday, May 02, 2010

Autobackup CF with Flash Recovery Area

In our office we have a 10g RAC database. It has a flash recovery area enabled, which points to an ASM disk. Nothing special I would say... However, from time to time, our nightly backup script simply fails complaining that it can't find some obsolete backups which should be deleted:

RMAN-06207: WARNING: 4 objects could not be deleted for DISK channel(s) due
RMAN-06208: to mismatched status. Use CROSSCHECK command to fix status
RMAN-06210: List of Mismatched objects
RMAN-06211: ==========================
RMAN-06212: Object Type Filename/Handle
RMAN-06213: --------------- ---------------------------------------------------
RMAN-06214: Backup Piece /u01/app/oracle/product/10.2.0/db_1/dbs/c-24173594-20100427-00
RMAN-06214: Backup Piece /u01/app/oracle/product/10.2.0/db_1/dbs/c-24173594-20100427-01
RMAN-06214: Backup Piece /u01/app/oracle/product/10.2.0/db_1/dbs/c-24173594-20100428-00
RMAN-06214: Backup Piece /u01/app/oracle/product/10.2.0/db_1/dbs/c-24173594-20100428-01

That's weird! All those backup pieces are controlfile autobackups. RMAN looks for them into a local filesystem and, being a RAC database, those files are accessible, obvious, just from one node. But how? They were supposed to be placed into our shared storage, in FRA, to be more precise. Well, let's look once again to our settings:

SQL> show parameter recov

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string +DG1
db_recovery_file_dest_size big integer 150000M
recovery_parallelism integer 0

Okey, it's clear we have a FRA! What about RMAN settings?

RMAN> show all;

using target database control file instead of recovery catalog
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 2 DAYS;
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F';
CONFIGURE DEVICE TYPE DISK PARALLELISM 4 BACKUP TYPE TO COMPRESSED BACKUPSET;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_fd1.f'; # default

It looks good... the autobackup format for controlfile is '%F' which is the default one, right? The documentation proves that:

The default location for the autobackup on disk is the flash recovery area (if configured) or a platform-specific location (if not configured). RMAN automatically backs up the current control file using the default format of %F.

Okey, we have a flash recovery area and a %F default autobackup format... WTF? Well, the answer is given by the 338483.1 metalink note. Apparently, there is a big difference between having the autobackup format set on its default value and having it reset to its default... Interesting, ha? It is... So, if you set (explicitly) the autobackup format to %F, the autobackup file will go to a OS specific location, which on Linux is $?/dbs. But if you have the autobackup format on its default (explicitly reset it, or never set it at all) and you have a FRA configured then that autobackup file will actually go to FRA.
So, in my case the solution was simple (please notice the "# default" marker):

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK CLEAR;

old RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F';
RMAN configuration parameters are successfully reset to default value

RMAN> show controlfile autobackup format;

RMAN configuration parameters are:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default

Ooookey, really really unintuitive... I think the Oracle documentation should be more precise regarding this.

Friday, March 19, 2010

When having a rman retention policy based on REDUNDANCY is a bad idea...

Suppose you have a RMAN retention policy of "REDUNDANCY 2". This means that as long as you have at least two backups of the same datafile, controlfile/spfile or archivelog the other older backups become obsolete and RMAN is allowed to safely remove them.

Now, let's also suppose that every night you backup your database using the following script:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
rman {
backup database plus archivelog;
delete noprompt obsolete redundancy 2;
}

The backup task is quite simple: first of all it ensures that we have the controlfile autobackup feature on, then it backups the database and archives and, at the end, it deletes all obsolete backups using the REDUNDANCY 2 retention policy.
Using the above approach you might think that you can restore your database as it was two days ago, right? For example, if you have a backup taken on Monday and another one taken on Tuesday you may restore your database as it was within the (Monday_last_backup - Today) time interval. Well, that's wrong!

Consider the following scenario:
1. On Monday night you backup the database using the above script;
2. On Tuesday, during the day, you drop a tablespace. Because this is a structural database change a controlfile autobackup will be triggered. Ieeei, you have a new controlfile backup.
3. On Tuesday night you backup again the database... nothing unusual, right?

Well, the tricky part is regarding the DELETE OBSOLETE command. When the backup script will run this command, RMAN finds out three controlfile backups: one is originating from the Monday backup, one is from the structural change and the third is from our just finished Tuesday backup database command. Now according to the retention policy of "REDUNDANCY 2", RMAN will assume that it is safe to delete the backup of the controlfile taken on Monday night backup because it's out of our retention policy and because this backup is the oldest one. Uuups... this means that we gonna have a big problem restoring the database as it was before our structural change because we don't have a controlfile backup from that time.

So, if you intend to incomplete recover your database to a previous time in the past it's really a good idea to switch to a retention policy based on a "RECOVERY WINDOW" instead. In our case a RECOVERY WINDOW OF 2 DAYS would be more appropriate.

Sunday, February 28, 2010

PLSQL "All or Nothing" Pitfall

Transactions are such a common thing when working with databases. They act on an "all or nothing" basis, that is, they succeed or fail but they always should let the database into a consistent state. Of course, in Oracle databases the rules are the same, but the interesting part I want to refer to is in connection with PL/SQL modules (procedures, functions or packages).

A PL/SQL module is some kind of "all or nothing" component. If the procedure fails it rollbacks the uncommited work it has done. Suppose we have the following procedure:
CREATE OR REPLACE PROCEDURE test AS 
BEGIN
insert into yyy values (1);
raise_application_error(-20000, 'I am a cute error!');
END test;

Let's see what happens:
SQL> truncate table yyy;

Table truncated.

SQL> exec test;
BEGIN test; END;

*
ERROR at line 1:
ORA-20000: I am a cute error!
ORA-06512: at "TALEK.TEST", line 4
ORA-06512: at line 1


SQL> select * from yyy;

no rows selected


Nice... we didn't explicitly rollback, but Oracle was smart enough to do the cleanup job for us. This makes sense and proves that PLSQL modules are, in a way, "all or nothing" components.

Now, let's say we have an oracle job which calls our "test" procedure and if an error occurs it has to log it into another table. A possible implementation of the job PLSQL caller block may be:

begin
test;
exception
when others then
insert into log values (dbms_utility.format_error_stack);
commit;
raise;
end;
/


The above code may seem harmless: the test procedure is called and if it raises an error the exception part of the PL/SQL caller block is executed which further inserts the error into our log table. Of course, we commit the log entry we just inserted and we re-raise the originating error. We know that if test procedure fails then it rollbacks its uncommited work as we seen above. After all, it's an "all or nothing" piece, right? Well, here's the pitfall: if you catch the exception then the procedure which raised the error will not clean up anything as long as you are within the EXCEPTION section. Even the whole anonymous block will fail because of re-raising the original error, the COMMIT statement from the EXCEPTION section will actually commit the incomplete work done by our "TEST" procedure. So, in most cases you should look twice to such EXCEPTION WHEN THEN ... COMMIT definitions... otherwise you may end up with nasty bugs. In the above example, in order to avoid this problem, a ROLLBACK should be performed before logging the error. Of course, there are smarter logging solutions which use autonomous transactions but, anyway, the goal was just to reveal the pitfall.

Wednesday, February 24, 2010

INS-32018 Warning for Standalone Server

When it comes to installing Oracle you should always follow the procedures written into the installation guides. As you already know, Oracle 11.2 packages ASM within a new separate component called Oracle Grid Infrastructure. So, if you want to install the database files into ASM then you must install Grid Infrastructure. As a good practice, Oracle recommends to install it under a different user, typically named "grid".
As far as the OFA directories structure is concerned the installation guide recommends:

  • to create an "/u01/app/grid" directory to be used as an ORACLE_BASE for this "grid" user;

  • to create an "/u01/app/11.2.0/grid" directory to be used as an ORACLE_HOME for this "grid" user.


If you're like me, the above configuration looks a little bit weird because I used to think that the ORACLE_HOME should be somewhere under the ORACLE_BASE directory. Nevertheless, the documentation clearly states the following:

Caution:

For grid infrastructure for a cluster installations, the Grid home must not be placed under one of the Oracle base directories, or under Oracle home directories of Oracle Database installation owners, or in the home directory of an installation owner. During installation, ownership of the path to the Grid home is changed to root. This change causes permission errors for other installations.

However, the above applies just to cluster installations. If you just want ASM installed for a single instance database then it's fine (and recommended) to place the ORACLE_HOME under the ORACLE_BASE. If not doing so, you'll get the following warning:



So, to sum up the above ideas, remember that if you are going to install a RAC then you need to create the grid ORACLE_HOME out of the ORACLE_BASE of any oracle software owner. If you choose to install the Oracle Grid Infrastructure for a standalone server then the ORACLE_HOME of the grid user should be under its ORACLE_BASE.

Wednesday, February 17, 2010

ALL_TABLES versus ALL_ALL_TABLES

If you ever wondered what's the difference between ALL_TABLES and ALL_ALL_TABLES then here's the answer: both views provide all tables to which the current user has access to but, in addition to the tables returned by ALL_TABLES, the ALL_ALL_TABLES will also return all object tables (system generated or not) accessible by the current user.

Pay attention that this may be an interview question (e.g. how can you get all tables you have access to?) and you may leave a good impression if you respond with another question: "Do you also want object tables to be included?". :)

Wednesday, December 16, 2009

A DDL statement may fire a DML trigger

Maybe you know this, maybe you don't. Because it's not quite obvious it deserves a little attention. We all know about DML triggers. Remember? Yea, yea... the before/after insert/update/delete each row triggers. We use to think that the INSERT, UPDATE or DELETE statements fire the corresponding triggers (of course, if any are defined). That's true with one (as far as I know) important note: a DDL statement which adds a new column with a default value will also fire the UPDATE trigger.

For example, let's create a dummy table:
SQL> create table muc  (col1 integer primary key, modify_date timestamp);

Table created.

Then, the corresponding trigger:
SQL> create or replace trigger trg_muc_mod_dt before update on muc for each row
2 begin
3 :new.modify_date := systimestamp;
4 end;
5 /

Add some records:
SQL> insert into muc values (1, systimestamp);

1 row created.

SQL> insert into muc values (2, systimestamp);

1 row created.

SQL> commit;

We end up having:
SQL> select * from muc;

COL1 MODIFY_DATE
---------- ------------------------------
1 16-DEC-09 09.54.03.804223 PM
2 16-DEC-09 09.54.41.815575 PM

Now, the moment of truth:
SQL> alter table muc add (active integer default '0');

Table altered.

SQL> select * from muc;

COL1 MODIFY_DATE ACTIVE
---------- ------------------------------ ----------
1 16-DEC-09 09.55.53.836113 PM 0
2 16-DEC-09 09.55.53.840896 PM 0

Take a look at the MODIFY_DATE and see the new timestamp. The update trigger was invoked in response to our DDL statement. This is important to know. Think to a deposit table which has a column named LAST_UPATED and a trigger which updates it whenever something within a deposit changes. Now, suppose the business logic dictates that a new column must be added with a default value. You run the DDL statement to add that column and... suddenly, all information regarding when a particular deposit was last upated is lost. Ups. So, I should write down one hundred times: "Think twice before adding new columns with default values on a table with UPDATE triggers".

Sunday, November 29, 2009

Strange RMAN snapshot controlfile issue

A strange thing happen today. I executed a delete obsolete command on my RMAN prompt and it reported the snapshot controlfile as obsolete. I don't know under which circumstances this problem occurs and I couldn't find any relevant information on forums or metalink (oh! sorry "my oracle support") about this.

Below is the output of the DELETE OBSOLETE command:
RMAN> delete obsolete;

RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
using channel ORA_DISK_1
using channel ORA_DISK_2
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Control File Copy 36 29-11-2009 12:35:33 /u01/app/oracle/product/11.2.0/
dbhome_1/dbs/snapcf_tetris.f

Do you really want to delete the above objects (enter YES or NO)? y
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of delete command on ORA_DISK_2 channel at 11/29/2009 21:11:16
ORA-19606: Cannot copy or restore to snapshot control file


Indeed, this is the default configured snapshot controlfile:
RMAN> show snapshot controlfile name;               

RMAN configuration parameters for database with db_unique_name TETRIS are:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/11.2.0/
dbhome_1/dbs/snapcf_tetris.f';

It seems I'm in a kind of deadlock here. The snapshot controlfile is reported as obsolete but it can't be deleted as it is used by RMAN. The only solution I found was to change the RMAN configuration to use another snapshot controlfile, to remove then the reported obsolete one and to switch back to the default. However, the question remains: why the snapshot controlfile is reported as obsolete?

PS: This happend on a 11gR2 database installed under a Linux x86 platform.

Update: Apparently this is encountered after executing a DUPLICATE database from ACTIVE DATABASE. Furthermore, the snapshot controlfile is reported as a "datafile copy" when a CROSSCHECK is suggested. See below:
RMAN> delete obsolete;                                                                                                                                                                                           

RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
using channel ORA_DISK_1
using channel ORA_DISK_2
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Control File Copy 40 30-11-2009 18:41:15 /u01/app/oracle/product/11.2.0/dbhome_1
/dbs/snapcf_tetris.f

Do you really want to delete the above objects (enter YES or NO)? y

RMAN-06207: WARNING: 1 objects could not be deleted for DISK channel(s) due
RMAN-06208: to mismatched status. Use CROSSCHECK command to fix status
RMAN-06210: List of Mismatched objects
RMAN-06211: ==========================
RMAN-06212: Object Type Filename/Handle
RMAN-06213: --------------- ---------------------------------------------------
RMAN-06214: Datafile Copy /u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f

Obviously, that can't be a datafile copy. So, let's try a crosscheck as suggested:
RMAN> crosscheck datafilecopy '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f';                                                                                                                     

using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=148 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=140 device type=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of crosscheck command at 11/30/2009 19:09:43
RMAN-20230: datafile copy not found in the repository
RMAN-06015: error while looking up datafile copy name: /u01/app/oracle/product/11.2.0
/dbhome_1/dbs/snapcf_tetris.f

Okey, this was expected as I don't have any datafilecopy with that name despite of what RMAN says. So, let's try a crosscheck for the controlfile copy:
RMAN> crosscheck controlfilecopy '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f';                                                                                                                  

released channel: ORA_DISK_1
released channel: ORA_DISK_2
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=148 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=140 device type=DISK
validation failed for control file copy
control file copy file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f
RECID=40 STAMP=704313675
Crosschecked 1 objects

As it can be seen the validation fails, although the file exists on that location:
$ ls -al /u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f
-rw-r----- 1 oracle oinstall 10436608 Nov 30 18:57 /u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_tetris.f

I don't know if this is documented somewhere but it looks to me like a bug. No idea why the snapshot control file is messed up after a DUPLICATE TARGET DATABASE ... FROM ACTIVE DATABASE.

Friday, November 27, 2009

TSPITR to recover a dropped tablespace

A nice feature of Oracle 11gR2 is the ability to recover a dropped tablespace using TSPITR. Of course, in order to succeed this, you need valid backups. Let's test this! First of all, just to be on the safe side, take a fresh backup of the database:
BACKUP DATABASE PLUS ARCHIVELOG;

Then supposing you have a "MUCI" tablespace, simply drop it:
drop tablespace MUCI including contents;

Let's try to recover "MUCI" tablespace. You'll need the nearest timestamp or SCN before the tablespace was dropped.

If you are tempted to use fully automatic TSPITR then be prepared for troubles. This is what happen to me when I tried it:
RMAN> recover tablespace muci until scn 2240386 auxiliary destination '/u01/app/backup';

...

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 11/27/2009 21:57:13
RMAN-06965: Datapump job has stopped
RMAN-06961: IMPDP> Job "SYS"."TSPITR_IMP_hilc" stopped due to fatal error at 21:57:09
RMAN-06961: IMPDP> ORA-39123: Data Pump transportable tablespace job aborted
ORA-01565: error in identifying file '/u01/app/oracle/oradata/TETRIS/datafile/o1_mf_muci_5k0bwdmb_.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3


I google it and found this post which recommends to drop the tablespace without "AND DATAFILES" but, as far as I'm concerned, it didn't work.
Nevertheless, setting a new name for the datafile which belongs to the dropped datafile did the job.
RMAN> run {
2> set newname for datafile 6 to new;
3> recover tablespace muci until scn 2240386 auxiliary destination '/u01/app/backup';
4> }

A direct consequence of this in 11gR2 is that you can apply multiple TSPITR for the same tablespace without using a recovery catalog. If you chosen a wrong SCN and you already brought the recovered tablespace ONLINE then you can simply drop it and try again with another SCN.

Awesome!

Annoying Tablespaces Quotas

There's one thing about tablespace quotas which I really don't like. If I allocate quota on a tablespace to a user and then I drop that tablespace the quota is not automatically revoked. It still can be seen in DBA_TS_QUOTAS view but with the DROPPED column set as YES. However, if i create afterwards a tablespace with the same name as the one previously dropped the old quota is auto-magically reactivated on this new tablespace which might not be my intention. Let's see it in action:

1. first of all, let's create a dummy tablespace:
SQL> create tablespace test_tbs datafile size 20M;                                                                               

Tablespace created.

2. let's also create a user and grant quota on the TEST_TBS tablespace:
SQL> create user gogu identified by xxx quota unlimited on users;                                                                

User created.

SQL> alter user gogu quota unlimited on test_tbs;

User altered.

3. take a look at quotas:
SQL> select * from dba_ts_quotas where username='GOGU';                                                                          

TABLESPACE_NAME USERNAME BYTES MAX_BYTES BLOCKS MAX_BLOCKS DRO
--------------- --------------- ---------- ---------- ---------- ---------- ---
USERS GOGU 0 -1 0 -1 NO
TEST_TBS GOGU 0 -1 0 -1 NO

4. now drop the TEST_TBS tablespace and look again at quotas:
SQL> drop tablespace test_tbs including contents and datafiles;                                                                  

Tablespace dropped.

SQL> select * from dba_ts_quotas where username='GOGU';

TABLESPACE_NAME USERNAME BYTES MAX_BYTES BLOCKS MAX_BLOCKS DRO
--------------- --------------- ---------- ---------- ---------- ---------- ---
USERS GOGU 0 -1 0 -1 NO
TEST_TBS GOGU 0 -1 0 -1 YES

Just notice that the DROPPED column is now set to YES for the TEST_TBS tablespace. This I don't like and if I want to revoke the quota oracle complains that it doesn't know anything about the TEST_TBS tablespace.
SQL> alter user gogu quota 0 on test_tbs;                                                                                        
alter user gogu quota 0 on test_tbs
*
ERROR at line 1:
ORA-00959: tablespace 'TEST_TBS' does not exist

Obvious, but then why preserving that quota in DBA_TS_QUOTAS anyway?

5. Let's recreate the TEST_TBS tablespace and then look at quotas:
SQL> create tablespace test_tbs datafile size 20M;                                                                               

Tablespace created.

SQL> select * from dba_ts_quotas where username='GOGU';

TABLESPACE_NAME USERNAME BYTES MAX_BYTES BLOCKS MAX_BLOCKS DRO
--------------- --------------- ---------- ---------- ---------- ---------- ---
USERS GOGU 0 -1 0 -1 NO
TEST_TBS GOGU 0 -1 0 -1 NO

See how the "DROPPED" column is now back on "NO". But wait... this TEST_TBS tablespace is a new tablespace which just happen to be named like an old dropped tbs. Bleah... ugly!

So, this boils down to the conclusion that when you are about to drop a tablespace is a good thing to check the quotas allocated to users and to revoke them before dropping the tablespace. Otherwise they will remain in DBA_TS_QUOTAS and they'll be reactivated when a tablespace with the same name is created. Furthermore, I don't know how you can get rid of them if the tablespace no longer exists. Of course, you can create a dummy tablespace with the same name, revoke quotas and after that to drop the dummy tablespace. But this is an awful workaround.

Update: Yet, I see an advantage of the above behaviour. In 11gR2 you can recover a dropped tablespace with TSPITR. After the TSPITR successfully completes and the dropped tablespace is recovered, the old quotas are also reactivated which is a good thing for the users who had objects in that tablespace.

Wednesday, November 18, 2009

Do archivelogs become obsolete if they contain blocks from an BEGIN BACKUP operation?

Of course, not every possible case is described within the docs therefore some of them have to be simply tried. So, today I was wondering what would happen if I leave a tablespace in BEGIN BACKUP mode and I will continue to backup the database using:
RUN {
BACKUP DATABASE PLUS ARCHIVELOG;
DELETE NOPROMPT OBSOLETE.
}

As you already know, if a tablespace is put in BEGIN BACKUP mode then all subsequent changes will force the dirty blocks to be written into the redologs which will be eventually archived. My main concern here was regarding the DELETE OBSOLETE command. Is RMAN smart enough to know that those archives are not going to become obsolete as long as the BEGIN BACKUP status is in place? After some tests I can conclude: RMAN knows this and will NOT consider those archives as obsolete. This was kind of obvious but, you know... it's always good to try and to see by your own eyes.

Sunday, November 15, 2009

High Availability Guaranteed Restore Points

I like the flashback database feature introduced by Oracle 10g and especially the guaranteed restore points. We intended to use it on our 10g production database to create guaranteed restore points before major upgrades but without enabling flashback logging because we wanted to affect at least as possible the whole database performance. What really bothered me at the time was the fact that I couldn't create my first guaranteed restore point if the database was open which, in my opinion, affects the high availability goal. The solution to this was to already have at least one guaranteed restore point before creating the next ones with the database open, but this always felt like an ugly workaround.
So, the question is: what do you choose: the overhead of always having the flashback logging enabled or the downtime produced by the creation of a guaranteed restore point? Hmmm...
As many Oracle new features they seem to be a little bit unpolished when just launched but they tend to become better an better. The same here. I've just had a nice surprise to see that in 11gR2 (I don't know if it's also in R1) you can create guaranteed restore points with the database open, without flashback logging enabled and without any previous guaranteed restore points:

SQL> select name from v$restore_point;

no rows selected

SQL> select flashback_on from v$database;

FLASHBACK_ON
------------------
NO

SQL> select status from v$instance;

STATUS
------------
OPEN

SQL> create restore point before_upgrade guarantee flashback database;

Restore point created.

SQL> select name from v$restore_point;

NAME
---------------------------------------------------
BEFORE_UPGRADE


Nice and good to know.

Sunday, November 01, 2009

RMAN Retention Policy with Corrupted Backups

I always assumed that RMAN is smart enough to take care of my database obsolete backups. I give it the retention policy and it's done: whenever I invoke the DELETE OBSOLETE command rman will identify those backups out of the scope of my retention policy and will safely delete them. Nevertheless, there is at least one big exception: when the taken backup is corrupted.

The following is quite self explanatory. Lets assume we have a retention policy of redundancy 1 and we take a new backup of the database.

RMAN> backup database;

Starting backup at 01-11-2009 11:20:53
using channel ORA_DISK_1
using channel ORA_DISK_2
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set

...
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:26
Finished backup at 01-11-2009 11:22:20


Now, we have two backups and, according to the configured retention policy, the previous one becomes obsolete. However, let's suppose that the backup we just taken is corrupted. We can simulate this using dd (we're zeroing 1MB somewhere in between):

dd if=/dev/zero of=o1_mf_nnndf_TAG20091101T232053_5gvyxpwt_.bkp bs=1M seek=10 count=1


Okey! As a good practice it's nice to validate the backup using the "RESTORE VALIDATE BACKUP" so let's do it:

RMAN> restore validate database;

Starting restore at 01-11-2009 11:30:10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=42 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=37 device type=DISK

channel ORA_DISK_1: starting validation of datafile backup set
channel ORA_DISK_2: starting validation of datafile backup set

...

ORA-19599: block number 1280 is corrupt in backup piece
/opt/oracle/app/oracle/flash_recovery_area
/VENUSDB/backupset/2009_11_01/o1_mf_nnndf_TAG20091101T232053_5gvyxpwt_.bkp

channel ORA_DISK_2: piece handle=/opt/oracle/app/oracle/flash_recovery_area
/VENUSDB/backupset/2009_11_01/o1_mf_nnndf_TAG20091101T232053_5gvyxp3o_.bkp
tag=TAG20091101T232053
channel ORA_DISK_2: restored backup piece 1
channel ORA_DISK_2: validation complete, elapsed time: 00:00:35
failover to previous backup

...
Finished restore at 01-11-2009 11:31:13


As you can see the BACKUP VALIDATE worked as expected. It identified the corrupted backupset and failed over to the previous valid one. However, what if at the end of the backup script there's a "delete noprompt obsolete" command?

RMAN> delete noprompt obsolete;

RMAN retention policy will be applied to the command
RMAN retention policy is set to redundancy 1
using channel ORA_DISK_1
using channel ORA_DISK_2
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle
-------------------- ------ ------------------ --------------------
Archive Log 2 01-11-2009 10:40:27 /opt/oracle/app/oracle/flash_recovery_area
/VENUSDB/archivelog/2009_11_01/o1_mf_1_6_5gvwkv55_.arc
Backup Set 10 01-11-2009 11:19:57
Backup Piece 10 01-11-2009 11:19:57 /opt/oracle/app/oracle/flash_recovery_area
/VENUSDB/backupset/2009_11_01/o1_mf_nnndf_TAG20091101T231814_5gvyrqdc_.bkp
Backup Set 9 01-11-2009 11:19:53
Backup Piece 9 01-11-2009 11:19:53 /opt/oracle/app/oracle/flash_recovery_area
/VENUSDB/backupset/2009_11_01/o1_mf_nnndf_TAG20091101T231814_5gvyrqtm_.bkp
Backup Set 11 01-11-2009 11:20:04
Backup Piece 11 01-11-2009 11:20:04 /opt/oracle/app/oracle/flash_recovery_area
/VENUSDB/autobackup/2009_11_01/o1_mf_s_701824802_5gvyw3h1_.bkp
deleted archived log
archived log file name=/opt/oracle/app/oracle/flash_recovery_area/VENUSDB/archivelog/
2009_11_01/o1_mf_1_6_5gvwkv55_.arc RECID=2 STAMP=701822427
deleted backup piece
backup piece handle=/opt/oracle/app/oracle/flash_recovery_area/VENUSDB/backupset/
2009_11_01/o1_mf_nnndf_TAG20091101T231814_5gvyrqdc_.bkp RECID=10 STAMP=701824695
deleted backup piece
backup piece handle=/opt/oracle/app/oracle/flash_recovery_area/VENUSDB/backupset/
2009_11_01/o1_mf_nnndf_TAG20091101T231814_5gvyrqtm_.bkp RECID=9 STAMP=701824695
deleted backup piece
backup piece handle=/opt/oracle/app/oracle/flash_recovery_area/VENUSDB/autobackup/
2009_11_01/o1_mf_s_701824802_5gvyw3h1_.bkp RECID=11 STAMP=701824803
Deleted 4 objects


Uuups! It just deleted our valid backupset. The proof:

RMAN> restore validate database;

Starting restore at 01-11-2009 11:35:03
using channel ORA_DISK_1
using channel ORA_DISK_2

channel ORA_DISK_1: starting validation of datafile backup set
channel ORA_DISK_2: starting validation of datafile backup set

...

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 11/01/2009 23:35:40
RMAN-06026: some targets not found - aborting restore
RMAN-06023: no backup or copy of datafile 5 found to restore
RMAN-06023: no backup or copy of datafile 3 found to restore
RMAN-06023: no backup or copy of datafile 2 found to restore


I don't know if the above behavior is clearly mentioned in the Oracle backup and recovery documentation but this should be taken into account when defining the backup and recovery strategy. Of course a RETENTION POLICY of 1 is not a setting to be used in productive systems but, anyway, I expect troubles even if the retention policy is set to a higher redundancy. In my option, it would be great if RMAN could label somehow the corrupted backups at the time the restore validate is invoked and then to take into account this when the retention policy is applied.

Meanwhile, in order to avoid the above scenario within your backup scripts, it's advisable to group the RESTORE VALIDATE and DELETE NOPROMPT OBSOLETE within a RUN { ... } command. If the first command fails then the DELETE command will never be executed.

Tuesday, April 28, 2009

CREATE VIEW with FORCE does not work

Yesterday I loaded an oracle dump in our 10.2.0.4 database... and guess what? Not all the views were created. I took a look into the impdp log and I saw some errors complaining that: ORA-00980: synonym translation is no longer valid. So what? The CREATE VIEW statements were issued with the FORCE clause therefore it should have been created, right?

Well, after some diggings on metalink I found this. It basically says that there is a(nother) bug and according to their description: create force view using a synonym for a table fails to create the view if the synonym is invalid. The 10.2.0.3 and 10.2.0.4 databases are confirmed to be affected and this bug is supposed to be fixed in 10.2.0.5 and 11.2.

In my case, the solution was to fix the synonyms problem and after that to reimport just the views using the INCLUDE parameter of the impdp utility.

Thursday, February 12, 2009

WTF is that? (ep. 2)

Today, the next episode of the Oracle WTF stories. One of my colleague brought to my attention the fact that the DECODE function doesn't work as expected when used with dates. He had a very simple test case:

create table muci (my_date date);

insert into muci
select decode(to_date('30/12/2099', 'dd/mm/yyyy'),
sysdate,
null,
to_date('30/12/2099', 'dd/mm/yyyy')) from dual;

He asked me: what we'll have in "MUCI" table after running the statements above? I didn't think too much. I realized that SYSDATE is not likely to be 30/12/2099, even the possibility of having a wrong setting in the OS clock couldn't be excluded, but anyway, I simply said that the final result should be 30/12/2099.

Let's take a look:

SQL> select to_char(my_date, 'dd/mm/yyyy') from muci;

TO_CHAR(MY_DATE,'DD/MM/YYYY')
-----------------------------
30/12/1999


Well, this was unexpected.. WTF? What's wrong with the YEAR? Even with a wrong OS clock setting this shouldn't happen. The reason must be somewhere else. Because I remembered that the result of DECODE depends on the type of the arguments, I said: let's look into docs! Yeap, the answer was there: "if the first result is null, then Oracle converts the return value to the datatype VARCHAR2". How this applies to our test case? It's simple: in fact, the whole result of the DECODE is a VARCHAR2 and not a DATE as one might think. The VARCHAR2 representation of a plain date value depends on the NLS_DATE_FORMAT, which on our server was:

SQL> select value from nls_session_parameters
where parameter='NLS_DATE_FORMAT';

VALUE
----------------------------------------
DD-MON-RR

So, when the INSERT was done, the inserted value was '30/12/99' which was further automatically casted to a DATE according to the NLS_DATE_FORMAT setting and we ended up with a “wrong” year in the final result. Lovely!

Tuesday, October 07, 2008

Remotely Connect to a RESTRICT Opened Database

Lately, I have this sub-conscience mantra which basically says: don’t believe everything Oracle Official Docs say but try and prove those facts! For example, one thing to try is starting an instance in restricted mode and prove after that what Oracle says in the Administration Guide 11g/Starting Up a Database chapter:

when the instance is in restricted mode, a database administrator cannot access the instance remotely through an Oracle Net listener, but can only access the instance locally from the machine that the instance is running on.

Lets try! On the server:

SQL> startup restrict
ORACLE instance started.

Total System Global Area 835104768 bytes
Fixed Size 2149000 bytes
Variable Size 595592568 bytes
Database Buffers 230686720 bytes
Redo Buffers 6676480 bytes
Database mounted.
Database opened.


On the client, using an admin user:

Enter user-name: admin@tbag
Enter password:
ERROR:
ORA-12526: TNS:listener: all appropriate instances are in restricted mode


What they forget to say here is the fact that this behavior is obtained just with dynamic listener registration. If I’m going to explicitly specify the SID_LIST within my listener.ora file then I can connect remotely without problems.

Thursday, June 12, 2008

Oracle Linux Date

If you’ll ever need to get the current Linux time from Oracle then you might be interested in the following solution. First of all, the Linux epoch time is expressed as the number of seconds since 1970-01-01 00:00:00 UTC and can be obtain by using the date +'%s' command. For example:


oracle@oxg:~$ date +'%s'
1213261534

From Oracle you can use the following custom function:


create or replace function current_linux_date return integer is
l_crr_date timestamp(9) := SYS_EXTRACT_UTC(systimestamp);
l_ref_date timestamp(9) := to_date('01011970', 'ddmmyyyy');
l_seconds integer;
begin
l_seconds := extract(day from (l_crr_date - l_ref_date)) * 24 * 3600 +
extract(hour from (l_crr_date - l_ref_date)) * 3600 +
extract(minute from (l_crr_date - l_ref_date)) * 60 +
extract(second from (l_crr_date - l_ref_date));
return(l_seconds);
end current_linux_date;
/

Now, you should get the same result from Oracle:


SQL> select current_linux_date from dual;

CURRENT_LINUX_DATE
------------------
1213261993

oracle@oxg:~$ date +'%s'
1213261993

Have fun!