This blog has moved here.

Tuesday, April 28, 2009

CREATE VIEW with FORCE does not work

Yesterday I loaded an oracle dump in our 10.2.0.4 database... and guess what? Not all the views were created. I took a look into the impdp log and I saw some errors complaining that: ORA-00980: synonym translation is no longer valid. So what? The CREATE VIEW statements were issued with the FORCE clause therefore it should have been created, right?

Well, after some diggings on metalink I found this. It basically says that there is a(nother) bug and according to their description: create force view using a synonym for a table fails to create the view if the synonym is invalid. The 10.2.0.3 and 10.2.0.4 databases are confirmed to be affected and this bug is supposed to be fixed in 10.2.0.5 and 11.2.

In my case, the solution was to fix the synonyms problem and after that to reimport just the views using the INCLUDE parameter of the impdp utility.

Thursday, February 12, 2009

WTF is that? (ep. 2)

Today, the next episode of the Oracle WTF stories. One of my colleague brought to my attention the fact that the DECODE function doesn't work as expected when used with dates. He had a very simple test case:

create table muci (my_date date);

insert into muci
select decode(to_date('30/12/2099', 'dd/mm/yyyy'),
sysdate,
null,
to_date('30/12/2099', 'dd/mm/yyyy')) from dual;

He asked me: what we'll have in "MUCI" table after running the statements above? I didn't think too much. I realized that SYSDATE is not likely to be 30/12/2099, even the possibility of having a wrong setting in the OS clock couldn't be excluded, but anyway, I simply said that the final result should be 30/12/2099.

Let's take a look:

SQL> select to_char(my_date, 'dd/mm/yyyy') from muci;

TO_CHAR(MY_DATE,'DD/MM/YYYY')
-----------------------------
30/12/1999


Well, this was unexpected.. WTF? What's wrong with the YEAR? Even with a wrong OS clock setting this shouldn't happen. The reason must be somewhere else. Because I remembered that the result of DECODE depends on the type of the arguments, I said: let's look into docs! Yeap, the answer was there: "if the first result is null, then Oracle converts the return value to the datatype VARCHAR2". How this applies to our test case? It's simple: in fact, the whole result of the DECODE is a VARCHAR2 and not a DATE as one might think. The VARCHAR2 representation of a plain date value depends on the NLS_DATE_FORMAT, which on our server was:

SQL> select value from nls_session_parameters
where parameter='NLS_DATE_FORMAT';

VALUE
----------------------------------------
DD-MON-RR

So, when the INSERT was done, the inserted value was '30/12/99' which was further automatically casted to a DATE according to the NLS_DATE_FORMAT setting and we ended up with a “wrong” year in the final result. Lovely!

Tuesday, October 07, 2008

Remotely Connect to a RESTRICT Opened Database

Lately, I have this sub-conscience mantra which basically says: don’t believe everything Oracle Official Docs say but try and prove those facts! For example, one thing to try is starting an instance in restricted mode and prove after that what Oracle says in the Administration Guide 11g/Starting Up a Database chapter:

when the instance is in restricted mode, a database administrator cannot access the instance remotely through an Oracle Net listener, but can only access the instance locally from the machine that the instance is running on.

Lets try! On the server:

SQL> startup restrict
ORACLE instance started.

Total System Global Area 835104768 bytes
Fixed Size 2149000 bytes
Variable Size 595592568 bytes
Database Buffers 230686720 bytes
Redo Buffers 6676480 bytes
Database mounted.
Database opened.


On the client, using an admin user:

Enter user-name: admin@tbag
Enter password:
ERROR:
ORA-12526: TNS:listener: all appropriate instances are in restricted mode


What they forget to say here is the fact that this behavior is obtained just with dynamic listener registration. If I’m going to explicitly specify the SID_LIST within my listener.ora file then I can connect remotely without problems.

Thursday, June 12, 2008

Oracle Linux Date

If you’ll ever need to get the current Linux time from Oracle then you might be interested in the following solution. First of all, the Linux epoch time is expressed as the number of seconds since 1970-01-01 00:00:00 UTC and can be obtain by using the date +'%s' command. For example:


oracle@oxg:~$ date +'%s'
1213261534

From Oracle you can use the following custom function:


create or replace function current_linux_date return integer is
l_crr_date timestamp(9) := SYS_EXTRACT_UTC(systimestamp);
l_ref_date timestamp(9) := to_date('01011970', 'ddmmyyyy');
l_seconds integer;
begin
l_seconds := extract(day from (l_crr_date - l_ref_date)) * 24 * 3600 +
extract(hour from (l_crr_date - l_ref_date)) * 3600 +
extract(minute from (l_crr_date - l_ref_date)) * 60 +
extract(second from (l_crr_date - l_ref_date));
return(l_seconds);
end current_linux_date;
/

Now, you should get the same result from Oracle:


SQL> select current_linux_date from dual;

CURRENT_LINUX_DATE
------------------
1213261993

oracle@oxg:~$ date +'%s'
1213261993

Have fun!

Monday, May 12, 2008

Profiling the new SIMPLE_INTEGER type

Oracle 11g comes with a new PLSQL type called SIMPLE_INTEGER. The official documentation says that this type yield significant performance compared to PLS_INTEGER type. Because I want to see this with my own eyes I’ve decided to test it using another new 11g component called hierarchical profiler which I also want to see how it’s working.

First of all, let’s setup the environment:

1. on the database server create a new directory to be used for creating profiler trace files:

oracle@obi:oracle$ mkdir profiler
oracle@obi:oracle$ chmod o-rx profiler/


2. create the DIRECTORY object in the database too, and grant read/write privileges to the testing user (in our case TALEK user):

SQL> create directory profiler_dir as '/opt/oracle/profiler';

Directory created.

SQL> grant read, write on directory profiler_dir to talek;

Grant succeeded.


3. grant execute privilege for DBMS_HPROF package to the TALEK user:

SQL> grant execute on dbms_hprof to talek;

Grant succeeded.


4. connect using TALEK user and create the following package (the only difference between the first and second approach is the type of the l_count variable):

create or replace package trash is

procedure approach_1;

procedure approach_2;

end trash;
/

create or replace package body trash is

procedure approach_1 as
l_count pls_integer := 0;
begin
for i in 1..10000 loop
l_count := l_count + 1;
end loop;
dbms_output.put_line(l_count);
end;

procedure approach_2 as
l_count simple_integer := 0;
begin
for i in 1..10000 loop
l_count := l_count + 1;
end loop;
dbms_output.put_line(l_count);
end;

end trash;
/


5. Profile the approaches:

SQL> exec dbms_hprof.start_profiling(location => 'PROFILER_DIR', filename => 'test.trc');

PL/SQL procedure successfully completed

SQL> exec trash.approach_1;

PL/SQL procedure successfully completed

SQL> exec trash.approach_2;

PL/SQL procedure successfully completed

SQL> exec dbms_hprof.stop_profiling;

PL/SQL procedure successfully completed


6. Analyze the generated trace file. For this we’ll use the "plshprof" command line utility.

oracle@obi:profiler$ plshprof -output report test.trc
PLSHPROF: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
[8 symbols processed]
[Report written to 'report.html']


Aaaaand, the WINNER is:

TALEK.TRASH.APPROACH_1 -> 5713 (microseconds)
TALEK.TRASH.APPROACH_2 -> 100706 (microseconds)


Well… this is unexpected. According to Oracle docs, the SIMPLE_INTEGER should be faster. Ok, back to official doc: "The new PL/SQL SIMPLE_INTEGER data type is a binary integer for use with native compilation which is neither null checked nor overflow checked". Ahaaa… native compilation! Let’s check this:

SQL> show parameter plsql_code_type

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
plsql_code_type string INTERPRETED


So, we have a first clue and a first conclusion. If the database doesn’t use NATIVE compilation the SIMPLE_INTEGER type is actually much slower.

Let’s switch to native compilation. This can be easily done because the "plsql_code_type" parameter is dynamic:

SQL> alter system set plsql_code_type=native scope=both;

System altered.


It is important to compile once again the package because otherwise the old PLSQL byte code will be used (you can use "alter package trash compile plsql_code_type=native;"), then repeat the profiler tests.

The new results are:

TALEK.TRASH.APPROACH_2 -> 3927 (microseconds)
TALEK.TRASH.APPROACH_1 -> 12556 (microseconds)


Now, the second approach with SIMPLE_INTEGER is much faster and, interestingly, the PLS_INTEGER approach is slightly slower on native compilation compared with the same approach on the initial PLSQL interpreted environment.

Okey, one more thing. I really enjoy using the new 11g hierarchical profiler. From my point of view is a big step forward compared with the old DBMS_PROFILER, and the provided HTML reports produced by "plshprof" are quite lovely.

Sunday, May 11, 2008

WTF is that? (ep. 1)

I've just decided to put here some posts about the (...well, you know) Oracle WTF stories, those moments (... hours, and sometimes days) when being at my desk, I'm just staring at that stupid SQLPLUS> prompt couldn't figure out what the hell is happening.

Today, episode 1:

The scenario is very simple. I have two nice users: TALEK and SIM. TALEK has a table and gives UPDATE rights to SIM.

SQL> connect talek
Enter password:
Connected.

SQL> create table muci (col1 varchar2(10));

Table created.

SQL> insert into muci values ('abc')
2 /

1 row created.

SQL> commit;

Commit complete.

SQL> grant update on muci to sim;

Grant succeeded.

SQL> connect sim
Enter password:
Connected.

SQL> update talek.muci set col1='xyz' where col1='abc';
update talek.muci set col1='xyz' where col1='abc'
*
ERROR at line 1:
ORA-01031: insufficient privileges


Of course, this is the WTF moment. Why the UPDATE failed? The first thing to do is to check the DBA_TAB_PRIVS view for a confirmation that the UPDATE privilege is still there. (I'm pretty sure that no one was so fast to revoke meanwhile the granted privilege but, just in case...)

SQL> select grantee, owner, table_name, privilege 
from dba_tab_privs where table_name='MUCI' and owner='TALEK';

GRANT OWNER TABLE_NAM PRIVILEGE
----- ----- --------- ----------
SIM TALEK MUCI UPDATE


And yes, the privilege is there. Hmmm... what's next? Usually the next thought is that another Oracle bug makes fun of me. But, this sounds too scary to be true. Finally, the stupid answer comes to light.

SQL> show parameter sql92_security

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
sql92_security boolean TRUE


The database reference documentation says the following:
"The SQL92 standards specify that security administrators should be able to require that users have SELECT privilege on a table when executing an UPDATE or DELETE statement that references table column values in a WHERE or SET clause. SQL92_SECURITY specifies whether users must have been granted the SELECT object privilege in order to execute such UPDATE or DELETE statements."

With the above sql92_security parameter set, actually the "where col1='abc'" filter from the UPDATE statement complains about "insufficient privileges" and not the UPDATE itself. Without a filter the update executes as expected:

SQL> update talek.muci set col1='xyz';

1 row updated.


Ok, another lesson has been learned!

Friday, May 09, 2008

Obsolete Policy Violation

This is quite funny. Starting with 10g the LOG_ARCHIVE_START parameter is obsolete. Nevertheless, into the "Database Control" of a new installed 11g database I have a policy rule violation called "Use of Automatic Log Archival Features". The description of this policy rule says: "This policy ensures that archiving of redo logs is done automatically and prevents suspension of instance operations when redo logs fill. Only applicable if database is in archivelog mode".

Well, my database is already configured to run in archivelog mode:
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 15
Next log sequence to archive 17
Current log sequence 17


So, where is the problem? The action suggested by "Database Control" console is "Set the value of the LOG_ARCHIVE_START initialization parameter to TRUE". Hmmm, that is to set an obsolete parameter... no, thanks!

Update: Actually, I discovered that there are a lot of non-relevant/obsolete policies for 11g, like SGA_TARGET is not used, or PGA_AGGREGATE_TARGET should be used etc. It seems that many default settings from the "Database Control" do not take into consideration the actual version of the managed database therefore you have to manually suppress these policy/rule violations.

Thursday, March 27, 2008

TKPROF Everywhere

Lately, I was searching for a solution to let developers to “tkprof” their statements in the most non-intrusive way possible. Of course, a quite appealing solution I found was the one suggested by Tom Kyte in his “Effective Oracle by Design” book. The solution involves reading the user trace file from the server and returning the content of that trace through a pipelined table function. The result may be spooled into a file on the client side and, after that, “tkprof” may be executed for this local file. However, the problem I have is that me, personally, I use oracle instant client and I don't have tkprof on my local machine. Furthermore, I don't use sqlplus all the time, therefore I would really like to get the formatted tkprof output directly within a simple SQL-SELECT statement.

The solution is quite simple: instead of returning the raw trace file it's enough to run tkprof utility on the server with the user dump trace file as a parameter and, eventually, to return the final output.

In order to setup the needed environment we'll have to:

1.create the user which will own the “tkprof” framework:

grant create session, alter session, create procedure to tools identified by xxx;

2.grant additional privileges for this user:

grant select on sys.v_$process to tools;
grant select on sys.v_$session to tools;
exec dbms_java.grant_permission( 'TOOLS',
'SYS:java.lang.RuntimePermission', writeFileDescriptor', '' );
exec dbms_java.grant_permission( 'TOOLS',
'SYS:java.lang.RuntimePermission', 'readFileDescriptor', '' );

3.create the following java source object. This is needed in order to execute an external program in the OS environment (thanks again Tom for this, see the Java procedure for host calls on Unix environment question.):

create or replace and compile java source named util as
import java.io.*;
import java.lang.*;

public class Util extends Object
{

public static int RunThis(String args)
{
Runtime rt = Runtime.getRuntime();
int rc = -1;

try
{
Process p = rt.exec(args);

int bufSize = 4096;
BufferedInputStream bis =
new BufferedInputStream(p.getInputStream(), bufSize);
int len;
byte buffer[] = new byte[bufSize];

// Echo back what the program spit out
while ((len = bis.read(buffer, 0, bufSize)) != -1)
System.out.write(buffer, 0, len);

rc = p.waitFor();
}
catch (Exception e)
{
e.printStackTrace();
rc = -1;
}
finally
{
return rc;
}
}
}

4.Create the package which embeds the whole tracing logic:

create or replace package tkprof
/**
Provides the tkprof tracing feature to developers. This package is
supposed to be used in developing environments only.

Required rights (the owner of this package is supposed to be TOOLS):

grant alter session to tools;
grant select on sys.v_$process to tools;
grant select on sys.v_$session to tools;

exec dbms_java.grant_permission( 'TOOLS', 'SYS:java.lang.RuntimePermission', 'writeFileDescriptor', '' );
exec dbms_java.grant_permission( 'TOOLS', 'SYS:java.lang.RuntimePermission', 'readFileDescriptor', '' );

*/
is

/**
The complete path along the name of the tkprof utility. Change this constant to fit to your
environment. After that the following right is required:

exec dbms_java.grant_permission( 'TOOLS', 'SYS:java.io.FilePermission', TKPROF.TKPROF_EXECUTABLE, 'execute' );

If a RAC configuration is used than the path should be the same accross all nodes. If the ORACLE_HOME
env variable is different between the RAC nodes then the value of the ORACLE_HOME should be get dynamically
by using, for example, the DBMS_SYSTEM.get_env procedure or symbolinc links may be created in the OS
environment of every RAC node.
*/
TKPROF_EXECUTABLE constant varchar2(300) := '/opt/oracle/product/10.2.0/db_1/bin/tkprof';

/**
The directory where the user trace files are stored. May be found out using:

show parameter user_dump_dest

If a RAC configuration is used than the path should be the same accross all nodes. If not, then the value
should be fetch dynamically from v$parameter view. Another approach would be to create symbolic links in the
OS environment of every RAC node.
*/
UDUMP_PATH constant varchar2(300) := '/opt/oracle/admin/rfd/udump';

/**
The name of the oracle directory object which points out to the above
path. The owner of this package must have read privileges on this
directory:

create directory UDUMP_DIR as '/opt/oracle/admin/rfd/udump';
grant read on directory UDUMP_DIR to tools;
*/
UDUMP_DIR constant varchar2(30) := 'UDUMP_DIR';

/**
A simple type used to return the tkprof_output.
*/
type tkprof_output is table of varchar2(4000);

/**
Enable tracing for the current session.
*/
procedure enable;

/**
Disable tracing for the current session.
*/
procedure disable;

/**
Get the status of the tracing for the current session.

#return 'TRUE' if the trace is enabled, 'FALSE' otherwise.
*/
function is_enabled return varchar2;

/**
Get the tkprof content thorough a pipelined table function.

#pi_tkprof_params additional parameters to tkprof
*/
function output(pi_tkprof_params varchar2 := '') return tkprof_output pipelined;

end tkprof;
/

create or replace package body tkprof is

g_unique_session_id varchar2(100);
g_trace_file_name varchar2(4000);
g_is_enabled boolean;

function run(pi_cmd in varchar2) return number as
language java name 'Util.RunThis(java.lang.String) return integer';

procedure enable as
begin
execute immediate 'alter session set tracefile_identifier=''' ||
g_unique_session_id || '''';
execute immediate 'alter session set timed_statistics=true';
execute immediate 'alter session set max_dump_file_size=unlimited';
execute immediate 'alter session set events ''10046 trace name context forever, level 12''';
g_is_enabled := true;
dbms_output.put_line('session tracing enabled under ' ||
g_unique_session_id || ' identifier.');
end;

function is_enabled return varchar2 as
begin
if g_is_enabled then
return 'TRUE';
else
return 'FALSE';
end if;
end;

procedure disable as
begin
execute immediate 'alter session set events ''10046 trace name context off''';
g_is_enabled := false;
dbms_output.put_line('session tracing disabled');
end;

function output(pi_tkprof_params varchar2 := '') return tkprof_output
pipelined as
l_status pls_integer;
l_bfile bfile := bfilename(UDUMP_DIR, g_trace_file_name || '.prf');
l_last pls_integer := 1;
l_current pls_integer;
begin
dbms_java.set_output(10000);
l_status := run(TKPROF_EXECUTABLE || ' ' || UDUMP_PATH || '/' ||
g_trace_file_name || ' ' || UDUMP_PATH || '/' ||
g_trace_file_name || '.prf ' || pi_tkprof_params);
if l_status != 0 then
dbms_output.put_line('status: ' || l_status);
dbms_output.put_line('ERROR: cannot produce the tkprof trace!');
return;
else
dbms_lob.fileopen(l_bfile);
loop
l_current := dbms_lob.instr(l_bfile, '0A', l_last, 1);
exit when(nvl(l_current, 0) = 0);
pipe row(utl_raw.cast_to_varchar2(dbms_lob.substr(l_bfile,
l_current -
l_last + 1,
l_last)));
l_last := l_current + 1;
end loop;
dbms_lob.fileclose(l_bfile);
end if;
exception
when others then
if dbms_lob.isopen(l_bfile) != 0 then
dbms_lob.fileclose(l_bfile);
end if;
raise;
end;

begin
g_unique_session_id := sys_context('userenv', 'sessionid');
select sys_context('userenv', 'instance_name') || '_ora_' ||
ltrim(to_char(a.spid)) || '_' || g_unique_session_id || '.trc'
into g_trace_file_name
from v$process a, v$session b
where a.addr = b.paddr
and b.audsid = sys_context('userenv', 'sessionid');
end tkprof;
/

5.Change the TKPROF_EXECUTABLE and UDUMP_PATH constants to fit with your environment.
6.Create the UDUMP_DIR directory:

create directory UDUMP_DIR as '/opt/oracle/admin/rfd/udump';
grant read on directory UDUMP_DIR to tools;

7.Grant execute rights for tkprof executable:

exec dbms_java.grant_permission( 'TOOLS', 'SYS:java.io.FilePermission',
TKPROF.TKPROF_EXECUTABLE, 'execute' );

8.Make the TKPROF package available to everybody:

grant execute on tools.tkprof to public;
create public synonym tkprof for tools.tkprof;

Okey, that's it! Now, let's test it:

fits_cr@RFD> exec tkprof.enable

session tracing enabled under 7154134 identifier.

PL/SQL procedure successfully completed.

fits_cr@RFD> select count(1) from rolmda;

COUNT(1)
----------
95

fits_cr@RFD> exec tkprof.disable;
session tracing disabled

PL/SQL procedure successfully completed.


fits_cr@RFD> select * from table(tkprof.output);

COLUMN_VALUE
--------------------------------------------------------------------------------------
TKPROF: Release 10.2.0.3.0 - Production on Thu Mar 27 15:13:06 2008
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Trace file: /opt/oracle/admin/rfd/udump/rfd3_ora_26618_7154134.trc
Sort options: default

********************************************************************************
count = number of times OCI procedure was executed
cpu = cpu time in seconds executing
elapsed = elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current = number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************

select /*+ index(idl_sb4$ i_idl_sb41) +*/ piece#,length,piece
from
idl_sb4$ where obj#=:1 and part=:2 and version=:3 order by piece#

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 2 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 0 2 0 0

... and so on ...

I'm sure this package can be improved but as a proof of concept I think that it's more than enough. Likewise, I have to mention that this was tested on a 10gR2 database therefore I don't know if you can use it without any changes on a different Oracle database version.

Saturday, March 22, 2008

What's Wrong with Oracle TIMESTAMP WITH TIME ZONE Data-type?

First of all, let's start by recalling what a TIMESTAMP WITH TIME ZONE data-type is. According to the Oracle official documentation "a TIMESTAMP WITH TIME ZONE is a variant of TIMESTAMP that includes a time zone region name or time zone offset in its value. The time zone offset is the difference (in hours and minutes) between local time and UTC (Coordinated Universal Time, formerly Greenwich Mean Time)".

Well, very nice... this is a data-type which might be very useful in a distributed environment across all over the world where, unlike its counterpart TIMESTAMP WITH LOCAL TIME ZONE data-type, is also important to record the time zone from which, a transaction or whatsoever information has been inputed into the system. A potential suitable case for using this timezone information might be the producing of a report with all peak hours of activity in the context of a specific timezone area.

So far so good. What I really don't like about this data-type is that I cannot create a primary/unique constraint on a such a column. The Oracle documentation clear states that "none of the columns in the unique/primary key can be of LOB, LONG, LONG RAW, VARRAY, NESTED TABLE, OBJECT, REF, TIMESTAMP WITH TIME ZONE, or user-defined type. However, the unique/primary key can contain a column of TIMESTAMP WITH LOCAL TIME ZONE".

Furthermore, any index created for such a column is actually converted into a functional index. Behind the scenes, Oracle automatically normalize all the values of a TIMESTAMP WITH TIME ZONE column using the SYS_EXTRACT_UTC() function. In a way, this makes sens. Suppose we have to order the result-set of a SQL-SELECT using a TIMESTAMP WITH TIME ZONE column. Having for example:

TIMESTAMP '2003-01-01 2:00:00 -08:00
TIMESTAMP '2003-01-01 2:00:00 -02:00

what record should be the first one? According to what point of reference? The solution is to normalize all values and only after that to compare the values. Oracle transparently does this.
Now, what if I want that the values to be unique in a TIMESTAMP WITH TIME ZONE column? Well, you can't create a unique constraint but you can create a unique index.

SQL> create table muci_test (
2 col1 timestamp with time zone,
3 constraint uk_muci_test_col1 unique(col1)
4 );

create table muci_test (
col1 timestamp with time zone,
constraint uk_muci_test_col1 unique(col1)
)

ORA-02329: column of datatype TIME/TIMESTAMP WITH TIME ZONE
cannot be unique or a primary key
SQL> create table muci_test (
2 col1 timestamp with time zone
3 );

Table created

Executed in 0,204 seconds

SQL> create unique index ix_muci_text_col1 on muci_test(col1);

Index created

Executed in 0,25 seconds

SQL> insert into muci_test values (to_timestamp('22-MAR-08 11.25.39.457137'));

1 row inserted

Executed in 0,031 seconds

SQL> insert into muci_test values (to_timestamp('22-MAR-08 11.25.39.457137'));

insert into muci_test values (to_timestamp('22-MAR-08 11.25.39.457137'))

ORA-00001: unique constraint (ADMIN.IX_MUCI_TEXT_COL1) violated

As I already said, Oracle has actually created a functional index and this can be easily checked:

SQL> select index_type from user_indexes where index_name = 'IX_MUCI_TEXT_COL1';

INDEX_TYPE
---------------------------
FUNCTION-BASED NORMAL

If you don't feel very comfortable with functional indexes or you really want a primary/unique key constraint then the application design should take into consideration this limitation. Me personally, I would choose a TIMESTAMP WITH LOCAL TIME ZONE data-type and an additional VARCHAR2 column to store the originating TIME ZONE offsets. Using this approach it is possible to enforce a multi-column primary/unique key.

Sunday, February 10, 2008

ORA_ROWSCN pitfall for optimistic locking

I can see lately that there's a big enthusiasm around this new 10g pseudo-column called ORA_ROWSCN, one of its great benefit being for implementing an optimistic locking mechanism. From a simplistic point of view, this mechanism consists of:

1.on the application server or on the client side we have an entity class which is mapped on a particular table. Instances of that class (objects) reflect rows on that particular table.
2.Every entity object has also a version field which maps to the ORA_ROWSCN pseudo-column.
3.At the very beginning, we are loading our entities using SELECT T.*, ORA_ROWSCN FROM TABLE T so that every entity has the corresponding values and version.
4.Now, suppose that the user changes some attributes from the entity through the application front-end. This changes have to be propagated into the database, most likely using an UPDATE statement, but taking care of the version field. That is, we need to be sure that meanwhile, the version of the entity we are about to persist didn't change within the database. So, all is narrowing down on performing something like : UPDATE TABLE set COLUMN = [value from the entity] where PK_COLUMN = [id from the entity] and ORA_ROWSCN = [version from the entity]. If the above UPDATE will return “0 rows updated” we'll know for sure that somebody else has changed the record into the database and the application will throw an “Optimistic Lock Exception”.

So, everything seems to work nicely: no more timestamp version columns, no more OLD and NEW column values comparisons, no more row hashes. In 10g, it seems that Oracle offered us a bullet proof solution. BUUUT, there's a pitfall here. Every decent application is working with transactions therefore every DML is nicely encapsulated into an Oracle transaction. What happens if on the 4th step our update succeeds? Well, in this case the version of the entity will have to be updated accordingly. But, our update is not yet committed therefore we don't have access to the new value of the ORA_ROWSCN which will be available only after COMMIT. This means that, in order to refresh our entity, we'll have to issue an SQL-SELECT after COMMIT, something like: SELECT T.*, ORA_ROWSCN FROM TABLE T WHERE PK_COLUMN=[id from the entity]. The main problem here is the fact that between our COMMIT and the SQL-SELECT there is a tiny time frame within which a lot of nasty things may happen: somebody else may delete our record or may update it once again. Furthermore, the fact that I have to issue an SQL-SELECT after the UPDATE doesn't sound to me very appealing as involves additional work for the database and might affect the whole scalability. In addition I would really like to use the RETURNING clause of the INSERT/UPDATE statements and to refresh my entity in one step.

In conclusion, if you are designing an WEB application which doesn't reuse entities across WEB pages then using ORA_ROWSCN for implementing optimistic locking is fine as long as the working flow is the following:

1.the user select the record he/she wants to change from a grid;
2.the application loads the entity and assigns the ORA_ROWSCN value as version;
3.the user performs the required changes and press the SAVE button;
4.the application performs the UPDATE into the database taking care of the ORA_ROWSCN values (WHERE ... and ORA_ROWSCN=[version from the entity]) and COMMITs the changes.
5.the application display once again the initial grid re-querying the database.

However, if you are designing an application which is required to take advantage of caching (most likely a stand-alone rich client application) then the ORA_ROWSCN is not the right solution. For example:

1.the application loads a grid and every grid record has the corresponding entity attached;
2.the user select the record he/she wants to change from the grid;
3.the application relies on the data from the corresponding entity therefore it doesn't query the database;
4.the user performs the changes and press the SAVE button;
6.the application performs the UPDATE into the database taking care of the ORA_ROWSCN values (WHERE ... and ORA_ROWSCN=[version from the entity]) and COMMIT the changes.
7.If the UPDATE returns “1 record updated” then our entity is stale because the new version from the database doesn't reflect the old one from the entity. A new SELECT, outside of the UPDATE transaction is required in order to refresh the entity which, of course, is not convenient at all.

So for this second kind of application I would use a regular version-column or hashing techniques which allows to use the RETURNING clause of the INSERT/UPDATE statements or to refresh the entity within the same modifying transaction.

Thursday, December 06, 2007

DEFAULT and NOT NULL become friends in 11g

If somebody would ask me to add a new column to a production table the first thing I would check is how many active transactions on that table are or, in other words, how hot it is from the concurrency point of view. This is because, if there are any active transactions on that table, my attempt to change its structure will simply fail. Of course, an online redefinition could be the right solution but is far more complicated than a simple ALTER TABLE. However, with Oracle 11g things change. My decision of using an ALTER TABLE statement instead of a DBMS_REDEFINITION operation when I am about to add new columns should be driven by the answers to the following questions:
1. Is it appropriate for the new column to be NOT NULL and to have a DEFAULT clause?
OR
2. Is it appropriate for the new column to allow NULLs without having a DEFAULT clause?
OR
3. Is it about a virtual column?
If the answer is YES you can safety go on with the ALTER TABLE ADD COLUMN statement.

Below is the proof:

1. create a dummy table:
create table muci (col1 number, col2 number);


2. from another sqlplus session run the following anonymous block which simulates a mixed workload with: INSERTs, UPDATEs and DELETEs:
-- simulate a workload with a mix of INSERTs, UPDATEs and DELTEs
-- close the sqlplus window in order to terminate
declare
l_option pls_integer;
begin
loop
l_option := round(dbms_random.value(1, 3));
if l_option = 1 then
update muci set col2 = dbms_random.value
where rowid in (select rowid from muci where rownum <= 1);
elsif l_option = 2 then
insert into muci (col1, col2) values (dbms_random.value, dbms_random.value);
else
delete from muci where rowid in (select rowid from muci where rownum <= 1);
end if;
commit;
end loop;
end;
/


3. now, run the following statements:
SQL> set timing on
SQL> alter table muci add (col3 number default 0 not null);

Table altered.

Elapsed: 00:00:01.32
SQL> alter table muci add (col4 number);

Table altered.

Elapsed: 00:00:03.36
SQL> alter table muci add (col5 number as (col1+col2) virtual);

Table altered.

Elapsed: 00:00:00.21
SQL> alter table muci add(col6 number default 0);
alter table muci add(col6 number default 0)
*
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired


Elapsed: 00:00:00.10


Very nice, except for the last case, when trying to add a column which allows NULLs and also have a DEFAULT clause. Why did not work? The answer relies, for sure, in the way the ALTER TABLE ADD COLUMN was implemented.

If the new column allows NULLs and has no DEFAULT clause then, anyway, the value of the new added column is NULL and being the last one it is not stored therefore no "data reorganization" is required within the table blocks. The rows from the "MUCI" table continue to have the following structure:



However, Oracle knows that there is a third column but its NULL value is not stored within the row.

When a new column is added having the NOT NULL constraint with a DEFAULT clause, the default value for this new column is stored just once as a table metadata. In a way it's the same like in the table compression feature where for repeated values of a column just one single value is stored, being further referenced when the table is queried. In this case, the rows from the MUCI table will still have the same structure:



The nice part is that Oracle knows that there is a third column which, on the row structure level, is like a "last column with NULLs". But how this could be true? The column was defined with a NOT NULL constraint. Well, if the row representation is like a "last column with NULLs" but the column was defined as NOT NULL then Oracle will simply take the default value which is stored in the metadata of the table and that value will be returned.

We can easily test that the rows layout remains the same:

1. create a dummy table:

SQL> create table my_table as select * from all_objects;

Table created.


2. check the corresponding segment size:

SQL> select bytes/1024 from user_segments where segment_name='MY_TABLE';

BYTES/1024
----------
8192


3. add a new "last column with NULLs":

SQL> alter table my_table add (my_column1 char(2000));

Table altered.


3. check the segment size.

SQL> select bytes/1024 from user_segments where segment_name='MY_TABLE';

BYTES/1024
----------
8192


Please notice that the size is the same.

4. add a new NOT NULL column with a default value:

SQL> alter table my_table add (my_column2 char(2000) default 'xxx' not null);

Table altered.


4. check the segment size:

SQL> select bytes/1024 from user_segments where segment_name='MY_TABLE';

BYTES/1024
----------
8192


Please notice that the segment size is unchanged.

So far so good... but what happens when you try do add a column which allows NULLs and also have a DEFAULT clause? In this case Oracle cannot decide if the value of the new added column should be NULL or the default value. For the previous two cases, it was a black&white approach but now, allowing DEFAULTS and NULLS turns out to be a gray problem and it cannot be handled in the same way. I mean, how could ever Oracle figure out whenever it's about a DEFAULT value or a NULL one? There's no possibility therefore the old approach will be used which actually implies re-organizations within the Oracle table blocks that is allocating the extra storage within each oracle block which belongs to the target table in order to accommodate space for the new column with the specified default value.

Let's check:

SQL> alter table my_table add (my_column3 char(10) default 'xxx');

Table altered.

SQL> select bytes/1024 from user_segments where segment_name='MY_TABLE';

BYTES/1024
----------
9216


Yes, the segment size has increased. Lovely!

Wednesday, November 07, 2007

Playing with Flashback Archives

Today I've decided to play a little bit with a new Oracle 11g feature called "Flashback Archive". The "Flashback" term is not new at all. I heard about it, for the very first time, in Oracle 9i. At that time, the "flashback" feature it was on its very beginning and allowed just to query a table "AS OF" a previous moment in time, taking into consideration, especially, UNDO segments constraints. As far as I remember there was no possibility to guarantee that the UNDO will really deliver the requested information as there was no GUARANTEE option for the UNDO tablespaces. However, Oracle 10g continued to refine this feature and brought a plenty of improvements: guarantee option for UNDOs, flashback database and flashback table, flashback transactions and others. Now, Oracle 11g adds a new component to the "flashback suite" which is this "flashback archive".
The "flashback archive" is nothing more than a repository which, of course, requires storage provided through one or more tablespaces and it has a well defined retention policy. This archive may be used to keep track of all changes performed on several configured tables. It is important to notice that you can setup more than one flashback archives which, interestingly, may share the same underlying tablespaces.

For example:
SQL> create tablespace test_arch_tbs datafile size 100M;

Tablespace created

SQL> create flashback archive test_archive tablespace test_arch_tbs retention 1 month;

Flashback archive created.

SQL> create flashback archive test_archive2 tablespace test_arch_tbs retention 1 month;

Flashback archive created.

Please notice that both "test_archive" and "test_archive2" share the same "test_arch_tbs" tablespace. In addition, you can configure a quota for every flashback archive you create, otherwise unlimited is supposed to be the default option.
Another important thing to notice is that you can configure a default flashback archive. You can do this at the archive creation time by specifying the "DEFAULT" option or afterwards using an "ALTER FLASHBACK ARCHIVE... SET DEFAULT". For both operations you need to be logged in as SYSDBA. Of course, you cannot have more than one default flashback archive.
Now, lets create a table with the flashback archive option enabled:
SQL> create table customers (
2 cust_id integer,
3 cust_name varchar2(100),
4 country_id varchar2(2)
5 ) flashback archive test_archive;

Table created

If the archive name is not explicitly specified then the default archive (if previously setup) will be used. It's time now to insert some records into our customers table. The question which arises is which time-stamp is used when flashback query the table: the one which corresponds to the INSERT itself or the one corresponding to the COMMIT statement. We'll figure out shortly:
SQL> insert into customers values (1, 'synapstec', 'ro');

1 row inserted

SQL> select systimestamp from dual;

SYSTIMESTAMP
-------------------------------------------------
07-NOV-07 05.50.01.383321 PM +02:00

SQL> commit;

Commit complete

SQL> select systimestamp from dual;

SYSTIMESTAMP
-------------------------------------------------
07-NOV-07 05.50.15.130301 PM +02:00

SQL> select /* INSERT TIME */ count(1)
2 from customers
3 as of timestamp to_timestamp('07-NOV-07 05.50.01.383321 PM',
4 'dd-mon-yy hh.mi.ss.ff PM');

COUNT(1)
----------
0

SQL> select /* COMMIT TIME */ count(1)
2 from customers
3 as of timestamp to_timestamp('07-NOV-07 05.50.15.130301 PM',
4 'dd-mon-yy hh.mi.ss.ff PM');

COUNT(1)
----------
1

As you can see the COMMIT time is taken into consideration when the table is flashback queried, which is the expected behavior. This means that, for example, several UPDATES onto the same column will not be stored into the archive but the last committed one.
Likewise, you may encounter problems if you need to flashback query at very precise timestamps. Please have a look to the above example:
6:21:38 PM SQL> update customers set country_id = 'fr';

1 row updated

6:21:38 PM SQL> commit;

Commit complete

6:21:38 PM SQL> select systimestamp from dual;

SYSTIMESTAMP
-------------------------------------------------
07-NOV-07 06.16.16.506341 PM +02:00

6:21:38 PM SQL> update customers set country_id = 'en';

1 row updated

6:21:38 PM SQL> commit;

Commit complete

6:21:38 PM SQL> select systimestamp from dual;

SYSTIMESTAMP
-------------------------------------------------
07-NOV-07 06.16.16.669808 PM +02:00

6:23:09 PM SQL> select /* FIRST COMMIT */ country_id
2 from customers
3 as of timestamp to_timestamp('07-NOV-07 06.16.16.506341 PM',
4 'dd-mon-yy hh.mi.ss.ff PM');

COUNTRY_ID
----------
ro

6:23:10 PM SQL>
6:24:18 PM SQL> select /* SECOND COMMIT */ country_id
2 from customers
3 as of timestamp to_timestamp('07-NOV-07 06.16.16.669808 PM',
4 'dd-mon-yy hh.mi.ss.ff PM');

COUNTRY_ID
----------
ro

6:27:56 PM SQL> select /* AFTER 3 SECONDS */ country_id
2 from customers
3 as of timestamp to_timestamp('07-NOV-07 06.16.19.669808 PM',
4 'dd-mon-yy hh.mi.ss.ff PM');

COUNTRY_ID
----------
ro

6:29:27 PM SQL> select /* AFTER SOME SECONDS */ country_id
2 from customers
3 as of timestamp to_timestamp('07-NOV-07 06.16.21.669808 PM',
4 'dd-mon-yy hh.mi.ss.ff PM');

COUNTRY_ID
----------
en

As you can see the rough time of each of the above commits doesn't return the correct flashback data. There is a delay between the COMMIT and the relative time to which the flashback archive may be queried. This drawback can be alleviated by using SCNs. So, we can replace timestamps with their corresponding, more precise, SCNs:
8:58:50 PM SQL> update customers set country_id = 'fr';

1 row updated

8:58:50 PM SQL> commit;

Commit complete

8:58:50 PM SQL> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
611944

8:58:51 PM SQL> update customers set country_id = 'en';

1 row updated

8:58:51 PM SQL> commit;

Commit complete

8:58:51 PM SQL> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
611947

8:58:51 PM SQL>
9:00:13 PM SQL> select /* FIRST COMMIT */ country_id
2 from customers
3 as of scn 611944;

COUNTRY_ID
----------
fr

9:00:13 PM SQL> select /* SECOND COMMIT */ country_id
2 from customers
3 as of scn 611947;

COUNTRY_ID
----------
en

Nice!
There are however some limitations as far as this feature is concerned. The ugliest ones, at least from my point of view, are:
1. the fact that I cannot modify columns owned by a table which has the flashback archive option enabled along with
2. the fact that we cannot perform partition operations for those tables.
These, put together with a long retention period, become a little bit scary because it is quite likely that, in 5 years for example, the business logic to change leading to modifying the type of a column, for instance. These changes are difficult to predict therefore extreme care should be given for the design of those tables which require flashback archiving. Again, if you have a big transaction table which needs flashback archiving but it is also suitable for partitioning, which feature should we elect? Hard choice, right?
If you ask me, I would, at least, prefer to have the possibility to temporary store the flashback archive records for a particular table into a temporary stage, with something like: "create table temp_stage as select * from {table} ... as of ...", then to disable the flashback archive, to do the maintenance stuff and then to have the possibility to reload the archive from the temporary stage. Well, it's just a thought... of course I have no idea about internal stuff and what this might imply.

Okey, that's it for the moment. The feature is nice and fits to a lot of projects.

Tuesday, November 06, 2007

Preparing Parameters for 12g?

Within all Oracle versions there were some features about I heard a lot something like: "well this feature is not quite ready now but it will be within the next Oracle version". This was the case for the "Oracle Resource Manager" in 9i, or for the "Oracle Managed Files" feature. Both of them were made available in 9i but many DBAs, including me, felt that they were not really ready for production. Only starting with Oracle 10g I could figure out that the strength of the OMF was especially in connection with ASM (Automatic Storage Management) and, just beginning with this version, the "Resource Manager" component had been improved quite enough to become appealing for a productive environment.


Now, I have the same feeling regarding the named and mixed notation for USER DEFINED PL/SQL functions, when they are invoked in a SQL context. For me, it was a common error in Oracle previous releases to write something like:


select dbms_random.string(opt => 'L', len => 10) from dual;

Of course, an error was raised because the SQL context doesn't know how to handle named parameters, but there is also true that when you are using an IDE like PL/SQL Developer or other powerful tools which provide code completion, it is almost impossible not to be tempted by such a feature, being sometimes hard to avoid the above mistake. However, the above code is perfectly valid in Oracle 11g. What really bothers me here is that I cannot use this approach for all functions. For example:


SQL> select regexp_count(srcstr => 'tralala', pattern => 'a') from dual;

select regexp_count(srcstr => 'tralala', pattern => 'a') from dual

ORA-00907: missing right parenthesis

Bleah, ugly! At least it is working for DBMS packages which is a great plus. But, anyway, for me it's a feature which is not "quite ready" but I hope it will be in Oracle 12g... maybe.

Monday, November 05, 2007

Describe REFCURSORs

In Oracle 11g, I really like the interoperability between DBMS_SQL cursors and REFCURSORs. Thanks God, they are finally friends and can "talk" to each other.
I remember that a few months ago I needed a feature through which to be able to describe a weak ref cursor, that is to know how many columns are in the result set along with the corresponding types for every column. Of course, I knew that DBMS_SQL package provides such a feature but it didn't apply to ref cursors therefore, bad luck!
Now, with the new Oracle 11g, it is possible to convert a ref cursor into a DBMS_SQL cursor and vice-versa so, virtually, this opens new gates, right?... including my problem with describing ref cursors.
Lets take a very simple example. Suppose we have a basic procedure which receives a ref cursor as a parameter and it has to know the structure of that cursor. Well, not a big deal in 11g. Just see below how it works:
SQL> create or replace procedure desc_refcursor(pi_cursor sys_refcursor) is
2 l_ref sys_refcursor;
3 l_cursor integer;
4 l_cols_count integer;
5 l_desc_tab dbms_sql.desc_tab;
6
7 begin
8 l_ref := pi_cursor;
9 l_cursor := dbms_sql.to_cursor_number(l_ref);
10 dbms_sql.describe_columns(l_cursor, l_cols_count, l_desc_tab);
11 dbms_output.put_line('number of columns = ' || l_cols_count);
12 dbms_output.new_line;
13 dbms_output.put_line('DESCRIBE CURSOR');
14 dbms_output.put_line('===========================');
15 for i in 1..l_cols_count
16 loop
17 dbms_output.put_line(l_desc_tab(i).col_name);
18 end loop;
19 dbms_sql.close_cursor(l_cursor);
20 exception
21 when others then
22 if dbms_sql.is_open(l_cursor) then
23 dbms_sql.close_cursor(l_cursor);
24 end if;
25 raise;
26 end desc_refcursor;
27 /

Procedure created

The whole trick consists in using the new DMBS_SQL.TO_CURSOR_NUMBER function (see line 9). Maybe you wonder why this function was not used against the pi_cursor parameter and why that additional assignment at the line 8. That's because the TO_CURSOR_NUMBER function expects an input/output parameter and because pi_cursor is read only the compiler will complain.

Now, lets test the above procedure:
SQL> set serveroutput on
SQL> declare
2 l_cursor sys_refcursor;
3 begin
4 open l_cursor for select * from user_tables;
5 desc_refcursor(l_cursor);
6 end;
7 /

number of columns = 50

DESCRIBE CURSOR
===========================
TABLE_NAME
TABLESPACE_NAME
CLUSTER_NAME
IOT_NAME
STATUS
PCT_FREE
PCT_USED
INI_TRANS
MAX_TRANS
INITIAL_EXTENT
NEXT_EXTENT
MIN_EXTENTS
MAX_EXTENTS
PCT_INCREASE
FREELISTS
FREELIST_GROUPS
LOGGING
BACKED_UP
NUM_ROWS
BLOCKS
EMPTY_BLOCKS
AVG_SPACE
CHAIN_CNT
AVG_ROW_LEN
AVG_SPACE_FREELIST_BLOCKS
NUM_FREELIST_BLOCKS
DEGREE
INSTANCES
CACHE
TABLE_LOCK
SAMPLE_SIZE
LAST_ANALYZED
PARTITIONED
IOT_TYPE
TEMPORARY
SECONDARY
NESTED
BUFFER_POOL
ROW_MOVEMENT
GLOBAL_STATS
USER_STATS
DURATION
SKIP_CORRUPT
MONITORING
CLUSTER_OWNER
DEPENDENCIES
COMPRESSION
COMPRESS_FOR
DROPPED
READ_ONLY

PL/SQL procedure successfully completed

As you can see we were able to obtain the structure of the REF CURSOR and with a minimum effort. However, there are some pseudo-limitations when using this feature and the most important one is that, as soon as you convert the REF CURSOR into the DBMS_SQL cursor representation then you cannot switch back to REF CURSOR without explicit conversion. This means that a scenario in which you'd want to describe the REF CURSOR and after that you would like to continue with your REF CURSOR using FETCH, CLOSE etc. (REF CURSOR specific), those operations will simply not work. You'll have to do the whole further cursor handling using DBMS_SQL package. However, you can re-convert to the REF CURSOR using DBMS_SQL.TO_REFCURSOR. So, the DESCRIBE_REFCURSOR procedure can be modified like this:
SQL> create or replace procedure desc_refcursor(pio_cursor in out sys_refcursor) is
2 l_cursor integer;
3 l_cols_count integer;
4 l_desc_tab dbms_sql.desc_tab;
5 begin
6 l_cursor := dbms_sql.to_cursor_number(pio_cursor);
7 dbms_sql.describe_columns(l_cursor, l_cols_count, l_desc_tab);
8 dbms_output.put_line('number of columns = ' || l_cols_count);
9 dbms_output.new_line;
10 dbms_output.put_line('DESCRIBE CURSOR');
11 dbms_output.put_line('===========================');
12 for i in 1..l_cols_count
13 loop
14 dbms_output.put_line(l_desc_tab(i).col_name);
15 end loop;
16 pio_cursor := dbms_sql.to_refcursor(l_cursor);
17 end desc_refcursor;
18 /

Now the invoking anonymous block can continue using ref cursor operations:
SQL> set serveroutput on
SQL> declare
2 l_cursor sys_refcursor;
3 begin
4 open l_cursor for select * from cat;
5 desc_refcursor(l_cursor);
6 close l_cursor; -- REFCURSOR operation
7 end;
8 /

number of columns = 2

DESCRIBE CURSOR
===========================
TABLE_NAME
TABLE_TYPE

PL/SQL procedure successfully completed

Yes, that's it! Nice feature!

Sunday, November 04, 2007

DBMS_XA and Four Eyes Principle

I am simply amazed by the new DBMS_XA package available in Oracle 11g. It offers the possibility to define global transactions and to manage them from within different sessions using a 2PC (Two Phase Commit) approach. One of the great benefit I see now is for those systems which require the implementation of a "four eyes principle", that is somebody puts a transaction into the system but somebody else checks it and finally approves it. Without this approval that transaction is not visible which, from the business point of view, it's not committed.

Let's walk through a very simple and basic example.
First of all we'll need an Oracle user:
grant connect, resource, force any transaction to fep identified by xxx;

Pay attention to the special right "foce any transaction". It is needed in order to be able to manage transactions created by other users.
Now, lets create a simple table which is supposed to store invoices.

SQL> connect fep/xxx@testdb
Connected to Oracle Database 11g Enterprise Edition Release 11.1.0.6.0
Connected as fep

SQL>
SQL> create table invoices (
2 inv_number integer,
3 inv_text varchar2(100),
4 inv_date date,
5 inv_value number(20, 6)
6 );

Table created

Now, assume that somebody books in a new invoice. This means that a new record has to be added to the "INVOICES" table but, being subject to "four eyes principle" it must not be committed till somebody else will check it. In the DBMS_XA package light this should be done like this:

SQL> DECLARE
2 l_rc PLS_INTEGER;
3 l_oer PLS_INTEGER;
4 e_xae EXCEPTION;
5 BEGIN
6 l_rc := DBMS_XA.XA_START(DBMS_XA_XID(123), DBMS_XA.TMNOFLAGS);
7
8 IF l_rc != DBMS_XA.XA_OK THEN
9 RAISE e_xae;
10 END IF;
11
12 insert into invoices values (1, 'buy a TV', sysdate, 1000);
13
14 l_rc := DBMS_XA.XA_END(DBMS_XA_XID(123), DBMS_XA.TMSUSPEND);
15
16 IF l_rc != DBMS_XA.XA_OK THEN
17 RAISE e_xae;
18 END IF;
19
20 EXCEPTION
21 WHEN OTHERS THEN
22 l_rc := DBMS_XA.XA_END(DBMS_XA_XID(123), DBMS_XA.TMSUCCESS);
23 l_rc := DBMS_XA.XA_ROLLBACK(DBMS_XA_XID(123));
24
25 IF l_rc != DBMS_XA.XA_OK THEN
26 l_oer := DBMS_XA.XA_GETLASTOER();
27 raise_application_error(-20001,
28 'ORA-' || l_oer ||
29 ' error in rolling back a failed transaction');
30 END IF;
31
32 raise_application_error(-20002,
33 'error in transaction processing, transaction rolled back');
34 END;
35 /

PL/SQL procedure successfully completed.

SQL> disconnect
Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning and Real Application Testing options

Lets have a look at the above piece of code. At the 6th line a new global transaction is started. The 123 is the global transaction identifier and should be provided by the application and it is required to be unique. If the transaction has been successfully started then returning value must be DBMS_XA.XA_OK,otherwise an error has occurred. Then, the insert into our "INVOICES" table is made and the global transaction is simply suspended. Pay attention that no COMMIT is issued. Of course, it is important to note how the status of each DBMS_XA operation is checked and how the exception handler is written. Finally, the session is ended by effectively disconnecting.

Now, suppose that the second person must take over the previous transaction, to check it and to approve it. This can be done like shown below:

SQL> connect fep/xxx@testdb
Connected.
SQL> select * from invoices;

no rows selected

SQL> commit;

Commit complete.

SQL> set serveroutput on
SQL>
SQL> DECLARE
2 l_rc PLS_INTEGER;
3 l_oer PLS_INTEGER;
4 e_xae EXCEPTION;
5 l_inv_no invoices.inv_number%type;
6 l_inv_dat invoices.inv_date%type;
7 l_inv_txt invoices.inv_text%type;
8 l_inv_val invoices.inv_value%type;
9 BEGIN
10 -- resume global transaction
11 l_rc := DBMS_XA.XA_START(DBMS_XA_XID(123), DBMS_XA.TMRESUME);
12 IF l_rc != DBMS_XA.XA_OK THEN
13 RAISE e_xae;
14 END IF;
15
16 select *
17 into l_inv_no, l_inv_txt, l_inv_dat, l_inv_val
18 from invoices i
19 where i.inv_number = 1;
20
21 dbms_output.put_line('invoice number = ' || l_inv_no);
22 dbms_output.put_line('invoice text = ' || l_inv_txt);
23 dbms_output.put_line('invoice date = ' || l_inv_dat);
24 dbms_output.put_line('invoice value = ' || l_inv_val);
25
26 -- end global transaction
27 l_rc := DBMS_XA.XA_END(DBMS_XA_XID(123), DBMS_XA.TMSUCCESS);
28 IF l_rc != DBMS_XA.XA_OK THEN
29 RAISE e_xae;
30 END IF;
31
32 -- commit global transaction
33 l_rc := DBMS_XA.XA_COMMIT(DBMS_XA_XID(123), TRUE);
34 IF l_rc != DBMS_XA.XA_OK THEN
35 RAISE e_xae;
36 END IF;
37
38 EXCEPTION
39 WHEN OTHERS THEN
40 l_rc := DBMS_XA.XA_END(DBMS_XA_XID(123), DBMS_XA.TMSUCCESS);
41 l_rc := DBMS_XA.XA_ROLLBACK(DBMS_XA_XID(123));
42
43 IF l_rc != DBMS_XA.XA_OK THEN
44 l_oer := DBMS_XA.XA_GETLASTOER();
45 raise_application_error(-20001,
46 'ORA-' || l_oer ||
47 ' error in rolling back a failed transaction');
48 END IF;
49
50 raise_application_error(-20002,
51 'error in transaction processing, transaction rolled back');
52 END;
53 /
invoice number = 1
invoice text = buy a TV
invoice date = 04-NOV-07
invoice value = 1000

PL/SQL procedure successfully completed.

SQL> disconnect
Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning and Real Application Testing options

Pay attention to the COMMIT statement just above the anonymous block. That's needed in order the anonymous block to start within a new fresh transaction, otherwise an error may occur.

So, using DBMS_XA package we were able to resume an uncommitted transaction made from another session, to see the data and to commit the changes. Of course, we can connect once again to see if the data is really committed.

SQL> connect fep/xxx@testdb
Connected.
SQL> set linesize 120
SQL> column inv_text format a10
SQL> select * from invoices;

INV_NUMBER INV_TEXT INV_DATE INV_VALUE
---------- ---------- --------- ----------
1 buy a TV 04-NOV-07 1000

Lovely! The data is there!

However, there are some additional areas which I still have to investigate in order to validate this model for a MAYBE "four eyes principle" system:
1) where I can see all suspended XA transactions? it seems that DBA_PENDING_TRANSACTIONS, V$GLOBAL_TRANSACTION, DBA_2PC_PENDING or DBA_2PC_NEIGHBORS doesn't show any suspended transaction.
2) are the suspended global transactions persistent across database restarts? Apparently, as far as I could test, it's really not the case.
3) what about the global transaction timeout for a session? If a global transaction should never expire than we have a problem as I don't see any special value for the timeout parameter of the DBMS_XA.XA_SETTIMOUT function (e.g. zero means forever). So the maximum timeout we can set is the upper bound of the PLS_INTEGER type which is 2147483647. That means approximately 24855 days. However, without persistence across database restarts it's, anyway, useless.
4) no idea how this component behaves in a RAC configuration. For example starting a transaction on one instance and resuming it on another one.

Monday, August 06, 2007

PGA Diagnostics

Do not forget a very important Oracle memory buffer which is the PGA (Program Global Area)! Most of us are tempted to look first of all at the SGA, especially at the library cache, whenever users complain that their SQLs run poorly. Of course, the library cache is a proper place to look at for problems, as well as the disk activity, CPU load and so many, many others. However, do NOT forget the PGA, because an inappropriate configuration can lead to a lot of performance issues.


So, reading through the "Database Performance Tuning Guide" I have found the section which talk about the tuning of the PGA. I am reading it quite often but it's still difficult to remember all the related fixed views and queries which are in connection with this topic. Therefore I have decided to put all the nice queries into a script which should give me a rough image of the PGA configuration.


Below is the script which I would like to share with all whom might be interested:



-- PGA Report
-- Displays various statistics regarding the PGA usage

set linesize 140
set pagesize 9999
set heading off
set feedback off

-- general statistics
select 'SECTION 1: GENERAL STATISTICS FOR THE PGA' from dual;
select '=================================================================='
|| '=========================' from dual;

set heading on
select name,
decode(unit, 'bytes', round(value / 1024 / 1024, 2), value) value,
decode(unit, 'bytes', 'MB', '') unit
from v$pgastat;

set heading off
select 'Check the following:' from dual;
select ' - "aggregate PGA auto target" should not be too small in comparison '
|| 'with the "aggregate PGA target parameter"' from dual;
select ' - "global memory bound" should not be lower than 1M' from dual;
select ' - "over allocation count" should be near 0 or should not increasing in time' from dual;
select ' - the bigger "cache hit percentage", the better' from dual;
select '' from dual;

set heading on
SELECT name profile, cnt, decode(total, 0, 0, round(cnt*100/total)) percentage
FROM (SELECT name, value cnt, (sum(value) over ()) total
FROM V$SYSSTAT
WHERE name like 'workarea exec%');
set heading off

-- PGA used by all DB processes
select 'SECTION 2: PGA USED BY CURRENT DB PROCESSES (IN MB), ORDER BY PGA_ALLOC_MEM' from dual;
select '=================================================================='
|| '=========================' from dual;
set heading on

break on report;
compute sum label '--> TOTAL' of pga_used_mem on report;
compute sum label '--> TOTAL' of pga_alloc_mem on report;
compute sum label '--> TOTAL' of pga_freeable_mem on report;
compute sum label '--> TOTAL' of pga_max_mem on report;

SELECT PROGRAM,
round(PGA_USED_MEM / 1024 / 1024, 2) pga_used_mem,
round(PGA_ALLOC_MEM / 1024 / 1024, 2) pga_alloc_mem,
round(PGA_FREEABLE_MEM / 1024 / 1024, 2) pga_freeable_mem,
round(PGA_MAX_MEM / 1024 / 1024, 2) pga_max_mem
FROM V$PROCESS
order by pga_alloc_mem desc;

set heading off
select 'The columns have the following meaning:' from dual;
select ' - PGA_USED_MEM = PGA memory currently used by the process' from dual;
select ' - PGA_ALLOC_MEM = PGA memory currently allocated by the process (including free '
|| 'PGA memory not yet released to the operating system by the server process)' from dual;
select ' - PGA_FREEABLE_MEM = Allocated PGA memory which can be freed' from dual;
select ' - PGA_MAX_MEM = Maximum PGA memory ever allocated by the process' from dual;
select '' from dual;

set feedback off;
select 'SECTION 3: USED PGA MEMORY BY CATHEGORIES (VALUES IN MB ORDER DESC BY ALLOCATED_MB).' from dual;
select '=================================================================='
|| '=========================' from dual;
set heading on
select category,
round(sum(allocated) / 1024 / 1024, 2) allocated_mb,
round(sum(used) / 1024 / 1024, 2) used_mb,
round(sum(max_allocated) / 1024 / 1024, 2) max_allocated_mb
from v$process_memory
group by category
order by 2 desc;

set heading off
select '' from dual;

-- workareas histogram
select 'SECTION 4: SQL WORKAREAS HISTOGRAM' from dual;
select '=================================================================='
|| '=========================' from dual;
set heading on
SELECT LOW_OPTIMAL_SIZE/1024 low_kb,
(HIGH_OPTIMAL_SIZE+1)/1024 high_kb,
OPTIMAL_EXECUTIONS, ONEPASS_EXECUTIONS, MULTIPASSES_EXECUTIONS
FROM V$SQL_WORKAREA_HISTOGRAM
WHERE TOTAL_EXECUTIONS != 0;

set heading off
select '' from dual;

-- active workareas
select 'SECTION 5: CURRENTLY ACTIVE WORKAREAS' from dual;
select '=================================================================='
|| '=========================' from dual;
set heading on
set feedback on
SELECT to_number(decode(SID, 65535, NULL, SID)) sid,
operation_type OPERATION,
trunc(EXPECTED_SIZE/1024) ESIZE,
trunc(ACTUAL_MEM_USED/1024) MEM,
trunc(MAX_MEM_USED/1024) "MAX MEM",
NUMBER_PASSES PASS,
trunc(TEMPSEG_SIZE/1024/1024) TSIZE_MB
FROM V$SQL_WORKAREA_ACTIVE
ORDER BY 1,2;
set heading off
set feedback off
select 'The meaning of the above columns is:' from dual;
select ' - SID = the active session identifier' from dual;
select ' - OPERATION = the type of the operation' from dual;
select ' - ESIZE = the expected size for the sql workarea' from dual;
select ' - MEM = Amount of PGA memory (in KB) currently allocated on behalf of this work area.' from dual;
select ' - MAX MEM = Maximum memory amount (in KB) used by this work area' from dual;
select ' - PASS = Number of passes corresponding to this work area (0 if running in OPTIMAL mode)' from dual;
select ' - TSIZE_MB = Size (in megabytes) of the temporary segment used on behalf of this work area. '
|| 'This column is NULL if this work area has not (yet) spilled to disk.' from dual;
select '' from dual;

-- top 10 sql with gurmand sql areas
select 'SECTION 6: OP 10 WORK AREAS REQUIRING MOST CACHE MEMORY' from dual;
select '=================================================================='
|| '=========================' from dual;
set heading on
SELECT *
FROM (SELECT distinct s.sql_id,
substr(s.SQL_TEXT, 1, 50) sql_text,
operation_type,
estimated_optimal_size,
max_tempseg_size
FROM V$SQL_WORKAREA a, V$SQLSTATS s
WHERE a.SQL_ID = s.SQL_ID
ORDER BY estimated_optimal_size)
WHERE ROWNUM <= 10;

set heading off
select 'SECTION 7: SQLs WITH WORK AREAS THAT HAVE BEEN EXECUTED IN ONE OR EVEN MULTIPLE PASSES' from dual;
select '=================================================================='
|| '=========================' from dual;
set heading on
set feedback on
SELECT sql_text, sum(ONEPASS_EXECUTIONS) onepass_cnt,
sum(MULTIPASSES_EXECUTIONS) mpass_cnt
FROM V$SQL s, V$SQL_WORKAREA wa
WHERE s.address = wa.address
GROUP BY sql_text
HAVING sum(ONEPASS_EXECUTIONS+MULTIPASSES_EXECUTIONS)>0;
set feedback off
set heading off

select 'SECTION 8: PGA TARGET ADVCE' from dual;
select '=================================================================='
|| '=========================' from dual;
show parameter pga_aggregate_target;
set heading on
SELECT round(PGA_TARGET_FOR_ESTIMATE/1024/1024) target_mb,
ESTD_PGA_CACHE_HIT_PERCENTAGE cache_hit_perc,
ESTD_OVERALLOC_COUNT
FROM V$PGA_TARGET_ADVICE;


set heading off
select '' from dual;
pause press any key to exit...
exit

Have fun!

Thursday, August 02, 2007

Automatic Shared Memory Management

Well, it's been a while since the last post... However, I'm back and I am going to dig a little bit on the “Automatic Shared Memory Management” 10g feature.


As you already (probably) know, every Oracle instance has a big shared buffer called SGA (System Global Area). This big memory buffer is further divided in several slices, each of them being used for special kind of memory allocations. Till Oracle 9i, the only possibility to adjust their sizes were to manually set a couple of initialization parameters like: DB_CACHE_SIZE, SHARED_POOL_SIZE, JAVA_POOL_SIZE etc. However, in 10g some buffers from the SGA were endowed with a nice capability to resize themselves according to the current workload and other parameters.

The figure below depicts a regular section through the SGA.



As you can see the maximum size upon to which the whole SGA may extend is given by the SGA_MAX_SIZE parameter. However, the SGA_TARGET parameter is the one which actually set the amount of memory to be used by all SGA buffers.
The correlation between the values of the SGA_MAX_SIZE and the SGA_TARGET parameters is important mostly when you query the V$SGA_DYNAMIC_FREE_MEMORY.
Lets have a look onto the following output:



SQL> show parameter sga_target;

NAME TYPE VALUE
------------------------------------ -------------------------------- ------------------------------
sga_target big integer 1504M

SQL> select current_size/1024/1024 size_M from v$sga_dynamic_free_memory;

SIZE_M
----------
544

SQL> show parameter sga_max_size

NAME TYPE VALUE
------------------------------------ -------------------------------- ------------------------------
sga_max_size big integer 2G

At the very first sight you may think that from the SGA_TARGET which is 1504M, about 544M are free which, of course, is not true because the Oracle server will always try to expand SGA buffers so that the whole SGA_TARGET amount to be allocated. The free memory reported by the V$SGA_DYNAMIC_FREE_MEMORY view is the difference between the SGA_MAX_SIZE and the SGA_TARGET, which should be read as: “I still have 544M free from the whole SGA_MAX_SIZE which I can use for further expansions of the SGA_TARGET”.


A nice view which I also like to query when I want to find out a brief summary of the SGA composition is the V$SGAINFO.



SQL> select * from v$sgainfo;

NAME BYTES RES
----------------------------------- ---------- ---
Fixed SGA Size 2074152 No
Redo Buffers 14700544 No
Buffer Cache Size 671088640 Yes
Shared Pool Size 838860800 Yes
Large Pool Size 16777216 Yes
Java Pool Size 16777216 Yes
Streams Pool Size 16777216 Yes
Granule Size 16777216 No
Maximum SGA Size 2147483648 No
Startup overhead in Shared Pool 100663296 No
Free SGA Memory Available 570425344

11 rows selected.

Please notice that I can also look at the size of the so-called “granules”, which are the allocation units for the SGA memory. The size of one granule varies according to the whole size of the SGA, operation system and platform. You can also find the size of the granule by querying the V$SGA_DYNAMIC_COMPONENTS.



SQL> select granule_size from v$sga_dynamic_components where rownum <= 1;

GRANULE_SIZE
------------
16777216

Another important thing to remember is that if you set values for the parameters which controls the size of the dynamic buffers into the SGA then, those values will be taken as minimum thresholds which means that Oracle server will try to adjust the size of the buffers when necessary but will never decrease them under the manually set value.
Of course, there is also the possibility to disable the “Automatic Shared Memory Management” feature by simply setting the SGA_TARGET value to zero. Likewise, this feature will also be disabled if the STATISTICS_LEVEL parameter is NONE, but this is not recommended at all as it disables other Oracle features too (ADDM, for example).


Well, that's it... not very new information but just a short review.

Thursday, June 21, 2007

/etc/oratab or /etc/rc.d/rc3.d/S99gcstartup?


A few days ago I had updated the Oracle Grid Control and its database located on a RedHat AS4 server. Everything worked fine till the server was rebooted and I had the “nice” surprise to see that the repository database was restarted but using the old ORACLE_HOME. At this point it is important to review the steps I had followed as part of the upgrading task:


1. stop agent

10ag
emctl stop agent

2. stop opmn stack

10gr
$ORACLE_HOME/opmn/bin/opmnctl stopall

3. stop listener

10db
lsnrctl stop

4. stop database

sqlplus / as sysdba
shutdown immediate

5. backup agent, database, oms homes

cp -r /opt/oracle/product/10.2.0/agent /opt/oracle/backup/
cp -r /opt/oracle/product/10.2.0/db10g /opt/oracle/backup/
cp -r /opt/oracle/product/10.2.0/oms10g /opt/oracle/backup/

6. apply patch for 4329444

oracle@aut-vie-racman:kits$ unzip p4329444_10104_LINUX.zip
Archive: p4329444_10104_LINUX.zip
creating: 4329444/
creating: 4329444/files/
creating: 4329444/files/lib/
creating: 4329444/files/lib/libserver10.a/
inflating: 4329444/files/lib/libserver10.a/qerix.o
creating: 4329444/etc/
creating: 4329444/etc/config/
inflating: 4329444/etc/config/inventory
inflating: 4329444/etc/config/actions
creating: 4329444/etc/xml/
inflating: 4329444/etc/xml/GenericActions.xml
inflating: 4329444/etc/xml/ShiphomeDirectoryStructure.xml
inflating: 4329444/README.txt
oracle@aut-vie-racman:kits$ cd 4329444/
oracle@aut-vie-racman:4329444$ /opt/oracle/product/10.2.0/db10g/OPatch/opatch apply
Invoking OPatch 10.2.0.1.0
Oracle interim Patch Installer version 10.2.0.1.0
Copyright (c) 2005, Oracle Corporation. All rights reserved..
Oracle Home : /opt/oracle/product/10.2.0/db10g
Central Inventory : /opt/oracle/oraInventory
from : /opt/oracle/product/10.2.0/db10g/oraInst.loc
OPatch version : 10.2.0.1.0
OUI version : 10.2.0.1.0
OUI location : /opt/oracle/product/10.2.0/db10g/oui
Log file location : /opt/oracle/product/10.2.0/db10g/cfgtoollogs/opatch/opatch-
2007_Jun_19_16-16-19-CEST_Tue.log
ApplySession applying interim patch '4329444' to OH
'/opt/oracle/product/10.2.0/db10g'
Invoking fuser to check for active processes.
Invoking fuser on "/opt/oracle/product/10.2.0/db10g/bin/oracle"
OPatch detected non-cluster Oracle Home from the inventory and will patch the local
system only.
Please shutdown Oracle instances running out of this ORACLE_HOME on the local
system.
(Oracle Home = '/opt/oracle/product/10.2.0/db10g')
Is the local system ready for patching?
Do you want to proceed? [y|n]
y
User Responded with: Y
Backing up files and inventory (not for auto-rollback) for the Oracle Home
Backing up files affected by the patch '4329444' for restore. This might take a
while...
Backing up files affected by the patch '4329444' for rollback. This might take a
while...
Patching component oracle.rdbms, 10.1.0.4.0...
Updating archive file "/opt/oracle/product/10.2.0/db10g/lib/libserver10.a" with
"lib/libserver10.a/qerix.o"
Running make for target ioracle
ApplySession adding interim patch '4329444' to inventory
The local system has been patched and can be restarted.
OPatch succeeded.

7. Start db listener

oracle@aut-vie-racman:4329444$ 10db
oracle@aut-vie-racman:4329444$ lsnrctl start
LSNRCTL for Linux: Version 10.1.0.4.0 - Production on 19-JUN-2007 16:18:42
Copyright (c) 1991, 2004, Oracle. All rights reserved.
Starting /opt/oracle/product/10.2.0/db10g/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 10.1.0.4.0 - Production
System parameter file is /opt/oracle/product/10.2.0/db10g/network/admin/listener.ora
Log messages written to /opt/oracle/product/10.2.0/db10g/network/log/listener.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=aut-vieracman.
bankgutmann.co.at)(PORT=1521)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 10.1.0.4.0 - Production
Start Date 19-JUN-2007 16:18:43
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File
/opt/oracle/product/10.2.0/db10g/network/admin/listener.ora
Listener Log File /opt/oracle/product/10.2.0/db10g/network/log/listener.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=aut-vieracman.
bankgutmann.co.at)(PORT=1521)))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

8. start emrep database

oracle@aut-vie-racman:4329444$ sqlplus / as sysdba
SQL*Plus: Release 10.1.0.4.0 - Production on Tue Jun 19 16:20:27 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 536870912 bytes
Fixed Size 780056 bytes
Variable Size 275781864 bytes
Database Buffers 260046848 bytes
Redo Buffers 262144 bytes
Database mounted.
Database opened.

9. Patch OMS.
10. Patch agent
11. Start agent

10ag
emctl start agent

12. Start oms

10gr
$ORACLE_HOME/opmn/bin/opmnctl startall


The Grid Control was patched and it worked just fine. The only problem was that the database
remained at 10.1.0.4.0 version. So, I decided to upgrade the repository database too, following the steps below:


1. stop agent
2. stop oms
3. stop database
4. stop listener
5. install 10.2.0.1 Oracle software into a new home
6. upgrade to 10.2.0.3 the Oracle software
7. upgrade the database using “dbua”
8. start oms
9. start agent


Everything was normal until today when we had to shutdown the server for maintenance reasons. Unfortunately, on reboot the repository database continued to be started under the previous 10.1.0.4 oracle home and, of course, because of this the OEM couldn't be used. I had a look into /etc/oratab file and the entries were fine. The database was supposed to be started from the new upgraded ORACLE_HOME!!! Furthermore, changing the /etc/oratab file seemed to have no effect. In the end, I found the reason. Under the /etc/rc.d/rc3.d/ directory there is a script called S99gcstartup. This script calls the startup scripts for repository database, OMS and for the agent. It's enough to copy the startup script for the database under the new ORACLE_HOME directory and to change the ORACLE_HOME variable which is declared in that script. After these changes the repository database was/is started up with the correct ORACLE_HOME on every reboot.

Friday, June 08, 2007

Losing One Voting Disk

Voting disks are used in a RAC configuration for maintaining nodes membership. They are critical pieces in a cluster configuration. Starting with ORACLE 10gR2, it is possible to mirror the OCR and the voting disks. Using the default mirroring template, the minimum number of voting disks necessary for a normal functioning is two.

Scenario Setup

In this scenario it is simulated the crash of one voting disk by using the following steps:

  1. identify votings:

crsctl query css votedisk

0. 0 /dev/raw/raw1

1. 0 /dev/raw/raw2

2. 0 /dev/raw/raw3

  1. corrupt one of the voting disks (as root):

    dd if=/dev/zero /dev/raw/raw3 bs=1M

Recoverability Steps

  1. check the “$CRS_HOME/log/[hostname]/alert[hostname].log” file. The following message should be written there which allows us to determine which voting disk became corrupted:

    [cssd(9120)]CRS-1604:CSSD voting file is offline: /opt/oracle/product/10.2.0/crs_1/Voting1. Details in /opt/oracle/product/10.2.0/crs_1/log/aut-arz-ractest1/cssd/ocssd.log.

  2. According to the above listing the Voting1 is the corrupted disk. Shutdown the CRS stack:

    srvctl stop database -d fitstest -o immediate

    srvctl stop asm -n aut-vie-ractest1

    srvctl stop asm -n aut-arz-ractest1

    srvctl stop nodeapps -n aut-vie-ractest1

    srvctl stop nodeapps -n aut-arz-ractest1

    crs_stat -t

    On every node as root:

    crsctl stop crs

  3. Pick a good voting from the remaining ones and copy it over the corrupted one:

    dd if=/dev/raw/raw4 of=/dev/raw/raw3 bs=1M

  4. Start CRS (on every node as root):

      crsctl start crs

  5. Check log file “$CRS_HOME/log/[hostname]/alert[hostname].log”. It should look like shown below:

    [cssd(14463)]CRS-1601:CSSD Reconfiguration complete. Active nodes are aut-vie-ractest1 aut-arz-ractest1 .

    2007-05-31 15:19:53.954

    [crsd(14268)]CRS-1012:The OCR service started on node aut-vie-ractest1.

    2007-05-31 15:19:53.987

    [evmd(14228)]CRS-1401:EVMD started on node aut-vie-ractest1.

    2007-05-31 15:19:55.861 [crsd(14268)]CRS-1201:CRSD started on node aut-vie-ractest1.

  6. After a couple of minutes check the status of the whole CRS stack:

    [oracle@aut-vie-ractest1 ~]$ crs_stat -t

    Name Type Target State Host

    ------------------------------------------------------------

    ora....SM2.asm application ONLINE ONLINE aut-...est1

    ora....T1.lsnr application ONLINE ONLINE aut-...est1

    ora....st1.gsd application ONLINE ONLINE aut-...est1

    ora....st1.ons application ONLINE ONLINE aut-...est1

    ora....st1.vip application ONLINE ONLINE aut-...est1

    ora....SM1.asm application ONLINE ONLINE aut-...est1

    ora....T1.lsnr application ONLINE ONLINE aut-...est1

    ora....st1.gsd application ONLINE ONLINE aut-...est1

    ora....st1.ons application ONLINE ONLINE aut-...est1

    ora....st1.vip application ONLINE ONLINE aut-...est1

    ora....test.db application ONLINE ONLINE aut-...est1

    ora....t1.inst application ONLINE ONLINE aut-...est1

    ora....t2.inst application ONLINE ONLINE aut-...est1


Note: There's also possible to recover a lost voting disk from an old voting backup and to perform the “dd” command without shutting down the CRS stack.