Ribald tales of swashbuckling geeks!

The Geeky DBA

May 11, 2007

Oracle Clone Wars

Filed under: Administration, Backups — admin @ 3:31 pm

So I’ve been trying to clone our production database to this new NAS device. It’s not as fast as a SAN of course but it’s decent. My steps were to copy the last hot backup, recover to current, then open it up. I had two iterations of this and the second one is what worked. Here’s what I ended up doing…

I found that Oracle (no matter what I tried security-wise) couldn’t work with a permanent drive mapped to my NAS share. So I used UNC paths. That’s all good.

I tried copying my hot backup across the network first. This took several days (from one older NAS to the new one). After I recovered it for a week I can only guess it got corrupted somehow and I was unable to open it up after recovery was well past the end of the hot backup.

In my final attempt, I took a fresh hot backup directly to the new NAS. This worked like a charm and finished in roughly 30 hours. I immediately started recovery which started out well but then I began having the recovery issues. The database would recover four or five archive logs then fail with an OS error saying the network name was no longer available and that the asynchronous I/O could not be queued. This was very frustrating as it was taking hours to recover two to three archive logs. Finally I created a batch file to keep the database looping into recovery. This worked out. By Wednesday of this week I was recovered and was able to open up the database.

The NAS is performing adequately and all seems to be ok. I’ve got load on it today and will post back with updates.

Sphere: Related Content

• • •
 

May 1, 2007

UNC Paths and DBA Time Travel

Filed under: Administration — admin @ 5:00 pm

Here’s my set up. Per my last post, I’ve created a hot backup of my production database. After copying it to our newly acquired NAS device (which blew a drive already), I started the recovery process. This was started several days ago. The recovery puked about 20 times over the weekend because of network hiccups or possibly the blown RAID (from the bad drive). The drive was replaced yesterday and the process has only puked once. That’s the good news.

The bad news is that it’s taking approximately 25-30 minutes per archive log file. Each file is 250 MB in size. Therefore the recovery is taking longer than real-time. I feel like I’m traveling backwards in time!

Anyway, let’s hope the performance of this test system is adequate for our needs otherwise I’m not sure what our next options are.

Sphere: Related Content

• • •
 

April 26, 2007

Oracle and UNC Paths

Filed under: Administration — admin @ 9:54 pm

My latest project is to move our test instance off of our primary SAN and move it to a cheaper NAS device. This instance is used for testing and quality assurance. We purchased a RaidWeb NAS device last week. It has 2TB when formatted RAID5.

I mounted it up in two shares to drives P: and O: on my test box. I copied all the data file to the two drives. When I tried to create the control database I found that oracle could not authenticate, therefore read the datafiles on the mapped drives. So I next tried UNC paths (i.e. \\hostname\share). This too did not work. I changed my service to run under a privileged user. Still no love.

As a last ditch effort, I changed the hostname in the path to the IP address of the NAS device. Guess what, it worked! Woot! Control file created, on to recovery.

Our system generates quite a bit of redo, about 100GB a day. I don’t want to copy all this locally to recover this test instance. I set the archive log destination to a public share on our hot standby and started recovery. Sure enough, Oracle picked up the files and started blasting through the recovery. I’ll post tomorrow on the results.

I’m curious to see how well the database performs, even though it’s a test instance, over an ethernet connected NAS versus a fiber connected SAN.

Sphere: Related Content

• • •
 

April 25, 2007

Oracle Hot Backups: The Geeky DBA Way!

Filed under: Maintenance, Backups — admin @ 11:02 am

As part of our maintenance schedule, I perform a weekly hot backup of our site. This is also copied elsewhere for DR purposes. I also maintain a hot physical standby that is never more than 6 hours behind for quick recovery. I’ve found recently that our previous DBA’s hot backup script was extremely (and I mean really really) convoluted. Obviously coded by someone who had no development experience.

So I took it upon myself last weekend to write an application to generate a “proper” hot backup script. My current development environment of choice is Visual Studio 2005 and C#. This application is a Windows console application and the source code can be downloaded here.

My goal was simple. I wanted a set of scripts that would distribute my hot backup across n database nodes and multiple channels per node. This will minimize the backup time. Since I’m backing up over a terabyte of data, I want it done quickly and efficiently.

I first created two classes to hold a copy of all the tablespaces and datafiles in my instance:

/// 
/// An Oracle tablespace
/// 
class OracleTablespace
{
	public long m_lID = 0;
	public string m_sName = “”;
	public long m_lSize = 0;
	public long m_lAssignedThread = -1;
}

/// 
/// An Oracle data file
/// 
class OracleDataFile
{
	public long m_lID = 0;
	public long m_lTSID = 0;
	public string m_sName = “”;
	public long m_lSize = 0;
}

I then open a connection to my instance and retrieve a list of all the tablespaces and fill up an array of OracleTablespace objects:

	//	Create a list of tablespaces
	OracleCommand _cmd = new OracleCommand( "select ts#, name from v$tablespace order by ts#",
		m_dbConn );

	OracleDataReader _dr = _cmd.ExecuteReader();

	while ( _dr.Read() )
	{
		OracleTablespace _ts = new OracleTablespace();
		_ts.m_lID = _dr.GetInt32( 0 );
		_ts.m_sName = _dr.GetString( 1 );
		m_lstTablespace.Add( _ts );
	}

	_dr.Close();
	_cmd.Dispose();

I then get a list of all datafiles in the instance:

	//	Create a list of data files
	_cmd = new OracleCommand( "select file#, ts#, name, bytes from v$datafile order by file#",
		m_dbConn );

	_dr = _cmd.ExecuteReader();

	while ( _dr.Read() )
	{
		OracleDataFile _df = new OracleDataFile();
		_df.m_lID = _dr.GetInt32( 0 );
		_df.m_lTSID = _dr.GetInt32( 1 );
		_df.m_sName = _dr.GetString( 2 );
		_df.m_lSize = _dr.GetInt64( 3 );
		m_lstDatafile.Add( _df );
	}

	_dr.Close();
	_cmd.Dispose();

Next, we calculate the size of each tablespace based on data file size:

	//	Calculate size of tablespaces...
	foreach ( OracleDataFile _df in m_lstDatafile )
	{
		m_lSearchID = _df.m_lTSID;
		OracleTablespace _ts = m_lstTablespace.Find( ContainsID );
		_ts.m_lSize += _df.m_lSize;
		m_lTotalSize += _df.m_lSize;
	}

	//	Sort by size...
	m_lstTablespace.Sort(
		delegate( OracleTablespace x, OracleTablespace y )
		{
			return( Comparer.Default.Compare( x.m_lSize, y.m_lSize ) );
		}
	);

I then figure out the average amount of data to distribute to each node in the cluster for the backup. There is a fudge factor built in which you can play with. m_iNodeCount and m_iChannelsPerNode are configurable of course.

	//	Calculate target size of each bucket...
	long _lTargetSize = ( m_lTotalSize / ( m_iNodeCount * m_iChannelsPerNode ) ) / 2;

Next I assign a channel/thread number to each tablespace and attempt (very crudely I might add) to balance the load based on my target size. Any unassigned tablespaces are spread across the cluster incrementally.

	//	Ok, evenly distributed now (somewhat).
	i = 0;
	foreach ( OracleTablespace _ts in m_lstTablespace )
	{
		if ( _ts.m_lAssignedThread == -1 )
			i++;
	}

	//	Throw unassigned stuff into the last thread...
	if ( i > 0 )
	{
		int _j = 0;

		foreach ( OracleTablespace _ts in m_lstTablespace )
		{
			if ( _ts.m_lAssignedThread == -1 )
			{
				_ts.m_lAssignedThread = _j++;
				_j %= ( m_iChannelsPerNode * m_iNodeCount );
			}
		}
	}

Finally, based on all the information gathered and calculated, a set of SQL scripts are generated, along with batch files to run the scripts. The resultant files are written to the directory from which the program was run. After they are generated, they are “pushed” off to the nodes in the cluster.

I have a scheduled task that runs this generator an hour before my hot backup is to begin. I have another scheduled task that calls the generated batch file on each node to kick off the hot backup. It’s been a big help and works quite well. Hope you enjoy it!

Download the HBupGen Source Code

Sphere: Related Content

• • •
 

April 23, 2007

Welcome to The Geeky DBA!

Filed under: General — admin @ 5:30 pm

I was sitting around last night and decided that the web needed a good place for geeky DBAs to hang out and post their war stories. I’ve just finished up a neat little program to generate hot backup scripts for my Oracle system and wanted to share. That’s going to be my first post.

I thought other DBAs would care to share their scripts and code as well. This is where I’d love to see it happen.

Sphere: Related Content

• • •