Beaulieu Group LLC, based in Dalton, Georgia, paid huge monthly fees for a hot site subscription. At best, this service could provide only a two to three day recovery time, and the recovery point would only be up to the last tape save, which could be as old as 24 hours.
The last system failure that occurred just prior to the implementation of Assure iTERA HA left Beaulieu Group down for two days. Among other things, the fix required a full reload of applications and data from tape.
In order to recover from a problem severe enough to cause core business applications to go offline, Beaulieu Group LLC paid inflated subscription fees to a hot site vendor.
This service could provide only a two to three day recovery time. It also had a recovery point that was dependent on the date and time of the last tape save, which could be as old as 24 hours.
Tommy Barge, V.P. of Information Systems knew that his company needed a self-sustaining, internal disaster recovery plan that included a high availability solution (HA) in order to provide fast and complete recovery.
In an unfortunate twist of fate the company experienced a system failure on its production IBM i server before HA could be implemented. Among other things, the fix required a full reload of applications and data from tape. “Given the uncertainty of the fix, we struggled with whether it was worth the time and expense to travel to the hot site and attempted to restore data or just wait it out,” according to Barge. Ultimately, he decided to wait and hope for a quick resolution.
In two days system access had been restored and all of its data reloaded. Still, two days of downtime was painful and costly, and it pushed the company to expedite its HA initiative.
Beaulieu Group decided to evaluate Assure iTERA HA. Says Barge, “I spoke with several references for Syncsort and all of them started out with varying needs and objectives. To me, this was sufficient evidence to prove that Assure iTERA HA was flexible and configurable, and that it would conform nicely to our environment.”
Assure iTERA HA was first installed, Beaulieu Group had an iSeries model 840 as its production server. This machine was situated right next to a model 740 that acted as its backup. With the installation of Assure iTERA HA, the company immediately eliminated nearly 14 hours of downtime from month-end tape save procedures. “That probably paid for the solution,” said Barge. “Prior to the implementation of HA, we used manual processes to track input transactions during downtime. Eliminating this extra work, as well as the errors nearly paid for Assure iTERA HA.”
When the lease expired on its production iSeries model 840, Barge saw and opportunity to move to a new System i with more processing power and a smaller processor-group rating. This meant thousands of dollars would be saved yearly on annual software maintenance costs; in fact, the savings would nearly pay for the new machine. In order to ensure the backup environment and the same resources as the production, Beaulieu Group purchased two model 550s.
Syncsort offered Barge a special licensing arrangement so he could use Assure iTERA HA to migrate data from his older machines to the new 550s and save several hours of downtime. Barge says the last time he migrated applications to a new iSeries, nearly 36 hours of downtime had accrued.
Using Assure iTERA HA, Barge and his team migrated the production 840 and two smaller model 720s to the new model 550 production machine. The old model 7 40 backup machine was also migrated to the second new 550 backup server.
The entire migration and consolidation of servers was accomplished with only 3.5 hours of downtime. “Had Assure iTERA HA not been used, it would have likely required 36 or more hours of downtime to complete these migrations,” Barge said. “Most of the downtime was due to tasks that should have been done hours before the migration completed and had nothing to do with Assure iTERA HA. If our ducks were in a row, I believe our total downtime would have only been 30 to 45 minutes.”
Barge added, “We test the switchover once per quarter, running the company on the backup system for 20 hours or longer before rolling back. This allows testing during shift changes, testing our time and attendance systems, as well as discovering any recent changes to the network."
“The expectation is that if the production machine is disabled for any reason, users and processes need to be back online within 30 to 45 minutes,” Barge said. “We are in the process of creating a formal DR plan that spells out exactly when we trigger a failover (a switchover when the production system is down) in the event of an unplanned outage. Of course, we don’t want to execute a failover if it looks alike the outage will last under 30 minutes.” Barge continued, “The decision of how long to wait before moving to a backup machine is certainly far easier with HA than with a hot site.”