Article : Saving The Day! Developing A Disaster Recovery Plan For Your Contact Center
Disaster recovery is a topic on many of our minds today. If you do not have a comprehensive, well-tested plan for your contact center in place, now's the time to get started!
We find that most organizations have a carefully devised and tested plan to address the data processing and IT risks and recovery process. But less than half of these same organizations have thought about the unique risks involved in the contact center – the front door to the company! We also find that many plans deal with only the recovery from a catastrophic disaster and do not address the greater likelihood of partial failures, or the techniques that can prevent them.
This article addresses the steps of creating a comprehensive contingency and disaster recovery plan for the contact center with some examples of the types of issues to consider along the way.
Assembling the Team
As your specific plans develop, it will be crucial to bring in your call center technology vendors to help you understand the full capabilities and options of the equipment and systems in place. It's important you fully understand how far the systems can be "stretched" in the event of a disaster, and what the costs would be to add backup or duplicate systems to keep you operating in the event of an outage. Knowing what your vendor can do to replace a system if needed will help you identify the timing of plans and the need for interim measures if a total system destruction takes place.
Identifying the Risks
Another set of steps must be in place to minimize the impact when prevention fails. The plan must define the precise steps to take to recover as quickly as possible. Various studies show that companies that experience a disaster have a high likelihood of going out of business if a major outage lasts longer than a few days. A Gartner Group study suggests 2 out of 5 companies that experience a computer disaster will be out of business in 5 years. A similar University of Minnesota study suggests only 6% of such companies would survive. These studies focused primarily on IT and data system problems. Far less research has been done on what the business impact would be if the contact center as the main customer access point were to experience a significant outage.
So what are the risks you face in the contact center? While terrorist attacks and bomb threats may be the incidents that you think about first in light of the recent terrorist activities in New York and Washington, they're actually not the most likely events you need to worry about in your planning effort. Studies by Contingency Planning Research suggest that power outages/surges and weather are far more likely to cause damage than man-made disasters that are so much in our thoughts since September 11.
Identifying the potential risks involves taking a comprehensive inventory of all the components of your contact center. Then begin to "think negatively" about all the things that could fail and what the impact would be if they did.
Consider each of the following components and what could go wrong: your physical facility, data systems, call center systems and applications, communications networks, electrical service, customer access points, partners, procedures, and staff.
As an example, let's take a look at just one risk in each of these areas:
Facility: It doesn't take a major hurricane or tornado to cause a weather disaster. Several areas of the US experienced soaking rains in November and many businesses were affected. A simple heavy rain can cause flooding in a localized or widespread area. What can you do to prevent damage to your facility from a storm? Can you detect water seeping in the equipment rooms before it is a major factor? Do you have tarps readily available to cover equipment and flashlights to aid the process that must often take place in the dark? Is the main driving route to your center a flood plain area or an underpass where a street may be inaccessible?
Data Systems: September 11 wiped out major data centers and huge amounts of end-user equipment. Do you have a hot standby center with a current set of customer data? Can you shift your personnel to laptops or other systems, or do you have a large quantity replacement plan in place with your vendors? Writing down the detailed procedures for back up restoration is equally important since the back ups may have to be installed by someone unfamiliar with normal operations. The lack of these skilled personnel caused more problems after September 11 than the lost equipment.
Call Center Systems and Applications: If your IVR goes down, could you provide a back up quickly or deal with the increased influx of calls to your call center staff? Or in the event of an ACD failure, how would calls be routed to the staff? Keep in mind that what you'll be missing during the downtime is not just call routing (where a simple hunt distribution will work in the short term), but valuable management reports. In any disaster, you'll want to get the ACD up and working as soon as possible. Often ACD reports serve as evidence that's submitted for "loss of business" insurance.
Communications Networks: What if major storms leave your building unaffected, but take out local telephone lines? Or a cellular tower? Do you have an alternative service provider or a secondary cable route to back up your primary service? Some centers have chosen to have cable to a second local telephone central office(CO) in their city in case of cable cut or CO failure. In most cases, this backup is enough to ensure continued communications.
Note: Sometimes the backup plan needs a backup. In the September 11 catastrophe, Verizon lost five of its COs around the World Trade Center. And companies that chose to back up their wire-line service with cellular saw 14 cell sites vanish from the area. One company dependent upon web communications had 9 internet providers and lost 8 of them! A back-up that is in close proximity to your main service provider may experience down time too if the cause is weather or highly localized trauma.
Electrical: Lightning is one of the most serious threats to electrical equipment. But lightning can enter through a variety of pathways. Have you identified all the paths and put the necessary protection in place? We've seen thousands of dollars of damage to PBX/ACDs from an indirect hit, and a direct hit can destroy it or another call center system completely. Grounding of equipment is essential to its proper operation during normal periods, but may spell the difference between a failure and a crisis averted in another. Have a qualified electrician ensure that your systems are properly connected to the electrical feed and back up systems, and make sure the appropriate levels of lightning protection are in place and are grounded thoroughly. This is probably the lowest cost and most effective preventative measure you can take to guard your expensive equipment.
Customer Access Points: The percentage of customer interactions handled over the Internet is growing. If your web site is down, can you handle the calls that may come into your call center staff as an alternative? Any point where customers access your business is a potential risk and could "dump" increased workload on the other channels. But more importantly, can you effectively communicate with your customers who normally use the affected channel to offer them other ways to contact you rather than contacting your competitor?
Partners: Do you have a contract with another party, either another call center in the area or an outsourcer, to take overflow calls? More and more call center executives are identifying other call centers in their area that could serve as a temporary "home" should their own facility or systems become unavailable. It's critical to make sure your network provider has an updated call routing plan in place should this event occur, and that the agreements are updated regularly to reflect the growth in both your organizations. And if your partner center has a disaster of its own, what's your backup option?
Procedures: Several recent disasters have forced call centers to do things "the old fashioned way" when their computer system was down for an extended period of time. Some were able to conduct business by manually taking orders and looking up information in paper-based documentation. Others simply couldn't function without automated systems in place. Does your center have a procedure in place for carrying on business as usual if the computers are down? Does your staff know how to handle manual operations?
Staff: Have you considered what would happen to service if a large percentage of your staff suddenly became unavailable -- something more serious than typical Monday morning absenteeism? What if a major flu epidemic hit the community that affected half your workforce or their families? Or a major storm prevented your staff from reaching the center? How would you handle the calls?
The above list outlines just one risk in each of the categories. It's critical that you and your team brainstorm all the negative possibilities in each of the areas, determine what can be done to prevent the bad events from occurring, and formulate a plan of recovery if the worst happens.
Calculating the Costs
Then you'll want to calculate the cost of each measure of prevention to see if you can realistically afford each one. Sometimes the cost of prevention is much higher than what the cost of recovery or lost business would be. But more than likely, the cost of an ounce of prevention is less than the pound of cure you'll pay for in the end.
Look at the cost of recovery from each kind of disaster that you have identified. Be sure to include temporary personnel, rental of equipment or facilities, overtime, and losses not covered by insurance. And even for those losses that are covered by insurance, take into account the time value of money while you wait for reimbursement.
Step 4: Selling
and Writing the Plan
In most cases, call center professionals find themselves with a limited budget and cannot implement everything in the plan at once. So be sure that you have ranked the risks, options, and preventive measures in their order of importance to carrying on effective operations. Senior management appreciates a presentation that lays out the choices in a concise format sorted to identify the most critical issues. This comes under the heading of "don't bring me a problem, bring me a potential solution".
Once funding priorities have been established, it is time to begin the plan documentation. When writing the plan, be certain the include all of the details. List all equipment, make and model, circuit IDs, IP addresses, and any specific item that would be needed to build the systems from scratch. Note each vendor and the point of contact for each (with a backup) and the points of contact within your own organization. Identify team leaders for each group of employees who are responsible for that group's safety, and establish an off-site gathering place when folks assemble if they must leave the building. And set up a communications plan for employees and their families so that they know where to go to get information on the situation as it develops. Include a communications plan for customers as well so that they are kept informed of the situation and your plans for recovery. Most customers are very forgiving of temporary service failures if they are kept informed of the situation and the efforts the company is making to restore normal operations.
A detailed step-by-step plan is ideal, assuming that the people who will put it in place have no familiarity with your systems or procedures. While this is a brutal task initially, it will pay huge dividends not only in a disaster, but in training for new hires along the way.
Make the plan a "living document". That means changes that happen in the environment must make it to the plan. Keeping the plan linked to normally used on-line IT documentation will help keep up with changing networks, IP addresses, circuit numbers etc. But there is a need for printed documents too so that if the systems fail, the plan documents can be accessed. Keep a set of plans in the homes of a few key people as well as at the office.
Step 5: Testing
Plan a regular test, but at odd intervals. Test something different each time. And include some auditors in the process who will take a critical look at ways to improve the reaction and the plan document. Success is having such great prevention that you never have to react to a disaster, but should one occur, success is measured in quickness of recovery.
Remember, you are trying to prevent injury to employees and customers, and to recover the business, not just the systems.
About the Authors
Pam Trickey and Maggie Klenke are co-founders and Senior Partners with The Call Center School, a Nashville, Tennessee based consulting and education company. The company provides a wide range of educational offerings for call center professionals, including traditional classroom courses, web-based seminars, and self-paced e-learning programs.
About the Company
Published: Tuesday, August 05, 2003