DIRECTIONS: Read chapters 4, 5 & 6. The week 2 individual assignment focuses on writing the plan. Using the week 1 assignment as a base document and the ‘I still don’t know what to write’ outline on pages 108 & 109 in your text as a guide – outline (using specific information – not general information) what your departmental plan may look like. In general, an outline will be approximately 3 pages in length.
C H A P T E R 5
BUILD AN INTERIM PLAN
Don’t Just Sit There, Do Something
Build it and they will come.
—Field of Dreams
INTRODUCTION
Building an effective business continuity plan can take a great deal of time and
resources. By this point, you have identified the processes critical to your business
in the Business Impact Analysis (Chapter 2), identified the risks to these processes
in your risk assessment (Chapter 3), and determined your strategy for building a
comprehensive plan (Chapter 4). Until the primary disaster plan begins coming
together (Chapters 14 to 20), there are 11 steps you can take right now to provide
some initial protection. The steps you follow in this chapter will be expanded in
great detail in later chapters. Even if your disaster planning stops after this chapter,
you will be noticeably better prepared.
Create an Interim Plan Notebook to organize your information. It
should contain:
1. Access to People. Organization charts should be included to show who is
assigned what areas of responsibilities and who their assistants are. Contact
information for each key person—work phone, home phone, cell phone
number, pager number, and home address—should be included.
2. Access to the Facility. A set of keys must be available to every door, cabinet,
and closet that holds equipment you support, all maintained in a secure key
locker. This includes copies of any special system passwords.
3. Service Contracts. Be sure you have the name, address, telephone number
(day and night), contact name (day and night), serial numbers of equipment
BUILD AN INTERIM PLAN 85
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
on the contract, contract number, and expiration date. This section may also
include a copy of the service agreement renewal calendar.
4. Vendor List. A list of companies where you have accounts set up for quickly
buying emergency supplies. This includes contact information.
5. Walk-Around Asset Inventory. This is necessary to properly build a plan. A
thorough asset inventory will come later. What assets might you need to
recover or to restore to service right away?
6. Software Asset List. What software are you protecting, insuring against loss,
and supporting?
7. Critical Business Functions. What are the business functions you are trying to
protect, to keep running with minimal disruption?
8. Operations Restoration Priorities. What do you fix first, and in what order do
you restore functions to service?
9. Toxic Material Storage. Locations of toxic material anywhere on the
company grounds.
10. Emergency Equipment List. Where are the equipment and materials you
need to help clean up a mess?
11. Trained First Responders. Do you have any volunteer firefighters or
Emergency Medical Technicians (EMTs) on your staff? Does anyone have
critical skills you can use in a crisis until emergency crews arrive?
In most emergencies, there are several keys to a successful recovery: key
people, keys to the doors, and key support account information. For your interim
plan, you will pull together basic contact information on the people you would
call on in a disaster, the service contracts you would invoke, and keys/passwords
necessary to gain entry into where you need to be.
A quick way to gauge your current state of disaster readiness is to make
unannounced visits asking for critical support information. Watch the people as
you ask for this information, and you will see how organized some are. See who
can quickly provide a copy of their list and who has everything scattered about in
a “sticky-note file.” As you watch them fumbling through folders of documents,
aren’t you glad this isn’t happening during a real crisis? How high would the quality
of their hurriedly gathered information be? How quickly could they provide the
correct answers? Rapid availability of this information is very useful even if the
computer room is not on fire. Imagine the same people doing this during an
emergency, and with the office illuminated only by emergency lighting.
As the information flows in, take time to carefully organize it. Label each item
as to who sent it to you and the date you received it. If you later need an explanation
about their information, you’ll know who to call. The date indicates when you
received it, not how old this information actually is. It never hurts to validate
critical information like telephone numbers and contract agreements. Set up a
tabbed three-ring binder to hold all the information. Later, you will consolidate
this information into your own lists and they will take a lot less space.
86 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
Remember that what you collect for an interim plan must be useful to anyone
involved in disaster recovery. Readability, accuracy, and clarity are important. The
various documents must be accumulated in a single binder and presented to your
various managers. Place a date on each document to show when it was created.
This will also act as a built-in reminder to call for updated documents if you feel
they are too old. Be sure to keep a copy of these documents at home; emergencies
don’t always happen during normal business hours.
Keep track of who has a copy of the binder. Then, as updates are created, you
know whom to pass them on to. Ensure all binders are tabbed for quick reference
and clearly marked as Company Confidential in accordance with your company’s
document guidelines (remember, you have home telephone numbers in here).
At a minimum, each of the following people should have an up-to-date copy
of this interim plan.
➤ Business Continuity Manager (you—the person writing the plan).
➤ Disaster Recovery Manager (to be kept at home).
➤ Information Technology department’s Help Desk (useful for providing support).
➤ Facility Security Manager (in a place the after-hours guard on duty can reach it).
ACCESS TO PEOPLE
Reaching key people is a two-step task. The first step is to know whom to notify.
The second step is to know how to reach them. You should know not only how to
reach your boss, but also the head of the purchasing department, the public
relations manager, the custodians’ office—many more than just the people in
your department.
Start with an organization chart that shows who works in what department,
from the top person down to the night-shift custodians. Current charts are often
hard to come by. Organization charts reflect the formal lines of authority within
an organization, not the actual day-to-day flow of authority.
The organization chart will help you identify who is responsible for what areas
and who you might need to call if a disaster occurs. Think of the ways this will be
useful. If the accounts payable system crashes over a weekend, you might need to
call the accounting clerk who uses that system to test your fix before work starts
on Monday. If there is a fire in the Quality Assurance office overnight, you need to
know which manager to notify.
The second piece is a complete telephone list for all employees that includes
home telephone numbers, cell phone numbers, pager numbers, etc. In most
cases, you will only notify the department managers, but by having a complete
list, you should always be able to call in the “resident expert.”
A funny thing about a telephone recall list is that some people lie to the
company about their home telephone number. Imagine that! Others “forget” to
pick up their pager every night before they go home. We can’t change all the bad
habits of the world, but for those key people who are critical to your disaster
BUILD AN INTERIM PLAN 87
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
recovery efforts, ensure their numbers are correct even if you have to call
them yourself.
An easy way to check this list is whenever it is used to call someone off-hours,
make a small notation of the date next to their name. If the call went through, that
is validation enough. Once every several months, take time to make a call to any
of the unchecked phone numbers just to verify them.
Try to never call people after hours unless it is necessary (like checking the
list). Check with your human resources manager to see what the impact is on
hourly and salary workers’ compensation for calling people after hours.
If your company has multiple sites, you will also need the telephone numbers
for their key technical, support, and management people. In an emergency, it is
sometimes quicker to borrow material from a sister company than to buy it. Also,
instead of hiring unknown consultants to assist in your recovery, it is far better to
borrow skilled people from sister companies. They are already familiar with your
company’s procedures, and they should have already had a security screening
(something your emergency consultants may not have). All around, it is preferable
to call on your fellow employees to supplement your recovery staff than it is to hire
someone on the spur of the moment.
ACCESS TO THE FACILITY
Limiting access to the company’s assets is not an optional activity. It is something
your auditors will be checking. All sensitive areas must be secured, such as
computer rooms, telephone switch room, vital records storage, and personnel
files. There may be other areas unique to your company that must also be
safeguarded. If in doubt, ask the auditors. They are a valuable source of information
for disaster planning.
Don’t be shy about asking detailed questions of your company’s security force
concerning the arrangements protecting your area of responsibility. Do not
take for granted that the force provides the proper protection for your
equipment. Review their after-hours entry policy to ensure it meets your
emergency needs.
Physical Keys
Murphy’s law says that problems will happen in the worst possible places.
Wherever the problem occurs, you will need to get into the location. Imagine a
network problem. You may need keys to access a number of equipment closets
checking data hubs until you find the defective equipment, another key to gain
entry to the hub’s cabinet, and still a third key to enter the secure area where spare
hub cards are maintained.
88 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
In most facilities, the security force maintains copies of the physical keys to all
doors and locks. If this is the case in your company, then you should review their
key management policies and key locker procedure. Things to look for and for you
to do if they are missing:
➤ There should be a formal request form for requesting a key. Each request
should be properly authorized before a key is issued. People who feel
accountable for a key will treat it more like a valued object. If someone keeps
losing their keys, then they should not be given any more of them. Note how
often their car keys turn up missing and you’ll see that it is only your key that
they don’t care about.
➤ There should be a “Key Log” of who has what keys. Verify that people who
work for you only have what is needed. Use this list to recover keys when
people leave the company. If a theft of company property is detected, this list
will be a valuable starting point for the investigation. If locks must be
changed, this will tell you how many keys are needed for the new setting.
Review this list at least quarterly to recover keys from people who no longer
need them.
➤ There should be a locked cabinet where copies of all keys are maintained.
Sometimes paranoid people attach their own locks to cabinets to keep others
away from their equipment. You may not even be aware there is an unauthorized
lock on this door or know whom to even ask for a combination. Personal locks on
company doors and cabinets must be vigorously discouraged as it will hinder
your recovery at a time when you can ill afford it.
Even if your facility security force has a “Key Locker,” you might want to have
one just for your department in a place that you can get to quickly. For your own
department, you might establish a key locker to hold a copy of every key to every
door and cabinet in your facility. Then no matter who is on-site during a disaster
can quickly enter the room or cabinet and begin containing the damage until the
expert support team arrives.
For security reasons, only a few people should have access to this cabinet.
Otherwise, you would be surprised how fast these keys will disappear. A sign-out
sheet in the cabinet can be used to track what has been loaned out, to whom,
when, and by whom it was authorized.
Note the phone numbers of local locksmiths who are available around the
clock, every day of the week. If you are depending on the building security folks to
provide this service, inspect their operation to ensure it includes all keys and that
they are available 24/7. Whoever maintains the key locker should also have a large
set of bolt cutters. This “master key” will open most locks by slicing through them.
Use it liberally on all noncompany locks you encounter.
Master keys are keys that open more than one door. The way that door locks
are keyed is such that security zones are created. This allows a master key to open
all the doors in a given department and not in other departments. Master keys
must be closely guarded and issued sparingly.
BUILD AN INTERIM PLAN 89
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
Electronic Keys
An excellent solution to the problem of propagating keys is electronic locks.
Electronic locks are expensive to install but provide a wide range of benefits. An
electronic lock not only opens the door but it tells you who tried to open a door,
when they opened the door, and how long it was open.
You see electronic door locks in most modern hotels. Hotels had a problem.
Customers often lost their keys or continued their journey without turning them
in. The hotel had to assume that someone was walking around with the key to a
room that they might use to break in later. This forced hotels into an expensive
rekeying of the doors. Rekeying cost their customers money and was a constant
problem for the innkeeper. Now, if someone checks out of a hotel with an electronic
key, that key is disabled. Door locks no longer need changing and customers are
no longer billed for rekeying.
Another problem with physical keys is that they get lost, get copied (with and
without permission), and the people holding them may pass them on to less-
trusted individuals. You can never be sure if a key is truly lost or has been
intentionally stolen so that some miscreant can gain access to a particular area.
This forces an expensive lock change. It also means that anyone else with one of
these keys (those who are entitled to have one) must exchange their key for a
current version.
There is no law against copying keys. Even the keys stamped with an admonition
of “Do Not Copy” have no legal standing. Key makers will copy them as they
please. So anytime a key is provided to someone, that person can easily make a
copy. Once a key is surrendered, you cannot be sure that door is still safe. For all
anyone knows, the employee made a copy of that key for a friend in
another department.
An electronic lock uses the digital number on a key or a key code to determine
who has access to what area. All information is kept in a master database. When
you try to open a door, the badge’s number is read or the key code used is recorded
and sent to a database. The database checks to see if you are authorized to open
that door. If you are, then the door latch releases. If not, then usually nothing
happens and the lock ignores the key. At the database, a record is saved of the key
number trying to open a door, where, when, and if the door was opened or not.
In this way, people can be given access or denied access via the database
without issuing or recovering keys to each door. If a key is lost, it can be disabled
at the database and be worthless to a thief. Anyone who finds it has a useless piece
of plastic. This of course depends on the individual to report the lost key and have
it promptly disabled. If an employee leaves the company, you can disable the
key quickly.
Whether you are allowed in or not, each attempt is recorded with a date and
time for tracking who went where. Denied access can be used to see who is testing
your security system. If something is missing from an area, you can see who
entered each room. This log must be reviewed daily to see which unauthorized
cards are attempting to get through which doors.
90 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
An electronic lock can also track doors that are propped open. Depending on
how your system is configured, this may trigger a security alarm to see if this is a
legitimate activity or if someone wanted a door left ajar. Since the log also told you
who opened the door, it could be a good time to find out why they did this. If you
wanted the door left open, you would save money and remove the lock!
Electronic locks also allow for master keys and security zones. For example,
this lets you set up the electronic key for the telephone systems technician to
open all telephone room doors but none of the computer room doors. Electronic
keys are nice because it is easy to enable various levels of security at any time.
Just as you would manually with physical keys, you should use the electronic
lock software to generate a quarterly key access report to review which employees
have access to what areas. This will catch those cases where someone’s project once
needed access to an area that is no longer required. It may also highlight more than
one card issued to a person (they lost one, were issued a “temporary” replacement,
and then kept it and the original). Usually this list is circulated among your managers
to ensure people have the proper access. Keep in mind your after-hours support
requirements or you’ll be making some late-night trips to open doors!
System Passwords
A system password is like a master key. Usually keyed to the user ID or administrator,
passwords provide unlimited security access to every feature on a computer system.
In our case, we may need them to perform an emergency shutdown of main
computer systems. For this reason only, we need them kept in the key locker.
These passwords have an unlimited potential for mischief, so they must be
closely guarded.
Establish a secure area to store system passwords. They can all fit on a sheet of
paper and must include all administrator-level accounts. This is kept in a sealed
envelope near your equipment in the event that a rapid system shutdown is
required. Another place to store this is inside your key locker. Check the seal on
the envelope from time to time to ensure it has not been tampered with.
You will need this information if you ever need to shut down or restart your
computer system when the systems experts are not available. This might be due
to a fire in an adjacent room where the loss of electrical power [and
Uninterruptible Power Supply (UPS) power] is imminent.
SERVICE CONTRACTS
How could someone qualify the downtime on a piece of machinery as a disaster?
You would if that was the only printer that could print paychecks and today is
payday! These normally must be distributed at a given time, and you may not be
able to wait another 4 hours for a staff member to come in just to look at it. It
might be critical if your primary data communications hub began emitting blue
smoke. It might be critical if . . . but I think you get the picture.
BUILD AN INTERIM PLAN 91
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
A service contract isn’t much good to you if you can’t call for help when you
need it. Round-the-clock service coverage is very useful for maximizing system
uptime. This is especially true for critical hardware and software. Unfortunately,
24/7 service can easily double the cost of a service contract. So if you are paying
out this large premium every month, take steps to ensure it is available when
needed. People cannot call for it if they don’t know how.
Obtain a list of all service providers you have service agreements with. Cross-
check this list with a walk-around to ensure that all your major equipment is
accounted for on the list. We will need to include all these service provider names
later when we build our vendor contact list.
There are four basic types of contracts with endless variations:
1. 24/7. They provide unlimited around-the-clock coverage for time and
materials. Pay one price per month and leave your worries behind. This is
necessary for mission-critical equipment and is the most expensive approach.
2. 8 to 5. They will work on equipment problems during the business day and
usually supply any parts that are needed.
3. Time and Materials. They will work on the problem and charge you by the
hour for the repair technician’s time. The costs of any parts required are also
included on the bill. This is good for nonessential equipment that rarely breaks.
4. Exchange. Send them your broken equipment and they will either send you a
refurbished replacement or repair it and send it back. This is good for devices
where you have on site spares, such as monitors, terminals, scanners,
printers, etc.
Begin building your list of service agreements. This list will be very useful in
many areas of your company—the help desk, the late-shift operators, the
security guards, and many other places. Use Form 5-1 on the CD-ROM to develop
your list. The essential information to gather from each of your service
agreements includes:
➤ Contact Names. Whom do I call? There may be multiple people involved with
your account. There may be a sales representative, a dedicated technician,
and even an after-hours contact name and number. When time is short, you
need to know whom to talk to for the fastest service.
➤ Company Address. Look at the city to see how far away they are. You can
gauge an approximate response time for the technician. Any spare parts the
technician may need will probably be that far away also. If the service company
is too far away, make a note to look for someone closer to home. On the other
hand, some companies use a work-from-home field workforce, so using the
company’s address is only a starting point for this inquiry.
➤ Telephone Numbers. This could be a rather long list. You may have a separate
number for normal hours, their fax machine, the technicians’ direct line, and
an after-hours number. You need them all clearly identified. Like your recall
92 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
list, when you use one, pencil in the date next to it so you know the last time
that telephone number was validated.
➤ E-mail Address. Many companies use e-mail to pass noncritical information
to their customers. This might also help if your sales representative was away
from the office on a business trip and was checking for messages.
➤ Customer Number. The identification code number by which this contract is
known to the vendor. You will need this when you call the problem in. Service
centers normally will not budge until they verify that you are paid up and
eligible for this service.
➤ Hours of Support Under Contract. This is VERY important. It will determine
if you will be billed for the service call. If you are paying for 8:00 AM until 5:00
PM service and then demand a technician come out late at night, you will be
billed for a hefty hourly fee. This may be acceptable, so long as you are aware
of the potential costs. Paying for 8 to 5 service means that if the repair isn’t
finished at 5:00, the repair technician is going home and will be back tomorrow.
Otherwise, you will again be paying a large overtime hourly fee.
➤ When Does the Agreement Expire? Some equipment inconveniently breaks
on the wrong side of the deadline. All service contract expiration dates should
be placed on a calendar so that you can see this coming and negotiate a new
agreement before the old one expires. This information can also feed into your
annual budget process.
➤ Description of What You Buy from Them. This could be a wide range of
things. Some contacts provide everything for a fee to include materials and
labor. Service companies you don’t often need may be contracted under a
time-and-materials scheme for all repairs. Whatever you buy from them, very
briefly describe it here.
➤ Your Internal Designated Contact Persons. Many contracts require that
several persons be designated as the company’s representatives for contacting
them to prevent their lines from being flooded by minor calls. Even though
specific people are named in the contract, by declaring an emergency the
service company should begin assistance until the named parties arrive.
Now that you have this list, assign someone to make up small cards for each
machine covered by a service agreement. On this card, print all the essential
information you have gathered. Firmly attach this information to the machine or
inside of its cover. This is the ideal—information available at the point it is needed.
Now if that device quakes, shakes, and begins to moan, the information on whom
to call is immediately at hand.
When attaching these cards, check the machine over for advertising stickers.
Some service companies attach them to whatever they repair. This is OK except
when you change service companies and some well-meaning soul calls the number
on the sticker to repair the machine. Without a service agreement, they may come
BUILD AN INTERIM PLAN 93
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
out and send you an expensive bill. So when you see these stickers, remove them.
People will get into the habit of depending on the cards you tape to the machines.
If you have a lot of equipment in the same room, make up an information
station with a notebook attached to the wall that contains all the same information.
Keep track of wherever you place this information for the times when you need to
update your service providers or hours of coverage.
VENDOR LIST
Now that you know whom to call for a service call, make up a list of the other
companies you routinely deal with using Form 5-2 (see CD). Since we have the
major equipment covered, we can now focus in on the companies that provide
your routine supplies. Why is that important?
Have you ever run out of something seemingly mundane like a special toner
cartridge, and your usual purchasing agent is on vacation? The company kept
moving along but there was someone out there who was very vocally upset. As you
collect vendor contact information, you will quickly see how this can be very useful.
You want vendor contact information for the companies that supply your
support materials such as custom cables, preprinted forms, backup tapes, any
number of things you need to keep your operation flowing smoothly. Most suppliers
don’t list an after-hours number. They have one but it is not published. Try to get
it from the salesperson. If they don’t have it, get the salesperson’s home number.
Often when you really need something, a salesperson will go the extra mile to
build customer loyalty.
Obtain a list of all support materials suppliers. This includes companies that
provide off-site storage of your backup tapes, courier services, companies that
provide preprinted forms, and companies that sell or lease you equipment as well
as companies that repair it. Mandate that it be kept current. Essential data
elements include:
➤ Contact names.
➤ Company address.
➤ Telephone numbers: normal hours, fax, and after-hours number.
➤ E-mail address.
➤ Your internal vendor number.
➤ Description of what you buy from them.
Public utilities are another set of vendors you need to know about. Loss of
service from telephone, Internet, electric, gas, and water companies can shut
down your operations in the blink of an eye. These companies all have 24-hour
service support numbers. They may also have a special trouble reporting number
for companies and major customers. For each utility, you will need:
➤ Contact names for sales, technical support, after-hours dispatch.
94 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
➤ Telephone numbers for each contact to include their normal hours number,
fax number, and after-hours number.
➤ E-mail address for handling routine issues.
Public safety telephone numbers must also be prominent on your list. The
ubiquitous 911 is always a good starting point, but you may find the normal
telephone numbers for police, fire, ambulance, and the local hospital are all
handy to have in a crisis. Use Form 5-3 (see CD) to start your list of whom to call
in an emergency.
WALK-AROUND ASSET INVENTORY
Most companies have a lot of equipment to keep track of. We’ll get to that later.
Start by doing a walk-though of your areas of responsibility (do not trust this to
memory). Draft a list with key information on all your major equipment. A major
piece of equipment is one that costs a lot of money, or that takes a long time to
replace, or your operation depends on it and it is the only one like it you have. This
will usually be your larger or shared pieces of equipment.
As you walk around, be sure to open all closet doors and look into boxes. You
would be surprised what you will find stashed away by people for emergencies.
Note the location of any spare equipment. Arrange to have it picked up later. It
should all be collected into one central point to cut down on the number of
duplicate spares. Computers and computer component parts are like fresh fish;
they lose value quickly with age. If everyone is hiding something like a spare
printer in case they have system problems, you could be paying for many more
spares than you need. Consolidate and lock up all your spares in one location to
minimize costs and to ensure they will be available to whoever needs them. This
may even free some equipment for use elsewhere.
When you examine each machine, look for indications of who sold or maintains
the device. Sometimes repair services place large stickers with their telephone
number on devices they service. Note these in case you cannot locate the service
contract for this device.
Another sticker often found somewhere on the inside is a notice of the last
time this device received preventative maintenance. Some equipment such as a
network server may need as little as an occasional shakeout of the fan filter. Other
devices, such as your UPS system, need their batteries checked every 6 months.
The frequency that preventative maintenance is required can be found inside the
manual that accompanies the equipment. Therefore, also begin locating the
manuals you need. All preventative maintenance must be recorded in a log for
that device. Note what was done and by whom. If the service was improperly done,
and then the equipment fails, you may have a claim against the service company.
To continue with the thought on hardware manuals, the books should either
be prominently displayed adjacent to the equipment or collected into a central
place. This reduces the amount of time lost looking for answers. As you walk
BUILD AN INTERIM PLAN 95
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
around, make a note next to each piece of equipment on your list as to whether
the manual could be located.
In each room note the following information. Again, this is for critical systems,
not a wall-to-wall inventory. Be sure to include all the equipment in your computer
room, telephone switch room, and network closets (a chain is only as strong as its
weakest link). The list should include:
➤ Manufacturer’s Name.
➤ Model Number.
➤ Serial Number.
➤ Warranty Expiration Date. Tracking this will save on service costs and help you
to know when to add that item to a service contract. Be sure to add it to your
service contract renewal calendar.
➤ Location. You may need to work up your own notation for this if everything is
not conveniently set up in an easily identifiable room.
➤ Serviced by. This may be a sticker right on the device.
➤ Connected to. This will take some asking around but will be very useful. Out
of this, you may uncover the weak link in a chain.
➤ Feeds into What. Same benefits as connected to.
Think back to previous problems. Are there any other critical or unique
devices around your facility that should be on the list? How about the UPS in your
computer room? I bet there is another one on your telephone switch. Both rooms
require climate control for the equipment to operate safely, so the HVAC repair
number must be on there also.
Now that you have a list of your critical equipment, take time to cross-reference
the equipment list to the vendor service agreement list. Are any of your critical
devices lacking service coverage? Be sure to check the serial numbers because that
is how service companies determine what is covered. Note the type of service
agreement that each item has.
Consider each item on the asset list separately. Based on your experience,
should any of the coverage be increased to include after-hours support? Should
any of the items be reduced to 8:00 AM to 5:00 PM (or whatever they offer)?
With all this information at hand, draft a vendor list of whom to call and the
normal billing method (time and materials, flat rate, etc.). Consider making a
matrix that allows you to quickly check to see who supplies services or materials
per device, such as every vendor that supports your AS/400. Use Form 5-4 (see
CD) as a starting point for creating your list.
SOFTWARE ASSET LIST
If you lose a server, a critical PC, or a shop floor controller to a fire, you need to
know what to replace it with. There is much more to a computer than what you
96 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
see on its outside; there is all the very important software inside of it. Replacing
the hardware without loading all the appropriate software (and data) will only
result in a dark monitor staring back at you.
For each of the critical systems you have previously identified, you need to
make a list of any software they require to drive them. This includes copies of
custom software, any nonstandard driver programs, or operating system settings.
(Cross-check this against your vendor list!)
This sometimes creates a problem if the machine that dies is old and only
the latest hardware is available. You can reload the software from a data
backup (you hope), but the hardware and existing operating system might
create some conflicts.
In many cases, you can recover the software for that machine by reloading its
full disk image backup. If you must reload the software from the original media,
you need to be able to locate it. Once purchased software has been loaded onto a
server, it should be stored along with your backup tapes at an off-site location.
CRITICAL BUSINESS FUNCTIONS
A key driver to your disaster planning is a clear identification of the critical business
functions performed at your facility. You cannot protect everything equally, so you
need to concentrate your recovery plans on the most important functions.
Identifying this is a top management function.
Every company has a few essential things it does. Everything else can be
delayed for a short time while the critical functions bring progress to a halt.
Critical items must be recovered before all other areas. If you must draw up your
own list, be sure to discuss them with your accounting manager or controller.
Don’t be surprised if they cannot rattle off a list to you. They probably never
worked a list up either.
For each critical function you identify, explain why it is important. Does it
involve cash flow? Does it fulfill a regulatory requirement? If you have a broad
understanding of your business, your list may be quite long—too long. Try to
narrow it down to 10 or fewer items. The longer list is still very useful, but what we
are after is a guideline.
OPERATIONS RESTORATION PRIORITIES
If three things break at once, which one do you fix first? That is a restoration
priority. Based on the critical business functions identified in the previous step,
you now take your asset list and identify restoration priorities for every asset.
Some of these are easy. If there is a file server used by many departments
across the company, it will have a high priority for service restoration. A telephone
switch is the same high importance. But how important is your e-mail server? Is it
more important than the Materials department’s warehouse server? Probably not,
unless it is the conduit for e-mailed and faxed orders from customers.
BUILD AN INTERIM PLAN 97
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
Consider this from another angle. If the electric company called and said they
were shutting off two thirds of the power to your building, which equipment
would you shut down, which would you ensure stayed up, and which would you
stand by to start as soon as the outage was over?
TOXIC MATERIAL STORAGE
For the safety of all concerned, you should know if there is any toxic material
stored on the premises and where it is. If there is a fire, building collapse, or flood,
you will want to help warn people away from that area.
Use a map of the facility to indicate where this material is stored and what it
is. If it is flammable, be sure to note that also. This is an important part of your
plan so ensure everyone is aware of it.
Everyone on the recovery team must know where these dangerous materials
are located and how to identify if they are leaking. They should know what to do if
they encounter them.
EMERGENCY EQUIPMENT LIST
When a disaster occurs, you’re going to want to know where things are to help
reduce the amount of damage to equipment and the facility. This includes things
such as electrical shutoff, water valves, gas shutoff, sprinkler system controls, etc.
You also want to know where any special equipment such as portable pumps,
wet/dry vacuums, and special fire extinguishers are kept so that damage can be
kept to minimum.
Everyone on the recovery team must know where these items are located and
how to use them in the event of an emergency. See Form 5-5 (on CD) to start your
list of emergency equipment. And yes! Don’t forget the keys to the doors!
TRAINED FIRST RESPONDERS
Many rural communities depend on volunteer fire departments and ambulance
crews to support their towns. If any of your employees are EMT qualified, this is
important to note. If anyone is a trained volunteer firefighter, this is important; a
ham radio operator, a homebuilder, any number of things might show up. An
additional question is if they have any hobbies or outside interests that would be
of use in a crisis.
If you have any military Reserve or National Guard personnel, the training for
their military job classification may be useful. They may be military police, hospital
workers, or a wide range of things. It is not unusual to work in a military field that
is entirely different from your civilian job. A possible downside to having these
people on staff is that, in a wide-area emergency, these people may be called to
government service and not be available to assist in your recovery.
Anyone that you identify with additional skills should be added to your recall
roster. You need to indicate what skills each has, along with details on how to
98 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
contact them during and after work hours. This list will have the same format (and
be a continuation of ) the emergency notification and recall list drawn up in the
“Access to People” section of this chapter. You will need their work telephone
number, home telephone numbers, cell phone numbers, pager number, home
address, etc.
You should check this list with your Human Resources department to ensure
you are not violating any company rules by calling on these people in an emergency.
CONCLUSION
Once you have finished the steps outlined in this chapter, you’ll have created a
basic interim plan that will drastically improve your ability to handle any disaster
that occurs. This interim plan will provide the recovery team the critical information
they need to:
1. Get access to key people who can get the recovery process started as soon
as possible.
2. Get access to facilities and computer systems to get them back up and running.
3. Have the service contracts they’ll need to get the vendors you’ve contracted
with for outside support busy as quickly as possible.
4. Order emergency supplies quickly from critical vendors.
5. Document assets damaged using the walk-around asset list.
6. Order replacement copies of important software.
7. Identify the critical business functions that must continue during restoration.
8. Restore the operational functions in the best order.
9. Identify the location of toxic materials to cleanup crews for their protection.
10. Locate onsite emergency equipment and materials you need to help clean up
the mess.
11. Ask for assistance from any volunteer firefighters or EMTs who are on staff.
If you have followed these steps and collected this information, you have the
material for a basic business continuity plan. If your project stopped right now,
your company is noticeably better prepared for a crisis than it was before. But
don’t stop now! There is much more important information to gather and mitigation
actions to identify. What you have now is a good starting point. Continue reading
to get the information you need to develop a complete plan.
BUILD AN INTERIM PLAN 99
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
This page intentionally left blank
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
C H A P T E R 4
SELECTING A STRATEGY
Setting the Direction
However beautiful the strategy,
you should occasionally look at the results.
—Sir Winston Churchill
INTRODUCTION
With the results of the Business Impact Analysis and risk assessment in hand, it is
time to select a recovery strategy. The recovery strategy is the overall direction for
planning your recovery. It provides the “what” of your recovery plan. Individual
plans are the “how” it will be done. An approved strategy keeps the company’s
recovery plans in sync and avoids working at cross purposes.
A recovery strategy is not for restoring things to the way they were before. It is
for restoring vital business functions to a minimally acceptable level of service.
This minimal level of service enables the company to provide a flow of goods and
services to its customers and buys time for planning a permanent recovery. There
will be a separate recovery strategy for different parts of the company.
Disaster recovery planning is all defensive. Like insurance, you pay year after
year so that if something did occur, you are covered. If nothing happens, then the
money is spent with nothing tangible to show for it. The business benefit of disaster
recovery planning (a subset of business continuity planning) is that it reduces the
risk that a major company catastrophe will close the doors forever.
Another strategy deals with business continuity for operating a company that
overcomes “in-process disasters” and keeps operating. Facility-destroying disasters
are rare. More common are the many local disasters that occur in a process.
Business continuity returns value to the company by developing contingency
plans in case of a vital business function interruption. It also forces the company
to examine its critical processes and to simplify them for easy recovery. Simpler
SELECTING A STRATEGY 71
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
processes are cheaper to operate, more efficient and more reliable. The business
continuity strategy is in addition to and complementary to the disaster
recovery strategy.
SELECTING A RECOVERY STRATEGY
Your recovery strategy determines the future costs and capability of your overall
program. All subsequent plans will be written to fulfill the recovery time required,
and solution selected. A poorly selected strategy will require all plans to be
rewritten when it is replaced.
Companies have long struggled with how much money to spend on a quick
recovery that may never be used. A recovery strategy is a tradeoff between time
and money. The faster the ability to recover (up to near instantaneous), the higher
the expense. The maximum recovery time that a company can tolerate an outage
is its recovery time objective (RTO). This was identified by its Business Impact
Analysis (BIA). Rapid recoveries are often favored until the initial and ongoing
costs are detailed. However, a rapid recovery may also become a marketplace
advantage by providing a more reliable product delivery.
The RTO is measured from the time when the incident occurs. Hours lost
dithering around whether to declare a disaster or not is time lost toward your
recovery time goal.
The classic error is to recover the data center which then sits idle because the
various departments that use the IT systems were not recovered. Companies must
craft a separate recovery strategy for each significant area:
➤ Information Technology. Recovering a data center, internal and external
network connections, and telecommunications.
➤ Work Area Recovery. Recovering a place for office workers along with a
personal computer, telephone, printer access, etc., all securely connected to
the recovered data center.
➤ Pandemic. Maintaining business during a public health emergency that may
run for 18 months or more.
➤ Business Continuity. Keeping the flow of products and services to customers
despite significant failures in company processes.
➤ Manufacturing. Recovering the flow of products after a crisis.
➤ Call Centers. Maintaining customer contact throughout the crisis.
Whatever is decided, the recovery strategy must be communicated throughout
the recovery project. All team members must understand the company’s timeframe
for recovering and the budgeted way to achieve it. It is the starting point for each
recovery plan.
72 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
Recovery Point Objective
Another important factor is your recovery point objective (RPO). This is the
amount of data that may be lost since your last backup. If your IT systems recover
to the point of their last backup, perhaps from the night before, and the incident
occurred at 3:00 PM the next day, then all of the data changes from the time of the
last backup up to that 3:00 PM incident must be re-created after the data center is
recovered. If not, the information is lost. Consider how many people take orders
over the telephone and enter them directly into the order processing system. How
many orders are shipped to customers with only online documentation? How
many bank transfers are received in a day? In the past, this data might be reentered
from paper documents. However, most of the paper products have been
discontinued. Where will this data come from?
Time
We live in a “right now” world. Will the company’s customers wait a week while
someone cobbles together a data center to restore the data or for someone to
answer customer service questions? The amount of time required to recovery a
company’s vital business functions is the first question. Can your company survive
if it loses a day’s worth of data? The BIA identified your RTO. The recovery strategy
for all plans must meet this time goal. The RTO typically drives the cost of the
entire program.
Distance
The distance between the primary and backup recovery sites depends on the risk
assessment. Wherever you go, the recovery site must be far enough away so that
the same catastrophe does not strike both sites. Wide-area disasters, such as
floods, earthquakes, and hurricanes, can impact hundreds of square miles. Use
your personal experience and that of the BCP team to identify areas that are not
likely to be affected by the same risks.
The farther away your recovery site is, the more likely that the team must stay
there overnight. This requires additional expense for hotel rooms, catered food,
etc. However, there is a point where a recovery site is too far away. It is not unusual
for a company to depend on a critical employee who is also a single parent. These
people cannot stay away from home for extended periods.
In many cases, the distance is determined by the type of local threats from
nature. If your company is located on a seacoast that is susceptible to hurricanes,
then the recover site may be hundreds of miles inland to avoid the same storm
disabling both sites. The same would be true in a floodplain such as along the
Mississippi River. However, if you are located in the Midwest, then a one hour
distance for a recovery site may suffice.
SELECTING A STRATEGY 73
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
You cannot foresee everything that might go wrong. After the terrorist attack of
September 11, 2001, many New York companies activated their disaster recovery
plans. Since their recovery sites were hundreds of miles away, they had planned
to fly to them. Who would have predicted that all of the country’s civil air fleet
would be ordered to remain grounded for so many days? In the end, driving was
the only way to get there, delaying most recoveries by at least a day.
Recovery Options
Recovering a data center is different from recovering a warehouse is different from
recovering a call center. In the end, all strategies come together to restoring a
minimal level of service to the company within the RTO.
The primary recovery strategies are to:
➤ Recover in a Different Company Site. This provides maximum control of the
recovery, of testing, and of employees. Some companies split operations so
that each facility can cover the essential functions of the other in a crisis. The
enemy of this approach is an executive’s desire to consolidate everything into
one large building to eliminate redundancies.
➤ Subscribe to a Recovery Site. This leaves all of the work of building and
maintaining the recovery site to others. However, in a wide-area disaster (such
as a hurricane), the nearest available recovery site may be hundreds of miles
away since other subscribers may have already occupied the nearest
recovery sites.
➤ Wait Until the Disaster Strikes and Then Find Some Empty Space. This
approach requires lots of empty office and warehouse space that is already
wired, etc. All we need to do is to keep tabs on availability and when needed,
take out a lease on short notice. This approach results in a long recovery time
but is the least expensive.
IT RECOVERY STRATEGY
IT systems were early adapters of disaster recovery planning. However, as
technology has evolved, so have expectations for how quickly they must recover.
Today’s companies keep almost all of their data in their computer systems.
Without this information, they stop working altogether. The time and expense to
completely re-create it is unacceptable. Companies examining their alternatives
must face up to the high cost of immediate recovery versus the lower cost of
slowly rebuilding in a new site. IT recovery steps (even for a temporary facility)
include rebuilding:
➤ Environmental. IT equipment must stay within a specific temperature and
humidity range.
74 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
➤ Infrastructure. External network connection into the data center of the local
service provider, and throughout the recovered data center; critical servers
used by application servers such as a domain controller, DNS, DHCP, etc.
➤ Applications. Company specific software used by the business to address
customer and internal administrative requirements.
➤ Data. The information needed by the company’s business departments to
support the flow of products and services.
In the past, the issue was to have a standby recovery site ready to go when
needed. This model is based on reloading software and data from backup media
(typically magnetic tape). However, this recovery strategy takes days. At best,
when company data is loaded onto backup media, vital data is separated from
nonvital data. Few companies bother to do this. The result is shuffling media in
and out of a loader to load critical files while the company waits for a recovery.
Refer to Figure 4-1 for a list of IT disaster recovery solutions; these fall into several
general categories from slowest to fastest.
How much can these solutions cost? A hot-site contract will cost about as
much per month as leasing your existing data center equipment. In a crisis, you
must pay the monthly fee for each day of use. So, if you use a hot site in a disaster
for 12 days, you might pay the same as you would for a year of disaster
recovery coverage.
Recovery solutions, such as hot sites, are expensive. A popular solution is for
a company to establish a second company data center about one hour’s drive
from the main data center. This location should use a different power grid and
telecommunications company link than the main facility. A one-hour drive enables
workers to sleep at home every night. (Remember that some of the employees will
live in the opposite direction from the recovery site and the drive might be two
hours each way.) This is especially important for single parents. Hopefully this is
far enough away that the same wide-area disaster cannot strike both.
To prepare the recovery site, move to the second data center all of the test
servers for the critical IT systems. Also move servers for the noncritical systems.
Include adequate disk and network support. This provides equipment that is
ready in a disaster, but not sitting idle. To save more time on recovery, mirror the
critical data between the data center and the recovery site. Data replication
requires a high-speed data connection with replication equipment at each end.
The costs include data replication controllers at each end and a significant set of
disk drives at the recovery site.
Let someone else do it. Application Service Providers (ASPs) provide data
processing equipment, software licenses, and services to companies. Instead of
operating your own data center, you run on their equipment at their site. Require
that they maintain a Business Continuity Program. If this is your strategy, you
must witness and audit their tests to ensure they provide the level of protection
that you expect. The advantage is that this is their line of business and they will be
more efficient at writing these plans and recovering at a different site. Ensure that
SELECTING A STRATEGY 75
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
Pros Cons Good for
Cold Site – an
empty computer
room without
equipment
Inexpensive Long recovery
time (weeks)
All data lost since
last backup
Good for
companies with
a low reliance on
data center
through
dispersed
processing
Hot Site –
adequate
equipment
installed and
loaded with an
operating
system; internal
and external
network is active
Recovery in days
(infrastructure is
in place)
As expensive as a
second data
center
All data lost since
last backup
Good for
companies that
can wait days to
recover
Hot Site with
Data Replication
Recovery in hours
as everything is
already on disk
Little data loss
Very expensive Good for a quick
recovery where a
1/2 day outage is
tolerable
Failover – mirrors
data and CPU
operation; when
the primary fails,
the secondary
site automatically
takes over with
minimal data loss
Outage measured
in minutes or less
Very, very
expensive to
purchase and
maintain
Duplicate
hardware to
become obsolete
Duplicate
software licenses
and hardware to
pay maintenance
fees
Online
companies,
hospitals, banks,
and any
company with a
low tolerance to
IT outages
Application
Service Provider
(ASP)
The ASP provides
for disaster
recovery
planning and
testing
Expensive
Lack of control –
Must ensure ASP
regularly tests
plans at their
recovery site
Companies that
do not manage
their own
applications
recovery
FIGURE 4-1: IT disaster recovery solutions.
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
the ASP is contractually required to meet your RTO irrespective of its commitments
to other customers.
Recommended IT Recovery Strategy
Establish a second company site at least a one-hour drive away in a place that is
on a different power grid and data network. In this site, operate the company’s
primary production data center. Ensure this satellite office has telecommunications
and network capacity to provide for a 25% surge in employees. Place the “Test” IT
systems and noncritical IT equipment in the company headquarters building.
The reasons for recommending this option include:
➤ If the headquarters offices are destroyed, the data center is safe, or vice versa.
Then we only have to recover from one disaster at a time.
➤ We can continue telephone contact with our customers if either office fails
and our customers will see only a slight drop in service.
➤ Using test servers as a backup data center avoids expensive “just-in-case”
machines sitting idle. In an emergency, the test servers become the production
machines for the applications they already support. Noncritical servers are
repurposed for critical systems support.
➤ The company’s application software is already on disk; we only need to load
the current version.
➤ We know the alternate data center is connected to a live network because we
use it daily.
➤ The company controls security access and facility maintenance of both sites.
➤ Backup media can be maintained in the headquarters facility (except for
archive copies), which means savings on third-party storage for short periods.
➤ Potential to add data replication to avoid time lost loading data from tape and
to minimize data losses.
➤ Recovery tests can be scheduled whenever we wish.
Example IT Recovery Strategy
The myCompany Data Center Disaster Recovery Strategy provides general
guidance for critical system recovery after an incident renders the myCompany
Data Center unusable. A recovery site has been prepared at our Shangrila data
center that is about a one-hour commute from the existing work site. This recovery
site is on a separate power grid and telecommunication connection. This site is
also furnished and equipped to accommodate 75 office workers.
To facilitate this recovery, myCompany has located all test servers (and
adequate disk storage) at the backup Data Center and keep production IT
equipment at myCompany. The underlying assumption is that the test system
SELECTING A STRATEGY 77
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
hardware is an adequate substitute for the critical systems (CPU & RAM), and that
each critical system has a corresponding set of test servers. In this way,
myCompany has an operational hot site that is proven to work (idle sites tend to
develop unnoticed problems).
Under this approach, servers in the backup Data Center are already loaded
with the necessary version and patch level of the operating system. During a
disaster, the test system is offloaded to tape or removable media. The equipment
is then loaded with the current production version of the application (which
should be present on their local disk drives).
All critical data is mirrored between the operational data center and the
backup site. The estimated recovery time is in seconds with minimal data loss.
Reasons for this selection are:
➤ Quick recovery at the lowest cost.
➤ The recovery site is under myCompany control.
➤ Segregating test servers facilitates testing of DR plans.
➤ Keeps production data in myCompany for easier backups.
WORK AREA RECOVERY STRATEGY
The general term for recovering damaged offices is “work area recovery.” A common
disaster recovery error is to focus solely on the IT recovery without providing a
place from which to access it.
On September 14, 2008, the remnants of Hurricane Ike swept through the Ohio
Valley with sustained winds equal to a Category 1 hurricane. This resulted in
widespread power outages that lasted for many days. The author worked for an
organization whose generator promptly roared to life and kept the data center in
operation even though none of the offices were wired for backup power. The
people arrived at work to hear the generator running but no lights inside or
power for desktop PCs. No one had any place to work, so they were all sent
home. After several (expensive) days of running the generator, power was
restored to that portion of the city.
Just like your IT recovery strategy, the work area recovery strategy must execute
in a prepared site. It does not take that long to run electrical connections down
the middle of a conference center, string some network wiring, and erect work
tables and chairs. The longest delay is the time required to add adequate bandwidth
78 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
to the outside world (which includes the data center recovery site). Without
this external connection of adequate size, the recovery is hobbled or delayed. If
the disaster covers a wide area, it may be weeks before the telecom connection
is ready.
In a crisis, only the personnel essential to operating the critical IT systems,
required to answer customer calls, or necessary to fulfill legal requirements must
be recovered immediately. The rest of the offices can be recovered over time.
Employees equipped with Virtual Private Network (VPN) authentication may
connect to the data center through secure connections. Scarce work area can be
maximized by adding a second shift for staff who do not directly work with
customers (such as the Accounting department).
One option for recovering offices is through the use of specially equipped
office trailers. These units come with work surfaces, chairs, generators for creating
their own electrical power, a telephone switch, and a satellite connection to
bypass downed lines. When on site, these trailers are typically parked in the
company parking lot to use any surviving services—and then provide the rest.
Beware of counting on hotels as large-scale work area recovery sites for
offices. Like everyone else, they watch their costs and do not want a monthly bill
for data capabilities far in excess of what is normally used. A T-1 provides sufficient
bandwidth for a hotel and its guests but not enough to support 100 office workers
filling the conference rooms. Also the hotel switchboard will lack capacity for
busy offices.
Setting up a recovery site requires:
➤ A location far enough away that it is not affected by the same disaster.
➤ Chairs to sit on and tables for work surfaces.
➤ Locating together any business teams that frequently interact or exchange
documents during business. Otherwise, there may be multiple work area
recovery locations.
➤ Desktop equipment, such as a computers and telephones. Loading the
company software image on PCs takes time. Also, people will miss their
personal data.
➤ Alternative communications, such as fax and modem.
➤ Historical documents that must be checked during the course of business.
➤ Preprinted forms required for legal or other business reasons.
Refer to Figure 4-2 for a list of work are a disaster recovery solutions; these fall
into several general categories from slowest to fastest.
Example Work Area Recovery Strategy
myCompany’s Work Area Recovery strategy is to use the company’s IT training
rooms adjacent to the backup data center as temporary offices in an emergency.
SELECTING A STRATEGY 79
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
These classrooms are equipped with workstations on every table. A telephone
switch is online and wires are run to each workstation. In an emergency, telephones
can be quickly installed.
This recovery site, approximately 60 minutes of travel from the main office,
is used as an off-site conference center and IT training facility. It can accommodate
enough of the critical office workers to keep the company operating until permanent
facilities have been prepared.
IT staff not involved with the IT recovery plan will work from home via VPN.
Executive staff is to meet in the Sleepy-Head motel conference center until a
local office is ready.
The Work Area Recovery Manager is also responsible for the ongoing
maintenance of the office recovery site. The recovery site must be tested semi-
annually to ensure that the network and telecom connections are functional and
available when needed.
80 THE DISASTER RECOVERY HANDBOOK
Cold Site Inexpensive. This
can be an empty
warehouse. Hotel
conference rooms
may also be
booked.
Long recovery time
(a week or more)
Low data speed
and limited
telecom
capabilities will
hinder operations
Hot Site Recovery in a day. As expensive as a
second call center
Pay on declaration
Nearest available
site may be
hundreds of
miles away
Companies with a
single office site
and a short
recovery time
Hot Site –
Trailers in
your
company
parking
lot
Recovery in a day.
Everyone sleeps
at home.
Expensive, local
units may be taken
in a regional disaster
This is a subscription
service
Companies with a
strong desire to
recover as close to
home as possible
ConsPros Good for
FIGURE 4-2: Work area disaster recovery solutions.
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
SELECTING A STRATEGY 81
PANDEMIC STRATEGY
The goal of the Pandemic Emergency Plan is for the company to continue operations
at a level that permits it to remain in business. This requires steps to prevent the
spread of disease into and within the organization. Actions to minimize the
spread of infection represent an additional cost for the company which must be
borne until the danger passes. Unlike other contingency plans, a Pandemic
Recovery Plan will be in operation from 18 to 24 months.
In 2003, in response to a local outbreak of severe acute respiratory syndrome
(SARS), the World Health Organization urged postponement of nonessential
travel to Toronto. Some conferences scheduled for the city were canceled and
hotel occupancy rate sank to half of normal. Although reported SARS cases were
few, the financial impact was significant.
Pandemic emergency steps require different strategies for major stakeholders:
➤ Employees
a. Employees who can work from home should use a VPN connection to
minimize the amount of time that they spend in the office.
b. The company sick policy must be relaxed so that sick people are not forced
to come into the workplace. Anyone who is sick is encouraged to stay home.
They should also stay home if they have a sick family member.
c. Areas used by company workers must be periodically cleaned thoroughly to
address any infection brought in from the outside.
d. Employees who travel into areas with a high rate of pandemic infection
should work from home for the first week of their return.
➤ Customers
a. Areas where customers enter the facility must be cleaned thoroughly to
address any infection brought in from the outside.
b. Provide complimentary hand sanitation at all store entrances.
c. It may be necessary to bring in individual sanitation supplies for an
extended period of time.
d. All returned products should be sanitized before examination.
➤ Vendors
a. Use video conferencing and other electronic tools to meet with vendors.
b. Carefully select meeting places with a low incidence of pandemic.C
op
yr
ig
ht
@
2
01
1.
A
MA
CO
M.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
Example Pandemic Strategy
myCompany’s Pandemic Emergency Plan is designed to contain the potential
spread of illness within the company. It is initiated when the state public health
authorities in the headquarters building’s state declare a pandemic. Limitations
on the number of sick days provided to each employee in the company’s sick
leave policy are suspended. Employees are encouraged to stay home with sick
family members.
All company areas where employees are in close physical contact with
customers or vendors must be thoroughly sanitized every day. Each employee is
provided with personal sanitation gloves and face masks.
Any employee returning from a business trip will work from home via VPN
for seven days before entering the office.
BUSINESS CONTINUITY STRATEGY
A successful business continuity strategy is when your customers never notice
an interruption in service. It is a proactive plan to identify and prevent
problems from occurring. To implement this in a company, begin with the list
of critical processes identified by the BIA. Develop a process map for each vital
process that shows each step along each. Identify areas of risk such as
bottlenecks into a single person or device, limited resources or legal
compliance issues. Mitigate each point of risk by implementing standby
equipment, trained backup personnel, etc.
A severe blizzard in Minnesota is not visible to a customer in Arizona who is
waiting on a rush order. An instant failover for IT systems is essential for online
companies, banking, hospitals, vital government service offices, public utilities,
etc. The dollar loss of customer impact is so high that it justifies the high cost.
Other companies regret the interruption but are not so real-time with their
customers. As a result, they have several days to recover with minimal customer
interruption. An example might be a health spa where a one-week interruption in
service is overshadowed by the strong customer relationship.
A business continuity strategy deals with your company’s vital processes. It
might be anything whose absence disrupts the normal flow of work. For example,
many companies have eliminated their company telephone operator and
replaced that person with an automated telephone directory. Key in the person’s
name, and you are connected. However, if that device fails, the rest of the company
is still creating and shipping products to the customer, but no one canall into the
facility. A Business Continuity Plan provides information on how to recover that
device or quickly replace it.
The strategy is to begin with your list of critical processes. Assign someone to
develop a process map of each to identify potential single points of failure or
82 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
places where the flow of products and services are constrained. Then reduce the
likelihood of failure by providing duplicate (or backup) equipment to single
threaded devices, trained backup personnel, etc.
Therefore, a business continuity strategy deals with processes. It might include:
➤ Identification of vital processes (this list is updated quarterly).
➤ Drafting a process map to examine each step for single threading or weakness,
such as unstable equipment or operators.
➤ Identification of steps to eliminate (simple processes are easiest to recover).
➤ Drafting a risk assessment for the process.
➤ Drafting an end-to-end recovery plan for each remaining step in the process.
CONCLUSION
Selecting a recovery strategy is an important step. Its boundaries are determined
by how quickly the company must recover in order to survive. Another factor is
the amount of data it can afford to lose. When the risk from natural is evaluated, a
recovery strategy can be created.
The strategy selected will drive the cost of the company’s recovery plans.
Therefore it must be based on the data gathered by the Business Impact Analysis.
This focuses efforts on the “vital few” processes. Each strategy selected must be
approved by the project’s executive sponsor. Otherwise, most work will be lost
when a revised strategy is issued.
A separate strategy must be developed for each plan. The primary plan is for
recovering the data center. Next, the strategy for the work area recovery must be
based on when the data center will be ready for use. The pandemic plan is different
in that the crisis comes on slowly, eventually hits a peak and then gradually
fades away.
In the end it comes down to how much security the company can afford.
Where possible, try to combine recovery capabilities with existing assets (such as
using “test” IT servers to recover the data center) with disaster recovery requirements
to reduce the program’s ongoing cost.
SELECTING A STRATEGY 83
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
This page intentionally left blank
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
C H A P T E R 6
WRITING THE PLAN
Getting It Down on Paper
No one plans to fail; they just simply fail to plan.
—Disaster Recovery Journal
INTRODUCTION
Writing a plan is not difficult. It is as simple as telling a story to someone. It is the
story of what to do. It addresses the basic concepts of who, what, where, when,
why, and how of a process. Although you cannot predict exactly what will happen
where, upon reflection, you can identify the basic steps that must be done in
any emergency.
Throughout your plan writing process, keep in mind that emergencies affect
people in different ways. Some will panic, others will sit and wait for the expert
(but many are really waiting for someone else to take responsibility for any
recovery errors), and some will make excuses and leave. The goal of your plan is
to minimize this chaos by providing some direction to the people onsite so they
can get started on containment and recovery. Once team members are in motion,
the chaos lessens and their professional training will kick in.
It is impossible to write a specific recovery plan for every possible situation.
Instead, the plans provide a set of guidelines to reduce the chaos at the point of
incident and to position the company for a recovery once adequate facts become
available. Whether you are rebuilding a data center due to a fire or due to a roof
collapse, it is the same set of steps.
Business continuity plans come in many forms according to local requirements
and the preferences of the person writing them. The CD contains four separate
plans. Each plan is executed by a different team, based on the circumstances of
the incident. They are:
WRITING THE PLAN 101
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
➤ Administrative Plan. Contains reference information common to all plans,
such as vendor call lists, recovery strategy, risk assessment, etc.
➤ Technical Recovery Plans. Many independent plans that contain the step-by-
step actions to recovery of a specific process or IT system from scratch. These
plans assume that the process must be rebuilt from nothing. These plans are
often referred to when addressing local emergencies.
➤ Work Area Recovery Plan. The details for relocating the critical company
office workers to another site.
➤ Pandemic Management Plan. Actions the company will take to minimize the
impact of a pandemic. Unlike a data center disaster whose recovery can be com-
pleted in a few hours or days, a pandemic can easily run for 18 months or more.
The essential elements of a business continuity plan are that it is:
➤ Flexible to accommodate a variety of challenges.
➤ Understandable to whoever may read it (assuming they know the technology),
➤ Testable to ensure that it completely addresses interfaces to other processes
(in and out).
Writing your plan is simply documenting before the fact what should be done
when a disaster strikes. The basic steps to follow are:
1. Lay the Groundwork. Here the basic decisions are made about who will execute
the plan, what processes need a plan, the format of the plan, etc.
2. Develop Departmental Plans. Departments are the basic structure around
which organizations are built; they are a good place to start developing
your plans.
3. Combine your Departmental Plans into an Overall Corporate Plan. Here
you check to ensure that departmental recovery activities do not conflict with
one another and that any interdependencies are considered.
LAY THE GROUNDWORK
Your first step in developing continuity plans is to establish a standard format.
This will give at least the first few pages of each plan the same “look and feel.”
When drafting your plan, consider the following:
➤ Who will execute it? If you are the local expert on that process, then why do
you need a plan? The odds are you don’t. In a crisis, you would know what to
do. But if you like to take days off, occasionally get sick, or even take a vacation,
then whoever is on the spot when the emergency occurs must be able to stand
in for you and address the problem. A plan must consider who may be called
on in an emergency if the expert is not available. Another consideration is that
if you are the manager over an area, and you want to be able to recover a
process in case the “expert” is promoted, transferred, quits, or is discharged, a
102 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
written plan is essential. You should especially look for highly stable processes
that never break and no one has experience working on. They must have a
plan on file since whoever worked on them may have already left the company.
➤ How obvious is the problem? Some problems, like magnetic damage to backup
tapes, are invisible until you try to read them. Other problems, such as the
entire building shaking in a massive earthquake, are easier to recognize. If a
problem is hard to detect, then step-by-step troubleshooting instructions
are necessary.
➤ How much warning will there be? Is a severe thunderstorm in your area often
a prelude to a power outage? Will the weather forecast indicate a blizzard is
imminent? However, if a local building contractor cuts your connection to the
telephone company’s central office, there is no warning of an impending
problem at all. Emergencies that provide a warning, such as a weather bulletin,
often trigger automatic containment actions. This might be to purchase extra
flashlight batteries, install sandbags, or to have essential technical personnel
pitch camp within the building in case they are needed.
➤ How long must they continue running with this plan before help arrives?
Should they have enough information to contain the problem for 10 minutes
or 2 hours?
➤ How soon must the process be restored before the company suffers serious
damage? This is called the recovery time objective for this process.
➤ Are there any manual workaround actions that can be used until the process
is restored? For example, if your payroll computer system dies at the very
worst moment, can you write 40-hour paychecks for everyone? This makes a
mess for the Accounting department to clean up later, but in an organized
labor facility, the worker’s contract may allow them to walk off the job if their
paychecks are late.
What Needs Its Own Plan?
Is the answer anything that could break? Some processes are like links in a chain,
where the failure of any single item brings the entire process to a stop, such as a
data network. In this case, any number of items along a chain of equipment could
be at fault. The plan would step you through the basic fault-location steps and tell
you what to do to address the problem you find. Some problems are isolated to
one or a few devices, such as a Web server failure. In this case, you would focus all
efforts on the server and its connections to the network.
You must have a plan for every critical business function identified by the
Business Impact Analysis (BIA). This includes manual processes and every piece
of critical equipment that supports the facility. For each critical business function,
explain the steps necessary to restore the minimal acceptable level of service. This
level of service might be achieved by performing machine functions with manual
labor. It might be achieved by shifting the work to another company site or even
WRITING THE PLAN 103
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
paying a competitor to machine parts for you. The goal is to keep your company
going. Optional plans may be written to support those functions essential to your
own department (and peace of mind), but that are not essential to the facility’s
critical business functions.
Consider how a plan will be used when you write it. Your goal is not a single
large soups-to-nuts document. Usually a department has an overall plan for
recovering its main processes or machinery and then specific action plans for
individual problems. For example, Vital Records may have detailed plans for
recovering documents based on the media on which they are stored. This
information should be readily available to the department. But some specific
action should be kept on laminated cards and provided to the security guards (for
after-hours emergency action) and posted on the walls of the rooms it affects.
Examples of these laminated pages might be immediate actions to take for a water
leak, for an electrical outage, for a fire, etc. See Form 6-1, Sample Business
Continuity Action Plan (on the CD), for an example.
Another example would be an electrical outage in the computer room. The
overall plan will contain information on calling the power company and who to
call for emergency generators, etc. But, a notice on the wall of the computer room
will provide specific power shedding instructions and indicate immediate steps to
take to monitor and potentially reduce the load on the UPS.
Word Processing Guidelines
Your company may have some specific guidelines in place for important documents
like this. If not, consider these guidelines for the plans.
PAGE LAYOUT
➤ Set your word processor to default to 12 point, Arial font (don’t make me
search for a pair of glasses in the midst of a crisis!).
➤ Set the page footers to include a page number in the center and the current
date in lower lefthand corner. This date will help to indicate which copy of the
plan is the latest. The footers should also include the phrase “Company
Confidential” on every page.
➤ Each document should read from major topic to minor topic—or broad view
to narrow view. The beginning of the document deals with actions that would
affect the entire process and, as you move further into the document, more
specific issues would be addressed.
DOCUMENT FORMAT
➤ On the first page, include a brief narrative (one paragraph) of the business
function of the equipment that this particular plan supports.
➤ The name of the primary support person.
104 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
➤ The name of the secondary support person.
➤ The name of the primary customer for this process (Accounting, Manufacturing,
Sales, etc.). It is better that you tell them what is wrong than they find out there
is a problem the hard way.
➤ Immediate action steps to contain the problem.
➤ Known manual workaround steps to maintain minimal service.
➤ In the case of telecommunications, data networks, or data processing services
outages, include the names of other technical employees in sister companies
with expertise in this area who can be called onsite in a crisis.
DEPARTMENTAL PLANS
A departmental recovery plan has several components. The main component is
the plan itself, a narrative that explains the assets involved, the threats being
addressed, the mitigation steps taken, and what to do in the event of a disaster. This
sounds simple enough, but such a plan could easily fill notebooks. Instead, base your
plans on a primary scenario with specific threats addressed in attached appendices.
In addition, more abbreviated instructions for security guards, computer operators,
etc., should be included as part of the departmental recovery plan.
The main part of the plan has three major components:
1. Immediate Actions. Steps that anyone can take to contain the damage (similar
to applying first aid to an injured person). This involves simple tasks, such as
shutting off the water main to stop a leak, evacuating people if there is a toxic
spill, or opening the computer room doors if the air conditioning fails. Once
people are safe, an early action in “Immediate Action” is to alert the appropriate
people for help. It takes time for them to drive to the disaster, so the earlier you
call, the sooner they will arrive.
2. Detailed Containment Actions. To reduce the spread or depth of damage.
What else can be done until the “experts” get there? What actions should the
“experts” take after they arrive to stop the damage from spreading?
3. Recovery Actions. To return the process to a minimal level of service is an
important third component of every plan. This is the part that most people
think about when considering disaster recovery planning.
There are four inputs into building your plan. First, begin with the Critical
Process Impact Matrix you developed in your BIA (Form 3-3). This lists the critical
processes and the time of day that they are essential. This list was further broken
down in the Critical Process Breakdown matrix (Form 3-4). These two tools can
provide the essential information for building your plans. Add to these lists your
risk assessment and your process restoration priority list. With these items, you
have everything necessary to write your plans. Write your primary plan for the
worst case scenario—complete replacement of the process.
WRITING THE PLAN 105
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
In many cases, the damage is caused by multiple threats, but their associated
recovery steps are the same. Therefore, a plan that details what to do in one disaster
situation is probably applicable to most other situations. For example, the loss of
a critical computer server due to a fire, physical sabotage, or a broken water pipe
would have essentially the same recovery steps. Separate plans are not necessary,
although the mitigation steps for each threat in the example would be
quite different.
Begin by drafting your plan to address this central situation. Add to the central
plan an appendix for any other specific threats or recovery actions you think are
appropriate. All together, this is your department’s (or critical processes’) disaster
recovery plan and should be available in your office, with a printed copy at your
home and your assistant’s home. In addition, the plan administrator must maintain
both a printed copy and an electronic copy. (Recovery plans contain information
useful to people with bad intentions, so keep them in a secure location.)
Looking at your department’s main plan, you still have a document that is too
unwieldy to use in the first few moments of the crisis. Remember, emergencies are
characterized by chaos. Some people are prone to act, and others are prone to run
in circles. You need to have something quick and easy to follow in the hands of
those who will act. These terse instructions must detail basic disaster steps to
safeguard people and to contain the damage. They are usually laminated and
posted on the wall. Include them as an appendix to your plan identified with their
own tab.
As you write your plan, consider the following:
➤ Who Will Execute This Plan? A minimum of three people must be able to
execute a plan: the primary support person, the backup support person, and
the supervisor. Usually, the weak link is the supervisor. If that person cannot
understand the plan, then it is not sufficiently detailed or it lacks clarity.
Most facilities operate during extended first-shift hours, from Monday
through Friday. However, if this plan is for a major grocery store, it might be
open 24 hours a day, 7 days a week. Problems occur in their own good time. If
they occur during normal working hours, and your key people are already
onsite, then the emergency plan is to summon these key people to resolve the
problem. Referring to the written plan will also speed recovery, since time is
not wasted identifying initial actions.
However, if the problem arises at 3:00 AM on a Sunday and is discovered
by the security guard, he needs to know the few essential containment
actions to take until help arrives. Because this is the worst case scenario—
someone unfamiliar with an area tasked to contain a problem—this is
the level of detail to which you must write. One of the first action steps
is always to notify the appropriate person of the problem. This gets help in
motion. Then, the person on the spot works on containment until that
help arrives.
This approach works well with crises that are common knowledge or are
basically understood by the general population, such as the sounding of fire
106 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
alarms, burst water pipes, or power outages. But for some of the technical
areas, such as data processing, writing such a level of detail would make a
volume of instructions so thick that the computer room would have long
since burned down while the containment team struggled through the text. In
those cases, the level of detail should be sufficient for someone familiar with
the technology, but unfamiliar with this particular piece of equipment, to
work through the steps. In addition, specific containment actions should be
posted on the wall so that the vital first few minutes are not wasted looking for
a misfiled disaster plan book.
➤ How Obvious Is the Problem? Standing in an office with water lapping over
your shoe tops is a sure sign of a problem. Smoke pouring out of a room is
likewise a sign that immediate action is needed. When drafting your plan,
consider how obvious the problem might be to the typical person. Obvious
problems are usually of the on/off type, such as electrical service, air
conditioning, machine-works-or-it-doesn’t type of situations.
Problems that are difficult to pinpoint require step-by-step troubleshooting
instructions. In these cases, something stops functioning, but the cause isn’t
obvious. In these instances, the call for help goes out first, but if there is
anything that the person on the spot can do, then he or she should have
detailed instructions on how to do it. For example, if a critical piece of shop
floor machinery stops working, yet everything else in the factory is working
fine, your immediate action troubleshooting steps would include tracing the
data communications line back to the controller and back to the computer
room to look for a break in the line. The plan should identify all the system
interdependencies so they can be checked.
➤ How Much Warning Will They Have Before the Problem Erupts? Most
weather-related problems are forecast by local news services. Flood warnings,
severe thunderstorm warnings, and tornado watches are all forewarnings of
problems. If your facility is susceptible to problems from these causes, then
you can prepare for the problem before it strikes. However, the first indication
of many problems does not appear until the problem hits, such as a vital
machine that stops working or the loss of electrical power.
➤ How Long Must They Continue Running with This Plan Before Help Arrives?
Begin with immediate actions steps, sort of like first-aid. There are always
some basic actions that can be taken to contain the damage and prepare for
the recovery once the “experts” appear. Detail these in your plan.
Some plans have a short duration. For example, in the case of a computer
room power outage, only so much electrical power is available in the UPS
before the batteries run dry. By turning off nonessential equipment, this battery
time can be extended in the hopes that power will be restored soon. This
assumes the person standing in the computer room knows which equipment
is not essential or has a way to identify these devices. In this case, the time
horizon for the containment plan is the maximum time that battery power
WRITING THE PLAN 107
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
remains available, or until the computer operations manager arrives to begin
shutting down noncritical servers.
A different example is in the case of a broken water pipe. Shutting the
water main to that portion of the building is the immediate action to stop the
damage, at which time you switch over to containment efforts to prevent the
water from spreading and the growth of mold. Your immediate actions steps
would list the facility maintenance emergency telephone number or tell the
person the location of the water shutoff valve.
In any case, if people in the affected room or adjacent rooms are in danger,
the first step is always to notify and evacuate them. Safeguarding human life
is always the number one immediate action step!
➤ Manual Workaround. Most automated processes have a manual workaround
plan. Unfortunately, this plan is rarely written down. If you know that one
exists, put it on paper immediately. If you don’t know about this, ask the
process owner. Manual workaround processes may not have the same quality,
they may require many more workers, and they may require substantial
overtime work just to keep up, but they may quickly restore your process to a
minimal level of operation (the least that a disaster plan should provide).
Manual workarounds may allow you to go directly to the recovery phase with
minimal containment actions.
Some manual workaround processes for computer systems will require a
data resynchronization action when the computer system returns to service.
In those cases, work logs must be maintained of the items processed manually
so that the data files can return to accuracy.
I Still Don’t Know What to Write!
Write your plan in the same way as if you were explaining it to someone standing
in front of you. Overall, you start with the overview and then drill down to the
details. For example, if you were writing a plan to recover the e-mail server, you
would state what the system does, its major components, and any information
about them. Then you would have a section explaining each major component
in detail.
Imagine that you are standing in a room when an emergency occurred. Also
imagine several other people in the room who work for you and will follow your
directions. Now imagine that you can speak, but cannot move or point. What
would you tell them to do? Where are your emergency containment materials?
Whom should they call, and what should they say? Write your plan in the same
conversational tone that you use when telling someone what to do.
Include pictures and drawings in your plan (for example, floor plans showing
the location of critical devices in a building). Digital cameras can be used to create
pictures that can be easily imported into a word-processing program.
It is also very important to include references to the names of the service
companies that have support contracts for your equipment. In the back of the
108 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
notebook, include a copy of the vendor contact list, so they know whom to call
with what information (such as the contract identification number).
So, the plan for your department will include:
1. Immediate Actions.
a. Whom to call right away.
b. Appendices: specific threats.
◆ Loss of electricity.
◆ Loss of telephone.
◆ Loss of heating, air conditioning, and humidity control.
◆ Severe weather and low employee attendance (perhaps due to a blizzard
or flood; how can you maintain minimal production?).
2. Detailed Containment Actions.
a. What to do to reduce further damage.
b. First things the recovery team does once onsite.
3. Recovery Actions.
a. Basic actions.
b. Critical functions.
c. Restoration priorities.
4. Foundation Documents.
a. Asset List.
b. Risk Assessment.
c. Critical Process Impact Matrix.
d. Critical Process Breakdown Matrix.
5. Employee Recall List.
6. Vendor List.
7. Manual Workaround Processes.
8. Relocating Operations.
How Do I Know When to Stop Writing?
Your primary plan only needs to contain enough explanation for someone to
restore service to minimal acceptable levels. Once you have established that, your
normal approach for handling projects can kick in. Some plans only cover the first
48 hours. As an alternative to setting a time guideline, link it to the function the
plan is intended to protect and then it takes however long it takes.
Provide as much detail as necessary to explain to someone what they need to
do. For the Immediate Action pages, assume they are unfamiliar with the details
WRITING THE PLAN 109
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
of the function and keep your instructions simple and to the point. For your primary
plan, assume they are familiar with the function and understand basically how
it works.
To be useful, your plan must be clear to others and include all pertinent
details. The best way to know if your plan is sufficient is to ask someone to read it.
Hand it to someone and then leave the room. See if they can understand and
would be able to act on it. What is clear as day to you may be clear as mud to
someone else. Then test it again without the involvement of your key staff members.
RECOVERY PLANNING CONSIDERATIONS
Prompt recovery is important to a company. It is also important to you because if
the company has a hard time recovering, the owners may simply close your office
and absorb the loss. For the sake of yourself and your fellow employees, include
recovery considerations in your plan.
1. Planning. Each of these steps can provide valuable information for your
plan development.
➤ Before an emergency arises, contact disaster recovery organizations that
support your type of department. For example, if you are in charge of the
company’s Vital Records department, you might meet with and negotiate
an on-demand contract for document preservation and recovery. Then you
would know whom to contact and what to expect from them. They might
offer some free advice for inclusion in your plan.
➤ Every department must have a plan for relocating its operations within the
facility. A classic example of this is an office fire where the rest of the facility
is intact. Your offices would be moved into another part of the facility until
the damage is repaired, but the company’s business can continue.
➤ Meet with your insurance carriers to discuss their requirements for damage
documentation, their response time, and any limitations on your policies.
This is a good time to review the company’s business disruption insurance
policy to see what it does and does not cover. Different parts of the facility
may have different insurance specific to their type of work.
➤ Meet with vendors of your key equipment to understand how they can help
in an emergency. Some equipment suppliers will, in the case of a serious
emergency, provide you with the next device off of their assembly line. (Of
course, you must pay full retail price and take it however it is configured.) If
this is something you wish to take advantage of, then you must clearly
understand any preconditions.
➤ Meet with the local fire, police, and ambulance services. Determine what
sort of response time you should expect in an emergency from each.
Identify any specific information they want to know from you in an
emergency. Understanding how long it will take for the civil authorities to
110 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
arrive may indicate how long the containment effort must allow for, such as
for a fire or for first aid in a medical emergency.
➤ Consider shifting business functions to other sites in case of an emergency
such as specific data processing systems, the sales call center, and customer
billing. The effort is not trivial and may require considerable expense in
travel and accommodations, but again, the goal is to promptly restore service.
2. Continuity of Leadership. When time is short, there is no time for introductions
and turf battles. Plan for the worst case and hope for the best. Assume that
many key people will not be available in the early hours of an emergency.
➤ Ensure that your employees know who their managers are, and who their
manager’s managers are. A good way to approach this is to schedule
luncheons with the staff and these managers to discuss portions of the plan.
➤ If you plan to use employees from a different company site in your recovery
operations, bring them around to tour the site and meet with the people.
Although an introduction is a good start, the longer the visit the better the
visual recognition later during an emergency.
➤ When exercising your plan, include scenarios where key people are
not available.
3. Insurance. Cash to get back on your feet again. Evaluating your current
insurance and selecting additional coverage should involve insurance
professionals to sift through the details. In light of that, consider:
➤ What sort of documentation does the insurance company require to pay a
claim? Does it need copies of receipts for major equipment? If I show a
burned-out lump of metal, will the insurer believe me that it used to be an
expensive server?
➤ If the structure is damaged, will the insurer pay to repair the damage? What
about any additional expense (beyond the damage repair) required for
mandatory structural upgrades to meet new building codes?
➤ In the event of a loss, exactly what do my policies require me to do?
➤ What do my policies cover? How does this compare to my risk assessment?
➤ Am I covered if my facility is closed by order of civil authority?
➤ If attacked by terrorists, does the company still have a claim or is that
excluded under an “acts-of-war” clause?
➤ Can I begin salvage operations before an adjuster arrives? How long will it
take them to get here? What about a wide-area emergency? How long must
I wait for an adjuster then?
4. Recovery Operations.
➤ Establish and maintain security at the site at all times. Prevent looting and
stop people from reentering the structure before it is declared to be safe.
WRITING THE PLAN 111
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
➤ During recovery operations, keep detailed records of decisions, expenses,
damage, areas of destruction, and where damaged materials were sent. Use
video and still cameras to photograph major damage areas from multiple
angles.
➤ Plan for a separate damage containment team and a disaster recovery
team. The containment team focuses on limiting the damage and is very
much “today” focused. The recovery team starts from the present and
focuses on restarting operations. Its goal is to restore the minimal acceptable
level of service.
➤ Keep employees informed about your recovery operations. They have a lot
at stake in a recovery (their continuing employment) and are your
staunchest allies.
➤ Protect undamaged materials from such things as water, smoke, or the
weather by closing up building openings.
➤ Keep damaged materials onsite until the insurance adjuster releases them.
PREPARE A DOCUMENT REPOSITORY
A business continuity program generates a lot of documents. Recovery plans,
Business Impact Analyses, risk assessment, and test results are just examples of
the many things that must be kept handy. Further, many people contribute and
maintain these documents. A central place is necessary to store everything so that
it can be found when needed. There are several popular options:
➤ Establish a file share with subdirectories to separate the technical plans from
the public areas. This is inexpensive and access permissions are controlled by
the Business Continuity Manager.
➤ Use a document management product, such as Microsoft’s Sharepoint™. This
also tracks who has which document checked out for updates.
➤ Another alternative is to purchase a purpose-built product such as Strohl’s™
LDRPS (Living Disaster Recovery Planning System) which can be used to build
an automated DR plan.
The challenge is to control access to plans so that the Business Continuity
Manager ensures the quality and accuracy of anything accepted for storage. Some
people will write little and call it enough. They will want to store it and declare the
job complete. Other well-meaning people may want to use their unique recovery
plan format which will also cause confusion. Whatever tool you use, set aside a
submissions area to receive proposed plans that will be reviewed
To be useful in a crisis, the repository must be available at the recovery site.
This may mean that it runs on a server at a third-party site or at the recovery site.
This introduces other issues such as ensuring the network connection to the
server is secured.
112 THE DISASTER RECOVERY HANDBOOK
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
CONCLUSION
Writing a business continuity plan seems like a big project. As with any big project,
break it down into a series of smaller projects that are not quite so intimidating.
Starting at the department level, work up the organization, combining department
plans as you build toward an organization-wide business continuity plan.
Developing the plan is an iterative process, and you won’t get everything right
the first time. Testing the plan, discussed in a later chapter, will help to verify what
you’ve written and point out gaps in the plan. Your plan should become a living
document, never finally done, but changing as the organization grows and changes.
WRITING THE PLAN 113
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
This page intentionally left blank
Co
py
ri
gh
t
@
20
11
.
AM
AC
OM
.
Al
l
ri
gh
ts
r
es
er
ve
d.
M
ay
n
ot
b
e
re
pr
od
uc
ed
i
n
an
y
fo
rm
w
it
ho
ut
p
er
mi
ss
io
n
fr
om
t
he
p
ub
li
sh
er
,
ex
ce
pt
f
ai
r
us
es
p
er
mi
tt
ed
u
nd
er
U
.S
.
or
a
pp
li
ca
bl
e
co
py
ri
gh
t
la
w.
EBSCO : eBook Collection (EBSCOhost) – printed on 1/11/2018 8:16 PM via AMERICAN PUBLIC UNIV SYSTEM
AN: 349248 ; Wallace, Michael, Webber, Larry.; The Disaster Recovery Handbook : A Step-by-Step Plan to
Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
Account: s7348467.main.ehost
RUNNING HEADER:
Disaster Recovery Plan: Information and Documentation for IBM Company
1
Disaster Recovery Plan: Information and Documentation for IBM Company 4
Disaster Recovery Plan: Information and Documentation for IBM Company
NAME
American Military University
ISSC490
A Disaster Recovery Plan is a documented process, and structured approach with instructions that details steps a business will take to recover from an unplanned catastrophic event. IBM highly relies on Information Technology to quickly and effectively process information, and most of its operations are computerized. As such, an IT disaster recovery plan for IBM should be well aligned with the business continuity plan. This is mostly known as risk assessment or threat analysis. Below are resources for documenting a disaster recovery plan for IBM Information Technology infrastructure.
Hardware and Peripheral devices
This generally includes any auxiliary device that is connected and works in conjunction with the computer, such as printers and scanners. When evaluating the hardware, one should determine the risk of losing the machine entirely and damage through hardware failure. The company computer systems may also be at risk of contracting viruses if employees are allowed to go home with laptops or consultants and vendors are allowed to plug in their Personal computers into IBM systems.
Email and Data exchanges
IBM uses shared computers and local area network which is generally a network of computers that share a communication line or wireless link to a server. This puts the company at risk of losing shared applications and information such as inventory control and payrolls. Sharing files using LANs may also lead to contraction of computer viruses and a slow down on the entire company network hence business interruptions. Emails shared through computers in the facility must also be evaluated when determining the risk.
Software Applications
IBM uses end-user programs designed to perform a group of coordinated functions for the fast and effective running of operations. These programs include word processors, spreadsheets, database programs and web browsers. All these programs are a source of vital information while developing a disaster management plan. Theft of software from the facility could be detrimental to the company and may even lead to lawsuits.
IP Addresses
The company internet protocol addresses act as a host or network interface identification. Despite the proxies and anonymity that exist to protect IP addresses, careless setups and gaps on the company’s security firewall could invite unwanted guests. Hackers may use the company IP address to send or retrieve information from the IBM computers.
VPN and Server Access
An evaluation on virtual private networks (VPNs) is necessary for ensuring the protection of private and confidential data. However, hackers may be able to spot weaknesses and steal such company data. A good disaster recovery plan ought to include a clear understanding of IBM server access and security of confidential data.
The Facility Telecommunication systems
IBM telephone system involves a Modern Private Branch Exchange (PBX) specially optimized for switching calls. A PBX failure could lead to a standstill in company operation, hence the need for evaluation on the system. However, wireless communication failure is not only caused by lack of maintenance, but also natural causes relating to weather. It’s, therefore, necessary to have a backup in the disaster recovery plan in case there is a failure.
References
Wallace, M., & Webber, L. (2011). The Disaster Recovery Handbook : A Step-by-Step Plan to Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets. New York: AMACOM.