ABOUT US SERVICES FEATURED ARTICLES CONTACT US SITE MAP
 


Industry Best Practices

The Show Must Go On
ComputerWorld TechGuide Security Part 2 ( 14 March 2003 )
By Melanie Liew

Imagine this: You, a purchasing manager of a retail company find that your stocks are low. You ring the sales office of your supplier but the phone rings without reply. You send a fax but this also brings no response. Your annoyance turns to concern as you wonder if the supplier has gone into liquidation.

This scenario is not entirely fictitious, as similar events have happened to many others.

When such an incident occurs and the business operation is interrupted, it will be only a matter of time before an organization's bottom line is affected. Over the longer term, if sales and marketing activities cannot be continued then there will be no new business. Consequently, this may impact the cash flow.

So, after a disaster or during a crisis, it is necessary for all organizations to put in place a contingency plan to ensure that it will be "business as usual."

The contingency plan should provide a number of well thought-out options that will allow an organization to retain all critical functions following a disaster or a crisis. This contingency plan should be based on a clearly defined policy.

"Business Continuity requires the alignment of business expectations and technology viability. It covers a comprehensive and consultative approach to minimize the potential impact of down time to critical business functions in the event of a service interruption from an outage to a full-scale disaster," said Oliver Tian (left), director of Solutions and Portfolio Management at HP Services, Hewlett-Packard.

To be effective, the policy statement should define the contingency objectives and establish the organizational framework and responsibilities for IT contingency planning.

There is a range of reference architectures and solutions that companies can typically consider when building a business continuity framework. Said Francis Fong (right), country manager, Strategic Outsourcing, Global Services, IBM Singapore, "An organization's infrastructure can be quite complex. It consists of delivery mechanisms necessary for business continuity. In many cases, this infrastructure extends beyond the enterprise walls through a dynamic value chain of suppliers, distributors, partners and even customers."

"To reduce complexity and improve management insight into potential risks and exposures, a business and its value chain can be viewed through the lens of six unique 'solution layers'."

These layers are strategy, organization, business and IT processes, data and applications, technology and facilities and security.

Said Fong, "Viewing a business or value chain in this manner enables the identification of crucial interdependencies between business process and the information technology that enable them. Understanding these interdependencies gives management the required context to prioritize the business continuity framework."

By and large, ebusinesses face the challenge more than most, as transaction-oriented enterprises where systems have to handle sudden peaks and millions of simultaneous users, both of which could crash a system.

Therefore, top management such as the chief information officer must be included in the process to develop the program policy, structure, objectives, and roles and responsibilities.

But, before starting on the process of continuity planning, the company must establish the business environment in which it will operate. This means that the policy must be produced which clearly defines what the business needs are, the strategy it will adopt and any priorities that are perceived as well as the key responsibilities.

In emergency conditions, the allocation of business priorities and the control and distribution of corporate resources may be very different from normal circumstances. Managers at all levels of a business must be prepared to have a flexible approach and to work to different rules. Therefore, the emergency policy carries the highest level of authority.

In any case, the senior management will set the policy and provide the impetus for contingency planning. In the event of a crisis, senior management hands over the authority to the Crisis Manager, set any new priorities which must be considered and resolve potential conflicts between different parts of the business. He has the responsibility of coordinating and directing activities during a crisis.

And, he is given executive authority to authorize expenditure and to direct business and support managers as appropriate.

To assist him, there will be a Crisis Team who manages the various coordination activities required for the survival and recovery of the business.

In a crisis, line managers are required to manage the implementation of their own plans under the overall direction of the Crisis Manager. In cases where they have particular knowledge or skills, they may be seconded to the Crisis Management Team.

The Premises or Facilities Management Department are given the task of providing alternative building facilities if current facilities become unusable. Each affected business will require suitable new facilities. They are also responsible for managing the damage clearance, repair and restoration activities for the company's buildings. This is usually a major activity because of the pervasive nature of an organization's premises.

The management of information systems is a complex task. Therefore, the computer centre will need to have its own contingency plans, which will be carried out by a separate crisis management team under corporate crisis management.

Specific to IT will include the loss of immediate or gradual communications systems, or technical problems such as hardware or software failures.

After a crisis, communications with the media need to be managed carefully after a disaster so as to ensure that customers' and other third parties' confidence in the organization is maintained.

Tian of HP added, "Business Continuity represents more than disaster recovery. There is a difference between protecting data and rescuing it. There is also a difference between business critical and non-business critical information. A full business assessment (business impact analysis) needs to be undertaken before putting in place a business continuity programme."

Tian cited the example of the airline industry where down time for a reservation system will drive customers to alternative carriers, impacting revenues immediately and directly. But should the airline's payroll system be affected, there would be no impact on business revenues.

The BIA enables the Contingency Planning Team Leader to fully characterize the system requirements, process and interdependencies and to use this information to determine the requirements and priorities. The purpose is to correlate specific system components with the critical services that they provide and based on that information, to characterize the consequences of a disruption to the system components.

Recovery strategies provide a means to restore IT operations quickly and effectively following a service disruption. The strategies should address disruption impact and allowable outage times. When developing the strategy, consider alternatives including cost, allowable outage time, security and integration with larger, organization-level contingency plans.

In all, a wide variety of recovery approaches may be considered. The right choice depends on the incident, the type of system and its operational requirements.

The contingency plan should take into account major disruptions with long term effects. Therefore, the plan must include a strategy to recover and perform system operations at an alternative facility for an extended period. There are three types of alternate sites available - dedicated site, reciprocal agreement with an internal or external party and a commercially leased facility. These may be categorized in terms of their operational readiness. Based on this factor, sites may be identified as cold sites, warm sites, mobile sites and mirrored sites.

Cold sites - A facility with adequate space and infrastructure to support the IT system. The space may have raised floor and other attributes suited for IT operations. The site has no IT equipment and usually does not contain office automation equipment such as telephones, facsimiles or copiers. By far the least expensive option, this may require substantial time to acquire and install the necessary equipment.

Warm sites - These are partially equipped office space that contain some or all of the hardware, software, telecommunications and power sources. This is maintained in an operational status ready to receive the relocated systems. In many cases, this may serve as a normal operational facility for another system or function.

Hot sites - These are office spaces sized to support system requirements and configured with the necessary system hardware, supporting infrastructure and support staff. Hot sites are staffed 24x7.

Mobile sites - These are self-contained, mobile, custom-fitted shells with specific telecommunications and IT equipment necessary to meet system requirements. The facility may be driven to and set up at a desired alternate location. In most cases, mobile sites should be designed with the vendor and a service level agreement (SLA) signed between the two parties. This is necessary because the time taken to customize the mobile site may be extensive and without prior coordination, the time to deliver the mobile site may exceed the system's allowable outage time.

Mirrored sites - This is identical to the primary site and provides the highest degree of availability because the data is processed and stored at the primary and alternate site simultaneously. By far the most expensive option, this type of facility ensures virtually 100 per cent availability. Typically, this is designed, built, operated and maintained by the organization.

Two or more organizations with similar IT configuration and backup technologies may enter a formal agreement to serve as alternate sites for each other or enter into a joint contract for an alternate site. This can be set up using a reciprocal agreement or memorandum of understanding (MOU). A reciprocal agreement should be entered into carefully because in the event of a disaster, each site must be able to support the other in addition to its own workload. This type of agreement requires the recovery sequence for the application from both organizations to be prioritized from a joint perspective. This should be a win-win partnership for both parties.

Said Fong of IBM, "Companies should turn to reputable managed services provider who provide standard SLAs that balance affordability and reasonable service level requirements. SLAs identified should be consistent with the scope of services required."

Added Quah Chin Yong (left), vice president, Infrastructure, Crimson Logic, "The cost of delivering a given level of service that meets the regulatory and business obligations, contributes to building the metrics and objectives for the SLA. The elements include items such as response time, availability, liquidated damages and other requirements."

"To mitigate the risk of non-performance of the SLA, the business continuity plan must be reviewed and tested regularly to ensure that it is working according to expectations."

The SLA should specify how quickly the vendor must respond after being notified. The agreement should also give the organization priority status for the shipment of replacement over equipment being purchased for normal operations.

The agreement should further discuss the priority status the organization will receive in the event of a disaster involving multi-vendor clients. The details should be documented in the SLA, which should be maintained with the contingency plan.

The Contingency Planning Coordinator should consider that purchasing equipment when needed is cost effective, but can add significant overhead time to recovery while waiting for shipment and set up. Alternatively, based on impact discovered through the BIA, organizations should consider that a widespread disaster would require mass equipment replacement and transportation delays would extend the recovery period.

In this case, the Contingency Planning Coordinator should ensure that the strategy chosen can be implemented effectively with available personnel and financial resources. In this case, the cost of the alternate site should be weighed against budget limitations.

A critical element of a viable contingency capability is plan testing which enables plan deficiencies to be identified and addressed.

Testing can also help to evaluate the ability of the recovery staff to implement the plan quickly and effectively. Each IT contingency plan element should be tested to confirm the accuracy of individual recovery procedures and the overall effectiveness of the plan.

And, the contingency test should address areas such as system recovery on an alternate platform from backup media; coordination among recovery teams; internal and external connectivity, system performance using alternate equipment and notification procedures.

 


Industry Best Practices

A Disaster Is Waiting To Happen - CNETAsia Magazine ( 9 February 2001 )

The Readiness Is All - CIO Asia Magazine ( October 2001 Issue )

When Recovery Means Life and Death - CIO Magazine ( January / February 2002 Issue )

NASDAQ's Best Practices - CIO Asia Magazine ( January / February 2002 Issue )

The Show Must Go On - ComputerWorld Singapore TechGuide Security Part 2

The Morning After - ComputerWorld Singapore TechGuide Security Part 2

Key Elements of a Business Continuity Framework - ComputerWorld Singapore TechGuide Security Part 2

COOP? What COOP? - ComputerWorld Singapore TechGuide Security Part 2

Lessons From A Disaster - ComputerWorld Singapore Vol. 9 Issue No. 32

Ease the Pain of Network Downtime by Managing Expections - CNETAsiaWeek Magazine ( Issue 12 - 1-15 July 2003 )

Security Best Practices - CNETAsiaWeek Magazine ( Issue 15 - 16-31 August 2003 )


 


ALSO IN THIS SECTION
   
Overview
   
Industry Trends
   
Industry Best Practices
   
What's New?
   
Useful Links


Copyright 2005 All rights reserved.