Departmental Business Continuity Planning for Information Technology | Secure Cloud Backup Software | Nordic Backup

An IT department typically provides essential services to every other department within an organization. Also typically, only IT personnel know their environment well enough to identify dependencies and critical systems, estimate the amount of time needed to repair them in various scenarios, establish recovery time objectives (RTO), plan and test failover procedures, and more. IT team members also know which of their teammates should be designated to respond when specific systems go down. All of this considered, information technology departments should prepare their own continuity plans that are supported, tested, and maintained by members of that department, whether they are standalone documents or are incorporated into an organization’s overall plan.

Mapping dependencies

Identifying and mapping the dependencies associated with each service and application within the environment is a good place to begin when developing a continuity plan. A single business process could be supported by multiple servers, possibly residing on more than one network segment and in different physical locations. Consider as an example a law enforcement system that supports multiple police agencies within a county and pulls information from records in local databases as well as state and federal systems. Dependencies could include routers, switches, and cabling providing network connectivity between data centers, intermediate distribution frame (IDF) and main distribution frame (MDF) locations across the county, servers on the internal network, and external servers.  A visual representation of this critical service and its dependencies would be most helpful should an outage occur.

Once dependencies supporting business services are identified, the network maps illustrating those dependencies should be included in the IT plan and should show all applicable internal and cloud resources.

Business impact analyses (BIA)

Consider an organization with multiple departments relying on numerous services provided by IT. A disastrous event occurs and many of those services go down. Which are more critical than others? Which, if not restored quickly, will have the most negative impact on the business? To answer these questions, those preparing the IT continuity plan will need input from other stakeholders. That input can be collected using BIA questionnaires. These questionnaires should be prepared for each service and provided to those who utilize those services or applications in the course of their duties.

In addition to gathering information to identify and rank critical services and applications, BIA questionnaires should also identify which employees and technology resources are required for their operation and how much downtime stakeholders believe would be acceptable based on negative impacts associated with service or application unavailability.

More specific information about what should be included in a BIA questionnaire is readily available online and can be easily tailored to individual organizations.

Combining mapping and BIA data

The dependencies mapping and BIA data provide the foundation for a large portion of the IT departmental continuity plan.

Dependencies mapping should provide information regarding technology resources required for each critical application and service to function. The next step is identifying the IT personnel responsible for supporting each one. From there, a determination should be made as to which of these team members need to be included as part of a business continuity incident response team. If staffing levels allow it, designating primary and backup team members for each service and application is recommended.

BIA data usually, but not always, indicates which applications and services should receive priority treatment during the restoration process. While it would be logical to assume that those identified in a BIA as being most critical and as having the most significant impact on operations should get top priority, dependencies mapping may indicate otherwise by revealing that the critical system is dependent on another that was previously thought to be low-priority. Putting all of this information together allows for the development of an accurate list of most-to-least critical services and applications. That list should be incorporated into the continuity plan.

Some additional action may be required to address the restoration time data gathered in the BIA process. If, for example, stakeholders who respond to a BIA questionnaire may indicate that a service must be restored in two hours or less to prevent its outage from doing serious damage. IT personnel, however, know that, due to resource limitations, two hours is simply not enough time. This discrepancy may need to be addressed in the continuity plan. Perhaps additional resources should be requested from management in order to meet the recovery time objective (RTO) information collected in the BIA process. If management plans to take action to address the issue or considers the inability to meet this RTO an acceptable risk, including this information in the continuity plan could end up being a very good idea should an outage occur.

The incident response team

As stated, once the IT personnel responsible for supporting applications and services have been identified, an incident response team can be formed and be prepared to act should disruptive events occur. Primary responders and backups, if available, should be designated and listed in the continuity plan along with information about what systems, applications, and services they support. Personal contact information should be obtained for these team members since the need may arise to contact them during off hours.

Including only job titles and no personal contact information in the formal plan document is recommended. People come and go. They also change jobs within an organization. Using only job titles eliminates the possibility that frequent plan revisions will be needed. Excluding personal contact information also indicates to employees that their privacy is valued.

An incident response team leader and alternate should be designated. Organizational management should be notified as to who will be leading the team so that they know who to contact should an incident occur.

Communications

Alternate communications strategies should be included in the continuity plan. When major disasters like hurricanes occur, power and land line outages are common. Cell towers and systems can become overburdened with traffic to the point at which phone calls and even emails sent via mobile device apps will not go through. Because they require less data to be transferred, text messages often get through when no other forms of communication will. There are emergency text alert systems commercially available that are designed specifically to facilitate communications during such an event. Email-to-text alert systems can also be built in-house using the gateway information for cell service providers along with cell phone numbers to create contact lists for team members.

When disasters occur, the incident response team may be required to remain in the impacted area to provide support while others in the IT department evacuate. It is important to maintain communications with all personnel to verify that they are safe, provide them with updates, and let them know when they can return to work. For those who evacuate, organizations may wish to create evacuation questionnaires where these employees can provide alternate contact information for the location where they will be staying during the event.

Hardware, software, and vendors

If an application needs to be reinstalled because it has been corrupted or the server where it resided failed, establishing a secure location to maintain copies of all application software will allow team members to quickly locate what they need when they need it. Identifying the location in the continuity plan is recommended.

If the budget allows, maintaining a supply of spare parts is also a good idea. These could include network cables, VOIP phones, wireless access points, etc. A simple cable replacement could be all that is required to restore a critical service. Having a spare on hand expedites the process. Again, noting in the continuity plan where these supplies are stored is recommended.

The continuity plan should also include vendor name and contact information for all vendors providing support services to the organization along with information identifying the applications they support.

Alternate work sites

Some organizations have multiple locations. For example, a local government may have annex buildings, road maintenance offices, and meeting locations that may not be in use when events like hurricanes occur. If this is the case, examining those locations, determining which would make good alternate work sites for essential employees “riding out” the event, and listing them, along with their resources, in the continuity plan could mean the difference between maintaining operations and shutting down.

When evaluating a possible alternate work site, consider factors such as security or safety issues, network connectivity and connection speed, work spaces, capacity, whether office furniture is available, and whether the site has backup power capabilities in the form of a large uninterruptible power supply (UPS) and generator.

Backup and failover

An IT continuity plan should list backup policies and procedures along with locations where backup media is stored, if applicable. If the organization has failover capabilities, procedures for failover and failback should be included in the plan. The job titles of those responsible for performing backup, restoration, and failover operations should be included as well.

Plan activation

The plan should list events that trigger its activation and how team members and organizational management will be notified once activation has occurred. Events that warrant activation could vary according to the organization’s location and other risk factors. If, for instance, it is located in a hurricane zone, proactive plan activation could occur a certain number of days before expected landfall to ensure that time is available beforehand to test failover procedures and run full backups. Activation may also be reactive. Earthquakes, for example, occur without warning and could certainly be the basis for plan activation.

Testing, after action reports, and continuous improvement

Business continuity plans should undergo complete reviews at least once per year and be updated based on those reviews. As changes to processes, services, and applications occur, minor edits may periodically be necessary.

Simulations and tabletop exercises should be conducted regularly to ensure that IT team members are familiar with the plan and how it works. Quarterly tabletops and at least one major simulation annually are recommended. If possible, including in some exercises the department heads and representatives of other departments responsible for operation of critical systems is highly recommended.

After each simulation or disruptive event, lessons learned should be detailed in an after action report. These reports can then be used to make needed updates in the plan to ensure continuous improvement.

Conclusion

Enterprise-level business continuity plans will not, in all likelihood, sufficiently address continuity of operations at the IT departmental level. This is why it is necessary to create an IT department-specific plan. IT, unlike some other departments, typically provides services critical to the operational continuity of all others within the organization. Only IT team members possess the knowledge and environmental familiarity necessary to create an effective departmental plan that will best protect the organization’s technology resources and, as a result, preserve overall business continuity.

Share This

nb@nordic-backup.ru