His disaster recovery DR plan was simple and, he thought, effective.
- How to Build Azure!
- The Machinima Reader (MIT Press).
- Resilient backup with optimum data protection.
With a smaller office in Baton Rouge, LA, and backup tapes stored in an offsite location, the law office's DR plan called for restoring data and applications in Baton Rouge and conducting business with users connected remotely. Underpowered servers running Microsoft Corp. Unable to procure servers due to high post-hurricane demand for hardware, Zeller rushed to purchase and configure high-end desktops to run his apps.
After a strenuous week, business returned to something approaching normal with one cruel lesson learned: An untested DR plan isn't worth the paper it's printed on. Chaffe McCall is hardly alone in its experience. DR planning and testing is costly, and the contingency nature of DR makes it susceptible to budget cuts and testing shortcuts. To control costs, some companies perform several smaller component tests during the year in addition to a full annual test.
Others extend the testing cycle across several years. What to test?
Before the Hands-on Lab
Only a limited number of services and applications can be realistically included in a DR test plan. Business criticality of services, exposure to loss, risk tolerance, and an assessment of threats and vulnerabilities determine what needs to be included, priorities and within what timeframe--defined by recovery time objectives RTOs and recovery point objectives RPOs --services need to be restored. DR planning and testing is a continual balancing act among costs, budget and the potential business loss if a disaster occurs.
While prioritization identifies mission-critical applications and services, seemingly low-priority services that high-priority apps depend on are sometimes excluded from the DR test. With IT services and infrastructure tightly intertwined, the risk of neglecting low-priority dependent services is high. This is exactly what happened to Jim Burgard, assistant vice chancellor for university computing and communication at the University of New Orleans UNO.
The university's inability to get to its backup tapes after Hurricane Katrina prevented Burgard from recovering Active Directory, requiring a total Active Directory rebuild from scratch.
Strategies to Mitigate Cyber Security Incidents - Mitigation Details | ecencirhori.ml
With the interdependency of IT services, it's pertinent to consider and exercise all aspects that impact the recovery of mission-critical services, including the following areas:. There isn't a single "right" way to perform DR testing because it depends on the specific situation, defined priorities and the DR plan at hand.
The level of redundancy in place has a huge impact on the DR exercise. For instance, the effort to rehearse failing over to a continuously updated redundant storage array in a secondary data center is relatively simple; in contrast, having no secondary array to fail over to requires restoring terabytes of data and rehearsing the loss of the data center itself. The DR testing efforts and costs associated with the two scenarios differ greatly, and companies need to do a thorough analysis before deciding whether to invest in redundancy or to pour money into a more elaborate rehearsal.
The real payoff of redundancy comes into play when an actual disaster strikes. Loss of business productivity, caused by long recovery times, can easily exceed the cost of redundancy.
The ability to recover depends on accurate documentation, and a DR test needs to execute the documented recovery steps meticulously. DR documentation needs to be updated continuously and reviewed periodically. The single greatest risk in keeping up with changes is the lack of a solid change management and verification process to ensure that changes are performed according to procedure. Change management tools like Finisar Corp. Besides third-party tools and free tools such as Syslog, monitoring and element managers like Cisco Systems Inc. Unless prohibited by budgetary restrictions, all mission-critical network connections should be designed redundantly.
Testing redundant network connection failover is relatively straightforward and can be as simple as forcing a manually induced failure of the primary link by disabling a port on a switch or router. Unless flawed, dynamic routing should redirect traffic through the redundant circuit without any disruption.
Due to the complex nature of routing, it's highly recommended to verify failover beyond simply accessing resources on the remote site. Tools like traceroute, network mappings, and topology graphing tools in storage and network management apps verify proper failover.
Redundancy can also breed complacency and create a false sense of safety.
A DR rehearsal of the network needs to contemplate different scenarios so you don't fall into common traps. An important aspect of redundant network connections is the dependency between primary and failover connections. Commonalities need to be clearly identified, as they present a single point of failure. For instance, the resilience of two Internet connections from two different carriers using BGP-4 routing for automatic failover is compromised if the wiring for both connections enters the building through the same minimum point of entry.
Similarly, having the primary and failover connection from the same carrier is problematic in case the carrier experiences difficulties. DR testing of network connections without redundancy is more difficult, and comprises a combination of component testing and process, documentation and service-level agreement SLA verifications. It begins with verifying the process for replacing defunct switches and routers, including testing the configuration and proper operation of spare equipment, as well as the procedure for getting replacement hardware.
Vendor agreements, contact information and SLAs need to be verified and at least partially tested. Finally, the network DR rehearsal needs to account for changes in network load in case of a disaster.
Keyboard shortcuts in Azure Data Studio
Typically, a failover circuit is a lower cost alternative to the primary connection such as a VPN or frame-relay circuit. The DR test must ensure that the network connections can deal with the increased network load during a disaster. Under normal circumstances, only a relatively small fraction of the employee community connects via VPN; that number will grow substantially during a disaster.
The DR test must ensure the VPN server can deal with the increased load and that there's plenty of Internet bandwidth to cope with the bandwidth surge caused by the increase in VPN usage. Data storage A solid recovery-from-tape DR strategy requires several tape sets to be stored offsite.
Meticulously testing the process of retrieving tapes from the offsite location is imperative. The DR exercise needs to include a verification of the offsite contact information, the list of users who are permitted to request tapes, and an assessment of how many and which tape sets are kept offsite. The DR test needs to challenge and verify policies to avoid unpleasant surprises like the one experienced by Bill Bremerman, global services manager at Cookson Electronics, Providence, RI, who lost two of three tape sets that were in transit and never made it to the offsite location.
You also need to assess to what extent the offsite location may be impacted by a disaster. As CDP products become more common and affordable, companies are beginning to use them to protect all tiers of storage. It isn't uncommon for companies to implement a replication or CDP product after a DR test--or an actual disaster--to improve their ability to recover. Verification of the consistency and completeness of the replicated data is imperative during a DR test of data protected via replication or CDP.
In addition to business-user verification, custom scripts or third-party file-verification tools should be run to verify that the secondary data is in sync with primary data. Although most CDP and replication solutions perform consistency checks, the DR test needs to challenge and confirm independent of the replication or CDP product that data is replicated completely and consistently. Applications Most DR plans are constructed around recovering applications. The list typically starts with critical business applications like the enterprise resource planning, CRM or manufacturing execution system, followed by lower priority applications like the company's public Web site.
The DR test of redundantly run apps consists of three steps: Initiating the failover, business-user verification of transactions and application consistency, and switching back to the primary application instance. The easiest failover scenarios to test are symmetrically load-balanced applications like a Web site or grid clusters; the only impact of disabling nodes--as long as the failover works--is an increased load on the remaining nodes. A performance impact analysis should be part of the test.
These products continuously replicate transactions and changes to the failover system, monitor availability of the primary application and perform an automatic failover in case the primary instance fails. DR testing of clustered applications is disruptive, as the primary instance can't be used while the test is performed. Recovering applications without a failover solution depends on restoring the app from backups. To avoid Zeller's experience during Hurricane Katrina, your DR test needs to ensure that the correct hardware, operating system and application software are available and that the recovery can be successfully performed on the designated recovery systems.
The test needs to ensure that all required components, especially the application software, have a valid maintenance contract and valid software protection codes SPCs if required by an application. SPCs bind an application to specific hardware. Furthermore, vendor contact information needs to be up to date and you must ensure that the hours of support are in line with RTOs.
The data center Data center recovery testing prepares for the worst-case scenario: losing a complete data center. Companies have taken two approaches to resume operations after a data center loss. The first and least-expensive option is to resume data center operations in another branch office. Learn more.
Synology Hyper Backup package provides a cost-effective backup solution which can drastically reduce required storage, while preserving as many recovery points as needed. Thanks to the intuitive deployment process of Synology C2 Backup, we configured our backup copies and redefined our new backup policy in minutes.
Tiered Data Protection for Business Continuity Traditional backup strategies present challenges like long recovery times or extraneous storage consumption.
SharePoint disaster recovery in Microsoft Azure
Feature Technical Specifications. Resilient backup with optimum data protection. Reduced storage consumption Real-time data protection Business continuity ensured Incremental backup Block-level incremental backup reduces storage required by only backing up the differences between versions. Deduplication Applies to cross-version backups as well as files that are simply renamed or duplicated, thereby saving space and consuming less time and money.