Category Archives: Backup and Disaster Recovery

Protecting Your Company Against Ransomware Attacks

Ransomware attacks are the latest security breach incidents grabbing the headlines these days. Last month, major companies including Britain’s National Health Services, Spain’s Telefónica, and FedEx were victims of the WannaCry ransomware attacks. Ransomware infects your computer by encrypting your important documents, and the attackers then ask for ransom to decrypt your data in order to become usable again.

Ransomware attack operations have become more sophisticated, in some cases functioning with a full helpdesk support.

While the latest Operating System patches and anti-malware programs can defend these attacks to a point, they are usually reactive and ineffective. For instance, the WannyCry malware relied heavily on social engineering (phishing) to spread, and relying on end users to open malicious email or to click on malicious websites.

The best defense for ransomware attacks is a good data protection strategy in the area of backup and disaster recovery. When ransomware hits, you can simply remove the infected encrypted files, and restore the good copies. It’s surprising to know that a lot of companies and end users do not properly backup their data. There are tons of backup software and services in the cloud to backup data. A periodic disaster recovery test is also necessary to make sure you can restore data when needed.

A sound backup and disaster recovery plan will help mitigate attacks against ransomware.

Replicating Massive NAS Data to a Disaster Recovery Site

Replicating Network Attached Storage (NAS) data to a Disaster Recovery (DR) site is quite easy when using big named NAS appliances such as NetApp or Isilon. Replication software is already built-in on these appliances – Snapmirror for NetApp and SyncIQ for Isilon. They just need to be licensed to work.

But how do you replicate terabytes, even petabytes of data, to a DR site when the Wide Area Network (WAN) bandwidth is a limiting factor? Replicating a petabyte of data may take weeks, if not months to complete even on a 622 Mbps WAN link, leaving the company’s DR plan vulnerable.

One way to accomplish this is to use a temporary swing array by (1) replicating data from the source array to the swing array locally, (2) shipping the swing frame to the DR site, (3) copying the data to the destination array, and finally (4) resyncing the source array with the destination array.

On NetaApp, this is accomplished by using the Snapmirror resync command. On Isilon, this is accomplished by turning on the option “target-compare-initial” in SynqIQ which compares the files between the source and destination arrays and sends only data that are different over the wire.

When this technique is used, huge company data sitting on NAS devices can be well protected right away on the DR site.

Protecting Data Located at Remote Sites

One of the challenges of remote offices with limited bandwidth and plenty of data is how to protect that data. Building a local backup infrastructure can be cost prohibitive and usually the best option is to backup the data to the company’s data center or to a cloud provider.

But how do you initially bring the data to the backup server without impacting the business users using the wide area network (WAN)?

There are three options:

1. The first option is to “seed” the initial backup. Start the backup locally to a USB drive, ship the drive to the data center, copy the data, then perform subsequent backups to the data center.

2. Use the WAN to backup the data but throttle the bandwidth until it completes. WAN utilization will be low, but it may take some time to complete.

3. Use the WAN to backup data and divvy up the data into smaller chunks. So that the users will not be affected during business hours, run the backup jobs only during off-hours and during the weekends. This may also take some time to complete.

How to Restore from Replicated Data

When the primary backup server goes down due to hardware error, a site disaster, or other causes, the only way to restore is via the replicated data, assuming the backup server was configured to replicate to a DR (Disaster Recovery) or secondary site.

In Avamar, replicated data is restored from the REPLICATE domain of the target Avamar server. All restores of replicated data are directed restores, because from the point of view of the Avamar target server, the restore destination is a different machine from the original one.

The procedure to restore files and directories are:

  1. Re-register and activate the server to the Avamar replication target server.
  2. Perform file/directory restore.
    • Select the data that you want to restore from the replicated backups for the clients within the REPLICATE domain
    • Select Actions > Restore Now
    • On the Restore Options window, notice that the only destination choice is blank so that a new client must be selected
    • Click Browse and select a client and destination from among the listed clients. Note that these clients are clients that are activated with the target server and are not under the REPLICATE domain.

If the Windows or UNIX/Linux server was part of the disaster, then the way to restore data is to build a new server first, then follow the procedure above to restore files and directories to that server. The other way is to perform a bare metal restore which is supported by Avamar on Windows 2008 and above.

Backing Up Virtual Machines Using Avamar Image-Level Backup

Avamar can backup virtual machines using guest level backup or image-level backup.

The advantages of VMware guest backup are that it allows backup administrators to leverage identical backup methods for physical and virtual machines, which reduces administrative complexity, and it provides the highest level of data deduplication, which reduces the amount of backup data across the virtual machines.

The second way to backup virtual machines is via the Avamar image-level backup. It is faster and more efficient and it also supports file level restores.

Avamar integrates with VMware VADP (vStorage API for Data Protection) to provide image level backups. Integration is achieved through the use of the Avamar VMware Image plug-in. Simply put, the VMware Image backup creates a temporary snapshot of the virtual machine, and uses a virtual machine proxy to perform the image backup.

Backup can occur while the virtual machines are powered on or off. Since the backup is handled by a proxy, CPU cycles of the target virtual machines are not used.

Avamar provides two ways for restoring virtual machine data: image restores, which can restore an entire image or selected drives; and file-level restores, which can restore specific folders or files.

However, file-level restores are only supported on Windows and Linux. In addition, it has the following limitations:

1. File-level restores are more resource intensive and are best used to restore a relatively small amounts of data. In fact, you cannot restore more than 5,000 folders or files.

2. The latest VMware Tools must be installed on the target virtual machine, in order to successfuly restore files and folders.

3. Dynamic disks, GPT disks, deduplicated NTFS, ReFS, extended partitions, bootloaders, encrypted and compressed partitions virtual disk configurations are not supported.

4. ACLs are not restored.

5. Symbolic links cannot be restored.

6. When restoring files or folders to the original virtual machine, only SCSI disks are supported; IDE disks are not supported.

If you must restore folders or files, and you ran into the limitations mentioned above, you can restore an entire image or selected drives to a temporary location (for example, a new temporary virtual machine), then copy those files and folders to the desired location following the restore.

Data Protection Best Practices

Data protection is the process of safeguarding information from threats to data integrity and availability.  These threats include hardware errors, software bugs, operator errors, hardware loss, user errors, security breaches, and acts of God.

Data protection is crucial to the operation of any company and a sound data protection strategy must be in place. Following is my checklist of a good data protection strategy, including implementation and operation:

1. Backup and disaster recovery (DR) should be a part of the overall design of the IT infrastructure.  Network, storage and compute resources must be allocated in the planning process. Small and inexperienced companies usually employ backup and DR as an afterthought.

2. Classify data and application according to importance.  It is more cost-effective and easier to apply the necessary protection when data are classified properly.

3. With regards to which backup technology to use – tape, disk or cloud, the answer depends on several factors including the size of the company and the budget.  For companies with budget constraints, tape backup with off-site storage generally provides the most affordable option for general data protection.  For medium-sized companies, a cloud backup service can provide a disk-based backup target via Internet connection or can be used as a replication target. For large companies with multiple sites, on-premise disk based backup with remote WAN-based replication to another company site or cloud service may provide the best option.

4. Use snapshot technology that comes with the storage array. Snapshots are the fastest way to restore data.

5. Use disk mirroring, array mirroring, and WAN-based array replication technology that come with the storage array to protect against hardware / site failures.

6. Use continuous data protection (CDP) when granular rollback is required.

7.  Perform disaster recovery tests at least once a year to make sure the data can be restored within planned time frames and that the right data is being protected and replicated.

8. Document backup and restore policies – including how often the backup occurs (e.g. daily), the backup method (e.g. full, incremental, synthetic full, etc), and the retention period (e.g. 3 months).  Policies must be approved by upper management and communicated to users.  Document as well all disaster recovery procedures and processes.

9. Monitor all backup and replication jobs on a daily basis and address the ones that failed right away.

10.  Processes must be in place to ensure that newly provisioned machines are being backed up.  Too often, users assume that data and applications are backed up automatically.

11. Encrypt data at rest and data in motion.

12. Employ third party auditors to check data integrity and to check if the technology and processes work as advertised.

A good data protection strategy consists of using the right tools, well trained personnel to do the job, and effective processes and techniques to safeguard data.

Integrating Riverbed Steelfusion with EMC VNX

SteelFusion is an appliance-based IT-infrastructure for remote offices. SteelFusion eliminates the need for physical servers, storage and backup infrastructure at remote offices by consolidating them into the data centers. Virtual servers located at the data centers are projected to the branch offices, enabling the branch office users access to servers and data with LAN-like performance.

SteelFusion uses VMware to project virtual servers and data to the branch office. Robust VMware infrastructure usually consists of fiber channel block-based storage such as EMC VNX. The advantage of using EMC VNX or any robust storage platform is its data protection features such as redundancy and snapshots.

In order to protect data via the use of EMC VNX array-based snapshot, and so that data can be backed up and restored using 3rd party backup software, the following items must be followed:

1. When configuring storage and LUNs, use Raid Group instead of Storage Pools. Storage Pools snapshots do not integrate well with Steelfusion for now.

2. Create Reserve LUNs to be used for snapshots.

3. When adding the VNX storage array information to Steelfusion Core appliance, make sure to select ‘Type: EMC CLARiON’, not EMC VNX.

For more information, consult the Riverbed documentation.

D2D2T vs. D2D2C

Disk-to-disk-to-tape (D2D2T) is a type of computer storage backup in which data is copied first to backup storage on a disk and then copied again to a tape media. This two tiered approach provides a quick short term recovery option since backup data can be easily retrieved from disk, as well as a more reliable long-term archive and disaster recovery on tape, since tape media can be stored off-site.

But using tapes has its drawbacks. Tape handling is one of them. Since it is usually a manual process, the possibility of human error is apparent – tapes getting misplaced, tapes getting lost or damaged while being transported to an off-site location, personnel forgetting to make backup to tape, failing backups because of tape device error or not enough space on tape, etc.

One alternative to D2D2T which is gaining popularity these days is disk-to-disk-to-cloud (D2D2C). With a D2D2C approach, the tape tier of D2D2T is simply replaced with cloud storage. A D2D2C backup involves backing up server drives to disk-based storage, and then running scheduled backups to archive backup data off to a cloud-based location. For short-term recovery operations, backups from disk are used for restoration.

The advantages of using D2D2C are: no more manual handling of tape media to send off-site, thus eliminating human tape handling error; provides easier and faster options for restoring data (tape restore can be a step-by-step manual process: retrieve tape from off-site location, mount tape, search backup catalogue, restore data); data can be restored anywhere; data transfer to the cloud can occur during off hours which will not impact the business; and cloud backups are usually incremental in nature which will reduce the amount of data sent over the network.

However, there are also some disadvantages of using D2D2C. For small offices especially, sometimes the WAN or Internet bandwidth can be a limiting factor. Also, backup to the cloud is still relatively expensive.

Bare Metal Restore

One important capability of a disaster recovery plan is the ability to do bare metal restore (BMR). A BMR is a restore of your entire server to new or original hardware after a catastrophic failure. A BMR can either be done manually – by reformatting the computer from scratch, reinstalling the operating system, reinstalling software applications, and restoring data and settings; or automatically – by using BMR tools to facilitate the bare metal restore process. The manual process, however, takes time and can be error prone, while BMR tools can be fast and easy.

With the majority of servers being virtualized, what’s the use of BMR? With virtualization, especially when using image level backup, there is no need to use specialized BMR tools. However, there are still servers that cannot be virtualized (such as applications requiring dongle, systems requiring extreme performance, applications/databases with license agreements that do not permit virtualization, etc.). With these systems requiring physical servers, BMR is critical to their recovery.

Backup vendors usually have bare metal solution integrated in their package, but usually not enough. There are software vendors that specialize in bare metal recovery.

Typically, a bare metal restore process involves:
1. Generating an ISO recovery image
2. Using the ISO image to boot the system to be recovered
3. Once in the restore environment, setting up the network connection (IP address, netmask, etc.), so it can connect to the backup server to restore the image.
4. Verifying the disk partitions and mapping.
5. Stepping through the restore wizard – such as choosing the image file you want to restore (point in time), and the partition (or unallocated space) to which you want to restore.
6. Performing any post recovery tasks – such as checking the original IP address, checking that the application services are running, etc.

Bare metal restore is essential to a server disaster recovery plan.

Avamar Backup Solution Review

I recently completed a hands-on training on Avamar management and was impressed by its deduplication technology. Deduplication happens at the client side which means less bandwidth consumed on the network and less space used for storage. Backup jobs run very fast, especially the subsequent backups after the initial baseline backup.

The Avamar disk-based backup appliance is based on Linux operating system, thus its Command Line Interface (CLI) commands are excellent. Its Redundant Array of Independent Nodes or RAIN architecture provides failover and fault tolerance across its storage nodes. It can also integrate with Data Domain as its backend storage nodes. It has an intensive support for VMware and NAS appliances via the NDMP accelerator device. Avamar servers can be replicated to other Avamar servers located at a disaster recovery site. The management GUI is intuitive for the most part and it’s very easy to do backup and restore.

However, I also found several shortcomings that I think could improve the product. First, the management GUI does not have an integrated tool to push the agent software to the clients. If you have hundreds of clients, you need to rely on third party tools such as Microsoft SMS to push the agent software. Second, there is no integrated management GUI. You have to run several tools to perform management tasks – the Avamar Administrator Console, Avamar Client Manager, Enterprise Manager, and Backup Recovery Manager. Third, there is no extensive support for Bare Metal Restore (BMR). Only Windows 2008 and later are supported for BMR. Finally, the system requires a daily maintenance window to perform its HFS checks and other tasks, during which very few backup jobs are allowed to run. This should not be a big deal though since a short backup window is usually enough to finish backup jobs because as I mentioned earlier, backups run very fast.

Overall, I consider Avamar coupled with the Data Domain appliance as the leading backup solution out there.