
Photo by Pixabay on Pexels.com
Every organisation knows how important data protection is. However still most of the organisations never test their backups. Why? Because it is complex issue. However if you do not test how do you know that you can really survive from disasters?
Modern Approach for Data Protection
First step for Data Protection is of course thinking it in modern ways. Even that most of the restores from backups are still individual files and/or folders your organisation has to prepare for bigger disasters. What if half of your 500 VM production environment is hit by ransomware attack? How do you really survive from these types of disasters? Restore times with so called legacy backups might take days, even weeks. Can your business survive without access to these VMs for weeks, propably not.
Data protection depends upon timely backups, fast global search, and rapid recovery. Cohesity DataProtect reduces your recovery point objectives to minutes. Unique SnapTree technology eliminates chain-based backups and delivers instantaneous application or file-level recovery, with a full catalog of always-ready snapshots even at full capacity. With this approach it can dramatically reduce recovery time.
However even modern data protection is not enough if you don’t know that you really have something to recover to. Most of the modern technologies handle file system level data integrity but still there is no way to really know that your backups are fully recoverable without testing them.
From Data Protection to Recovery Testing
Typically organisations approach recovery testing with either just recovering single (or multiple) virtual machines. This of course makes sure that you can recover individual VMs but it doesnt ensure that you can recover something that is truly working. Some backup vendors implement recovery testing, but still it is mostly just VMs or some basic uptime testing.
Other way to do this is manually restore application setups and do manual testing. This is very costly because it requires lot’s of manual work, and also introduces several risks. However it enables your organisation to really test application workflows with proper testing. Do you really get answer from your 3-layer web application, can you get answers to your DB queries etc. What if you could take this method of running complex testing but without any need for manual labour?
Automating Recovery Testing
Because modern hypervisor platforms are API driven it is pretty easy to automate things on VM level. When you add API driven data protection platform, like Cohesity, you can automate full recovery testing with very complex testing. This is a issue I hear from most of my Service Provider customers – but also from bigger enterprise customers. How to automate complex recovery testing? Lets see….
Cohesity Backup Validation Toolkit
To make things simpler, you can download Cohesity Backup Validation toolkit from here and with minimal scripting knowledge it is easy to automate validation process.
After downloading it is time to create some config-files. Lets start with environment.json -file. This file contains connection information for both Cohesity, and VMware vSphere environments. Create file with content:
{ "cohesityCluster": cohesity-01.organisation.com", "cohesityCred": "./cohesity_cred.xml", "vmwareServer": "vcenter-01.organisation.com", "vmwareResourcePool": "Resources", "vmwareCred": "./vmware_cred.xml" }
After this we need to create actual config.json -file containing information about each virtual machine we are about to test.
This file also defines tests per VM so it is very easy to define multiple tests but only use selected per VM. Script also enables you to attach VM to needed test network, and change IP address to different for testing purpose so you don’t need to test with overlapping production IPs, or create siloed networking for VMware.
Note that VMs don’t need to be protected with same protection job making this more scalable since propably you have different job for web frontends and actual backend databases.
{ "virtualMachines": [ { "name": "Win2012", "guestOS": "Windows", "backupJobName": "VM_Job", "guestCred": "./guestvm_cred.xml", "VmNamePrefix": "0210-", "testIp": "10.99.1.222", "testNetwork": "VM Network", "testSubnet": "24", "testGateway": "10.99.1.1", "tasks": ["Ping","getWindowsServicesStatus"] }, { "name": "mysql", "guestOS": "Linux", "linuxNetDev": "eth0", "backupJobName": "VM_Job", "guestCred": "./guestvm_cred_linux.xml", "VmNamePrefix": "0310-", "testIp": "10.99.1.223", "testNetwork": "VM Network", "testSubnet": "24", "testGateway": "10.99.1.1", "tasks": ["Ping","MySQLStatus"] } ] }
And then final step is to create actual credential files. To prevent having usernames and password in configuration files in plaintext format we can use simple powershell script to create these. You can have one shared credential file for all VMs, or you can have one per VM. Note that these users must have administrator level access to VMs to change IP network to test network.
To create credential files you can use included createCredentials.ps1 script which will create only one guestvm_cred.xml file but if you want to create more you can just simply run simple powershell command:
Get-Credential | Export-Clixml -Path guestvm_more.xml
Since this file is encrypted it can be only accessed with same user who created file, so make sure that you create credential files with same user you are using for running testing scripts.
So How Does it Work?
Here is an example run to clone two virtual machines (one Linux and one Windows) and run different set of tests on each VM.
First script gets configuration files and connects to Cohesity cluster and VMware vSphere vCenter environments. Then it will start clone process for VMs
and after clone process is done it will move to actual validation phase where we will first check that clone task is in success state and actual VMware VM’s are powered on with VMware Tools running in VMs cloned.
When VMs are up and VMware Tools are running we will run test per VM to ensure that we can push scripts trough VMware Tools. Next task is to move VMware VM to correct VM Network and then change IP configuration for each VM.
After moving VMs to correct network we will run tests for each VM
and after running tests we will clean clones automatically from Cohesity and VMware environment
Notes
This automation toolkit is not officially provided by Cohesity. Any bugs can be reported to me directly. There are some limitations with this toolkit:
You can use it only for VMware environments running vCenter
You can run tests only against Linux and Windows virtual machines and Windows machines need to have PowerShell installed.
Hope that this simple toolkit helps you to automate your organisations backup validations!