Your Mission Critical Applications Deserve Real Backup Validation!

accident disaster steam locomotive train wreck

Photo by Pixabay on Pexels.com

Every organisation knows how important data protection is. However still most of the organisations never test their backups. Why? Because it is complex issue. However if you do not test how do you know that you can really survive from disasters?

Modern Approach for Data Protection

First step for Data Protection is of course thinking it in modern ways. Even that most of the restores from backups are still individual files and/or folders your organisation has to prepare for bigger disasters. What if half of your 500 VM production environment is hit by ransomware attack? How do you really survive from these types of disasters? Restore times with so called legacy backups might take days, even weeks. Can your business survive without access to these VMs for weeks, propably not.

Data protection depends upon timely backups, fast global search, and rapid recovery. Cohesity DataProtect reduces your recovery point objectives to minutes. Unique SnapTree technology eliminates chain-based backups and delivers instantaneous application or file-level recovery, with a full catalog of always-ready snapshots even at full capacity. With this approach it can dramatically reduce recovery time.

However even modern data protection is not enough if you don’t know that you really have something to recover to. Most of the modern technologies handle file system level data integrity but still there is no way to really know that your backups are fully recoverable without testing them.

From Data Protection to Recovery Testing

Typically organisations approach recovery testing with either just recovering single (or multiple) virtual machines. This of course makes sure that you can recover individual VMs but it doesnt ensure that you can recover something that is truly working. Some backup vendors implement recovery testing, but still it is mostly just VMs or some basic uptime testing.

Other way to do this is manually restore application setups and do manual testing. This is very costly because it requires lot’s of manual work, and also introduces several risks. However it enables your organisation to really test application workflows with proper testing. Do you really get answer from your 3-layer web application, can you get answers to your DB queries etc. What if you could take this method of running complex testing but without any need for manual labour?

Automating Recovery Testing

Because modern hypervisor platforms are API driven it is pretty easy to automate things on VM level. When you add API driven data protection platform, like Cohesity, you can automate full recovery testing with very complex testing. This is a issue I hear from most of my Service Provider customers – but also from bigger enterprise customers. How to automate complex recovery testing? Lets see….

Cohesity Backup Validation Toolkit

To make things simpler, you can download Cohesity Backup Validation toolkit from here and with minimal scripting knowledge it is easy to automate validation process.

After downloading it is time to create some config-files. Lets start with environment.json -file. This file contains connection information for both Cohesity, and VMware vSphere environments. Create file with content:

{
        "cohesityCluster": cohesity-01.organisation.com",
        "cohesityCred": "./cohesity_cred.xml",
        "vmwareServer": "vcenter-01.organisation.com",
        "vmwareResourcePool": "Resources",
        "vmwareCred": "./vmware_cred.xml"
}

After this we need to create actual config.json -file containing information about each virtual machine we are about to test.

This file also defines tests per VM so it is very easy to define multiple tests but only use selected per VM. Script also enables you to attach VM to needed test network, and change IP address to different for testing purpose so you don’t need to test with overlapping production IPs, or create siloed networking for VMware.

Note that VMs don’t need to be protected with same protection job making this more scalable since propably you have different job for web frontends and actual backend databases.

{
    "virtualMachines": [
        {
            "name": "Win2012",
            "guestOS": "Windows",
            "backupJobName": "VM_Job",
            "guestCred": "./guestvm_cred.xml",
            "VmNamePrefix": "0210-",
            "testIp": "10.99.1.222",
            "testNetwork": "VM Network",
            "testSubnet": "24",
            "testGateway": "10.99.1.1",
            "tasks": ["Ping","getWindowsServicesStatus"]
        },
        {
            "name": "mysql",
            "guestOS": "Linux",
            "linuxNetDev": "eth0",
            "backupJobName": "VM_Job",
            "guestCred": "./guestvm_cred_linux.xml",
            "VmNamePrefix": "0310-",
            "testIp": "10.99.1.223",
            "testNetwork": "VM Network",
            "testSubnet": "24",
            "testGateway": "10.99.1.1",
            "tasks": ["Ping","MySQLStatus"]
        }
    ]
}

And then final step is to create actual credential files. To prevent having usernames and password in configuration files in plaintext format we can use  simple powershell script to create these. You can have one shared credential file for all VMs, or you can have one per VM. Note that these users must have administrator level access to VMs to change IP network to test network.

To create credential files you can use included createCredentials.ps1 script which will create only one guestvm_cred.xml file but if you want to create more you can just simply run simple powershell command:

Get-Credential | Export-Clixml -Path guestvm_more.xml

Since this file is encrypted it can be only accessed with same user who created file, so make sure that you create credential files with same user you are using for running testing scripts.

So How Does it Work?

Here is an example run to clone two virtual machines (one Linux and one Windows) and run different set of tests on each VM.

First script gets configuration files and connects to Cohesity cluster and VMware vSphere vCenter environments. Then it will start clone process for VMs

blog1

blog2

and after clone process is done it will move to actual validation phase where we will first check that clone task is in success state and actual VMware VM’s are powered on with VMware Tools running in VMs cloned.

blog3

When VMs are up and VMware Tools are running we will run test per VM to ensure that we can push scripts trough VMware Tools. Next task is to move VMware VM to correct VM Network and then change IP configuration for each VM.

blog4

After moving VMs to correct network we will run tests for each VM

blog5

and after running tests we will clean clones automatically from Cohesity and VMware environment

blog6

Notes

This automation toolkit is not officially provided by Cohesity. Any bugs can be reported to me directly. There are some limitations with this toolkit:

You can use it only for VMware environments running vCenter

You can run tests only against Linux and Windows virtual machines and Windows machines need to have PowerShell installed.

Hope that this simple toolkit helps you to automate your organisations backup validations!

Building a Modern Data Platform by Exploiting the True Possibilities of Public Cloud

analysis blackboard board bubble

Photo by Pixabay on Pexels.com

Building a modern next-generation datacenter requires specific approach and understanding of automation. When we are designing modern datacenter we have to understand that data is center element in todays business. We have to understand that with automation we can not only save time but ensure that human error factor is reduced dramatically.  On-premise datacenter, even the next-generation one, is still only one element of data platform and since no modern data platform would be complete without having option to use the public cloud and in fact public cloud plays significant role in building modern data platform and providing all the capabilities we just couldn’t get any other way.

In this post we look look the benefits of public cloud while taking care that we overcome all the challenges we might see in public cloud adoption embracing cloud as functional key element of our platform.

Why our data needs public cloud?

While the modern storage systems are very good and past couple years they have evolved lot modern data centric approach and fast changes in business landscapes require flexibility, scalability, data movement and commercial approach which makes quickly clear that cloud can be potentially answer for all of these challenges.

While these business challenges are quite common in pretty much all traditional systems they are area where public cloud can be strongest. Cloud can, in theory, scale infinite and provide consumption model where organisations can move CAPEX investments to OPEX by paying only what they need while still having option of flexibility by going bigger or smaller based on current business requirement. But cloud can do also much more. We can easily take copy of our data and do pretty interesting things with it – once it is copied or moved to public cloud. Typically organisations start with low-hanging fruits, backups, since they are very easily moved from on-premise to cloud since pretty much every modern backup software supports extension to cloud (If your’s doesn’t maybe it’s very good time to look something better). When we backup our data to public cloud we can actually benefit more from it. We can use this cold data for business analytics or artificial intelligence. But it can work also as a disaster recovery. With proper design this can be way cheaper than building disaster recovery site. In the end flexibility is the most compelling reasons for any organisation to consider leveraging public cloud.

But while these benefits are pretty clear why so many organisations fail to meet these benefits by not moving to cloud?

Why organisations resist moving to the cloud?

It’s not about what public cloud can do it is more about what it doesn’t that tends to stop organisations wholeheartedly embracing cloud when it comes to organisations most valuable assets, data.image.png

As we’ve worked through the different areas of building a modern data platform our approach to data is way more than just storage. It is insight of data, protection, security, availability, and privacy, and these are things not normally associated with native cloud storage. Traditionally native cloud storage is not built to handle these types of needs but to be pretty much easily scalable and cheap.  And since organisations got so used to these requirements they don’t want to move their data to cloud if it means losing all of those capabilities, or having to implement and learn a new set of tools to deliver them.

Of course there is also the “data gravity” problem, we can’t have our cloud based data siloed away from the rest of our platform, it has to be part of it. We need to be able to move data in to the cloud but also ensure that we can move it back to on-premise again and even between cloud providers while still retaining all of those key elements that enterprise organisations require – control and management.

So is there really a way to overcome these challenges and have cloud as fundamental part of modern data platform. Yes, there is.

Making cloud be part of the enterprise data platform

There are dozens and dozens companies trying to solve this issue. Most of them start from the top without really looking the real problem, data mobility.  If you look AWS Marketplace’s storage category you will see almost 300 different options available so the question is how one knows which really gives organisation full potential for true hybrid cloud. The answer is, one really cant without deep knowledge. I will not point any single vendor but quite many makes claims that they can give you data mobility and leverage your data in full potential while only few of them can really do this.

There are two things making this very hard.

First is data movement between on-premise and cloud. It’s pretty easy to copy data from point A to point B but how to make this cost efficient and fast. Moving huge amounts of data takes time even with very fast internet connections so having builtin capabilities of moving only needed blocks can make significant difference not only in migration/movement times but since pretty much all cloud vendors charge egress traffic when it is time to move data back to on-premise or to other cloud vendor this can mean a huge difference in costs.

Second is ability to use migrated/moved data to several purposes. Using cloud as backup target is quite inefficient if you cannot use the same data as source for DR, analytics, AI or test&dev. Cloud storage doesn’t cost that much but if you can use if efficiently in more than one use case you will reduce the total cost quite much.

Both of these are foundation of enterprise capabilities. And while adding enterprise capabilities are great, the idea of a modern data platform relies on having our data in the location we need it, when we need it while maintaining management and control. This is where the use of efficient technology provides real advantage. You can achieve this in many ways one being for example using NetApp’s ONTAP  storage system as a consistent endpoint allowing organisations to use the same tools, policies and procedures at the core of data platform and extend this to organisations data in the public cloud. This is possible if vendor has an modern software-defined approach.

NetApp’s integrated SnapMirror provides the data movement capabilities so one can simply move data in and out of and between clouds. Replicating data in this way means that while on-premise version can be the authoritative copy, it doesn’t have to be the only one. Replicating a copy of  data to a location for a one off task, which once completed can then be destroyed, is a powerful capability and an important element of simplifying the extension of organisations data platform into the cloud.

So technology matters?

In short answer, no. One doesn’t need to use technology vendor X to deliver true hybrid cloud service. You do not need to use NetApp but I have used it as an example since it has nice cloud integration features built-in and because of that it can deliver modern data platform easily by providing consistent data services across multiple locations (on-premise and cloud) while still maintaining all critical enterprise controls. Of course this means that you need to have NetApp on-premise and in cloud.

When you evaluate vendor Y for your next-generation datacenter it is very critical to think how you can build your enterprise data platform to have an option to expand your business to cloud. While there are other data service providers having somewhat similar services as NetApp I think that NetApp’s story and capabilities are in line with the requirements for modern data platform. There are more solutions which can be used to achieve similar solution and even go bit further but I will cover one of them in my next post.

In the end most important thing on your design sterategy, if it is to include public cloud, is to ensure that you have appropriate access to data services, integration, control and data management. It is crucial that you don’t put your organisations most valuable asset, data, at risk or dimish the capabilities of your data platform by using the cloud. Cloud is playing huge role in future data plaforms so make sure you have easy option to move workloads to cloud – and back.

Red Hat acquired gluster

Red Hat has announced that it will acquire Gluster. Gluster is the company behind cluster/cloud filesystem GlusterFS. Red Hat explains that the company considers the technologies used in Gluster to be a good fit with its cloud computing strategy. Red Hat will continue to support Gluster customers and will integrate Gluster products into its portfolio over the coming months. It hopes to continue to involve the GlusterFS community in developing the filesystem.

GlusterFS is released under the GPLv3 and is described by the developers as a stackable modular cluster file system, which runs in user space and hooks into the kernel via Fuse (Filesystem in Userspace). Communication between storage nodes and clients is via Infiniband or TCP/IP. GlusterFS scales to the petabyte level and, rather than storing data onto storage media directly, uses proven file systems such as Ext3, Ext4 and XFS.

Seven rules to help you on preparing for problems on storage area networks

I wrote this article due the fact that every SAN-adminisrator should know how to be better prepared for problems on their storage area networking. As central storage nowadays is rather a rule than an exception does storage area networking play really important role for everyday computing.

Note that these seven steps are more like my suggestions rather than generic rules that should be followed strictly. So try to find at least something that you can implement for your environment and please let me know if there’s more relevant things than these, or something to add.

Rule #1 – Implement NTP

From my opinion this is the most important thing. Anyone who has faced situation where you needed to find out what happened on environment where all clocks are pointing different time understands relevancy of NTP. When you have all your devices on same time finding root cause is usually much easier. Implement NTP server on your management network and sync all your devices from there. You can keep your management networks NTP on proper time by syncing it from internet but it’s more relevant that all devices are on same time rather than exactly on right second of world clock.

Rule #2 – Implement good naming schema

In past servers usually got their names from action heroes and stars and this might be nice but if you want to have easy rememberable names use them as CNAMES rather then proper names. In problem situation it would be nice to see from name exactly where your device/server is located so I suggest that you use something like Helsinki-DC1-BC01-BL01-spiderman rather than just spiderman. In this example you could easily see that your server is located at Helsinki on datacenter 1 and is on blade chassis one and there blade number one.

Use consistent naming on zoning. I usually name zones ZA_host1_host2. This shows immediately that it’s zone on fabric A and it’s between host1 and host2. On SAN I always prefer that aliases are also named with same kind of naming schema; AA_host1 which is alias on fabric A for host1.

For storage area networking domain ID is like phone number, domain ID’s should always be unique. This is not usually problem if you have separate SAN’s, but if you move something between SAN’s having unique ID is crucial so from the beginning use unique id’s. This information is also used for several other things like fibre channel address of device ports etc.

Rule #3 – Create generic SAN management station

This is usually done on all bigger environments but every now and then I see environments where there is no generic SAN management station implemented. Almost every company has implemented virtualization at least in some level so creating generic SAN management station should not be any kind of problem. You can go easily with virtualized Windows Server or maybe even with just virtualized Windows 7 with RDP connection enabled but I would go with server so there can be more than just one admin on station at time.

This station should have at least these:

  • SSH and telnet tool witch allows you to output of session to text file, on Windows environments I usually go with putty
  • FTP-server (and maybe tftp also). I usually go with Filezilla Server which is really easy to configure and use
  • NTP server for your SAN environment
  • Management tools for your SAN (Fabric Manager for Cisco and DCFM for Brocade) – This is really important on larger environments for toubleshooting
  • Enough disk space to store all firmware-images and log files from switches (Rule #5)
  • ….access for internet in cases where you need to download something new or just use google when sitting on fire 😉

Rule #4 – Implement monitoring on your SAN environment

This can be done at least by using same software you use for your server environment but I would go with Fabric Manager on Cisco SAN’s and DCFM on Brocade SAN’s because these include also other features and are really useful when your environment gets bigger. Configure your management software to send email/sms when something happens – don’t just trust your eyes!

You should also implement automatic log collection for your environment. For example this helps a lot when you try to find physical link problems or slow drain devices. Configure your management station to pull out all logs from switches daily/weekly and then clear all counters so next log starts with empty counters. This can be implemented with few lines of perl and ssh library and there are plenty of exciting scripts already on google if you don’t know how to do it with perl.

Rule #5 – Design your SAN layout properly

This is really easy to achieve and doesn’t even need much time to keep in update. Create layout sketch of your SAN – even in smaller environments – and share it with all admins. You don’t need to have all servers on this sketch, include just your SAN switches and storage systems, if you want you can include your servers also but this usually makes your sketch quite big and unreadable. In two SAN environments (Having two separate SAN’s should be defacto!) plug your servers and storage always on same ports, so if you connect your storage system on ports 1-4 on switch one in fabric A, connect them to ports 1-4 also in corresponding switch on fabric B.

Rule #6 – Update your firmwares

Don’t just hang on working firmwares. There is no software which is absolutely free of bugs and this is why you should always update your firmwares regularly. I am not saying that you should go with new release as soon as it gets to downloads but try to be in as new version you can. There are lot’s of storage systems which makes requirement for firmware levels so always follow your manufactures advices. If your manufacturer doesn’t support newer then something released year ago it might be time to change your vendor!

If you have properly designed SAN with two separate networks you can do firmware upgrades without any breaks on production and most of the enterprise class SAN switches (Usually called SAN Directors) have two redundant controllers so you can update them on fly without any interruption on your production!

Rule #7 – Do backups!!!

Take this seriously. Taking backups is not hard. You can implement this on your daily statistics collection scripts or do this periodically by your hands – which ever way you choose take your backups regularly. I have seen lot’s of cases where there was no backups from switch and on crash admins needed to create everything from scratch. Implement this also to your storage systems if possible, at least IBM’s high end storage systems has features which allows you to take backups of configs. Config files are usually really small and there shouldn’t be place where there is no disk/tape space for backups of such a important things like SAN switches and storage systems. From SAN switches you might also want to keep backup of your license files as getting new license files from Cisco/Brocade can take while.

PostgreSQL:n ja avoimen lähdekoodin mahdollistama yksinkertainen HA

Useimmat SQL-sovellukset tukevat Point-in-Time Recoveryä joka tarkoittaa käytännössä yksinkertaisesti sitä että tietokanta tuottaa transaktio-logeja joiden avulla voidaan halutessaan palata hyvinkin tarkasti tiettyyn ajankohtaan. Myös avoimen lähdekoodin PostgreSQL tukee tätä ominaisuutta (Kirjoitushetkellä ainakaan MySQL ei tätä vielä tue) ja tämä mahdollistaa useita mielenkiintoisia käyttötapoja, mm. mahdollistamalla tarkan varmuuskopion tietokannasta jolla voidaan palata aina tiettyyn ajankohtaan helposti.

Virtualisointi on kuitenkin mahdollistanut uusien koneiden tuotttamisen helposti, nopeasti ja kustannustehokkaasti. Kun koneita voidaan tuottaa aika tehokkaasti on katastrofivarmentaminenkin helpottunut huomattavasti. Toisessa konesalissa voidaan ajaa erillistä tuotantoa ja varalta muutamaa konetta joihin katastrofitilanteissa voidaan käynnistää toisen konesalin palveluita, esim. juuri PostgreSQL:n tietokannat. PostgreSQL:n tarjoama point-in-time recovery mahdollistaa standby-koneen käytön erittäin helposti. Käytännössä tuotantokantaan asetetaan transaktiolgoit päälle jonka jälkeen tietokanta tallentaa wal-tiedostoja arkistohakemistoon josta ne voidaan siirtää helposti toiseen lokaatioon ja siellä ajaa katastrofivarmennus-käytössä olevaan standby-koneeseen, tämä on myös erittäin helppoa koska standby-tietokanta voidaan määrittää tilaan jossa se automaattisesti hakee tietystä hakemistosta transaktiologit ja ajaa niiden mukaiset muutokset tietokantaansa. Alla oleva kaaviokuva kertoo tarkemmin mistä on kyse:

Kuvattu ominaisuus on todella helppoa ottaa käyttöön jonka lisäksi se on myös erittäin tehokas ja nopea. Käytännössä toisen konesalin kopio on perässä 1-2 transaktiologin verran mikäli konesalien välissä on normaali yhteys joka ei ole täysin kuormitettu. Pahimmassakin tilanteessa siis menetetään vain pienen hetken tiedot, joka tarjoaa erittäin hyvän kustannustehokkuuden ratkaisulle. Ominaisuuden käyttöönotto aloitetaan muokkaamalla tuotantopuolelle tietokantaan logit käyttöön muokkaamalla postgresql.conf-tiedostoa:

archive_mode = on
archive_command = 'cp -v %p /var/lib/pgsql/data/pgsql/archives/%f'
archive_timeout = 300

Seuraava toimenpide on siirtää itse logit standby-koneelle. Tämä onnistuu helpoiten lisäämällä standby-koneen croniin rsync-työ, alla olevassa esimerkissä logeja haetaan viiden minuutin välein, mutta voit säätää sen omaan ympäristöösi parhaiten sopivaksi. Käytännössä siis tämä määrittelee maksimaalisen datan menetyksen katastrofitilanteessa, eli jos kantaan tulee paljon muutoksia tätä on syytä kiristää hiukan.

/5 * * * * rsync -avz --remove-sent-files prod-sql:/var/lib/pgsql/data/pgsql/archives/ /var/lib/pgsql/data/pgsql/archives/ > /dev/null

Esimerkki olettaa että olet tehnyt SSH-avaimet koneiden välille, jolloin siirto onnistuu automaattisesti. Viimeinen työvaihe käyttöönotossa on määritellä standby-tietokanta hakemaan archive-hakemistosta transaktiologeja ja ajamaan niiden mukaiset muutokset tietokantaan. Tässä hyödynnetään pg_standby-komentoa jonka käyttöönotto tapahtuu muokkaamalla standby-koneen recovery.conf-tiedostoa.

restore_command = '/usr/bin/pg_standby -l -d -s 2 -t /tmp/pgsql.trigger /var/lib/pgsql/data/pgsql/archives %f %p %r 2>>standby.log'

PostgreSQL:n dokumentaatiossa on erittäin hyvin kuvattu tämä warm standby-toiminto joten kannattaa tutustua näihin mikäli aihe kiinnostaa tarkemmin. Tässä taas hyvä esimerkki siitä miten avoimen lähdekoodin tietokanta kykenee tarjoamaan ominaisuuksia jotka useimmin löytyvät vain kalliista enterprise-tietokannoista. PostgreSQL:stä on saatavilla lisäksi tuettu PostgreSQL Plus-versio joka tarjoaa paljon muitakin lisäominaisuuksia ja mm. erittäin hyvän Oracle-yhteensopivuuden jonka avulla olemassa olevia Oracle-tietokantoja voidaan helposti migroida kustannuksiltaan paljon edullisempaan Postgres-tietokantaan.

Kohti korkeampaa käytettävyyttä replikoinnilla

Nykyisissä virtualisoiduissa ympäristöissä törmätään usein haasteelliseen tilanteeseen jossa tietty virtualisoitu palvelin on niin tärkeä että sen käytettävyyttä halutaan parantaa. VMwaren uusi vSphere-ympäristö toi HA:n lisäksi uuden ominaisuuden Fault Tolerance (FT) jolla voidaan varmistaa että tietty virtualikone on aina käytettävissä. Ongelma on kuitenkin se että tällä menetelmällä kone voidaan varmentaa vain tietyn klusterin sisällä ja asiakkaat haluavat usein että koneita varmennetaan toiseen konesaliin/lokaatioon. VMwarella on tähän oma tuote nimeltä Site Recovery Manager, mutta tuote edellyttää alla olevan infran osalta tiettyjä asioita, kuten sitä että käytettävä peilaus toteutetaan levyjärjestelmätasolla (joka luonnollisesti lisää kustannuksia). Tuote on kuitenkin erinomainen ratkaisu suurien konesaliympäristöjen varmentamiseen ja luonnollisesti tämän tarpeen kasvaessa markkinoille on tullut useita kolmannen osapuolen tuotteita. Keskityn tässä artikkelissa Vizioncoren tuotteeseen nimeltä vReplicator (Quest on ostanut Vizioncoren joten tuote kulkee myös nimellä Quest vReplicator) joka voidaan ostaa joko yksittäin tai vEssentials-nimisessä paketissa jossa mukaan tulee myös mm. virtualikoneiden varmuuskopiointiin tuote.

Mikä vReplicator on ja mitä sillä voi tehdä?

Kuvitellaampa tilanne jossa halutaan varmistaa tietyn virtualikoneen käytettävyys olemassa olevasta ESX/ESXi-koneesta toiseen ESX/ESXi-koneeseen joka sijaitsee toisessa kaupungissa. Yksi tapa toteuttaa tämä on esim. ottamalla koneesta snapshot ja kopioimalla  kaikki tavarat toiseen palvelimeen. Jos konesalien välillä oleva yhteys ei ole nopea kuituyhteys menee tähän prosessiin valtavan pitkä aika, vaikka kone ei olisi kuin muutamia gigatavuja (Usein kuitenkin korkean käytettävyyden koneet ovat tietysti suurempia ja näin aika vain kasvaa). Edellä mainitulla tekniilla kone voitaisiin hyvin kopioida esim. kerran yössä, mutta mitä jos halutaan pitää toisessa lokaatiossa mahdollisimman ajantasaista versiota, esim. vain tunnin vanhaa? Tässä tapauksessa edellä mainittu tekniika ei valitettavasti vain toimi koska jo pelkkä snapshot+siirto vie reilusti enemmän aikaa.

Ongelmaan on useita lähetysmistapoja ja tietysti tässä vaiheessa herää kysymys siitä miten vReplicatorilla tämä ongelma voidaan ratkaista tehokkaammin? Vizioncoren vReplicator on ESX/ESXi-ympäristöön asennettava sovellus joka hoitaa tehokkaasti virtualikoneen virtualilevyn replikoinnin toiseen lokaatioon. Sovellus toimii siten, että ensimmäisellä kerralla koneesta otetaan täysi replika (tähän menee luonnollisesti aikaa) ja kun tuo on valmistunut seuraavilla kerroilla siirretään vain muuttuneet tiedot.  Tämä tapa säästää sekä tietoliikennekapasiteettia että siirtoon kuluvaa aikaa ja näin RTO/RPO-arvot saadaan mahdollisimman pieniksi.

vReplicator on erittäin helppo ja nopea asentaa ja se ei vaadi erillisiä agentteja tai clientteja asennettavaksi yhteenkään ESX/ESXi-koneeseen eikä myöskään virtualikoneisiin. Kun vReplicator on asennettu ei muita asennuksia tarvita vaan jatkossa kaikki ympäristön virtualikoneet ovat liitettävissä palvelun piiriin. Tuotteella voidaan myös kopioida useita virtualikoneita, useista eri ESX/ESXi-koneista yhteen keskitettyyn kohteeseen, joka tarjoaa erinomaisen tavan toteuttaa katastrofivarmennusta. Näiden lisäksi tuote helpottaa katastrofivarmennuksen käyttöönottoa koska käytetyn alustan ei tarvitse olla yhteneväistä lähde- ja kohde-ympäristöissä.

Miltä vReplicator näyttää

Vizioncoren vReplicator on selkeä Windows-sovellus jonka käyttöönotto ja operointi on erittäin yksinkertaista. Asennuksen jälkeen vReplicator-palveliemelle määritellään ESX/ESXi-palvelimet joita ympäristöstä löytyy jonka jälkeen replikoinnin käyttöönotto tapahtuu vain valitsemalla lähde ja kohde.

Alla kuvankaappaus tuotteen käyttöliittymästä jossa on ajossa replikointityö:

Miten vReplicator toimii käytännössä?

Ylempänä kerroin tuotteen toiminnallisuudesta jo yleisellä tasolla. Alla oleva kaaviokuva avaa tarkemmin tuotteen toimintaperiaatetta:

Kuten kaaviokuvasta käy hyvin selville tärkein osa replikoinnin toimivuudessa on vReplicator-palvelin josta on yhteydet sekä lähde- että kohde-ympäristöön. vReplicator-palvelin komentaa vSphere-ympäristön vCenter-palvelinta ja käyttää replikoinnissa hyväkseen snapshot-toiminnallisuutta. Kun snapshot on otettu vertaa vReplicator-palvelin eroavaisuuksia lähde- ja kohde-koneissa tiedostotasolla ja replikoi muuttuneet blokit automaattisesti kohteena olevan virtualikoneen virtualilevylle.

Yhteenveto. Toimiiko tuote kuten luvataan?

Tiivistäen voin ehdottomasti sanoa että toimii. vReplicator oli erittäin helppo tuote käyttöönotettavaksi ja tarjoaa erinomaisen tavan toteuttaa replikointia kustannustehokkaasti ilman peilattuja levyjärjestelmiä ja kalliita kuituyhteyksiä konesalien välillä. Tuotteesta on saatavilla ilmainen kokeiluversio jonka asentamista suosittelen varauksetta, näin pystyt itse toteamaan täyttääkö tuote yrityksenne tarpeet vai onko kenties VMwaren oma Site Recovery Manager parempi. En kuitenkaan lähde vertailemaan tuotteita keskenään, koska lähtökohdilta ja tarpeiltaan ne on tarkoitettu aivan eri tyyppiseen varmentamiseen ja kummallakin on selkeästi omat tarpeensa markkinoilla.