Not so long ago Forbes published a very interesting article by Lance Smith, CEO of Primary Data named “Migrations Are Dead, Long Live Migrations”. It got me thinking about this whole data migration business. For over 8 years, while I was at NetApp, I spent the majority of my time designing, scoping and calculating levels of effort for data migration projects of various complexities.
Data migrations were “just” for file shares — including MS Exchange, MS SQL, Oracle DB and sometimes UNIX servers. As time went on, the software capabilities of enterprise storage arrays became more sophisticated, applications became more storage-aware, and dataset sizes increased exponentially. Due to these factors, the complexity of moving data from one place to another also became more challenging.
While the complexity of data migration is well understood, what is less straightforward is understanding how to successfully migrate data without losing your mind.
#1Data migration challenge: Size and Complexity
Data migration has changed dramatically with the explosive growth of data generation: more data were created in the last two years than the previous 5,000 years of humanity. We now have datasets so large that a migration may include physically transporting data from one location to another. Did you know that Amazon Web Services now offers an actual truck to migrate data for customers? According to the website, its “data transfer service used to move extremely large amounts of data to AWS”. Google has a similar approach called Google Transfer Appliance.
According to Google ‘if you have a typical network bandwidth of 100 Mbps, a petabyte of data takes about three years to upload’. Hence their suggestion for a Transfer appliance that allows data to be accessible within weeks.
Regardless of how you move data, it’s likely you’ll be responsible for sizeable datasets in any migration.
Keep on trucking? Amazon offers a truck to migrate data for customers with large datasets.
#2Data migration challenge: Cutover duration & Scheduling
Cutover events, where the users can’t access the data and the applications are taken out of service, take time. The duration of the cutover event is roughly determined by the network bandwidth and rate of change following this math:
Cutover Duration = Rate of Change / Network Throughput
For example, if you have a 1G network and you need to transfer one terabyte of data, it will take you three hours. The larger that dataset, the longer the migration takes and the harder it is to ensure that the baseline will be copied over in time.
Then there’s the timing of the cutover events themselves. Since the servers can’t be pointed to the target dataset until after the cutover event, they are often scheduled on weekends and holidays — the least convenient time for the storage administrator, backup managers, and systems managers to be in the office away from their families and friends.
#3Data migration challenge: Costs
A vendor data migration proposal typically includes something like this:
Dear Valued Partner,
Our consulting services team has extensive, global experience with large-scale data migrations utilizing a rigorous ITIL approach with proven delivery methodologies in transforming core storage services.
The best way to achieve cost and service delivery objectives is to rapidly migrate to your new infrastructure in order to realize immediate value. Our proposed solution includes multiple resource pools, necessary hardware, and software tools with overarching governance and program management to provide flexibility, quality and cost benefits.
Simply translated: This is going to be a major headache, and it’s going to cost you a lot of money.
And that’s the estimate. Things can quickly arise which will make data migration take longer and cost more. For example, If the cutover duration is longer than a cutover window approved by the business, you now have to have multiple cutover events, complicating things even more and introducing additional risk. This requires coordinating with business units and worrying about application dependencies. Believe me when I say that no one documents all dependencies.
Addressing the challenges for a successful data migration
While data migration comes with a set of challenges, there are also ways to overcome them. Below is a step-by-step guide for evaluating your needs and tools to ensure a successful migration.
Step 1: Ask the right questions
Before starting any data migration project, it’s good to ask yourself and your team and/or vendor some important questions. You don’t want surprises later, so here are a few discovery questions to answer before getting on track with any migration.
Are you changing your storage vendor?
If you are staying with the same storage vendor, and changing only their platform, most of the good ones have their own data migration tools and utilities to get the data over with little risk and minimal disruption.
If you are switching vendors, the story becomes more complex. Your native tools will typically not help unless you’re using a unique suite of products (more on that below).
What kind of data is being moved?
Is it unstructured data or database? Block or file? SAN or NAS? Depending on the data type, the tools you will use are different.
Are your hosts physical or virtual?
If your hosts are virtual, you can use tools such as Storage vMotion from VMware that allows the live migration of running a virtual machine's (VM) from one storage system to another. There’s no downtime for the VM or service disruption for end users, and the migration occurs while maintaining data integrity.
How is the data backed up?
Or is it backed up? How long does a backup usually take? What is the daily rate of change?
With many migration tools, you can copy the baseline of the data set without an outage and take the servers down only for the cutover event.
Step 2: Use the right migration tool for the job
There is no way to cover every migration tool available, but below are a few common ones to know about.
As discussed above, vMotion is a great tool that allows you to change compute and storage for a virtual machine without ever shutting it down. It’s a licensed product and is included in an enterprise level license, not in essentials. If you don’t have the enterprise license, the guest OS needs to be shut down for vMotion. Bulk migration is available as well. It takes time, but avoiding an outage makes it attractive.
File Data Migrations (NAS)
Remember the discovery question about block or file and SAN or NAS? If the answer is NAS or file, here are few common migration tools.
Old, but tried and true. XCOPY is a command used on PC DOS, MS-DOS, OS/2, Microsoft Windows, and related operating systems for copying multiple files or entire directory trees from one directory to another. For the most part, it’s single threaded and slow.
This is a software application for Unix which synchronizes files and directories from one location to another while minimizing data transfer using delta encoding. Works well most of the time.
Robocopy, or "robust file copy”, is a command-line directory and/or file replication command. Robocopy functionally replaces XCOPY, with more options. It has been available as part of the Windows Resource Kit starting with Windows NT 4.0, and was first introduced as a standard feature in Windows Vista and Windows Server 2008. One of its limitations is that it can’t copy open files. This complicates planning and coordination process.
It changed hands multiple times, and was most recently owned by DELL and Quest. It’s not free and is licensed based on the number of hosts and duration of migration. For example, if you are moving from a four node cluster to a high availability storage array with two controllers, you will have to buy six licenses. It’s a great tool that takes care of permissions, allowing you to consolidate or split directories.
SAN Data Migration Tools
If you have SAN/block storage device, you need to migrate the whole LUN (disk presented to the application server). The tools are very different because the host owns the file system and you need to copy it over block-for-block.
Built-in Application Tools
Using the built-in application tools is always an option. They are slow, but they work and maintain application consistency. Not a bad option for smaller environments.
Restore to a new SAN
You can always take a full backup and restore to a new SAN. Works most of the time, but it can be slow.
Volume Management Software
If there is some sort of Volume Management software used (VVM, LVM) it can be used to mirror datasets over WAN. Vicom Systems appliance and software is one option for SAN migrations. It works really well, but there is a cost. There’s still cutover, but it tends to be shorter. They now have offline and online options.
Storage vendors have data replication tools that can be used for migrations. SnapMirror™ from NetApp is a good example.
Step 3: Consider a different approach to data migration - Use a time machine with TimeOSTM from Reduxio
Reduxio is a leader in the SAN storage arena, offering a suite of tools to help you migrate data instantly without the need for a cutover event with its TimeOSTM software.
NoMigrate™ is a migration vehicle that lets you migrate data from another SAN to Reduxio. You can “drag and drop” a LUN from another storage and serve production data instantly. Data migration is done in the background and there is no need to wait for data copy to complete.
NoRestore® offers you the ability to restore data instantly Let’s say you have your Reduxio HX550 SAN flash storage array happily humming in production, allowing you to create an offsite backup repository. Think of it as a bucket that backs-up all of your primary data every five minutes. Your target can be another Reduxio HX550, any other legacy iSCSI storage array or S3 bucket in the cloud.
Now, if disaster hits your primary data center, you can bring a new Reduxio HX550, “drag and drop” from the backup repository and serve the data instantly. Yes, instantly, while it’s still migrating. No need to wait for restore to finish.
This tool gives you the ability to eliminate snapshots and recover your data to any second in the past. BackDatingTM acts as a time machine for your data. Clone any volume to any point in time in the history of the system, for data recovery or application testing purposes. This capability is used by hundreds of our customers every day. Some use it to recover from ransomware or virus attacks, some use it for Test and QA with instant zero space cloning.
Hopefully, this information can guide you next time you are planning a data migration project. We believe Data Storage is unnecessarily complex, data migration projects are tedious and require lots of time and energy. Why not try a different approach?
Want to learn more?
Connect with our team to learn more about how you can instantly migrate data.