Moving to cloud-native: Top 5 problems with persistent data storage (and how to fix them!)

Let’s face it: We’re living in a cloud-native world. These days, you can’t read a blog or attend a conference without being exposed to all the benefits of going cloud-native. Yet amidst all of the excitement about the cloud-native landscape, it can be easy to overlook the cloud data storage challenges that arise when migrating a slew of monolithic legacy systems to the cloud.

So let’s start with the obvious. What exactly does “cloud-native” mean? Well, it depends (as do many things in life) on who’s talking about it. We feel the most real and relevant definition of cloud-native comes from the Cloud Native Computing Foundation (CNCF):

Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.
These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.
The Cloud Native Computing Foundation seeks to drive adoption of this paradigm by fostering and sustaining an ecosystem of open source, vendor-neutral projects. We democratize state-of-the-art patterns to make these innovations accessible for everyone.

Now that we’ve defined the cloud-native landscape, how do we go about getting there—and what does this mean for cloud data storage? Many approach going cloud-native as an isolated process, assuming that they’ll set up the infrastructure and the platform and solve migration and data issues afterward. Big mistake.

The most drama-free way to move data from legacy systems to a cloud-native storage environment is by including the migration process as part of your overall cloud strategy and building issue resolution around data into the process from Day One. To help troubleshoot the process, let’s take a look at five of the most common persistent data/cloud-native challenges, along with strategies for overcoming them.

First things first: About persistent data storage

Persistent data is primarily fixed, in that it’s the type of data that’s rarely accessed and unlikely to be modified. Persistent (or non-volatile) storage is a data storage device that continues to house data after the power source for that device has been disconnected or turned off.

Problem #1: Storage

Containers are the modern trend for building infrastructure across the IT industry, and Docker is the leading company that provides container infrastructure and tools. Their containers are API-driven, which means that you can integrate these in your platforms and infrastructure as needed There are significant benefits to be gained with containers, but there are also some gotchas that come with their use. A big one? Storage.

One common challenge with many cloud-native technologies is persistent data storage. Containers, serverless functions, and apps deployed using an immutable infrastructure model don’t typically offer a way to store data permanently because all internal data is destroyed when the application shuts down.

The Fix: Decoupling storage
Solving this challenge requires rethinking your approach to data storage by decoupling it from apps and host environments. Instead of storing data within the app environment, cloud-native workflows store it externally and offer the data as a service. Then, as workloads that need to access the data, they simply connect to it just as they would connect to any other service.

Problem #2: Container mobility

Containers are meant to be small and lightweight, quickly spinning up and down and moving around the cluster. Your data on the other hand is large and difficult to move around.

The Fix: Keep data agile
Depending on your orchestrator, your data has to follow your container around wherever it moves around the cluster. Don’t map containers to specific hosts for your data, because then you lose the mobility and the portability of containers.

Problem #3: The people problem

Human error is inevitable. If you rely on an operator to run through a playbook manually, you have a much higher chance of something going wrong.

The Fix: Integration.
For storage, ensure that everything is API-driven, and as integrated with Docker and Kubernetes as possible.

Problem #4: Containers aren’t designed to be stateful

Docker containers are made up of a layered image and a writable ‘container layer’. When a container is deleted, the writable layer is removed leaving only the underlying image layers. This is good because sharing layers makes images smaller more agile, but it’s also bad because without the writable layer, the app is useless!

Cloud native platform storage problem solution

The Fix: Local volumes.
In a case like this, we mount a directory from the host onto the containers, which can then access and both read and write to them. This lets you mitigate faster than write to your local host. Because that volume is tied to a specific host however, your data is inaccessible if the host goes down. Be very careful with consistency when you have more than one container write to the same volume.

Problem #5: Quality of service controls

Because there are no QoS controls, you’ll end up having the “noisy neighbor” problem if some containers are taking up more than their fair share of the IOPs.

The Fix: Volume plugins.
To resolve these limitations, Docker came up with a new way to integrate external storage: Volume plugins. Volume plugins are a virtualization layer that runs top of any commodity or cloud storage. From the point of view of the app container, volumes are accessible exactly the same way, across the entire cluster, and the storage is always highly available. They are designed to scale horizontally by adding more nodes–new nodes contribute their storage into the storage pool. If nodes don’t have storage themselves, can easily access storage on other nodes.

Your move from legacy to cloud-native may have a few surprises along the way, but the integrity of your persistent data storage doesn’t need to be one of them. With a little advice and planning upfront, you can ensure that your persistent data arrives at its destination intact and doesn’t cause unnecessary downtime for your business.

Start transforming your software experience today!

Schedule a 10-minute call to
discuss your Cloud strategy!

    Similar Articles