Multi-Cloud Infrastructure: Misconceptions, Benefits, Best Practices

What does infrastructure as code look like when you are supporting multiple cloud providers?

Multi-cloud is a term that is making the rounds in companies small and large. Any company that is currently leveraging a cloud provider is also thinking about a multi-cloud strategy. They are thinking about scenarios where they run services in more than one provider. Why? The reasons are often varied. 

There is the classic, avoiding vendor lock-in. If you are a retail business running your infrastructure in AWS, you may want a plan for how to replicate that infrastructure to GCP or Azure. So if Amazon decides to eat into your space you can switch to a different cloud provider that isn’t a competitor. 

Or there is the more expensive, less likely, and a bit out there reasoning. AWS goes down. Like, completely offline kind of down. Think the recent Facebook outage kind of down. In this scenario, you want a “hot-hot” failover to another cloud. Meaning your entire infrastructure is not only replicated in another provider, but it can be ready to accept traffic at a moment’s notice. A very, very, very hard problem to solve. Not to mention very expensive.

That said, there are reasons and use cases between these two extremes. A simple one is you want to offer your technology to clients that only use one cloud or another. Customers want to use your technology, via your platform as a service (PaaS). But they want to be able to say, “provision your technology in this specific cloud provider”.

This is a very real use case that bleeds into the multi-cloud conversation. When multi-cloud is thrown around, the first two use cases are often thought of. But this third use case is actually much more likely. With the right tools and processes, it’s a lot more practical as well. 

In this post, we will dig into this world. What does it look like when I want to provide my technology in more than one cloud provider? What does it mean when I need to provision and manage production infrastructure in more than one cloud provider?

Should you even consider Multi-Cloud Infrastructure?

It depends on the problem you are in the business of solving. If a major cloud provider dipping their toes into your industry is a major threat, maybe it makes sense. Or if complete fault tolerance is necessary for your business, replicating your infrastructure in a “hot-hot” fashion to multiple cloud providers might be of interest to you.

It all comes down to your business, the product you are offering, and the expectations of your users. Those three factors will factor into whether multi-cloud is of importance or not.

But, at a high level, what are the real advantages and disadvantages of a multi-cloud strategy?

Benefits

Pro 1: Support different use cases and customer demands

Customers like to have choices. Not so many choices that it’s overwhelming, but enough to feel flexibility. This is very prevalent if you are building any kind of platform or platform as a service offering. Your users will eventually ask if you can run your technology in AWS, GCP, or Azure. 

Why? Maybe they view Amazon as a competitor and want to avoid directly or indirectly giving money to their competition via AWS. Perhaps they have credits in GCP that would make using your technology over there more cost-effective. It’s possible their entire engineering team is only familiar with Azure. So they don’t want to introduce a learning curve by adding another provider.

So, if you provide your technology in more than one cloud provider you can solve these problems for your customer.

Pro 2: Cloud providers iterate rapidly

Cloud providers are in constant competition with one another. They are in a steady state of developing new services and offerings. They are in the business of helping customers (you) in solving your problems more efficiently. Competition is good for you a multi-cloud thinker.

You can iterate on your own product or offering to leverage the latest cloud development that improves your services. With multi-cloud, you leverage that development in whatever cloud provider introduces it first. Are you going to be able to introduce that new feature or performance boost in all cloud providers? No, but one out of three isn’t bad until the other providers catch up.

Disadvantages

Con 1: More cloud providers means a larger footprint

When you support a second cloud provider you double your infrastructure footprint. Support a third one? Now you have tripled. You get the idea. For each provider you support, you increase the footprint of your infrastructure. Yes, some things will overlap and so it may be a bit less. But the point remains true, more cloud providers means a lot more infrastructure. This means you need to provision, monitor, and maintain this footprint forever. We will talk more about how to combat this later.

Con 2: It’s expensive

Much like the con above, more infrastructure means more money. This depends on your architecture and the services you use in each cloud provider. But in general, it is safe to say that maintaining a product or offering across more than one cloud provider is going to be more expensive than a single cloud provider. It’s expensive to replicate your infrastructure from one provider to another. It’s also expensive to hire talent that has the skillsets to build and maintain infrastructure in the other cloud provider.

The crux of the story here is that there are unique business advantages to providing a multi-cloud offering. You may be able to win business that you otherwise would have been locked out of because you chose one provider or another. But it’s not free money and it’s a non-trivial amount of work. Running production infrastructure in more than one cloud provider doubles many things. Including your infrastructure, context, and money spent footprint. This is a serious cost that should be weighed against the benefits.

Common Misconceptions

There are many misconceptions surrounding multi-cloud infrastructure. Some of them we have already touched on. For example, multi-cloud for extreme fault tolerance is not the same as supporting another cloud provider in your PaaS offering.

But there are a few more that are worth touching on here.

1. Infrastructure-as-code (IaC) make multi-cloud easy

Infrastructure as code and tools around it, like Terraform, help make multi-cloud possible. But it’s wrong to think you can swap one cloud provider for another in your Terraform code. That’s not how any IaC tool exists today.

A tool like Terraform or Pulumi represents all of your infrastructure in code. So you can clearly see what infrastructure supports your entire architecture. But those tools define resources specific to the cloud provider.

So to replicate your infrastructure into another cloud provider you must define all of that infrastructure as resources in that cloud provider. This includes all of your identity management, networking, compute, and storage.

An IaC tool gives you a head start because all your infrastructure is right there in code. But it still requires cloning that code tied to AWS to be the equivalent resources in GCP.

2. Cloud providers are all very similar

Conceptually, this is true. AWS and GCP are both cloud providers that maintain data centers and provide APIs to allow you to access a wide range of services to facilitate the development and running of your technology. It’s even true, generally speaking, that they offer similar services.

If you leverage services that have equivalents in other providers you avoid major roadblocks. But when it comes to the implementation of a multi-cloud strategy, similar is not as simple as it sounds. Even something as basic as the terminology from one provider to the next can vary greatly. This means the process of finding the equivalent services for porting your IaC from one provider to another can be a slog.

But it doesn’t stop at terminology. APIs, IaC resource definitions, authentication mechanisms, networking, and even storage can have differences. Differences that need resolution when it comes to moving your IaC setup to another provider.

Again, infrastructure as code gives you a head start, but it doesn’t solve the differences between cloud providers for you.

3. All clouds are equal

The idea of offering another cloud provider to your customer is enticing. It’s business you can win by porting over some work you have already done if you are using infrastructure as code. But it often misses one obvious fact.

Not all cloud providers are equal.

I don’t mean service for service kind of equal. That is true, any cloud provider may have a service that another doesn’t at any given moment. But bigger picture, they aren’t all equal in terms of their scale. Meaning the resiliency, availability, and security of the providers are not equal. Are they all better than running a data center yourself? Probably. But to say that GCP or Azure is more available than AWS is false.

This is a critical factor to consider when thinking about multi-cloud infrastructure. Your footprint doubles and so does your blast radius for all the things that can fail both in your control and outside of your control.

Best Practices when using Multi-Cloud Infrastructure

We know some of the pros, cons, and common misconceptions surrounding multi-cloud infrastructure. Now it’s worth chatting about some high-level best practices. These are practices that can be leveraged to make multi-cloud possible, easier to work with, implement, or all of the above.

1. Choose the right tool and set up processes

First, it’s all about choosing good tooling. The tools you choose to manage your infrastructure are going to make or break your multi-cloud adventure. We could even say that multi-cloud is impossible if you’re not using some kind of infrastructure as code.

It’s important to choose an IaC tool that supports multi-cloud. If you are all in on AWS and you represent all your infrastructure in Cloudformation, you are going to have a very difficult transition to multi-cloud. So choosing the right tools that keep the door open to multi-cloud is critical.

Beyond choosing good tools, you need processes that make them required. This means if you represent all infrastructure as code, don’t allow for manual creation or modification. Developers shouldn’t be able to create or change infrastructure in the console. Changes made by hand causes anomalies in your production environment. Anomalies that are not represented in code.

2. Modularize the core components in IaC

Second, modularize the core components of your architecture in your IaC. Say your technology runs in a Kubernetes cluster provided by AWS. You configure `cert-manager` to be deployed to that cluster so you can do SSL/TLS to deployments. With Terraform you should have a module that represents all the resources for the Kubernetes cluster. Then a second module for `cert-manager` that gets installed into the cluster. 

If you have a clear module system in your IaC, multi-cloud becomes a bit easier. Why? Because you know what modules need to be recreated to support another cloud provider. When you want to run this workload in GCP, you need to create an equivalent Kubernetes module. But you should be able to plugin the `cert-manager` module with no need to rework that one as it just needs a Kubernetes cluster.

3. Make architecture decisions that facilitate multi-cloud

Finally, make architecture decisions that facilitate multi-cloud. This is rather obvious when you stop and think about it but it’s often overlooked. If your architecture relies on a specific service in a specific cloud provider that operates in a particular way, it may not be possible to use another cloud provider.

The core cloud providers all have similar services. For example, AWS has Lambda, GCP has Cloud Functions, and Azure has Azure Functions. So if your architecture needs some function as a service offering, you should be alright at a high level. You may need specific features or performance from a given cloud provider service. These are not always similar and can be very different.

So as a best practice you should aim to make your architectures as generic as possible. This makes it easier to port them across providers. Containerization is a great tool to think about that can make this possible. 

Tools for Multi-Cloud Infrastructure

Here are some good tools and processes for supporting infrastructure in multiple clouds:

  • Terraform, Pulumi, etc for declaring your multi-cloud infrastructure in code. Choosing a cloud provider-specific tool is still better than no IaC tool, but it doesn’t facilitate multi-cloud.
  • Continuous integration and deployment for your IaC. Spacelift is a fantastic tool for this. It allows you to automate, audit, secure, and continuously deliver your infrastructure.
  • Containers provide a cross-cloud packaging and deployment mechanism. You can use Docker or one of the other many tools out there for building containers like `buildah` or `podman`. Containers unlock the ability to have a single way to package your application and deploy it to any given cloud provider.
  • Okta or some kind of central identity solution. No one should be able to access cloud provider consoles via logins. Instead, it’s better if they come in through some kind of SSO provider like Okta. This makes it easier to map your developers to roles or authentication mechanisms in different cloud providers.

Conclusion

Multi-cloud is not a small project. It requires careful planning and a lot of planning from the beginning of your product or technology. If a multi-cloud strategy is something you hope to tack on two years down from the road, it likely won’t work.

It requires avoiding architectures, tools, and processes that lock you into a specific cloud provider. So if multi-cloud infrastructure is something you may need in the future, keep it in mind when you make decisions around all these things.

I hope in this post you have been able to learn some of the reasons behind why multi-cloud is gaining momentum. There are many pros and cons but there are also many misconceptions surrounding it. With all these things in mind, we can establish some high-level best practices to make working across cloud providers simpler. Additionally, we can leverage great tooling to make the implementation of multi-cloud a bit easier.

Share this post

twitter logo