Updated: Updated: Cloud
Architects are buzzword slayers—when we hear buzzwords, we are compelled to translate them into one or more meaningful decisions with clearly understood trade-offs. One decision that just about every large enterprise is grappling with these days is whether or not to use multiple clouds. Using decision models and structured thinking help us tone down the buzz and debunk some common multi-cloud myths along the way.
The best part of my job is meeting smart people. Not only do they challenge me to sharpen my thinking, but they usually know more about their business than I do. So, rather than tell them what to do, I prefer to share mental models that allow them to make a better decision.
I have long asserted that architecture is defined by meaningful decisions. This doesn’t imply that architects need to make all those decisions. Quite the opposite: they help others make better decisions. Or, as I opined on Twitter:
Architects aren’t the smartest people in the room. Their primary job is to make everyone else smarter, for example, by getting folks to see more dimensions.
Decision making is a fascinating and broad topic and unsurprisingly has inspired several past blog posts, e.g. when I cautioned that you might be unaware of making major decisions and wondered whether IT’s biggest decisions are its worst.
Start with “Why”
Architects tend to ask a lot of questions and the most useful question of all is “why”. Not only is it the basis for the famous “five whys” method of root cause analysis (more on the benefits of asking questions in the chapter Question Everything in The Software Architect Elevator). So, when the discussion comes to multi-cloud, it’s only apt to ask why you’d want to consider that option. Asking “why” doesn’t mean questioning the customer’s initiative. Rather, you can only help them find a better solution if you understand the problem they are looking to solve. Too often the problem is portrayed as “we need to implement this solution”. Bringing architecture decision models won’t do much good after the decision has already been made and folks are merely looking for justification.
My starting point for multi-cloud discussions is a decision model I shared several years ago:
The model provides a neutral vocabulary and helps unearth a variety of motivations:
- Using the best suited (or lowest cost) cloud service for a given problem (Segmented)
- Providing freedom of choice to business units (Choice)—it’s one way of providing options; we’ll get to that soon
- Increasing application availability (Parallel)
- Reducing switching costs, which can be used to increase negotiation power with a credible threat to “pack up and leave” (Portable)
Frequently, there are additional motivations:
- Having inherited multiple clouds due to acquisitions (Arbitrary)
- Balancing IT spend across vendors (more likely than you might think)
- Protecting against vendor default or product discontinuation
- Fulfilling regulatory guidelines or customer expectations
- Assuming cloud platforms will still drastically evolve and it’s too early to make a pick
- The fear of betting on the wrong horse, or being blamed for it
The list goes on. All of those have come up in customers conversations, and although they are worthwhile motivations, they all entail significant trade-offs (my 9 poor ways to select a cloud provider might provide useful background reading). When a lot is at stake—cloud vendor decisions can reach into the double-digit millions—and the options space is nuanced, architectural thinking helps organizations see both the cloud trees and the multi-cloud forest.
Architects sell options
One of the best insights into architects and decisions stems from Martin Fowler’s classic article Who needs an architect?, published in IEEE Software Magazine back in 2003:
One of an architect’s most important tasks is to eliminate irreversibility in software designs.
So, architects aren’t there to make all decisions up front. Rather, they are there to allow us to change our mind later (that’s also a key reason why Agile and Architecture go along very well).
Martin’s article inspired me to draw the analogy between doing architecture and selling options (which evolved into a chapter in The Software Architect Elevator). The “make all the big decisions up front” trap can be avoided in two ways: by making decisions that are easy to undo (“two-way doors” in Amazon parlance) or by deferring decisions until we know more about the problem. The financial world has a structured instrument for the latter: options.
An option is the right, but not the obligation, to execute a financial transaction at fixed parameters in the future.
The magic of options is that they make time travel possible: buying a stock for $100 today is a difficult decision to make. Who knows what the stock will trade for in one year? If you can defer the buying decision into the future, it becomes trivial: if the stock trades for more than $100 in one year, buying it for $100 at that time is a sure deal. The magic here is that you deferred the decision at fixed parameters until you know more.
My favorite IT example for architecture options is sizing infrastructure. Who knows how much hardware will be needed to run an application? In many cases, we don’t even know how many users we will serve or how fast the data will grow over time. Architects can provide a nifty option to defer that decision until it becomes trivial: it’s called scale-out architecture and elastic infrastructure (yup, exactly how cloud platforms provide it). With that option, you can defer the sizing decision and add or remove hardware as the need arises.
It’s intuitive that these options have value, but you don’t need to rely on intuition. Gentlemen Fischer Black and Myron Scholes scored a Nobel prize in economics for calculating the value of (European style) financial options:
The value of the option depends on several parameters, the strike price (
K, how much you will need to pay to exercise your option), the time frame (
T - t), the volatility (
σ), the risk-free interest rate (
r), and the current stock price (
S). We’ll look at the impact that some of the parameters have later.
Luckily, you won’t need a Nobel prize to benefit from this proven model to make better and more nuanced IT decisions. It also shows that architects offering up options brings proven value—it’s a great time to be an architect!
Cloud options have value
Let’s assume you or your customers are eyeing the option to be able to switch cloud providers sometime in the future. As cited above, you could have multiple motivations for this option, including commercial or operational aspects. Falling back onto the financial options theory we just examined, we can clearly state:
The option to switch providers in the future has a positive value.
It’s no secret that I work for one vendor, and although I do not earn a sales commission, my salary can only be paid if customers use our service. Arguing against multi-cloud when you’re the preferred vendor and arguing against it when you’re not is unlikely to build trust with your customers. Sharing established decisions models does.
A well-established decision model builds a common baseline for customer conversations.
Falling back to a mathematical decision model like options theory builds a common baseline for customer conversations, thereby building trust and defusing a potentially controversial topic.
“Lock-in”: Switching Costs ✕ Likelihood
It’s rare to have a conversation about multi-cloud without the topic of “lock-in” coming up. Being drawn to controversial topics, I warned about getting locked up into avoiding lock-in several years ago (a revised version of that chapter is featured in my book Cloud Strategy). When customers speak about lock-in, they usually refer to the cost of switching providers, as my colleague Mark Schwartz aptly pointed out. In our options model, the cost of making a switch is the “strike price”, i.e. the price to pay when exercising the option to switch providers. Even if math wasn’t your favorite topic in school, you’ll be able to derive from the Nobel-studded formula that a lower strike price
K increases the value of the option. So, cloud options have value and reducing their strike price (switching costs) is even better.
Naturally, the switching cost is a potential cost: you may end up switching or you might not. Therefore, the “lock-in” to be considered is the product of the potential switching cost multiplied by the likelihood that it’ll occur. This gives you two levers to reduce “lock-in”:
You have two levers to reduce “lock-in”: reducing switching cost or reducing the likelihood of having to switch.
For example, you could reduce the likelihood of a switch by selecting a well-established vendor who is evolving their platform based on customer needs or by negotiating a particularly good deal.
Switching Costs vs. Option Cost
Things that have a value are likely to also have a cost. Architecture options are no different. That’s why I say that “architects sell options”; they aren’t donating them. Now, you don’t pay the architect for the option; you pay in different ways. As elaborated in Cloud Strategy, multi-cloud options come with a broad spectrum of costs:
- Effort, e.g. to build abstractions or integrations between cloud platforms
- Expense, e.g. to train your staff not in one but in multiple new technologies
- Underutilization because you shy away from using advanced features that aren’t available in competing clouds (also called the “lowest common denominator” problem)
- Complexity, e.g. due to additional abstraction layers or tools
- New lock-in, e.g. by using an additional vendor’s multi-cloud framework
Out of these, I have observed underutilization and complexity being the most significant costs, and ironically the ones that are least often considered. The relationship between not making decisions, i.e. wanting all options, and complexity is captured in Gregor’s Law:
Excessive complexity is nature’s punishment for organizations that are unable to make decisions.
Architects translate buzzwords into trade-offs: how much up-front investment do I want to make across these dimensions to gain more options or a lower strike price in the future? In my original article on lock-in, I shared the following curve to visualize the decision spectrum (it’s also a great example of how architects see shades of gray):
The curve shows that the total cost is the sum of the needed up-front investment (the option price) and the expected payoff gained from having the option. This (qualitative) graph not only enables a nuanced discussion about the trade-offs, the red curve (the total cost) also highlights a valuable architecture insight:
The extrema of an architectural decision spectrum are rarely the optimum.
This means that buzzword-driven slogans like “lock-in must be avoided” are unlikely to lead the organization down a suitable path because they portray that the less lock-in you’d have, the better, ignoring the costs associated with it. At best the folks uttering such “wisdoms” are only looking at the black curve (the switching cost), but more likely they prefer to repeat slogans over engaging in architectural decision making. It goes without saying that I don’t consider these folks architects.
Lock-in is as much about you as it is about the vendor
Armed with an underlying theory (options pricing), and a visually intuitive decision model (the cost graph), we can engage in two of my favorite architect activities: seeing more dimensions and seeing many shades of gray.
Seeing new dimensions of reducing lock-in means realizing that the abstraction layers and architectural frameworks that are being proposed originate from a rather static view onto an inherently dynamic problem. I have posited that architects live in the first derivative, so let’s look at the problem from another dimension: a dynamic view.
Organizations that are slow moving, the ones that take a long time to make decisions and even longer to implement them, will always have high switching costs. It routinely costs them many millions of dollars to upgrade from one version of a product they use to the next one. How much do you think it’ll cost them to switch cloud providers? They’ll do that with the push of a button with this awesome framework they put in place? I am not so sure! These organizations are notorious for lack of automation, lack of testing, low appetite for change, excessive complexity (see above), and an abundance of internal politics.
In comparison, consider fast-moving, lean organizations that push software many times a day thanks to high levels of deployment and test automation and rigorous feedback cycles. How long will it take them to switch cloud providers? Several weeks, a few months, maybe?
A key determinant of your switching cost is your own velocity, i.e., the speed with which you can make changes.
Thinking like an architect reveals whole new approaches. Rather than spending another 18 months putting a multi-cloud framework into place, perhaps you should invest into increasing your velocity. This’ll reduce your switching cost and makes the multi-cloud option less valuable because your potential downside of not having the option is much reduced.
The higher your velocity, the lower the value of the multi-cloud option.
Furthermore, increasing your velocity pays off immediately and allows you to better respond to any kind of change, not just switching cloud providers.
Option Pricing: Uncertainty
But we can gain an even deeper insight by considering additional parameters that feed into the Black Scholes formula. Remember sigma (
σ), the volatility? Increasing volatility increases the value of an option. That’s intuitive because the less one knows about the future, the more valuable deferring a decision is going to be. If nothing ever changes, I might as well make a decision now. But if things are all over the map, I’ll be much better off making the decision later.
This insight easily translates into the server sizing example from the beginning. If I build an internal application for a handful of users, uncertainty is low and the “horizontal scaling” option won’t be as valuable. If I am building a mobile application that might have 100 users or 100 million, the option is almost a given—more likely than not I even go a step further and build a serverless architecture that provides elastic scaling at the function level.
Uncertainty driving the value of options has a positive outcome for option-selling architects:
Our uncertain times increase the value of options and therefore the value of architecture.
How does volatility impact multi-cloud decisions? Cloud platforms are constantly evolving and so are your needs. At the same time, cloud platforms are composed of building blocks that are designed to support a virtually unlimited number of implementation scenarios. So, the chances that you’ll “outgrow” a cloud provider are limited. Also, all major providers are extremely well funded (the rumors that one provider might eventually cut their continuing losses have largely disappeared).
When dealing with cloud provider uncertainty, my advice is twofold:
1) Don’t just look at today’s offering but at the provider’s product strategy and philosophy because that’ll determine whether the platform will still be a fit in future years.
2) Don’t just consider the technology but also at the organization—you’ll be working with these guys for a long time to come, so better make sure they think along the same lines as you do.
Here we learn another important lesson about thinking like an architect: to see the whole picture, you need to zoom in and out. Not everything that architects do can be packaged into a single formula.
Multi-cloud isn’t the next level of cloud maturity
This leaves us with one more key parameter,
r, the risk-free rate of return (Investopedia). The Black-Scholes model has been criticized for assuming the rate remains constant, but since we’re not looking to calculate an exact dollar value, that doesn’t bother us. Rather, we find that for our architecture options the rate doesn’t have the same influence as it does in the Black-Scholes formula. When dealing with financial options, buying the option avoids having to buy the stock today, saving you cash that you can invest otherwise. Hence, a higher rate of return increases the value of the option (consider the negative exponent in the second part of the equation).
Our so-called “real option” of deciding on a cloud strategy doesn’t have the same mechanism—there is no stock to buy. Rather it has the opposite effect. The interest rate is often used as a discount rate, indicating that a dollar earned tomorrow is worth less than a dollar earned (or spent) today. That’s intuitive because I can invest today’s dollar and earn a return while inflation makes tomorrow’s dollar less valuable. A discount rate thus determines the time value of money (Investopedia), and hence the value generated from a future option, e.g. the option to move cloud providers.
Organizations routinely use discount rates to make investment decisions. Successful businesses tend to apply higher discount rates, called Internal Rate of Return (IRR, Investopedia) that a project is supposed to deliver. Consider a fast-moving organization in a highly competitive market. This company will use a high discount rate for their investment decisions to acknowledge that investing in an option today carries a high cost—spending 18 months to build a multi-cloud abstraction layer is likely to put them out of business.
Likewise, the potential return of the option to change cloud providers in many years is comparatively small: they might have pivoted many times and changed their application landscape anyway. What good is a container-based abstraction framework if in a few years you’ll be all serverless or focused on 5G-enabled edge computing or moved from compute to training AI models? This logic explains why very few “digital native” organizations have elaborate multi-cloud strategies.
In very large, traditional organizations, where the large clock hand appears to advance in 18-month increments, three years will seem very soon. So, they might be more inclined to want to have the option to switch providers. This leads us to an interesting insight:
Fast-moving, “digital” organizations, those who traditional enterprises aspire towards, are less likely to have a multi-cloud strategy because today’s cost to them is very high and the future payoff uncertain.
This insight helps us debunk a major multi-cloud fallacy, which purports multi-cloud as the next maturity level of being in the cloud, the “logic” being that one goes from no cloud (on premises) to a single cloud and then to multiple clouds. Following a proven decision model and the rigor of financial analysis shows that exactly the opposite is the case.
Multi-cloud isn’t the next level of cloud maturity. Rather the opposite.
When you work with high-velocity organizations, you’ll find them favoring managed services, full automation, and advanced services like serverless and machine learning. The immediate benefit they gain from these capabilities far outweighs the potential benefit of being able to switch providers some time in the distant future. Traditional organizations might attempt to explain this preference away with smaller companies “having less to lose”. I’d say the opposite is true: smaller companies have more at stake, specifically their very survival. That’s why they favor investing in product innovation and gaining market share over buying options that they might never use.
It’s a dialog
Using decision models allows me to avoid telling my customers what they should be doing. That’s not because I have been a consultant for too long and all I can say is “it depends”, but honestly I am convinced my customers can ultimately make a better decision. I am just there to help them think about their problem in new ways.
What excites me most about my work is that I find every single business I have interacted with genuinely interesting. This also means that there’s more to a decision than a model or formula can capture. Customer conversations are always a dialog: afterwards we are both smarter.