Updated: Updated: Cloud
Some 15 years ago I wrote a chapter for the book 97 Things Every Architect Should Know titled, Convenience is not an -ility. The idea for the chapter originated from someone requesting to augment an API with a convenience function or a convenience parameter to an existing function, a change which I considered to break the existing abstraction. Some 15 years later, I find that good abstractions are still difficult to come by.
Cloud Automation Abstraction
I’ve been spending my recent engine room time with serverless automation, specifically with AWS CDK, the Cloud Development Kit. I consider automation an essential element of a cloud operating model, going as far as having publicly stated that:
Cloud - Automation = Yet Another Data Center
Alas, much automation is encoded in YAML or JSON documents, which are easily parseable, but not the languages we use for our every-day programming (you might enjoy the related chapter “Infrastructure as actual Code” in my book Cloud Strategy). CDK is a code library that allows you to program cloud automation in common programming languages like Python, Java, or TypeScript. Those languages come with features like inheritance and polymorphism, which are great tools to create—you guessed it—abstractions.
The CDK library conveniently includes several levels of constructs, which are intended to provide “higher-level abstractions consisting of multiple related AWS resources”. Level 1 constructs denote familiar AWS resources like an ec2 instance or a Lambda function, and map straight to the vocabulary that’s used in CloudFormation and the Command Line Interface (CLI).
Level 2 constructs aim to take some of the tedium out of the Level 1 constructs (“make them more intent-based” per the docs). Level 3 constructs become more interesting. Denoted as “patterns” (in a loose meaning of the term), they are intended to provide higher-level abstractions over the base resources. A representative example is the
ApplicationLoadBalancedFargateService, which creates a Fargate service running on an ECS cluster with an application-level load balancer in front. This construct is enormously convenient as it replaces 270 lines of CloudFormation code with an almost-one-liner. But is it really an abstraction? My 15-year-old article reminds us that convenience alone isn’t sufficient.
Whole or Sum of Parts ?
That construct’s name might hint that finding a solid abstraction proved challenging—it essentrially just concatenates the names of the parts that it creates. This effect is also pronounced in the Solution Constructs library and the Serverless Pattern Catalog, which largely combine base constructs into larger elements. The documentation states:
“Composition is the key pattern for defining higher-level abstractions through constructs.”
That’s a bit of a mouthful as we are now digesting patterns, composition, abstraction, and constructs. One could likely write a whole book about composition vs. abstraction, so I’ll just point to an excellent article by Eric Elliott, who also did write a whole book about composition (within the context of object and functional languages).
Sidestepping semantics for a moment, I’d suggest that construct names like Lambda-SQS-Lambda, although convenient, aren’t abstractions. (On a side note, this construct doesn’t actually send messages from Lambda to SQS as the name might suggest. The referenced Lambda-SQS construct (see source code) merely creates a queue and passes its url as an environment variable to the function’s code. You’ll still have to write the code that sends a message to the queue.)
That doesn’t mean that these constructs aren’t useful. They help developers avoid mistakes by taking care of many cumbersome aspects of composing serverless applications, such as creating and assigning Dead Letter Queues, setting IAM permissions, handling VPCs, and enabling distributed tracing. So, they are a huge convenience. I just don’t consider them an abstraction as the functionality they provide can only make sense if you are already intimately familiar with the underlying constructs—the details that should have been abstracted away.
Good abstractions speak a simpler language
Having a well-known knack for car analogies, I once tweeted:
If engineers had named the automobile, it’d be called “engine-transmission-wheel-assembly”.
Most of us would consider that rather silly as we like to call things by its purpose or intent, in this case something that moves on its own, i.e. without horses pulling it, hence an “auto-mobile”.
When Mike Roberts, an old friend and serverless aficionado, was looking for a meaningful “Hello World” project to take CDK for a spin, he actually built a useful abstraction. The few dozen lines of CDK code combined AWS services like CloudFront CDN, Route 53 DNS, and S3 storage to host a static web site. Having used the very same components to manually setup my eaipatterns.com site (yes, a poor practice), I found it immediately useful. Being a seasoned developer, he didn’t call his creation
CloudFrontRoute53S3Helper but named the wrapping class simply:
To highlight why I like that name so much better, I rely on the following, admittedly non-scientific, test:
Does the construct provide a higher-level vocabulary that shields me from the underlying complexity?
One of the biggest challenges in today’s programming environments is cognitive load. Modern systems can do amazing things, but they aren’t simple. Abstractions reduce cognitive load because they allow us to use our brain cells more effectively. TCP sockets aren’t actually streams of characters just as assembly code doesn’t understand the notion of inheritance. Those are just two examples of powerful abstractions that we have grown fond of.
Constructs that are simply compositions of lower-level elements don’t measure up to this bar. They are more like an assembly language macro: hugely convenient—for those who program in assembly language.
Good abstractions are utterly obvious
It’d be easy to dismiss Mike’s naming choice as obvious. However, the simplest choice for someone building such a construct would have been to name it by the pieces that it’s made from. Instead he chose to name the composite element by its purpose, that of setting up a static web site. It just so happens that the pages are stored in S3, mapped to a URL via Route 53, and cached via CloudFront.
Good abstractions don’t need fancy names, actually the opposite: they tend to rely on simple but evocative names. Competing Consumers is a very aptly named abstraction for multiple functions consuming from a Point-to-Point Channel. That channel might well be an SQS queue with multiple Lambda functions or an Azure Service Bus coupled to Azure Functions (Azure actually documented this use case as a cloud pattern). In either case, as the name suggests, multiple function instances compete for each message so that only one function is invoked per message. If you feel that such expressive abstractions can help reduce service dependencies, you are indeed spot on.
How tight do abstractions need to be?
The base CDK library has some well-named examples as well, like the
HttpsRedirect construct. This one wires up a CloudFront distribution and an S3 bucket setting to reroute HTTPS requests directed at one domain name towards another. HTTPS Redirect perfectly expresses the intent of this construct, making this a good abstraction candidate in my book.
Joel Spolsky has posited that all non-trivial abstractions are leaky, and
HttpsRedirect isn’t immune to this effect, as we can quickly notice from the required parameters:
Hosted zone of the domain which will be used to create alias record(s) from domain names in the hosted zone to the target domain. The hosted zone must contain entries for the domain name(s) supplied through recordNames that will redirect to the target domain.
Er… well, if you work a lot with DNS and redirects, this likely makes a lot of sense to you. Otherwise, leave it for another day. The example itself suggests to retrieve the correct parameter via the static
route53.HostedZone.fromHostedZoneAttributes method. Oops, that’s what we could call a major leak: you aren’t able to use the construct without detailed knowledge of the underlying implementation.
On another side note, Joel’s examples of leaky abstractions mainly refer to physical properties, which one could consider impossible to abstract away. For example, lost TCP packets require retransmissions and therefore cause latency spikes, just like the SQL language can’t shield you from performance optimization. I refer to this aspect as Failure doesn’t respect abstraction, which I might have abstracted (hah!) to “physics doesn’t respect abstraction”.
Climbing up the stack is hard. And so is naming.
Why is it so difficult to come up with good abstractions? The Software Architect Elevator cites the Stack Fallacy as one reason for this wide-spread phenomenon. A vendor who is delivering a set of products and services is bound to think in those very products—that’s how products are pitched, teams structured, usage measured, and customers billed.
Similarly, IT’s history is littered with top-down attempts to define business frameworks, so the answer is not as simple as inverting the direction. I still have a copy of IBM’s San Francisco Framework on my shelf, which defined common building blocks for business applications, including process management and basic accounting. If you have never heard of it, don’t worry.
Meaningful abstractions won’t be found bottom-up nor top-down, but need to be done outside-in.
Meaningful abstractions won’t be found bottom-up nor top-down, but need to be done outside-in, by understanding customers’ intents. You might rebut this with the famous “If I had asked people what they wanted, they would have said faster horses” quote, but keep in mind that Henry Ford apparently never said it and that not listening to customer needs was the main cause of his market share dropping from 65% to 15% in less than 6 years (see HBR article).
On the upshot, understanding common needs and encoding them in an abstract language can be immensely powerful. The Enterprise Integration Patterns that Bobby and I documented almost 20 years ago spurred the creation of a whole genre of open-source Enterprise Service Bus projects, most of which are still in wide use today. These patterns originated from the insight that a customer combining a queue and multiple consumers is likely looking to scale a system while avoiding duplication. The product selection is just a specific implementation to the more general need of Competing Consumers. Neither Bobby nor I worked for a vendor at that time, so we based the patterns on our hands-on project experience plus input from the community. Still, some patterns started out with names that merely concatenated their components. For example, Scatter-Gather was initially called “Broadcast Aggregate”. So, we know how easy it is to fall into that trap.
Scatter-Gather was initially called “Broadcast Aggregate”, named after components, not intent.
We routinely comment that our industry struggles with naming things. Part of the reason likely lies in the complexity that we deal with and the lack of suitable abstractions. Coming back up from the engine room, you need to clear your head from the details and consider how the combined pieces create meaning in a specific context. Once you understand the meaning, naming will be much more natural.