March 20, 2024

Today's Solution Landscape Pt 1

This is a first in a multi-part series that looks at how serverless and native cloud fit within larger software and IT industry trends.

In order to show the value that serverless technologies can bring to custom solutions, in the first two posts of this series we're going to zoom out, look at the big picture, and explore what solutions today look like.

There's been a lot of focus on practice-oriented areas of software development, like test automation and continuous integration. Without minimizing their importance, I would suggest that the most impactful change in recent years is actually our growing ability to compose larger, custom solutions out of smaller, off-the-self systems and services.

This is the biggest solution multiplier we have, I believe, and it comes from three long running trends that together are redefining what modern solutions look like. So, in this post, and the next, we're going to take a look at these three trends, which are: the evolution of reuse; the shift to cloud-hosted SAAS products; and the emergence of serverless technologies as the best implementation platform for the cloud.

The Evolution of Reuse

The most important and longest running trend concerns reuse. I'm going to start by framing this discussion with an observation made years ago by Robert Glass in his book Facts and Fallacies of Software Engineering, that always stuck with me. In that book, Glass distinguished between what he called "reuse-in-the-small", which describes things like libraries and frameworks, and "reuse-in-the-large", which describes the integration between more substantial, and separately deployed, units of software. At the time he was writing, Glass maintained that while reuse-in-the-small was widely practiced, reuse-in-the-large was, as yet, an unsolved problem.

Now, of course, large enterprises have long had multiple systems with some element of integration between them, if sometimes only batch-driven data transfers. But the reuse-in-the-large that I have in mind is near-real-time functional integration across separately deployed systems or services, such that they work together, seamlessly, as a unified solution. Achieving this level of integration means elevating it, as a design approach and a practice, from the periphery to something that's central to how we build solutions.

Thirty years ago, Glass argued that integration like this was uncharted territory. But today, reuse-in-the-large, as defined above using modern architectures, standards and technologies, is now a practical reality. I think it's worth retracing our journey and highlighting the lessons we've learned along the way, to fully understand where we are.

The first, and to my mind most important lesson to highlight is the value of bounded context in large code bases.

To illustrate this, I'll share with you the first large project that I worked on more than twenty years ago, which was a rewrite of a large university student system. Decades earlier, this system had been written in COBOL to run on mainframes. To get over the Y2K hurdle it had been machine-converted to run in an emulator on Unix, connecting to a database that had also been machine-converted from a hierarchical to relational design.

I joined the project to help lead the first couple years of what would be a decade-long effort to rewrite this system in Java. For this, we designed a three-tier application with an extensible UX framework. This facilitated an incremental build out of functionality that could coexist with the legacy system until that could be decommissioned.

The code for this rewritten application contained a lot of reuse-in-the-small, incorporating libraries and frameworks from third parties, as well as those developed in-house. Reflecting the period in which it was built, this ended up being a large, functionally complete system, because it had to be. There were no reuse-in-the-large options available at that time to have been integrated.

Although this system implemented what you would now call a service architecture and had code that was packaged into domain-oriented modules, it didn't have any notion of bounded context. Nothing stopped the code in one module from reusing classes from another, or from interacting directly with data in different domains. The consequence of this approach, which we see now retrospectively in a lot of legacy systems from this era, is monolithic application code for which functional change can be difficult to implement.

Nobody twenty years ago would have written a procedure that took the global state of an application as its input. It was understood that well-written procedure code should be small, should do the proverbial one thing well, and should rely on only the minimum required context. What we've learned since then with micro-service architecture is to apply these same design principles at the level of a service component.

Specifically, we've established that small, cohesive services should have bounded context, meaning they should only have access to the minimum inputs required to perform their work. There may be integrated views and orchestrated updates in a system, and they might span multiple domains. But these can be implemented in an application layer above the domain services.

The second lesson we've learned, collectively, is what works for large-scale service integration.

I was fortunate to get some early exposure to the challenges involved with integration of this kind. After the university project, I worked on an ADSL and IPTV order automation system at a large telecom company. This time I was responsible for the overall solution architecture, which included a frontend application for order capture, and a assortment of backend orchestrations that had to interface with a dozen different systems. To give an idea of the scope of this project, all told there were twenty distinct process flows, including in-flight cancellation scenarios, that all had to be mapped to operations spanning those dozen systems. The cancellation scenarios required defining compensating transactions in order to unwind orders that were in-flight.

The distributed architectural nature of this was entirely different from anything I had encountered previously. We were able, however, as a team, to make a success out of this project, and I did learn about some specific practices and technologies along the way that I think are important for large-scale service integration.

First, I learned that it matters not only that you package your code into small, cohesive services, but that you also have the right kind of interfaces to support scalability, cross-system integration, and service composition. To achieve this, I defined service interfaces that were stateless with simple request/response patterns, clearly separated commands from queries, as well as idempotent updates from non-idempotent, and lastly, implemented lightweight web protocols. (You'll note that this more or less corresponds to what REST APIs look like today.) The lightweight web protocols made it easier to interoperate between systems running different code on different platforms. The stateless request/response design principles gave us scalable services, without side effects, that were also easier to orchestrate.

Second, I learned the central role that middleware services play as platforms for reliable messaging and orchestration. In large integrated solutions you frequently need to incorporate messaging to uncouple large transactions that would otherwise span too many systems. If, let's say, you were to order a laptop online, the many operations that are processed in different systems over an extended period cannot all be a single transaction. Instead, these operations have to be uncoupled from one another and turned into a series of smaller transactions within a longer-running orchestration. In addition to messaging and orchestration, you also often need middleware to support things like publish/subscribe, message filtering, and content-based routing. You really can't build industrial-strength solutions without these features.

In the end what I find interesting, looking back at a large integration project like this, is how it demonstrates the value of reuse. It did take a lot of work to design all those system APIs, the orchestrations, and the various compensating transactions. But bear in mind that, despite these design complications, integrating the existing systems was a much a shorter path to a working solution than reproducing all the equivalent functionality in a new application would have been.

The third lesson that I've certainly learned over the years is the importance of what I would call the "it just works" factor. By this, I mean that the journey from an architect's design intention through to the ultimate deployment of a working solution should involve a reasonable number of reasonably predictable steps. This means working with fully-realized technology standards that have the necessary framework and tool support. The industry also took a couple of evolutionary detours in prior years that, in my experience, worked against this goal.

One of these detours was the unfortunately complicated and abstract set of web service standards from the early 2000s. I had some exposure to these standards, after the telecom project, when I was tasked with building an XML-first code pipeline that would generate working web-services, based on XML Schemas, WSDL, and Java code-binding mechanisms. Using the WS-standards was then, and continues to be, a more difficult way to define and build APIs than it ought to be.

Another evolutionary detour was the centralization of middleware services on ESB platforms. This reflects the era in which ESB products were built, and the on-premises environments in which they had to be installed. In an ideal world, middleware services would be distributed, to support independent solution configurations and deployments. But instead, ESBs generally run on centralized, dedicated hardware, operated by a dedicated team of people. Being expensive to license and requiring a lot of server capacity, ESBs end up being shared resources that whose services require a lot of management overhead. This makes them a huge impediment to developing and testing integrated solutions with any degree of agility.

Today, thankfully, there are better options. Instead of the WS-* stack, we can now work with mature API standards and principles that have evolved from real world use in web development. These include REST, OAuth 2.0, and OpenAPI, all of which build on HTTP, Mime types, JSON and YAML. Used together, these now make it practical to follow an API-first approach like I was trying previously, not entirely successfully, to work out with the WS-stack.

Practicing API-first with these newer standards, and deploying working code to the cloud on serverless, now looks like this:

You declaratively define your endpoints and schemas, including field-level constraints, in OpenAPI. This can also include OAuth 2.0-based authorization.
You generate skeleton code for your service and data classes and implement using any modern programming language.
You wrap this service code with your preferred protocol wrapper, whether that be serverless functions or containerized Web API endpoints.
You declaratively define your cloud resources, including API management, serverless functions, an identity provider and perhaps also a database service.
You deploy this secure, scalable API and supporting services to the cloud.

Admittedly, we're simplifying things to a degree with this list. See the Learning AWS Serverless course on this site for all the details.

But the point is that each of these discrete steps is reasonably easy and predictable. Things that were once challenging, like getting object serialization working reliably across different code bases and platforms, now just works. The technology stacks and tooling are all mature, and there's less risk of running up against a blocker.

There's also less team coordination overhead and resource contention from sharing ESB infrastructure when you deploy serverless APIs and integration services to the cloud. I invite you to look at the Unlocking Speed and Agility post in the Beginner's Overview series for more on this topic.

In summary, a team that's experienced at working with these modern standards and technologies can make the journey from design to deployment very rapidly, even while iterating through multiple prototypes.

So, given this evolutionary journey, where are we today with reuse-in-the-large?

Integration used to be an exercise in frustration. At various points we lacked a consensus on good service architecture, lacked the necessary middleware services, or lacked mature standards and technologies on which to build solutions. These accidental complexities, as Brooks called them, are now far less of an issue. We have practical API standards that are widely supported in modern programming languages, in development tools, and in cloud services. We also have serverless middleware options that, at least for cloud-hosted solutions, sidestep the centralization problems touched on above.

Reuse-in-the-large is, indeed, a solved problem. Software is a different world from machinery, but we are, in broad strokes, at the same point now that industry was when it moved from craft work to interchangeable parts. Modern APIs make it possible to integrate different services and systems. They also make micro-services practical so that large systems can be built and operated as collections of smaller parts.

In the next post, we'll look at the other two long-running industry trends that I believe shape today's solution landscape. We'll see how SAAS systems and services are expanding the opportunities for reuse in the cloud. We'll also look at how, in a cloud-centered world, serverless technologies represent a nearly ideal development platform.

Today's Solution Landscape Pt 1

The Evolution of Reuse

Next