Over the past 10 years, distributed systems have become more fine-grained. From the large multi-million line long monolithic applications, we are now seeing the benefits of smaller self-contained services. Rather than heavy-weight, hard to change Service Oriented Architectures, we are now seeing systems consisting of collaborating microservices. Easier to change, deploy, and if required retire, organizations which are in the right position to take advantage of them are yielding significant benefits.This book takes an holistic view of the things you need to be cognizant of in order to pull this off. It covers just enough understanding of technology, architecture, operations and organization to show you how to move towards finer-grained systems.
What is the standard pattern of orchestrating microservices?
If a microservice only knows about it's own domain, but there is a flow of data that requires that multiple services interact in some manner, whats the way to go about it?
Lets say we have something like this:
And for the sake of the argument, lets say that once an an order have been shipped, the invoice should be created.
Somewhere, someone presses a button in a GUI, "Im done, lets do this!" In a classic monloith service architecture, I'd say that there is either an ESB handling this, or the Shipment service has knowledge of the invoice service and just calls that.
But what is the way people deal with this in this brave new world of microservices?
I do get that this could be considered highly oppinion based. but there is a concrete side to it, as microservices are not supposed to do the above. So there has to be a "what should it by definition do instead", which is not oppinion based.
The Book Building Microservices describes in detail the styles mentioned by @RogerAlsing in his answer.
On page 43 under Orchestration vs Choreography the book says:
As we start to model more and more complex logic, we have to deal with the problem of managing business processes that stretch across the boundary of individual services. And with microservices, we’ll hit this limit sooner than usual. [...] When it comes to actually implementing this flow, there are two styles of architecture we could follow. With orchestration, we rely on a central brain to guide and drive the process, much like the conductor in an orchestra. With choreography, we inform each part of the system of its job, and let it work out the details, like dancers all find‐ ing their way and reacting to others around them in a ballet.
The book then proceeds to explain the two styles. The orchestration style corresponds more to the SOA idea of orchestration/task services, whereas the choreography style corresponds to the dump pipes and smart endpoints mentioned in Martin Fowler's article.
Under this style, the book above mentions:
Let’s think about what an orchestration solution would look like for this flow. Here, probably the simplest thing to do would be to have our customer service act as the central brain. On creation, it talks to the loyalty points bank, email service, and postal service [...], through a series of request/response calls. The customer service itself can then track where a customer is in this process. It can check to see if the customer’s account has been set up, or the email sent, or the post delivered. We get to take the flowchart [...] and model it directly into code. We could even use tooling that implements this for us, perhaps using an appropriate rules engine. Commercial tools exist for this very purpose in the form of business process modeling software. Assuming we use synchronous request/response, we could even know if each stage has worked [...] The downside to this orchestration approach is that the customer service can become too much of a central governing authority. It can become the hub in the middle of a web, and a central point where logic starts to live. I have seen this approach result in a small number of smart “god” services telling anemic CRUD-based services what to do.
Under choreography style the author says:
With a choreographed approach, we could instead just have the customer service emit an event in an asynchronous manner, saying Customer created. The email service, postal service, and loyalty points bank then just subscribe to these events and react accordingly [...] This approach is significantly more decoupled. If some other service needed to reach to the creation of a customer, it just needs to subscribe to the events and do its job when needed. The downside is that the explicit view of the business process we see in [the workflow] is now only implicitly reflected in our system [...] This means additional work is needed to ensure that you can monitor and track that the right things have happened. For example, would you know if the loyalty points bank had a bug and for some reason didn’t set up the correct account? One approach I like for dealing with this is to build a monitoring system that explicitly matches the view of the business process in [the workflow], but then tracks what each of the services does as independent entities, letting you see odd exceptions mapped onto the more explicit process flow. The [flowchart] [...] isn’t the driving force, but just one lens through which we can see how the system is behaving. In general, I have found that systems that tend more toward the choreographed approach are more loosely coupled, and are more flexible and amenable to change. You do need to do extra work to monitor and track the processes across system boundaries, however. I have found most heavily orchestrated implementations to be extremely brittle, with a higher cost of change. With that in mind, I strongly prefer aiming for a choreographed system, where each service is smart enough to under‐ stand its role in the whole dance.
Now, after this comes the fun. The Microservices book does not assume microservices are going to be implemented with REST. As a matter of fact in the next section in the book they proceed to consider RPC and SOA-based solutions and finally REST, which is what they recommend at the end. Important point here is that Microservices does not imply REST.
So, What About HATEOAS?
Now, if we want to follow the RESTful approach we cannot ignore HATEOAS or Roy Fielding will be very much pleased to say in his blog that our solution is not truly REST. See his blog post on REST API Must be Hypertext Driven:
I am getting frustrated by the number of people calling any HTTP-based interface a REST API. What needs to be done to make the REST architectural style clear on the notion that hypertext is a constraint? In other words, if the engine of application state (and hence the API) is not being driven by hypertext, then it cannot be RESTful and cannot be a REST API. Period. Is there some broken manual somewhere that needs to be fixed?
So, as you can see, Fielding thinks that without HATEOAS you are not truly building RESTful applications. For fielding HATEOAS is the way to go when it comes to orchestrate services. I am just learning all this, but to me HATEOAS does not clearly define who or what is the driving force behind actually following the links. In a UI that could be the user, but in computer-to-computer interactions, I suppose that needs to be done by a higher level service.
According to HATEOAS, the only link the API consumer truly needs to know is the one that initiates the communication with the server (e.g. POST /order). From this point on, REST is going to conduct the flow, because in the response of this endpoint, the resource returned will contain the links to next possible states. The API consumer then decides what link to follow and move the application to the next state.
Despite how cool that sounds, the client still needs to know if the link must be POSTed, PUTed, GETed, PATCHed, etc. And the client still needs to decide what payload to pass. The client still needs to be aware of what to do if that fails (retry, compensate, cancel, etc.).
I am fairly new to all this, but for me, from HATEOAs perspective, this client, or API consumer is a high order service. If we think it from the perspective of a human, you can imagine an end user in a web page, deciding what links to follow, but still the programmer of the web page had to decide what method to use to invoke the links, and what payload to pass. So, to my point, in a computer-to-computer interaction, the computer takes the role of the end user. Once more this is what we call an orchestrations service.
I suppose we can use HATEOAS with either orchestration or choreography.
In Fowler's article he also mentions special care you must have with endpoint versioning if you do so. In HATEOAS approach, if you do a version upgrade, how easily can that be propagated?
The API Gateway Pattern
In an approach more resembling SOA principles than REST ones, Chris Richardson proposed what he called API Gateway Pattern.
In a monolithic architecture, clients of the application, such as web browsers and native applications, make HTTP requests via a load balancer to one of N identical instances of the application. But in a microservice architecture, the monolith has been replaced by a collection of services. Consequently, a key question we need to answer is what do the clients interact with?
An application client, such as a native mobile application, could make RESTful HTTP requests to the individual services [...] On the surface this might seem attractive. However, there is likely to be a significant mismatch in granularity between the APIs of the individual services and data required by the clients. For example, displaying one web page could potentially require calls to large numbers of services. Amazon.com, for example, describes how some pages require calls to 100+ services. Making that many requests, even over a high-speed internet connection, let alone a lower-bandwidth, higher-latency mobile network, would be very inefficient and result in a poor user experience.
A much better approach is for clients to make a small number of requests per-page, perhaps as few as one, over the Internet to a front-end server known as an API gateway.
The API gateway sits between the application’s clients and the microservices. It provides APIs that are tailored to the client. The API gateway provides a coarse-grained API to mobile clients and a finer-grained API to desktop clients that use a high-performance network. In this example, the desktop clients makes multiple requests to retrieve information about a product, where as a mobile client makes a single request.
The API gateway handles incoming requests by making requests to some number of microservices over the high-performance LAN. Netflix, for example, describes how each request fans out to on average six backend services. In this example, fine-grained requests from a desktop client are simply proxied to the corresponding service, whereas each coarse-grained request from a mobile client is handled by aggregating the results of calling multiple services.
Not only does the API gateway optimize communication between clients and the application, but it also encapsulates the details of the microservices. This enables the microservices to evolve without impacting the clients. For examples, two microservices might be merged. Another microservice might be partitioned into two or more services. Only the API gateway needs to be updated to reflect these changes. The clients are unaffected.
Now that we have looked at how the API gateway mediates between the application and its clients, let’s now look at how to implement communication between microservices.
This sounds pretty similar to the orchestration style mentioned above, just with a slightly different intent, in this case it seems to be all about performance and simplification of interactions.
I'm building off of a previous discussion I had with Jon Skeet.
The gist of my scenario is as follows:
At this point, I understand that it would be bad practice to use the UUID generated by the client as the object's PK in my database. The reason for this is that a malicious user could modify the generated UUID and force PK collisions on my DB.
To mitigate any damages which would be incurred from forcing a PK collision on PlaylistItem, I chose to define the PK as a composite of two IDs - the client-generated UUID and a server-generated GUID. The server-generated GUID is the PlaylistItem's Playlist's ID.
Now, I have been using this solution for a while, but I don't understand why/believe my solution is any better than simply trusting the client ID. If the user is able to force a PK collison with another user's PlaylistItem objects then I think I should assume they could also provide that user's PlaylistId. They could still force collisons.
So... yeah. What's the proper way of doing something like this? Allow the client to create a UUID, server gives a thumbs up/down when successfully saved. If a collision is found, revert the client changes and notify of collison detected?
A nice solution would be the following: To quote Sam Newman's "Building Microservices":
The calling system would POST a BatchRequest, perhaps passing in a location where a file can be placed with all the data. The Customer service would return a HTTP 202 response code, indicating that the request was accepted, but has not yet been processed. The calling system could then poll the resource waiting until it retrieves a 201 Created indicating that the request has been fulfilled
So in your case, you could POST to server but immediately get a response like "I will save the PlaylistItem and I promise its Id will be this one". Client (and user) can then continue while the server (maybe not even the API, but some background processor that got a message from the API) takes its time to process, validate and do other, possibly heavy logic until it saves the entity. As previously stated, API can provide a GET endpoint for the status of that request, and the client can poll it and act accordingly in case of an error.
What are pros and cons of using microservices in comparison with alternative architectures? Is there a rule of thumb when microservices should be used?
Sam Newman in Building Microservices, enumerates the key benefits of Microservices as following:
With a system composed of multiple, collaborating services, we can decide to use different technologies inside each one. This allows us to pick the right tool for each job, rather than having to select a more standardized, one-size-fits-all approach that often ends up being the lowest common denominator.
A key concept in resilience engineering is the bulkhead. If one component of a system fails, but that failure doesn’t cascade, you can isolate the problem and the rest of the system can carry on working. Service boundaries become your obvious bulkheads. In a monolithic service, if the service fails, everything stops working. With a monolithic system, we can run on multiple machines to reduce our chance of failure, but with microservices, we can build systems that handle the total failure of services and degrade functionality accordingly.
With a large, monolithic service, we have to scale everything together. One small part of our overall system is constrained in performance, but if that behavior is locked up in a giant monolithic application, we have to handle scaling everything as a piece. With smaller services, we can just scale those services that need scaling, allowing us to run other parts of the system on smaller, less powerful hardware.
A one-line change to a million-line-long monolithic application requires the whole application to be deployed in order to release the change. That could be a large-impact, high-risk deployment. In practice, large-impact, high-risk deployments end up happening infrequently due to understandable fear.
With microservices, we can make a change to a single service and deploy it independently of the rest of the system. This allows us to get our code deployed faster. If a problem does occur, it can be isolated quickly to an individual service, making fast rollback easy to achieve.
Microservices allow us to better align our architecture to our organization, helping us minimize the number of people working on any one codebase to hit the sweet spot of team size and productivity. We can also shift ownership of services between teams to try to keep people working on one service colocated.
One of the key promises of distributed systems and service-oriented architectures is that we open up opportunities for reuse of functionality. With microservices, we allow for our functionality to be consumed in different ways for different purposes. This can be especially important when we think about how our consumers use our software.
If you work at a medium-size or bigger organization, chances are you are aware of some big, nasty legacy system sitting in the corner. The one no one wants to touch. The one that is vital to how your company runs, but that happens to be written in some odd Fortran variant and runs only on hardware that reached end of life 25 years ago. Why hasn’t it been replaced? You know why: it’s too big and risky a job.
With our individual services being small in size, the cost to replace them with a better implementation, or even delete them altogether, is much easier to manage.
The most important disadvantage of Microservices is that they have all the associated complexities of distributed systems, and while we have learned a lot about how to manage distributed systems well it is still hard. If you’re coming from a monolithic system point of view, you’ll have to get much better at handling deployment, testing, and monitoring to unlock the benefits. You’ll also need to think differently about how you scale your systems and ensure that they are resilient. Don’t also be surprised if things like distributed transactions or CAP theorem start giving you headaches, either!
Just quoting from Martin Fowler:
One reasonable argument we've heard is that you shouldn't start with a microservices architecture. Instead begin with a monolith, keep it modular, and split it into microservices once the monolith becomes a problem.