Micro services is an architectural style where a system is broken into number of distinct services, each typically running on its own host and using a light weight integrations to talk to other services.
A micro services architecture helps address some of the limitations described above by splitting the system into autonomous services that can be deployed independently. Services are typically deployed on a dedicated host and communicate with one another using light weight integrations such as REST or RPC. There’s nothing to stop multiple services running on the same host but it makes sense to deploy them separately so that in the event of a host failure only one service is affected.
The key benefit of splitting an system into distinct services is that it allows us to develop, deploy and mange the life cycle of each service independently. Code changes to a single service result in the deployment of just that service while the rest of the application remains unaffected. This allows a team to make changes to its service and deploy new functionality quickly, in contrast to the large coordinated effort required for changes to a monolith.
Fig 1.1 – Micro Services Architecture
In order to achieve a high level of autonomy, loose coupling between services is essential. If services are tightly coupled a change in one service may directly impact a consuming service and result in both services having to be updated and deployed together. A micro service architecture seeks to avoid this kind of lock-step deployment by ensuring services are as loosely coupled as possible. Loose coupling can be difficult to achieve and requires careful consideration by both service providers and consumers. Below are a number of important considerations when it comes to loose coupling
- Services should avoid sharing database schemas with other services outside of their domain. A shared schema means a shared data model and usually results in tightly coupled components. If we decide to evolve the data model we’ll impact every other service that uses it. This means we can no longer evolve a service independently of other parts of the system and forces us to coordinate change with the owners of other services.
- A dedicated database doesn’t mean that each service needs its own database server. You can use a single database with each service having a dedicated schema.
- Each service having its own data store provides greater flexibility and allows teams to choose the technologies that are right for them. For some teams that may be a traditional relation database, while others might choose a NoSQL data store.
- Service Interface
- Service interfaces should expose only the parts of the data model that are required. Exposing some form of data model to the client is essential but its important that it remains lean and does not contain any more information than is absolutely necessary.
- Ensure that the data model you expose is decoupled from the internal data model. Exposing the internal data model means exposing unnecessary implementation detail. For example if a service deals with Customer information, the entity that represents a customer inside your service should not be exposed to clients. Instead you could expose a model with just the data required by the client and no more. This way you can evolve your internal data model without breaking the client. Keeping internal implementation detail hidden is an important part of loose coupling.
- Integration Technology
- Choose an integration technology that lends itself to loose coupled integration.
- REST integration uses well defined web standards and is a popular choice for loose coupled integration. REST is platform agnostic, allowing services written in different technologies to easily talk to one another. It doesn’t mandate specific messaging formats, giving you the flexibility to choose the format that suits best.
- Avoid integration technologies that tightly couple client and service through a shared model. Exposing services using WSDL for example, typically require the consumer to generate a client side stub based on the exposed interface. This can make it more difficult to evolve the service without breaking the client. Changes to the service WSDL often require clients to regenerate their client side code in order to realign with the new service interface.
- RESTful integration with plain XML or JSON over HTTP allows services to evolve their interface without necessarily breaking consumers.
- Achieving loose coupling requires discipline from service clients too. Obviously we won’t always have control of the client applications calling our services but where we do, the following points are worth considering.
- Clients should consume services in a way that is tolerant to change, implementing what is known as tolerant readers.
- Its preferable that clients apply minimal validation and extract only the data they need from the service response, ignoring the rest.
- If a service interface evolves and adds 2 new fields to an XML response, the client code can simply ignore this extra data. If required, the client code can be updated at some point in the future to read the new fields.
- Clients that implement tolerant readers allow the services they consume to evolve without breaking changes.
Modelling Services on Business Concepts
Splitting a system into loosely coupled services and defining the responsibilities and boundaries of those services is a fundamental step in implementing micro services architecture. You should start by identifying natural boundaries in the business domain. Most organisations are split into distinct business areas, each responsible for performing specific functions. Consider an on-line retailer for example, you could break this type of business into the following areas (obviously such a business could have many more distinct areas but for the sake of simplicity we’ll go with the list below)
- public facing web app where customers can browse products and place orders
- payments processing
- warehouse order processing
- sales and marketing department
- finance department
While each of these areas is responsible for performing a specific business task they cannot exist in isolation, and rely on interactions with other parts of the business. The point at which one business area interfaces with another can be thought of as a domain boundary. By mapping out distinct business areas and their boundaries with other parts of the system, we start to get a picture of how we might model the services in our micro service architecture.
Warehouse order processing from the list above, is an example of a business domain that could be modelled with a set of dedicated services. This would allow the services to evolve independently of services in other business domains. The development team could build, test and deploy new functionality for this business domain without disrupting the wider application.
Dealing with Change
The type of change is what ultimately dictates the level of disruption to the wider application. If service boundaries/interfaces don’t change and the service updates are internal to the business domain, the rest of the system should be insulated from changes in one business domain. For example, an architect may decide to change the service persistence layer. While this may involve considerable change within the service, it remains an internal implementation detail and shouldn’t impact other parts of the application.
Service boundary changes on the other hand involve altering the way a service integrates with external components and typically means a change to existing interfaces. This type of change has the potential to be more disruptive because changes to an exposed contract may break service clients. An example might be adding new fields to a REST endpoint. Such a change will require coordination with other parts teams and the new interface will have to be tested to ensure it hasn’t broken consumers. If an updated interface can’t be handled gracefully by all clients, time will need to be set aside to allow client applications to make the required changes. Interface changes that break clients are more painful because they require components from multiple business domains to be tested and deployed together. This type of coupling is what we’re trying to minimise with a micro services architecture.
An application consisting of many distinct services, poses a number of challenges when it comes to component integration. The more granular we make our services, the more integration points we have to deal with, so its important our service integrations are as robust as possible. The techniques mentioned below are applicable to any distributed system, but are particularly important for micro services where we’re potentially dealing with a large number of remote components.
- Retry failed remote calls
- Network glitches are common even on robust cloud infrastructure. We must assume that remote calls will fail from time to time and put measures in place to deal with these failures.
- Implementing a retry mechanism allows us to re-execute remote calls in the event of a network failure. This is especially useful for dealing with short term network glitches. A typical approach is to retry a call 3 times (configurable) with a short back-off period between each call.
- Circuit breakers
- Circuit breakers limit the number of times we attempt to call a slow or unresponsive service by monitoring previous failed attempts.
- After a predefined threshold has been reached the circuit breaker will trip, any further attempts to call the service will result in the circuit breaker skipping the remote call and immediately returning an error response.
- From a clients perspective this means reduced latency as we no longer have to wait for multiple attempts to call a service that will likely fail. An immediate error response from the circuit breaker allows the client to deal with the error right away.
- It also benefits the target service by reducing the number of incoming requests and may provide an opportunity for the service to recover if struggling under heavy load.
- Connection Pooling
- A single connection pool can be quickly exhausted by multiple calls to a slow or unresponsive service.
- Exhausting the connection pool will result in other processes being unable able to make remote calls at the same time.
- A sensible approach is to have dedicated connection pools for outbound service calls. In the event of one set of service calls running slowly, all available connections won’t be monopolised and other remote calls can proceed.
- Connection Timeouts
- Sensible timeouts are important to ensure we manage slow or unresponsive remote calls.
- Timeouts should be fine tuned for each remote call depending on expected performance and response times.
- Using timeout values that are too long, or worse still no timeouts at all, can lead to unacceptable latency for client applications.
- Timeout values that are too short result in failed calls that might have otherwise succeeded had the target service been given more time to process the request.
We know from experience that things can and will go wrong in a production environment. Its important to proactively monitor application health so that we can identify issues as soon as they happen and react accordingly. Once we’ve identified that there’s an issue we need access to application metrics so that we can identify the root cause and do something about it. Application metrics are important in any production environment but are of particular significance in a micro services architecture. An system made up of many distinct services has the potential to fail at many points, so its important we have a fined grained view of each components health so that we can quickly identify and resolve issues.
- Health Checks
- A common approach is to have a load balancer periodically ping the application to check its health. If the load balancer doesn’t get a successful response code (HTTP 200) within a certain period, it may take the service out of action by no longer routing traffic to it. For example, this is implemented on AWS by setting up health checks using an Elastic Load Balancer. You can configure the Elastic Load Balancer to send a periodic HTTP requests to a health endpoint on our application. If the service responds successfully within a predefined period (say 3 seconds) the ELB will assume that the server instance is healthy.
- We can set different tolerances for different services or environments. For example on a pre-production environment we might configure a response time of 8 seconds for health checks before deciding an instance is unresponsive. On a production environment we may have lower tolerances and decide an instance is unresponsive after 3 seconds.
- Real time application and infrastructure metrics are key to fault finding in a production environment and should be available for every service
- Below are some metrics that I’ve found useful trouble shooting issues in the past
- JVM Metrics
- Threadpool metrics
- Database connection pool metrics
- Remote service call times
- DB query times
- Cache metrics
- Host CPU usage
- Host memory usage
Scaling Micro Services
- Scaling Vertically
- Vertical scaling is where we increase host resources such as CPU, memory or disk storage. In the cloud this is typically the easiest way to scale a service to handle increased demand.
- Scaling vertically is useful but is ultimately limited by the resources available on a single host. To achieve real scalability we need to look beyond single instances and run components across multiple hosts.
- Scaling Horizontally
- Scaling horizontally is where we deploy service instances across multiple hosts.
- Load balancers are used to distribute HTTP requests across the various service instances.
- In the event of an instance failure, the load balancer will stop routing traffic to the unhealthy instance. It will continue to route requests to healthy instances. The load balancer will use health checks (discussed earlier) to decide whether or not an instance is healthy enough to accept requests.
- Auto Scaling
- Auto scaling is where we use events or infrastructure metrics to trigger a change in infrastructure.
- A failed health check is an example of an event that can trigger the provisioning of a new server instance, in this case to replace an instance that is no longer responsive.
- CPU and memory metrics can also be used to trigger the provisioning of new instances.
- Alarms can be created, that if triggered will result in new server instances being started and registered with the load balancer. This is a great way to respond automatically to increased load on a service.
- We can also use auto scaling to scale in when load decreases. We could trigger a scale in event by responding to server CPU or memory usage dropping below a predefined threshold. The ability to scale back in is important for managing costs.
Micro services in some ways is not dissimilar to the SOAs I’ve worked on in the past. It takes things further by encouraging a greater level of service granularity and therefore a larger number of distinct components. This introduces new complexity in terms of testing, deployment, integration and monitoring. This complexity can be offset to some degree with automation, especially around testing and the provisioning of infrastructure.
As well as more service granularity, micro services encourage splitting services by business domain and decoupling those domains as much as possible. This allows teams to develop test and deploy new functionality independently of other parts of the system. This autonomy should translate to faster release cycles and allow teams to deliver new features quickly, without being locked into a fixed release cycles with other dev teams. I suspect this level of autonomy doesn’t come cheap though, and to get it right would require considerable discipline during design and implementation.