I was involved in a recent discussion regarding how to improve a reporting microservice that aggregates data from numerous services. There were several issues with this reporting service, but the major ones that needed addressing were:
- This service’s data gets corrupted, and
- New representations of the consumed data are requested.
A so-called “take-on” must be performed to fix these issues. This “take-on” is a process through which the reporting service’s data is purged; new data is fetched from the relevant services’ databases, transformed and restored.
There is a myriad of issues with this approach, least of all the intimate knowledge this aggregator service has of the internal underlying datastores/models of each service. This is a form of implementation coupling.
The suggestion was APIs could remedy this by allowing the reporting service to pull the required data. This, however, does not resolve implementation coupling if the types are shared. Furthermore, retrieving data in this way, similar to the database access above, negatively affects the participating services’ performance.
Additionally, this does not tackle behavioural and temporal coupling. Behavioural coupling exists as the reporting service knows where each piece of data comes from and which services are responsible for it. Meaning if the responsibilities of any of the services change, the reporting service must also change.
Temporal coupling is prevalent, too, as all the services must be available to kick off this “take-on” process. If one of the services is down, the take-on cannot succeed. On-call engineers ran the take-on process overnight to increase the likelihood of its success. This was also a way to minimise the risk of degrading the performance of participating services.
So to recap, the reporting service cannot work without these “dependant” services, and any changes to these services could potentially impact the aggregator. Furthermore, the coupling to the internal models adds a level of both knowledge and implementation coupling that makes it impossible to evolve the services safely as the internal models are, in effect, external contracts.
Sending data to an aggregator is a well-known problem in SOA systems, which has already been solved via “data pumps”. This was one of the first solutions to be suggested.
Data was to be pumped through a queue to the reporting service. The reporting service processes the events and stores the relevant information in its local read models/materialized views.
This all seems like a very reasonable solution; however, message queues are not ideal for this scenario. Reporting services may be considered a form of read models — albeit one that gets its data from many services. Message queues are not a good fit for maintaining read models for the following reasons:
- Messages are gone once consumed, and
- Ordering of messages is not guaranteed.
In the next post in this series, I will delve deeper into these issues and how they impact an aggregation service such as this.