This is the final part of the article series. In the previous parts, we have looked in to what DevOps is and covered the core principles and practices. In this part, I would like to look closely into one practice in particular. One which can easily be overlooked resulting in grave consequences.
Continuous monitoring
Things do not always go according to plan. Things break. This is a fact of life, and, although it is not technically impossible to make error-free software, there is almost always something which goes wrong. Monitoring software is the first place to start, but it goes beyond finding problems. It includes analyzing current performance and characteristics with the goal of improving. Instead of telling you why it is important straight out, I would like to tell you about an experience I had.
I started on a project where the client needed help fixing some integrations which had broken about a month prior. After an introduction to the platform and an alarming lack of documentation, I began the witch hunt for the broken connection. There was a variety of code smells and subpar choices in the code, but what really frustrated me above all else was the fact that there was next to no logging in the code. Scattered were a few logs indicating certain activities had been started, but when a problem arose, the only handling ever was a call to a function which would send an email containing the problem to a hard-coded IT support email address. As you can imagine, this frustrated me because I could not trace the issue and attempt to reproduce the problem while finding what specifically went wrong. I brought this to the attention of the customer who was also alarmed by the daily-growing number of emails in the inbox and understood that this was a very inefficient way of monitoring and solving issues. We agreed that I would use some time to set up proper connections from all services to a monitoring platform and add logging in the code where it would give value for future use.
The principle of continuous monitoring is to alert the support team and allow them to efficiently investigate what went wrong and where. The fundamentals include proper logging throughout the code of the application and a mechanism for accessing those logs. Depending on the type of system, the exact setup may look very different. A setup which I have used (and thoroughly enjoyed once I got the hang of it) was built on the OpenTelemetry suite of tools to collect logs from services and store them to be accessed by Grafana (a data visualization and query platform) along with a number of other fantastic monitoring tools from the same ecosystem. This setup allowed us to keep tabs on what our software was doing, as well as locate errors in real-time. Figure 1 shows a view of Grafana being used to filter though logs (label filters), visualize the number of logs (and levels) over time, and see the content of those logs. The tool includes powerful query abilities and useful tools such as built-in JSON parsing for log contents.
Continuous monitoring will not solve the issues that arise, but it will at least yield the necessary level of awareness and tools to locate the problem sources. The other, just as important, part is enabling the fundamentals to produce value through good logging practices. As this topic deserves an article of its own, I will quickly name a few good practices to whet your appetite.
- Log where necessary to fill in the gaps where state is important: A simple deterministic function probably does not need to be logged, but a function which executes differently based on user input deserves a log containing the determining value in order to rebuild the state when debugging.
- Use string templating: Loggers will usually have a way to log a string template with supplementary arguments to populate the values. This is extremely beneficial when querying logs (e.g. filtering all logs by a given template).
- Write good log messages: There is nothing more frustrating than an ambiguous log message which is not very helpful in the first places and matches multiple places in the code.
- Never log sensitive or secret data: This should be obvious but worth mentioning nonetheless. Because logs are generally persisted, saving sensitive personal data may breach compliance regulations (e.g. GDPR) and saving secrets can result in inadvertent data breaches.
In this section, I focused on the logging aspect of continuous monitoring. While it is a vital part, collecting and analyzing metrics is also quite useful. There is more to continuous monitoring than logging, but it is a good start.
Architecture
This article series has focused primarily on what DevOps is, but there are more aspects to consider when using DevOps than DevOps itself. The architecture of the software is one of these aspects. Two very common software architectures are monolithic and microservice.
As the name suggests, a monolithic architecture involves a single codebase for a large software system generally containing many components which are tightly coupled together. While there is certainly a time and place for monoliths, there are both advantages and disadvantages to this approach. The main upside is the simplicity of the architecture. There is generally not a lot of overhead or engineering required for the components to communicate and work together. However, due to its nature, even the smallest of changes require the entire application to be rebuilt and deployed. As you may begin to suspect, this architecture is not always the best suited to the task when it is the goal to utilize efficient DevOps practices.
The other architecture i mentioned, the microservice architecture, has some advantages over the monolithic architecture when it comes to DevOps practices. A microservice architecture relies on many small, isolated services which are individually deployed, communicating with each other to form the entire system. The advantage over the monolithic architecture may be immediately obvious. Given each individual service has its own codebase and is deployed individually, developers can make changes to a single service, test, and deploy it without ever touching any of the other services. This gives development teams the ability to iterate more quickly on individual parts of the system as they are all independent from the others. It is also true, however, that this architecture comes with the drawback of requiring a resilient communication system among services to facilitate the business logic of the system.
Culture
This would not be an article about DevOps if I did not talk about the cultural aspect. It is not surprising that DevOps relies heavily on a culture which will support such an “agile” approach to the processes I have covered in this article. The continuous nature of each of the principles may require team members to change how they are used to working. To reap the benefits of continuous integration and not suffer the consequences of merge hell, team members should merge their code changes to the master branch regularly. Some (notably, Mark Seemann in Code That Fits In Your Head) would argue that this should happen at least every 4 hours. Personally, I have not worked on a project with a team where I feel it would have made sense to merge at least every 4 hours, but I do believe there are cases where it is applicable, and I can appreciate the point it tries to convey. Continuous delivery/deployment should mean regular releases to avoid overly large deployments requiring much oversight and manual input (or testing).
Given the variety of individual use cases, DevOps is not a “one size fits all.” Rather it is a way of seeing the individual processes and deciding on how best to structure software, projects, teams and processes to fit the goals the team decides on.
This is the final part of my article series in which we have learned what DevOps is, why it is important to us, and how we can use it in our daily software development. More than anything else, I hope this article conveyed the importance of the right mindset and willingness to make use of good practices.
References
[2] https://medium.com/startlovingyourself/microservices-vs-monolithic-architecture-c8df91f16bb4
[3] https://medium.com/@verapatc/agile-mindset-and-devops-culture-9526f0b4592e