91传媒

MAKING MONITORING PERVASIVE

Historically, monitoring has been something that could be manually deployed to a set of physical or virtual infrastructures, and once in place, you were good to go. Alerts and events would occur if something went wrong, and the monitoring tool would simply notify you with details about the incident. IT operations teams would nurse their infrastructure back to health, and all was good and right in the world again.

Fast-forward to today, however, and you鈥檒l find a much different situation 鈥 one where enterprises are increasingly embracing a culture of DevOps and the world of monitoring is reckoning with the need to evolve and adapt. Production environments are no longer static or perpetual where monitoring is a one-and-done activity. Rather, concepts like observability鈥攚here teams are taking a holistic approach to measuring and understanding digital experiences鈥攁re being baked into the release management pipeline to ensure every application and every bit of infrastructure being deployed has monitoring attached to it. But how do you get there? What tools and technologies are ushering in this new age of monitoring and observability? The not-so-surprising answer: An intelligent combination of听automation听补苍诲听data.

A Call to Automation

As the velocity of software release cycles increases, the manual approach to deploying monitoring agents or instrumenting everything in the release for proper observability becomes impractical and counterproductive. Can you imagine a developer checking in code to a source code management repository, creating a ticket in an ITSM system, and requiring someone within IT operations to manually deploy monitoring to that release? That flies in the face of speed, agility, and what a culture of DevOps is trying to accomplish (releasing better software, faster).

Leveraging automation tools, such as Ansible, Puppet, or Chef, is a great first step toward making monitoring pervasive with every release. This may involve the automated deployment and installation of a monitoring agent, the customization of a configuration file to define what gets monitored, or API calls to the control plane of a monitoring system to register new systems for observation. Fortunately, many continuous integration (CI) tools either have these automation capabilities out-of-the-box or stitch well with third-party automation tools.

Another growing trend we鈥檙e seeing in the quest toward making monitoring pervasive is 鈥榦bservability as code.鈥 In the same way that听infrastructure as code听seeks to codify infrastructure components as configuration files using JSON or YAML for consistent and repeatable use, we can leverage a similar capability with monitoring and observability. Increasingly, we鈥檙e seeing observability vendors develop terraform providers to do just that.听 Head over to the听听and search for your favorite observability vendor to see if they have something listed.

A Data-Driven Approach

Once a repeatable structure has been established to deploy observability in an automated fashion, we can take things a step further in the use and maturity of observability data. If the goal of DevOps is to release better software more quickly, the question of听howwe do that听begs to be asked. Observability provides one answer: by leveraging data about the health, performance, and availability of the application while running in a pre-production, lower-tier environment. You might be asking yourself, 鈥渄oes that mean monitoring and observability should be deployed outside of just production?鈥 The short answer is听yes. There are a host of benefits to having the telemetry data to make better-informed decisions about whether or not to promote code from a pre-production environment to a production environment. Many in the marketplace have begun to call this process 鈥榬elease verification.鈥

Among the largest data sources to drive this automated decision-making process are the metrics, logs, and traces from our observability stack. If there is too much latency being identified by our APM tool, too many errors in the logs, or anomalous metrics coming from our applications or infrastructure, then our tooling should be integrated or stitched into our delivery pipelines to throw up a red flag and say, 鈥淪top! Don鈥檛 release or promote this code.鈥 In addition to taking automated action, it would also facilitate a quick feedback loop to the development team, providing insight into why it failed so developers can spend more time on value-add activities and less time troubleshooting.

Conclusion

Monitoring maturity has come a long way in a short period of time 鈥 especially as the ways infrastructure and applications are architected and released to market have evolved. Learning to become proactive in the use of monitoring and observability data as part of a CI/CD pipeline can be new to a lot of traditional operations teams (particularly for those who have historically focused on incident response activities). However, employing an automated and data-driven approach to the use of observability will result in better software, differentiated products and services, happier customers, and outpaced competition.

Contributing Author: Johnny Hatch

SUBSCRIBE
Subscribe to the 91传媒 I/O Newsletter for a periodic digest of all things apps, opps, and infrastructure.
This site is protected by reCAPTCHA and the Google听听补苍诲听听apply.