Using end-to-end observability for cyber, CX improvements
Brian Mikkelsen, the vice president for US public sector at Datadog, said reducing tool complexity helps agencies understand how their systems are working.
The Office of Management and Budget’s 2022 IT Operating Plan highlighted the need to reduce complexity of systems to bring down costs. And, of course, it promoted the idea of using data to better drive decisions.
Over the years, agencies and vendors have made their technology environments a little more complex due to too many bespoke tools and a lack of integration of data. With all those challenges that that have come up over the last 20-25 years, OMB has actually pushed agencies toward enterprise services as one way to overcome many of these IT modernization obstacle.
There’s other opportunities for agencies to become more efficient, more secure, and improve how they deliver services decisions. One way: the use of end-to-end observability tools that can help agencies innovate by consolidating the tools that they use, reducing the complexity of those tools, and of course, give them visibility across many of their tools across the technology stack.
Brian Mikkelsen, the vice president and general manager for US public sector at Datadog, said end-to-end observability gives organizations an opportunity to observe or monitor any application, any infrastructure, anywhere. This includes infrastructure and applications no matter if they are on-premise or in the cloud.
“The three pillars of observability at its core context is infrastructure metrics. This is understanding the health of my operating systems, my virtual machines, my containers, all the way up into cloud native serverless functions,” Mikkelsen said on the discussion Innovation in Government, sponsored by Carahsoft. “It’s infrastructure metrics paired with application traces so now I’m starting to think about on top of that infrastructure, where am I running my applications, whether it’s on-premise or in the cloud, but what can I actually see in terms of how my applications are performing? What are they doing from a memory constraints perspective? What’s their overall performance? How much lag time is there between requests and actions? The third component of that three pillars of observability is logs. So it’s the end-to-end observability part is really this idea that we’re creating context for the users of these systems.”
Reducing time to solve problems
One of the biggest benefits of this approach is reducing the number of tools required to monitor networks, mitigate risks and creating context between infrastructure, applications and logs.
“The real benefit is to try and reduce the time to know when I have a problem. And the reduced time to solving that problem is correlating all that information and not having separate teams working in separate tools, all with a separate perspective,” Mikkelsen said. “One of the key characteristics of a more modern observability and security solution, we talk all the time about the cultural changes of getting people out of individual tools and individual contexts, and giving everybody the same view of the same information. I don’t want to have five tools and five teams looking at it from a different perspective. I want one tool with all the teams in that same tool, folks having the same context so we’re not arguing about what’s happening. We’re observing what’s happening, and we’re solving for it.”
The need to solve problems more quickly is as much about the evolving nature of the cyber threat as it is about meeting the growing expectations of an organization’s customers.
A recent Government Accountability Office report found agencies are struggling to meet the cybersecurity logging requirements as required by President Joe Biden’s May 2021 executive order.
“What it’s really asking you to be able to do is track issues in real time, hold those logs in storage for, I think, a minimum of 12 months in hot storage, and I think 30 months in total,” Mikkelsen said. “The benefit of an end-to-end observability and security company is that we think about logs in multiple perspectives. We can talk about IT infrastructure and application. But here from a cybersecurity perspective, now, we’re really talking about cloud security management.”
Solving mission problems
From a customer experience perspective, end-to-end observability also includes tools that provide digital experience monitoring.
Mikkelsen said the tools help organizations understand the user’s experience from login throughout the entire front-end event.
“They can generally understand what’s working and where are the bottlenecks. What are the challenges with that customer’s front end experience?” he said. “If you think about this from a synthetics [data] point of view, what synthetics allows you to do is proactively understand ‘is that system up and is that front end application up and running the way I want it to? Is it handling requests from various operating systems? Is it working with various browsers?’ And we can actually set up proactive tests so even more important than knowing when you have an issue and fixing it is knowing you have it before it’s a real issue, and resolving it before you have a negative customer experience or citizen experience. This all boils down to the real drive for a lot of our IT mission owners across government: They’re in the business of solving for the mission. A lot of times the mission is improving the citizen’s experience with government.”
Mikkelsen said the Agriculture Department’s Digital Infrastructure Services Center (DISC) took advantage of end-to-end observability tools and saw immediate improvements.
“They had one ongoing problem with memory utilization. The way I think about it was it was an executable loop and every time it fired up, it was causing memory depletion. That same or systematic set of tickets had popped up something in the neighborhood of 700 times in a short period of time,” he said. “They’ve taken that memory utilization challenge down from 700 plus tickets down to zero tickets relatively quickly because we were able to show them what the challenge was. On top of that, they were able to bring, I think, 95% of their target infrastructure up and running with monitors and dashboards from an observability point of view within 75 days. I think that includes over 4,000 containers as part of that infrastructure setup.”