What You Need to Know About the New Era of IT Operations Management Software

The IT Operations Management (ITOM) software is a large space consisting of thousands of technology vendors across several categories; Gartner tracks over 15 categories for the space, writes Neil Pearson, Principal Product Manager, OpsRamp. This makes for a crowded marketplace for buyers to navigate, and one which is also in considerable flux. We see new terms like digital operations management, observability, digital experience monitoring and artificial intelligence for IT operations (AIOps) taking precedence to accommodate changing enterprise needs.

The relevance of ITOM technologies—or whatever you may call them– in today’s digital world cannot be underestimated. What’s included in this category spans a wide swath of monitoring and management solutions including network, infrastructure, and application performance monitoring, cloud management, infrastructure automation, and ITSM.

These are the tools needed to keep organizations running smoothly by managing the provisioning, capacity, performance and availability of the ever-changing IT environment.

Fast, reliable performance of applications and websites is table stakes today when it comes to business expectations of IT. In this article, I’ve taken a crack at describing how things have changed and where the ITOM sector is headed.

New Infrastructure, New Requirements

As enterprise technology has evolved, so has the ITOM market. Once upon a time, monolithic legacy tools from huge vendors like IBM and BMC could adeptly handle the job of monitoring servers, applications, storage devices and network components inside the corporate firewall.

But then the cloud blew everything up and infrastructure became distributed and ephemeral: private clouds, public clouds, containers, edge devices, sensors, SaaS and mobile apps created a cascading snowball of alerts and events that were hard to parse. Shadow IT has resulted in even less control and visibility for IT managers. As a result, technology vendors have been developing a new breed of agile, cloud-ready applications over the last few years, resulting in a start-up ecosystem of hundreds of new vendors.

How IT leaders analyze and select these solutions will depend upon specific needs (such as cloud and compliance) along with the available state of technology and IT budgets. Yet, there are some core characteristics which have become priorities when selecting new ITOM tools in this space today.

Here are five things to keep in mind:

1: The ability to quickly merge and correlate event data from anywhere to show the big picture and point to next steps. What poses a real threat to the business and what is just noise or a low-level issue? The right approach to AIOps is a key enabler here to make sense of the madness.

2: The ability to be proactive. Prevention is the name of the game: a customer website going down for even an hour or two can cause significant financial loss, especially if it is the only source of revenue. As above, this depends on intelligent event triage.

3: Alignment with cloud-native development methods and tools that emphasize automation, adaptability and speed. IT operations teams are adopting tools and practices from DevOps, such as CI/CD and Observability, to keep pace with modern development trends. One of the fastest-growing careers in IT these days is the site reliability engineer (SRE), a role that was created to help bridge the gap between development and operations groups.

4: Drift detection: Changes to application and infrastructure comparisons still account for the majority of outages, so real-time monitoring for configuration drift is imperative. The ability to automatically fix known, repeatable configuration issues with AIOps is even better.

5: Monitoring-as-code: This is about deploying standard monitoring blueprints in a codified and controlled way and without the need to manually discover resources. That saves time and eliminates dangerous visibility gaps, common in our shadow IT world. It also aligns with the recent trend toward infrastructure-as-code.

How the Marketplace is Evolving to Meet New ITOM Needs

As ITOM software vendors have evolved their offerings to meet the demands of dynamic, cloud-based workloads and software-defined infrastructure, one theme has become pervasive: intelligent automation.

Gartner is credited with coining the term AIOps, which is now mainstream in the user community. By incorporating machine learning into monitoring software, IT teams can save a lot of time and effort when the software does the dirty work of filtering and correlating alerts, analyzing events according to business priority, routing events to the right person every time, doing faster root cause analysis and automatically fixing commonplace issues.

AIOps, which combines historical and real-time analysis, has been shown to reduce up to 80% of alert volumes so that IT can focus on preventing and resolving the business-critical problems. Automation will continue to evolve, with a closer alignment of robotic process automation, business process management, iPaaS and IT process automation into what Gartner refers to as hyperautomation. The ability to apply trained machine learning models to the edge will detect anomalies faster than pushing data to a central data store and aid preventative measures.

While AIOps addresses many of the areas of need on the short list above, another trend in the works is cloud-native incubated projects such as cloud events, service meshes and open tracing. These projects are important because they bring the latest Silicon Valley innovations to the average enterprise in an organic, community-driven manner.

Knowledge management is another area ripe for innovation. This has always been seen as a rather dull and difficult topic but if you’re a senior executive who has just laid off hundreds or thousands of people and the tribal knowledge has been lost, knowledge management might be seen as high risk. This can play out by sharing best practices, insights, and data across functional silos, point tools, and enterprise services.

The Future of Best of Breed Point Tools

Customers know that there’s no single tool across all 15 categories that can do everything they need. Furthermore, enterprise IT organizations are still set up with a data center, a network engineering team, an applications team, and so on. These realities portend that a best-of-breed monitoring environment will persist for some time.

Yet still, the promise of a single pane of glass remains a goal for many ITOM professionals. Today, with open standards and APIs, we are getting much closer to realizing the dream. The ability to manage infrastructure autonomously is a shared vision, which means moving away from environments where organizations have 20 or 30 different point tools to monitor, manage and optimize their environments. People need one place to go: a single source of truth.

Market Consolidation

M&A activity in the sector has been hot over the past few years: Cisco’s recent announcement of its pending acquisition of ThousandEyes (for a reported nearly $1 billion) shows the price tag of a resilient network.

There have been more than 80 ITOM tools acquisitions since 2015 and 11 AIOps tool exits, and there’s still ample room for consolidation. IT vendors with the cash on hand to purchase a promising startup this year may come out stronger following the Covid-19 crisis. A portion of smaller vendors will likely disappear altogether unless they have strong balance sheets and funding models in place to ride out the storm.

Given marketplace shifts and a tenuous business climate, what is a VP of IT infrastructure to do right now? Many organizations will choose to make do for the moment, maintaining patched-together and legacy systems until the economy looks stronger. Yet a more progressive approach could be updating the legacy tools environment now, to support ongoing digital transformation plans of the business and to prevent customer defections that could occur from outages and poor user experiences.

The starting point for any ITOM initiative has not changed over the years. It is to ensure that key internal and customer-facing applications are working correctly and meeting SLAs all the time. The hope is that innovation will allow this job to become easier and more meaningful for frontline IT employees – even as infrastructure complexity grows.