How AI Will Help Tomorrow’s IT Operations

The year 2025 started out with a bang, with the AI community collectively deciding to use the MCP format to build agentic systems, and moving on to a focus on building operational AI frameworks.
For IT operations, however, getting useful work out of AI agents still required considerable planning. Pushing code changes into production, for the time being, still requires human oversight.
In a TNS survey, Kevin Reeuwijk, a distinguished architect for Spectro Cloud pointed out that “We plan to support AI operations insights more than pure AI operations. Even if an AIOps agent is right 90% of the time, the actions it takes during the other 10% could be disastrous. But AI agents will be great at parsing logs and correlating events to provide meaningful insights into system health, as well as recommended actions. I would just not trust an agent to run unsupervised yet.”
Nonetheless, there are many ways to start down the path of using AI in production, even in a supervised manner.
Here are a few posts TNS ran in 2025 that best reflected this progress.
True End-to-End Incident Management
One of the biggest challenges IT faces today is losing context across tools and observational platforms. PagerDuty’s Cristina Dias writes that one of the must-haves for operations would be a “unified AI and automation layer that supports the full incident life cycle, from detection through continuous learning and prevention.”
“When incidents strike, teams can’t afford to jump between disconnected tools or lose context … This layer eliminates the need for piecemeal integrations, where AI or automation only address part of the chain or require heavy integration to work effectively.”
Welcome to ModelOps
The team behind building and fine-tuning models for your organization may have been a bunch of data scientists, but eventually that team will have to be integrated into the rest of software development lifecycle, hence a new discipline, ModelOps, predicted GitLab‘s David DeSanto.
ModelOps would be the aggregation of today’s DataOps, which prepares and manages data, with MLOps, which handles the development, trains, deploys and versions the necessary AI models. The resulting “ModelOps will provide a comprehensive framework for ensuring the successful integration of AI into the software development workflows,” DeSanto wrote.
Better Documentation
One area of IT operations that sorely needs improvement is documentation. Enterprise IT Documentation has rarely been kept current, but that will need to change, given AI’s voracious appetite for documentation. A chatbot overlay for internal information would thrive on up-to-date data about internal systems.
TNS reporter Jennifer Riggins reports on how a 25% increase in AI adoption triggered a 7.5% increase in documentation quality, according to the Google’s annual DevOps Research and Assessment (DORA) report. The industry is “focusing too much on efficiency of producing [greenfield] code, which isn’t actually the problem,” said Thoughtworks CTO Rachel Laycock. Instead, look to how AI can help existing systems.
Multimodal AI
Administrators love working with the command line. It’s fast, and once you memorize a few crucial keystrokes, it’s pretty easy to use. But AI promises to widen the conduits with which you can communicate, noted Asperitas’ Derek Ashmore.
“Imagine an engineer holding up a smartphone to a noisy pump, describing a strange vibration. The AI doesn’t merely parse the voice; it recognizes the hardware visually, listens to the pattern, consults historical sensor logs and instantly pulls up the correct maintenance playbook,” he wrote.
This is the promise of multimodal AI in enterprise workflows, where systems will fuse text, image, audio, video, and even sensor input, giving your diagnostic tools human-like context awareness.
AI Will Operationalize AI
If all of this AI assistance seems like it may be a headache to set up and maintain, you’re in luck, in that you can get some help … from AI.
“The good news is that we can actually use AI to operationalize AI. In fact, it’s a necessity, given the complexity of the LLM app stack,” wrote PagerDuty’s Dormain Drewitz, in a post that envisions a future where the first job of tomorrow’s AI is to set up other AI tools.
An ML system, after all, can automate all the steps that are needed to run a monitoring system, with its ability to filter out noise from the vital bits of data for making a diagnosis.
“Not only is machine learning able to help with the three Cs of contextualization, correlation and compression, it’s better at it than humans,” Drewitz wrote. “By connecting the event processing to conditional logic to apply predefined tasks, we can accelerate the resolution of an incident in complex systems. Even if the service can’t be fully restored in a self-healing way, the teams that have to step in can have better context and starting points to troubleshoot.”
Operationalizing AI: Accelerating Automation, DataOps, AIOps
The Agentic Enterprise Operating System
In one contributed post, Angie Jones, vice president of engineering at Block, described how the financial service provider built out an MCP-based agent operational framework that allowed agents to communicate with one another, automating and eliminating many of the manual hand-offs needed to complete tasks. Today, over 6,000 Block employees use Goose.
“Just months ago, it would take days of manual labor to read Snowflake dashboards, pull context from recent Slack chatter and generate a weekly Google Doc with insights and flagged anomalies,” Jones wrote. “Now humans orchestrate this process in minutes, directing the goose to the relevant data while applying judgment about what matters most.”
AI Agents Are Morphing Into the ‘Enterprise Operating System’
TNS analyst Lawrence Hecht contributed to this post.
The post How AI Will Help Tomorrow’s IT Operations appeared first on The New Stack.
