IT’s role in providing business services is more critical than ever. After all, these are the services that drive the customer experience, foster innovation, boost efficiency and create competitive advantage. Downtime must be avoided at all costs. Any issues that impair performance or threaten availability, therefore, must be identified, investigated and resolved in a timely manner.
But finding the root cause of an issue is far from simple, in part, because many different systems typically support each business service. A breakdown or glitch in any one of those systems can cause the entire service to go down. However, identifying exactly what is causing the problem, and how to fix it, can take hours or even days.
Platforms such as ServiceNow gather data from multiple monitoring systems and make it possible for IT to drill down to find out what is going on. ServiceNow Event Management applies artificial intelligence (AI) powered by machine learning to sift through thousands of alerts impacting up to dozens of systems to isolate the most likely reason for a slowdown. This capability cuts through the noise to help IT find the underlying cause of service issues far faster than could ever have been done manually. Its platform-agnostic nature makes it possible.
This works well for almost all IT systems. But there is one vital area that traditionally has been omitted. ServiceNow Event Management does not have native access to mainframe and IBM i systems – the systems that most large enterprises rely on for their mission-critical operations. Even enterprises that consider themselves cloud-based often find that their business services touch a mainframe or IBM i at some point along their journey. Without visibility into these systems, IT service views are incomplete or must be manually stitched together, putting root cause analysis on shaky ground.
Organizations can overcome this challenge with Syncsort’s Ironstream solution for ServiceNow Event Management. Ironstream seamlessly integrates with ServiceNow to incorporate mainframe and IBM i systems into the platform. This enables IT to guarantee high availability, to become more responsive to the needs of the business, and to avoid service degradation or outages.
Service availability and performance issues are the bane of today’s IT departments. A deluge of events and alerts fills the screens of IT operations, yet there is little context concerning how these relate to ongoing issues with business services. IT teams waste valuable time jumping from screen to screen as they attempt to discover the important from the routine. And meanwhile, business services continue to be sluggish and outage times lengthen.
Unfortunately, the monitoring tools at IT’s disposal are disconnected and generate siloed streams of data. Multiple tools often report events having to do with the same issue. Manual correlation is required to get to the bottom of what’s really going on, but it is slow and inefficient. No wonder errors occur. Some issues are misdiagnosed, while others may be completely missed.
ServiceNow Event Management has largely resolved these challenges. Its cloud-based approach provides the means to monitor IT operations, manage services throughout the enterprise, put together a single system of record for all IT assets, automate tasks, and conduct fast and effective root cause analysis.
Root cause analysis is all about understanding the many relationships and dependencies that exist between infrastructure, applications and services. It is essential to be able to correlate a multitude of performance metrics, events and logs to find the most likely causes of problems. Whether it’s a network switch that has gone down, an application failure, latency in a server or a storage bottleneck, an accurate analysis is used to find the root cause.
ServiceNow Event Management applies artificial intelligence (AI), powered by algorithms and machine learning, to IT operations (known as AIOps), making it possible to analyze and process a large volume of events in real time, automatically identify patterns and drill down to underlying reasons for service interruptions or slowdowns. The result is a dramatic reduction in the noise, the rapid identification of multiple symptoms of a single underlying incident, the ability to spot issues similar to those that happened in the past and the location of unusual patterns that may indicate trouble. By being able to quickly determine the potential root cause of the problem, IT can now move at the speed of the business.
In addition to identifying the fundamental source of errors and anomalies, root cause analysis helps IT avoid repeat incidents and offers learning experiences that ultimately bring about greater resiliency.
Root cause analysis is sometimes straightforward and, at other times, complex. Investigation is greatly aided by the ability of ServiceNow Event Management to automatically correlate data from multiple monitoring tools and across IT domains. A root cause can be traced even when a service snakes its way through multiple systems. Perhaps it begins in the public cloud, which then feeds a back-end private cloud server, and links to an on-premises database. Regardless of the complexity of the business service, ServiceNow Event Management can rapidly and efficiently get to the bottom of the issue – most of the time.
But the efforts of IT to utilize ServiceNow Event Management to resolve issues rapidly can be thwarted if all critical systems are not considered in the analysis.
Many of today’s IT professionals don’t realize that mainframes and IBM i systems continue to play a central role within enterprise IT. More than 2.5 billion business transactions run on IBM mainframe systems every day. This includes a great many in healthcare, financial services, government, telecom and other verticals that operate their most mission-critical business services on these platforms.
Similarly, more than 100,000 enterprises rely on IBM i systems. Continually updated by IBM, these systems are cloud-enabled, fully virtualized and operate some of the most demanding databases and applications on the planet.
With so many organizations running these systems, omitting them from root cause analysis makes service slowdowns and outages difficult to resolve. IT operations personnel typically try to solve for this via a variety of workarounds. In some cases, they call their local mainframe/IBM i guru to ask what systems and applications are operating. In other cases, there are two different IT analytics platforms: one for mainframe/IBM i and another for the rest of the enterprise. Alternatively, spreadsheets and Visio diagrams might be referred to, even though these documents are usually badly outdated. But whatever workaround is attempted, results are rarely satisfactory.
Failure to consolidate mainframe and IBM i event and alert data with other enterprise systems can have serious consequences. Sluggish application behavior may be incorrectly blamed on the network, for example, when the real fault lies on the mainframe/IBM i side. What isn’t understood well in IT is how many enterprise processes interface with IBM i and mainframe services. As a result, troubleshooting efforts may be rendered ineffective, upgrade efforts may not have the desired impact and event management actions may be incorrectly targeted.
Take the case of an e-commerce transaction. A cursory look may make it seem that the entire workflow travels through Windows-based frontend and back-end systems. Yet, a more thorough look reveals that the transaction takes a detour through a mainframe or IBM i system. In large companies, for example, ATM transactions often visit the mainframe to check account balances, detect fraudulent activity or verify confidential information.
If the help desk is dealing with a problem with a credit card or financial transaction, it will struggle to achieve resolution without a complete view of the systems that support the business service. Those troubleshooting the issue might decide to contact someone in the applications team. After a couple of hours without result, they may call the mainframe team, who review the IBM Db2 performance monitor. However, everything looks fine to them as it’s holding a performance level of 10,000 transactions a second. What is missed is the chain of dependencies as a workload moves through various IT systems. A day might be wasted trying to come to terms with what went wrong on that one transaction.
To make matters worse, faulty root cause analysis can lead to the same issues having to be resolved time and time again. IT applies a quick fix that provides a temporary improvement. But the next week, or even the next day, the business complains about the same transactional slowdown.
This state of affairs can keep IT up at night or coming in on weekends. Or worse, it can create a situation where root cause analysis is maligned as lost time in one endless troubleshooting operation – unless ServiceNow could incorporate events from IBM mainframe and IBM i systems in its monitoring and analysis.
Syncsort’s Ironstream for ServiceNow solutions enables full discovery and event management capabilities for mainframe and IBM i systems on the Now Platform®. By seamlessly integrating these traditional systems with the rest of their IT infrastructure in this cloud-based platform, organizations can automate tasks and provide insights previously not possible.
Ironstream for ServiceNow helps IT operations staff take immediate and effective corrective actions based upon centrally deployed policies for important or critical messages or events. This enables IT optimization initiatives to encompass mission-critical IBM mainframe and IBM i environments as part of a complete end-to-end enterprise solution.