Visa has blamed a component failure in a backup switch at its data centre for an outage earlier this month that resulted in 5.2 million card payments failing – and made a frank admission that software to automatically detect such failures was not in place.
In a letter to Parliament’s Treasury Committee published today (June 19), the company gave a detailed description of the IT meltdown on June 1.
The company also said that an ongoing migration of its European processing onto its global processing system, VisaNet – due to wrap up by the end of 2018 – was not to blame and the new system will prove “more resilient” to such malfunctions.
Primary Data Centre Switch Failed
Visa Europe CEO Charlotte Hogg, a former deputy governor of the Bank of England, wrote: “A component within a switch in our primary data centre suffered a very rare partial failure which prevented the backup switch from activating.”
“As a result, it took far longer than it normally would to isolate the system at the primary data centre; in the interim, the malfunctioning system at the primary data centre continued to try to synchronise messages with the secondary site. This created a backlog of messages at the secondary data centre, which, in turn, slowed down that site’s ability to process incoming transactions.”
“Due to this complexity and the very rare partial failure of the switch, a number of key steps were taken throughout the afternoon – including turning off software applications at the primary site and cleaning up message backlogs at the secondary site by both manual and automatic means.”
Software to Detect Failure Was Not In Place
The company said it does not know why the switch failed and is working with the manufacturer to conduct a “forensic analysis”.
It said ongoing migration to VisaNet, [pdf] expected to be final by the end of 2018, will help prevent future such issues and was not to blame.
“VisaNet is based on a different technical architecture from the European system, has multiple data centres, and serves multiple geographies. VisaNet has four active-active images that work in tandem, and has significantly more capacity and scale compared to the European system.”
In a frank admission that its data centre was not equipped with monitoring software, Charlotte Hogg wrote: “The manufacturer has provided us with recommendations on software for automating the monitoring and shutdown of the switch in the event of a similar type of malfunction… We are working internally to develop and install other new capabilities that would allow us to isolate and remove a failing component from the processing environment in a more automated and timely manner.”
MPs Satisfied – but Becoming “Less Tolerant”
Rt Hon. Nicky Morgan MP, Chair of the Treasury Committee, said she was satisfied with the explanation in the 11-page letter.
She said: “The Treasury Committee is satisfied with Visa’s answers regarding its system failure earlier this month, which lasted just over 10 hours and saw 2.4 million transactions in the UK fail to process.”
“It appears that the problems have been fully resolved. The news that debit card payments have overtaken cash use for the first time shows that the reliability of IT systems is becoming ever-more important.”
She added: “The detriment caused to consumers by IT failures is greater than ever, so the Committee will become less tolerant of them.”
“The Committee expects to see the findings of the independent review, which will examine the lessons to be learned from the incident, in full.”