One February morning in 2011, 40,000 users of Google’s Gmail service awoke to find that they were, in fact, no longer users of Google’s Gmail service. Their emails had, in politer language than many of the search giant’s customers probably used at the time, completely vanished, thanks to a misconfigured software upgrade. Not to worry, Google assured this confused multitude: it had a backup plan. Hidden away in a far-flung data centre rested hundreds of magnetic tape cartridges containing facsimiles of all the lost accounts. It took a little while, but eventually each and every account was restored – using essentially the same technology your parents used to make mix tapes, or record last week’s episode of ‘Coronation Street.’
It’s a story all the more astonishing for the fact that, even now, magnetic tape drives serve exactly the same purpose for a growing number of companies, in spite of multiple predictions that the technology should have died a death a long time ago. “My first experience with tape was in the beginning of my career – that was in 1981,” recalls Phil Goodwin, a research director at IDC and an expert in digital storage. Even then, says Goodwin, people were saying tape was not long for this world. Those critics appear to have been silenced by recent sales figures, which show year-on-year shipments of hard disk drives (HDDs) sink by 34% in 2022, while consignments of magnetic tape drives rose by 14% – a total of 79.3 exabytes, or roughly equivalent to the entirety of data created on the internet every 32 days.
This is in spite of the fact that HDDs still boast formidable storage capacities and retrieve data much faster than tape drives ever could. But the priorities of cloud providers have changed in recent years. Front of mind for hyperscalers, explains Goodwin, is the cost of storage, and when approximately 60% of all data is the kind of information that doesn’t need to be accessed with any urgency, how quickly you can reach the first byte of it suddenly matters a lot less.
Tape also requires much less power to run than HDDs, chiming with the sustainability priorities of the likes of AWS or Azure (although tape libraries do still tend to be housed in climate-controlled data centres). And for those worried about cybersecurity, tape libraries are almost always air-gapped, and extremely difficult to tamper with. “The whole idea is that you can take data on magnetic tape, remove it from a library, put it into a vault or on a shelf or whatever, and it’s effectively saved from any external threats,” says Goodwin.
Magnetic tape storage drives, HDDs walk
Another reason for tape’s renewed popularity is the fact that pace of innovation in HDD technology is slackening. Aerial density in hard disks is only growing now by up to 8% a year – a far cry from the glory days of the medium, when doubling capacity meant you only had to add an extra disk and two heads to each unit. Now, though, “there’s no space left in the HDD form factor to squeeze in more disks,” explains Mark Lantz, IBM’s manager for advanced tape technologies.
Capacity is the least of magnetic tape’s problems. “We’re basically doubling capacity every generation,” says Lantz. Specifically, that’s down to the fact that data tends to be written using larger bits on tape than in HDDs. This means that researchers can continue to innovate in the space by progressively reducing the size of the bits without compromising on the size of the tape – squeezing more out of an individual cartridge for longer.
By using that strategy, says Lantz, magnetic tape has huge long-term potential as an archival storage medium. As such, he says, researchers can “continue scaling aerial density and capacity, probably, for about 15 to 20 more years before we run into the same fundamental physical challenges that HDD currently faces.”
As such, argues Lantz, there isn’t currently any fundamental physical limit that’s been discovered yet for the storage capacity of tape. “There’s a huge potential to scale the capacity of these systems,” says Lantz. “Today, enterprise cartridges [containing] 20 terabytes, if we recorded 317 gigabits per square inch? That’s a potential cartridge capacity of 580 terabytes. So, half a petabyte in a single cartridge.”
The simplicity of the technology is also a key attribute, argues Lantz. “It’s a serial write technology,” he says. “If you wanted to re-encrypt the tape, or to delete all of the data on tape, it takes a long time. And so, if somebody starts trying to interfere with the data in your tape library, basically overriding it all, it takes much longer to destroy the data on tape than on HDD.”
Even so, explains Dr Ioan Stefanovici of Microsoft Research, the truly determined attacker will persist in their efforts, despite these difficulties. “In the absence of proper cybersecurity defences, tape libraries are still potentially liable to malware or ransomware attacks,” says Stefanovici, “where the robot mechanism for tape delivery can be hijacked and made to insert specific tapes into drives for malicious access.”
Ceilings of innovation
Tape also lasts an incredibly long time – increasingly important in an age where more and more data is required by law to be squirrelled away for a rainy day, even if it doesn’t need to be immediately used. Data written onto cartridges up to 40 years ago can still be read back, explains Lantz, albeit using ageing technology that’s light-years behind what’s currently available. One such case involved the retrieval of data from the first lunar landings hitherto assumed lost, but actually broadcast to an Australian radio telescope during the mission on 14-track tapes. The result was the emergence of video footage of the mission of much higher quality than had ever been seen before, obscuring somewhat that the equipment used to extract it was borrowed from museums and used by individuals who had previously been happily retired.
While that’s a nice story, it also illustrates a long-term problem with magnetic tape: the uneven pace of innovation when it comes to building the machines capable of reading it. Such ‘device orphaning,’ says Stefanovici, combined with the inevitability of data decay, “can ultimately result in datasets sitting in long-term storage, potentially inaccessible, and at high risk of becoming lost.”
Current magnetic tape storage technology would probably last just as long, explains Lantz, provided it was stored properly, though he recommends that interested companies migrate their data every couple of years to newer cartridges to harness the growing storage capacity of the medium. The cost of these upgrades is something Goodwin recommends that companies strongly consider when considering investing in tape. “It really is best practice to take the media that’s becoming obsolete and re-read and write it out to current generations of tape,” he says.
And while capacity is expected to increase, slower progress has been made in extracting data. While the pace of streaming data has been continuously scaling, explains Lantz, the ‘time to first byte’ is still in the tens of seconds, rather than the tens of milliseconds experienced with HDD. “For what we call really hot data that’s being accessed a lot, we would recommend putting it on flash, because the IOPs performance is so much better than anything else,” says Lantz. As that data cools, that data should then be moved to HDD and, eventually, in tape libraries.
Might breakthroughs in HDD technology push tape out of its place in the hierarchy of storage media? In a recent interview with TechWireAsia, AWS’ vice president for storage, edge and data services Wayne Duso expressed disdain for tape’s long-term prospects. “The need for deep data storage has not disappeared, but the solution needs to be simpler, easier, more cost-effective, and more efficient than tapes,” he said, touting the capabilities of AWS’s latest S3 Glacier solution. “I do not believe tapes are dead, and if someone wants to use tapes for their solution, that is fine. But the solution that tapes initially provided is no longer sufficient.”
More experimental methods might also knock magnetic tape storage off its pedestal. DNA storage, for example, is predicted by some to reach a dollar per terabyte by the end of this decade, while Stefanovici pinpoints glass – specifically, silica – as an alternative that has practically zero power or environmental demands, is tamper-proof thanks to its write-once-read-many (WORM) nature, and could potentially continue working for hundreds of thousands of years, provided nobody decides to get swing-happy with any baseball bats in the data centre during that period.
Goodwin is more sceptical. Simply put, he argues, many of these candidates simply aren’t as cost-effective, yet, as magnetic tape. Indeed, when Goodwin hears predictions that tape is about to be superseded by another storage technique, his mind drifts back to the marketing campaigns for holographic storage, the ‘tape killer’ of the 2000s that couldn’t attract enough venture capital to catch on.
There’s no reason to believe magnetic tape drives will, eventually, stay retired and not keep getting pulled back into the line of fire for one last job. But for that to happen, says Goodwin, experimental media ultimately has to “exceed the advantages of tape – in terms of speed, and reliability, and cost.”