Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Cerabyte: ceramic storage poised to usher in 'yottabyte era' (tomshardware.com)
155 points by CharlesW on Sept 8, 2023 | hide | past | favorite | 77 comments


> CeraMemory cartridges (2025-30) storing between 10 PB and 100 PB, and its CeraTape (2030-35) with up to 1 EB capacity per tape.

> Cerabyte says its technology can read and write data at GB/s class speeds.

1EB = 1,000 PB = 1,000,000 TB = 1,000,000,000 GB

To read the 10PB cartridge at:

1GB/sec = 115.740.. days

10GB/sec = 277.7.. hours

100GB/sec = 27.7.. hours

To read the 1EB tape at:

1GB/sec = 31.7 years

10GB/sec = 3.17 years

100GB/sec = 115.7 days

1TB/sec = 11.57 days

10TB/sec = 1.2 days


Throughput is usually a workable problem. Throw in a bunch of these devices and stripe your writes across them - now when you read, you get nice improvements in throughput. Look at ZFS VDEVs for example.

If they are inexpensive, and durable to be considered archival quality, it will find its place.. somewhere between NVME drives and the magnetic tapes.


So let's think this through: you have 10 1EB tape drivers in a RAID0. Each drive has a max throughput of 100GB/sec, so the array has a max throughput of 1TB/sec. Congratulations, now you can write 1EB of data to your array in 1.2 days!

But what about the other 9 Exabytes? Yeah, it will still take another 114.5 days to even touch that; and that's assuming "GB/s class speeds" means 100GB/s, and not less.

You may as well just get 10 100PB disks. There is no point getting extra storage that you will never have time to write to.


True if full 'tape' is read sequentially.

These kind of storage are indexed, one can jump directly to relevant sections and access the data.

Large storage density usually helps in efficiency of mounting & dismounting of media.


The ultimate way to conceal data, just remember which index has your information, the rest is randomness.


Just hope someone doesn't have a good magnifying glass (I don't know if it's true of this, but many storage mediums leave a physical pattern that you can see, and randomness sticks out)


Encrypted data by any decent alt will be statistically indistinguishable from random noise.


Even without encryption, https://en.wikipedia.org/wiki/Erasure_code is common in archiving, and that tends to look pretty random. Obvious patterns don’t maximize the smallest differences between any possible messages.


Looks true

> Data reading can be done with equipment using high-resolution microscopic imaging techniques or electron beam microscopy


In other words: you don't have to wait for slow read times if you don't bother reading the data at all!

You still need to live with the slow write times, though. What's the point of even building a single unit of tape storage if it will take you a minimum of 30 years to write your data to it? Why not just make 1 million smaller ones?


You’ve now repeated this “30 years to write” thing in several places. It’s frankly obvious that the 1EB tape would have faster write speeds than that, and repeatedly pretending otherwise doesn’t make a compelling discussion. Taking decades to read or write a single storage device would make that storage device completely worthless for obvious reasons.

CeraTape is further into the future compared to CeraMemory, so it’s completely understandable that they wouldn’t be ready to talk about the read/write performance of CeraTape yet.

However, it is fun to imagine an audit logging server that is able to operate for 30 years continuously without having to swap the storage. Just write, write, write. The odds of an audit log ever needing to be read are usually quite low, but if it is ever called upon, you’d be able to quickly jump to the point in time of interest and recover the relevant audit logs from potentially decades of logs.


> It’s frankly obvious that the 1EB tape would have faster write speeds than that, and pretending otherwise doesn’t make a compelling discussion.

OK, so I should assume the max of "GB/s speeds"? 999GB/sec would still take 11-12 days!

---

Even if we do limit to 1GB/sec, that's a lot of data to be logging, even if it's raw video. It would be neat to have that as a sort of infinite throwaway chache, though, but I can't imagine really using more than a petabyte or two.


12 days would be adequate. This is archival storage, not high performance storage.

However, I will emphasize again that CeraTape is further in the future, so it makes sense that Cerabyte would not be ready to discuss performance figures for that. What you’re continuously quoting is certainly referring to CeraMemory, the stuff they’re planning to release sooner in smaller capacities.

10PB at 10GB/s would be about 12 days. Even if it were 5GB/s, it would still be less than a month. A major trade off, to be sure, but I can definitely imagine applications where that would be acceptable if the cost were low and the durability were high.

The real question is whether they can deliver what they’re promising at a reasonable price.


Can I have a tiny fraction of it? I want a consumer tape drive, or the revival of Zip disk /MO Disk that is in TB rage that last a few hundred years.


Just curious, but Wouldn’t you call a NMVE enclosure with TB3 or 4 a sufficient ZIP drive reencarnation?


We are looking at $149 for 4TB SSD. That is close to affordable. The only problem is current NAND price are abnormal and I dont expect this price to last. So that is the first problem, price / GB isn't quite there yet.

2nd being NAND quality varies a lot and are not battle tested as long term storage solution. Would I still need to have ZFS Storage for my SSD just in case? If so why not use HDD instead?

May be NAND will get there. But judging from the industry roadmap I have seen it wont happen for another 5 - 6 years.


No. Flash memory loses its data after just a few years unless regularly scrubbed an re-written (which in turn burns through your write cycles).


My reading of the abstract was that the “initial 10 PB systems” would be racks. You can build or buy a 10PB rack today, based on conventional HDDs. So these ceramic cartridges would have similar density to existing HDDs, and presumably stuffing a lot of them into a rack brings the opportunity for some I/O parallelism.

The extraordinary claim here is that the technology can scale by 10x over 5 years, and then also be applied to tape, all while being much cheaper. And you know what they say about extraordinary claims…


Even at 1MB/s it would be amazing (If it ever made it to consumer level stuff) you could just have it auto-archive anything on a cache disk that hadn't been touched in a year in the background.


PCI-E might approach that 1TB/sec number by the time CeraMemory ships. And 10ish days isn’t unreasonably long for model training (a bit long for first epoch though).

https://www.tomshardware.com/news/pcie-70-to-reach-512-gbs-a...


Might be good enough for archival storage, though.


An archive that takes 30 years to write... What would you write it from?


Any constant stream of data, from seismographs and radio telescopes to surveillance cameras.


You have 1GB/sec of that in one place?


CHIME telescope has a raw data rate of 1PB/day[1], or roughly 11GB/s. They use three levels of on-site processing since they can't store that much data.

The LHC had a data rate of over 40TB/s and even after the low-level triggers of over 100GB/s back in 2016[2]. Again they're forced to throw away most of it due to lack of storage.

This means someone looking for a novel signal can't just scan through the archives. I'm sure science instrument groups like the ones mentioned would have loved to be able to store more data.

[1]: https://iopscience.iop.org/article/10.3847/1538-4357/aad188

[2]: https://www.annualreviews.org/doi/full/10.1146/annurev-nucl-...


Would be useful if you wanted to archive many tv stations at once (and never actually go back to watch them)


Or maybe go back to them in a rare case, like a criminal case investigation.


That's what I was thinking. Like the BBC archives. Who really cares if it takes 6 months to unarchive an entire decade of TV history?


radio telescope arrays can probably pull that much.


Same argument applies when I hear people say “we can just do an ETL job.” And why storage is so cheap in the cloud. Once you get a lot, it’s hard and expensive to move. And definitely why things like AWS Snowmobile exist.


If it’s that durable they can do raid 0 on it with 50 tapes. Can it?


> “Data reading can be done with equipment using high-resolution microscopic imaging techniques or electron beam microscopy.” What this says to me is it’s going to be incredibly slow to read and write. It claims in order to achieve the headline densities they will use 3nm spot sizes for writing bits, no way can that be done by optical microscope so electron beam it is. There’s a reason why we don’t use e-beams to make computer chips, it’s incredibly slow. That’s why ASML sells a $200m machine to concentrate the extreme UV mask image down to the required sized. It can do all the writing in parallel, instead of serially.

I can still see this tech having use for long term archiving purposes, especially if it doesn’t end up being too expensive. But that’s a big if. Even if the ceramic storage tapes are free, running a massive high throughput electron microscope is unlikely to ever be anything close to inexpensive. My intuition is that it will probably not be able to compete with just buying some traditional HDDs or tapes and keeping backups. Maybe someone with more knowledge of the area can chime in?


High resolution microscopy and electron beam microscopy are both industrial production technologies that cost far less than extreme chip-making. That is: there is already a business that supplies these technologies off the shelf. It's still not truly cheap, but it makes sense in the context of warehouse computing, in the same way that massive tape robots do.

(I still think it's unlikely the economics of developing an entirely new storage tech like this will ever beat the established players, but the cost of the readers and writers is the least challenging aspect).


They're claiming that read/write speeds are at gigabytes per second using "low power" technologies.


That seems insane.


Why?

In high throughput optics you are basically scanning one full frame of a 4K (3840x2160) per second (or higher!). If you store one bit per pixel, with 16 bit data, that's 10-20MB/sec. You would likely move the field of view to obtain 10FPS, so 200MB/sec per reader. However, it would be trivial to parallelize this (with more objectives, mirrors, higher resolution cameras or whole devices) and so it's not impossible to imagine an optimized, production version of this generating terabytes/second of raw data.


Because positioning a 4K camera at nanometer scale is hard and moving it very precisely 10 times/sec is even harder. Reading each pixel at that scale and speed without substantial error correction seems crazy to me. You’ll probably want at least 3x3 pixels/bit and a higher resolution camera that moves less.


The miniaturization of what are in some senses fired clay tablets for information storage would be a fascinating development.


Full circle Mesopotamia


Company's website: https://www.cerabyte.com

Abstract: https://storagedeveloper.org/events/agenda/session/527 - To be presented on September 18 at the Storage Developer Conference in Fremont, California.


The claim to cost 75% less is so useless. It seems to be based on a very particular scenario based on multi-decade storage, and doesn't seem to recognize that new drives are denser than old drives.

Please just say drive cost and media cost.


> Please just say drive cost and media cost.

Either they don’t know, or its the ol’ industry smokescreen to charge more opportunistically.


The problem, as always, is that if this is just a single company, then they are going to die a quiet death from not being able to break into the market. If they're a single company selling licenses to their patents and production line, then maybe it'll work.


I think they could absolutely make it work as a single company. The hurdle is proving the technology actually works and manufacturing it.

If they can actually produce the product they claim, I don’t think failing to break in to the market will be much of a concern. They’d be in a league of their own. However, if the tech doesn’t actually work they way they claim, or if competing tech can do the job better, then they’ll go nowhere.

Realistically, if they deliver on the product they claim to have they’re not going to last long but not because the company fails. They’ll be swept up by a bigger player in the industry.


I think AWS, Azure, and GCP would be very interested in such a technology. It could serve as a glacier replacement.


All three of those would prefer the cheapest, immediately replaceable option of off the shelf drives rather than small run, low availability drives. Even if the dollar per petabyte figure is lower, "we can't make that many" is a problem if you want to sell to AWS and friends. It's going to be a long time before these drives hit AWS scale, so they'll need to win over the much smaller fish first.

And of course these days with AWS, the other risk is Amazon going "yeah we're just going to literally steal your idea and make our own, and there's nothing you can do about it because no lawsuit you bring would fine us an amount we can't just write off as a business expense" if it'll save them enough money.


I’m not sure that’s entirely true. Aws at least is ok with exotic technologies if they’re an order of magnitude or more beneficial. The scales discussed are not on magnetic roadmaps and are compelling enough. Glacier is today a fairly custom hardware kit built around tape infrastructure and used to be based on spinning disks. They’ve redeveloped it at least once and would again


And we're back to chiseling on stone tablets.


“And we’ll be saying a big hello to all intelligent life forms everywhere. And to everyone else out there, the secret is to bang the rocks together, guys.”

—Hitchhiker’s Guide


Am I the only one who wonders how long until diffusion randomizes things in a 10 nanometer thick film?

Obviously, error correction will be part of the system. It seems to me that there will always be a measurable raw error rate. It would be quite useful to have those raw and corrected error counts exposed in order to manage this new class of storage proactively.

I don't care if there are correctable errors, I just care if they suddenly increase, or change in nature.


I don't think you will see a lot of diffusion (or any kind of chemical degradation) in a 10 nm (100 angstroms, say ~100 layers of atoms) thick slab. Unless you bombard it with a molecular beam, but that'd be really weird and deliberate.

Maybe you can get some light atoms (hydrogen) diffusing through it... but not displacing whatever is in their ceramic.


Ceramics are notoriously brittle so it will be interesting to see how they make ceramic tapes.


> The first Cerabyte solution, CeraMemory, will come as a cartridge that contains sheets with ceramic coatings. If you looked closely at the data stored, it would look like "quasi-punched cards in nano-scale."


Their road map to denser storage, TB/cm vs GB/cm, mentions tapes. The cartridges make sense since they're essentially cd-roms, CeraTape on the other hand with 10 nanometer coating seems like magic.

> Meanwhile, CeraTape (2030-35) gives away the storage medium type in its name. These data tapes will have a 5 µm thick substrate with a 10 nm thick ceramic coating.


Ceramic film is used in window tinting and seems durable enough. It can stand the rollers and the elements.

I assume the film for storage will be in a protective housing, but even exposed I don't see a problem.


Probably the same way Samsung make a foldable screen with a layer of glass, make it thin.


Assuming that their claim is true that they will soon(-ish?) be writing TB/cm^2 and reading at GB/sec rates using electron beams, this seems like a very cool breakthrough in electron beam microscopy, both size and price, and there would be many other uses for their technology. I could Sooo much use a portable/tabletop electron beam microscope!!


I’ve wondered if extremely dense storage like this would give rise to a new “tape trading” or bootleg culture - if you could export Spotify’s entire catalog or every movie made so far to a single storage tape and duplicate it all, what would that be like?


I believe the main bottleneck for that would be finding seeders for a collection that large, not a lack of storage.


In that case why would it need to be online at all? If it's not too big just share the physical media..


Seriously: just buy a station wagon ;)


Sounds like WORM (write once read many)


My own usage would be satisfied with that sort of storage, and it doesn't even need to scale past single digit petabytes, I don't think. I probably even only need single digit terabytes for rewritable, everything I put on the Synology I expect to keep for decades all the way up to forever.


640 terabytes should be enough for anyone. ;)


Realistically, I am probably never going to have more than about 20,000 movies. Right now, I'm above 5000, but as they become more obscure and less B movies and more Z- movies, I would lose interest.

Currently, I'm getting them at low bitrates, only about 3 gigs each in 265. But if I were to do them from lossless 4K bluray remux, they'd go up towards 100 gigs each, never beyond that (I think). So, x30 for the increase in quality, and x4 for for quantity.

I'd probably still get the low bitrate version for streaming, but it's inconsequential to the math.

I currently have about 120 terabytes capacity (maybe more if I get the Synology expansion chassis, but I think I can only do another 5 bays that way).

This means that (for movies), 10 petabytes would suffice for the rest of my life, and likely for the next couple generations. Even adding television into the mix, it looks like that's currently at half what my movie collection is taking up, so 15 petabytes. Music probably isn't worth factoring in. And my 5000 ebook library, while small, is only a few gigs. If I somehow raise that to a few million, still wouldn't notice.

Anything that gets me into the middle hundreds of terabytes cheaply (or with good redundancy, affordable backups) is a godsend. Your 640 terabytes thing really is good enough for me to almost within an order of magnitude.


Maybe we’ll start publishing movies as 3D environments instead, so you can walk around them and see the film from any angle. I’m sure we can eat up lots of drive space with photorealistic models and textures for things that are normally hidden.


Maybe, but right now your average movie isn't even going for 4K.


Hi Bill :)


This sounds like an 8K TV in a flying car.

Talking to the press before having product instead of stealth mode until having a functional prototype typically means they're going nowhere.


Always interesting to see new ideas being tried, but if it needs an electron beam microscope to read the data back that might cause some problems for miniaturisation.


I wonder how dust/scratches might affect the supposed spectacular data density. A single scratch can wipe out terabytes of data... (even with FEC)


At that scale, devoting a significant portion of the storage space to parity or redundant data seems plausible if not required. A scratch doesn't have to mean wiping any data completely.


People in this thread seem to miss the comparison against tape storage which caps out at 1GB/s today. If this thing can also do random access, it is legitimately a totally new ball game for archival storage and how storage products are built / conceived.


The article did well to call out the extraordinary claim that it is stable in a huge temperate range, starting close to absolute zero.


We have such wonderful things to store for you.


It is PB per RACK!! Not per cartridge!


huge if true




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: