Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Rest in Peace, Optane (specbranch.com) similar stories update story
194 points by PaulHoule | karma 78160 | avg karma 2.48 2023-12-04 23:26:36 | hide | past | favorite | 122 comments



view as:

My HP laptop came with optane memory, as far as I could tell, it was nothing special.

I eventually pulled it to upgrade to a Terabyte ssd


That was probably a hybrid optane setup, like the h10 ssd. It was a tiny optane ssd and a normal ssd on the same m.2 card. Not very good compared to real optane. A really noticeable improvement when running apt, or dnf, or any windows program install.

Silly hack of a product. x2 PCIe for Optane, 2x for NAND. Meant compromised bandwidth for all! The fact that it actually had value in some cases highlights how cool it could have been if it was a proper hybrid design.

In the use case they were pushing for consumers, accelerating a hard drive, it was pretty much indistinguishable from cheap flash.

Optane had great potential for big databases, but you're not running a big database.

It also had very interesting potential for supplementing memory, as a very low latency place to store bulk data. But then the Optane DIMMs ended up being just as expensive if not more expensive than actual RAM. So outside of certain niches, especially because of how narrow the compatibility was, just spend the money on more RAM.


No your laptop did not come with a pure optane SSD; it came with a 32GB optane cache. The former's retail price was thousands of dollars while the latter was a joke

I also had one. 1tb+octane cache.

It actually was a net decrease in performance! The reason is that first writes would saturate the cache SSD. Then the octane driver software must manually flush the cache to the main SSD.

What happened was that the laptop CPU actually heated up faster doing this, and reached throttle temps sooner.

I disabled the optane function and the performance increased to something typical.


Optane SSDs are really special and still have a place. SLC is 30 microseconds at best, FIO puts my 905p optane drive at 5-11 microseconds. That means for a single threaded latency bound task, a 905p may finish it 6 times faster than an SLC ssd. 20 times faster than MLC!

Edit - wrong prefix sorry!


Maybe your fio is hitting RAM? fio for 905p / p5800x is usually low micros. Order of magnitude faster than other nvme but not as fast as RAM.

Even hitting RAM takes the best part of 100ns. They probably mean 5-10us given the ‘6x faster’ thing.

50ns is typical with overclocked desktop RAM

do you have your fio benchmark / args handy? i would love to get sub-micro out of optane, even if it's only on paper

it's possible to get <1us with Optane PMem (the dimms) in fio, you just need to use the memory-specific engines so that it avoids the fs/block layer.

I use an Optane drive for boot/pagefile and for my VR simulators (two drives) for exactly this reason. It doesn't help peak FPS, but it has a noticeable effect on the 1% and 0.1% cases, to the point where I notice it and can tune around the difference.

This is a 905p vs SK Hynix Platinum for reference.


What was interesting about Optane was that it was kinda an attempt to get rid of disk and ram distinction.

You had single device (optane stick) which served as both - ram and disk.

After decades of current approach we finally could physically get rid of disks - hdds, ssds or even nvme sticks.

https://www.theregister.com/2022/08/01/optane_intel_cancella...

>Optane presented a radical, transformative technology but because of this legacy view, this technical debt, few in the industry realized just how radical Optane was. And so it bombed.


Still waiting for the Memristor...

Oh jeez I was so excited for that!

At first I was resistant but now I am charged for it.

Optane was probably a memristor technology under the hood.

So the winning move now is for Apple to announce iOptane and sweep the data storage and cache industries by storm?

Apple may already be headed in that direction. They already have unified CPU and GPU RAM. It doesn’t seem far-fetched to imagine that they could unify persistent storage and memory.

Knowing apple’s marketing moves, they could definitely do that: just use a single number to describe memory. And then pretend it’s a big number.

I can already see it: base 128GB of total system unified memory for your files and data.

Keep all other files in iCloud for big money. Genius

Well technically, Intel and AMD both use the regular system RAM for their integrated graphics VRAM, but I see what you mean.

Intel and AMD also do support unified memory for their integrated graphics. It’s been a while since you needed a statically cordoned-off area of main memory (“shared memory”) for the iGPU to work.

Consoles have been using unified memory since the 8th gen (PS4/XB1/Switch, kinda sorta even WiiU).

And NVidia has CUDA unified-memory slide decks going back at least 5 years.


The Xbox360 already had unified memory too. That gave it a slight edge compared to the PS3 in the long term because it was more flexible compared to a fixed 50:50 split.

The original xbox before it too.

And the N64 before them. I think it was the first console with unified memory.

I think they'd rather call it iMemory and jack up the price of the iDevices using it by 25‰.

Yeah - the article was talking about mmap... but what i wanted was to not have to define persistence boundary. I wanted the entire in-memory state of my program to be persisted - perhaps even duplicated and moved elsewhere

But then you can't just reset the state when crashes happen and data corrupts

This shouldn’t be voted down. This problem, in the more general case, was inherent at the system level with the persistent and PARC’s immersive Smalltalk and Interlisp environments on the D-Machines. It was much better to have full source you could reload into a fresh environment.

mmap on Optane direct-access-aware (DAX) filesystems like EXT4/XFS now, is not like mmap on block devices where the OS gets in your way and pages stuff in from disk and (maybe) later syncs it back to persistent storage. Optane is the persistent storage, it's just usable/addressable as regular RAM as it's plugged in to DIMM slots.

And in the later Xeon architecture (Xeon Scalable 3rd gen, I think), intel expanded the persistence domain to CPU caches too. So, you didn't even have to bother with CLFLUSH and CLWB instructions to manually ensure that some cache lines (not 512B blocks, but 64B cache lines) get persisted. You could operate in the CPU cache and in the event of power loss, the CPU/mem controllers/and the capacitors on Optane DCPMMs ensured that the dirty cache lines got persisted to Optane before the CPU lights went off. But all this coolness a bit too late...

Another note: Intel's marketing had terrible naming for Optane stuff. Optane DCPMMs are the ones that go into DIMM slots and have all the cool features. Optane Memory SSDs (like Optane H10) are just NAND SSDs with some Optane cache in front of them. These are flash disks, installed in PCIe slots but Intel decided to call these disks "Optane Memory" ...


There are also fully optane based SSDs such as the 905p or the 5800x

Yes, which made it even more confusing, why call the Optane-cached consumer NAND disks "memory" ... but perhaps they thought that it's easier to fool the consumer segment (?)

> mmap on Optane direct-access-aware (DAX) filesystems like EXT4/XFS now, is not like mmap on block devices where the OS gets in your way and pages stuff in from disk

Yes, the real case for Optane memory is that, supposedly, you don't have to fsync(). And insisting on proper fsync() tends to tank the performance of even the fastest NVMe SSD's. So the argument for a real, transformative performance improvement is there.


Why would you not have to fsync? The fsync is a memory barrier that is just as useful with octane to ensure integrity.

Do you mean the latency of ensuring fsync safety is lower?


No you don't have to fsync. Think of it like RAM. You don't fsync RAM.

You do, in fact. It’s called a memory write barrier. Ensures consistency of data structures as needed. And it call stall the cpu pipeline, so there’s a nontrivial cost involved.

The point is that on PMem that is simply "sfence", and not a potentially super-expensive "fsync" syscall... Fsync is an fsync, not a memory barrier...

They both involve flushing cache to backing stores, and waiting for confirmation of the write. It’s literally the same thing. It’s just writing a cache line to RAM is orders of magnitude faster than writing a disk sector to storage, even with NVME SSDs. Octane is/was somewhere in the middle.

> They both involve flushing cache to backing stores, and waiting for confirmation of the write.

No they don't. A fence only imposes ordering. It's instant. It can increase the chance of a stall when it forbids certain optimizations, but it won't cause a stall by itself.

CLWB is a small flush, but as tanelpoder explained the more recent CPUs did not need CLWB.


memverge.com does some cool work around making that happen.

I'll admit this sounded cool when I first heard about it; but it's actually a lot harder to program if you want to be able to recover from sudden power outages (which would be the main reason for having persistence in the first place).

That's an easy way to accumulate data corruption.

It's better to design for unexpected restarts than design for a golden in-memory image which needs to be carefully ported around, have all its connections wired back up, and so on.

You're going to get unexpected restarts anyway. The faster and more reliable you can make recovery from that, it benefits you in the moving use case. The kinds of things you might want to do to enable reliable restart - like retry mechanisms for incoming requests - make migration work too.


You can design for that. For example, when building a persistent-memory native database engine, you probably need some sort of data versioning anyway - either Postgres style multi-version rows (or some other memory structures) that later need to be vacuumed or Oracle/InnoDB style rollback segments that hold previous values of some modified objects. Then you probably want WAL for efficient replication to other machines and point in time recovery (in case things go wrong or just DB snapshots for dev/test).

Transient & disposable memory structures like keeping track who's logged in or compiled SQL execution plans that facilitate access to the persistent "business data", much of that stuff will need to be in RAM/HBM/CPU cache anyway, for performance reasons and as these things do not necessarily need to persist across a crash/reboot. The data (and likely indexes, etc) need to. But you won't need a buffer cache manager that copies entire blocks around from storage to different places in memory and vice versa. Your giant index or graph could rely just on direct memory pointers instead of physical disk block addresses that need to get read to somewhere in memory and then are accessed via various hashtable lookups & indirect pointers. And you don't have to ship entire 512B-8kB blocks around just to access the next index/graph pointer, just access only the relevant cache line, etc.

With proper design, you'd still have layers of code that take care of coherency, consistency and recovery...


Look at developing for MSP430s with FRAM -- these microcontrollers have a decent amount of FRAM with full persistence, full XIP etc, up to 256 kB; but only 8 kB or less of traditional SRAM. Even in this world, where you /could/ have everything persisted, you still end up aware of the persistence boundary and using SRAM both for the absolutely highest-performance code (e.g. interrupt handlers; FRAM has more wait states than SRAM in this implementation), but more interestingly for things that specifically /should not/ be persisted (e.g. the bytes storing whether your POST has completed, hardware initialized, etc). You can come close to persistence-oblivious, especially at a conceptual "application layer", but the overall implementation still ends up persistence-aware.

I had a calculator from the early 90s like that, the HP 48GX, 128K RAM, but it kept it storage when shutting it off and on again. Certainly very convenient!

Like Itanium, it was one of those Intel projects that was simultaneously too ambitious and not ambitious enough.

If Intel really wanted to redesign the Von Neumann architecture, they would have had to be prepared to absorb losses for much longer, way north of a decade.

The alternative might have been to focus exclusively on providing SSDs using the new technology and maybe try to segue into this new memory architecture 10 years later. Like Itanium should have initially focused on beating competing x86_32 chips of the era in benchmarks and ship the new ISA as an afterthought.

Thank you, Intel, for trying to push the envelope, though.


I think it will be basically impossible to move away from von Neumann unless you control the entire stack, including OS and software. I don’t even think Apple could do it with Mac, because they are general purpose machines (for now at least). Maybe Apple could do it with iOS devices. Nintendo might be able to pull it off, though people trying port titles from other platforms may no longer try to do so. Because so many AAA titles go between XBox and PS, I don’t think Sony would try.

I'm not sure if Sony would want to make a system with a very unique architecture again. Devs complained about how hard it was to program PS2 games, and again with PS3. PS4 and 5 are practically PCs by comparison.

I don’t want to say that we’ll never see big architecture changes again but I think a company like Sony would want more confidence that they’d get real advantages. Cell wasn’t just unpopular with developers but also never delivered compelling performance; I suspect if they’d had a PC CPU and a Blu-ray player at the same price it would’ve sold identically.

Anyone trying this needs to figure out a decade-long schedule with points where something would be worth using for some reason so they don’t have to run the whole thing in a vacuum hoping it’ll be worth it at the end.


The other similarity was that they needed to treat developers like VIPs: Itanium failed in part because almost nobody was interested in paying a premium for a slow chip, licensed compiler, and then spend their time optimizing code to match competing chip’s out of the box performance. In both cases, they really needed to flood developers with free hardware & help - especially open source developers working on things like databases where you could see the most wins.

People might have bought Optane if the pitch was “Postgres/MySQL runs twice as fast” rather than hoping someone else would make your purchase cost-effective later.


At the very least wait for CXL to become commonplace before you get ready to cancel, since that's the kind of interface that is a perfect fit for optane.

The problem is optane wasn't as fast as DRAM, nor as cheap as disk. So you still needed the conceptual split anyway, and there wasn't a compelling reason to really use it (OK, I can get a lot of RAM but it's 1/10th the speed of my normal RAM, which will kill both memory latency and memory bandwidth limited applications, which is basically any workload which wants lots of RAM. Or I can get super fast persistent storage with high durability but much higher cost, which basically makes it good for a persistent cache but not that much else, considering SSDs are now similarly fast even though their durability is not as good). It wasn't sunk by a stuck-in-the-mud way of thinking about memory versus disk, it was sunk because the tech never got good enough to actually achieve their ambitions.

Hmm, ive seen report by fujitsu that it was fine if used with dram

>Intel Optane persistent memory is blurring the line between DRAM and persistent storage for in-memory computing. Unlike DRAM, Intel Optane persistent memory retains its data if power to the server is lost or the server reboots, but it still provides *near-DRAM performance*. In SAP HANA operation this results in tangible benefits

>Speeds up shutdowns, starts and restarts many times over – significantly reduce system downtime and lower operational costs

? Process more data in real-time with increased memory capacity

? Lower total cost of ownership by transforming the data storage hierarchy

? Improve business continuity with persistent memory and fast data loads at startup

https://sp.ts.fujitsu.com/dmsp/Publications/public/wp-perfor...


Where Optane could've been used: distributed storage. The whole game there is about how fast can you ack the writes while writing as far away as you can. The company I worked for in this specific area had used pmem when that was available for that specific reason.

I'm no expert on costs of these things and especially wouldn't dare to predict how these costs could've behaved in the future, but my guess would be not that the technology had no use, rather that it was too expensive for what it was good at, and the second best was cheap enough to warrant choosing it over Optane.


You're right and it did get used for that. Unfortunately that's nowhere large enough a use case to support the entire product/business/r&d.

> What was interesting about Optane was that it was kinda an attempt to get rid of disk and ram distinction.

That might be the promise, but that premise is fundamentally flawed. I mean, what drives the need to classify memory devices is performance characteristics, and even if your starting point is this idealized world where Optane was ubiquitous then all it took was a ephemeral memory technology to significantly outperform Optane to create need support the performance-oriented non-persistent memory type.


I was just thinking about optane now that I'm dealing with AWS lambda cold starts.

It wold be a great use for that technology to decrease function startup time .


The problem is that if you want to keep the server state around, there's already a cheap and easy solution: write a persistent server.

Optane is dead. Long live Optane!

PCIE 3.0 Optane recently sold so well it got backordered and the price popped. (Ok maybe it was LevelOneTechs Wendell? Maybe it was organic? https://www.newegg.com/intel-optane-905p-1-5tb/p/N82E1682016... ) Today PCIE 4.0 prices have fallen so fast that you can RAID0 a few M.2s and get about the same IOPs from a single Optane drive for a similar $ per GB, though not exactly the same durability / DWPD. Chia is no longer a driving force for drive durability, but then again Optane was designed for databases / SAN and not proof-of-space (e.g. PNY https://www.pny.com/lx3030-m2-nvme-ssd ).

Optane as a consumer product might have been a good play when laptops mostly had spinning disks, but market timing was too late. Today those consumer Optane drives were, yeah, a mistake, but a cool part for a homelab.

Optane as an enterprise / workstation product though is king for: * databases / ml datasets that don't fit in memory * large SANs that host critical VM persistent storage

It's a small niche but when IOPS and durability matter, there's nothing close. However if Optane really ends at PCIE 4.0, then top-tier PCIE 5.0 nvme will probably meet or beat Optane in 2024.


Well, if Optane wasn't dead, it would not be stalling. If SSD speeds would be improving, so would Optane speeds.

I’ve been using them on SBCs for that reason - on pcie 3 you still get the latency benefits even if throughput isn’t there.

It's possible the price pop is happening because of people trying to find spare parts.

I can't speak to these specific devices because I haven't been doing infrastructure work since before 2015, but I recall the price of certain enterprise products would routinely skyrocket a few years after it became hard to find them. I recall having to source a replacement drive for a server at my Dad's company and discovering that finding any drive at least 4.5GB in capacity for this 15 year old server was challenging. I ended up having to buy a much larger drive at about twice the cost of a modern (at the time) enterprise-grade SAS drive.

I doubt my specific example is all that similar to what might be happening. In my case, it was a confluence of the usual suspects: RAID controllers that are very particular about the specific drive models/firmware versions that they work reliably with (which everyone else needs to replace their drives, too!) coupled with the server using a no-longer-common interface (SCA) and a drive tray specific to the vendor which was no longer a popular (and went Bankrupt)[0].

I suspect it's too soon for it to be that simple but I wonder if there are specific circumstances where "My Optane drive failed and replacing this drive at any cost is the only fix that gets me access to my data, again".

[0] Not important, but it was a Gateway brand server if you can believe it. And I ended up kludging the tray situation, anyway.


The low latencies at lower queue depths made it great for write cache devices(SLOG) for ZFS. It was the perfect match for that purpose.

If I had a database that is smaller than 400GiB in size, I could have made it screaming fast while being safe with an Optane drive.

Indeed one of a kind technology.


[dead]

Wrote a big rant about this. https://twitter.com/jaredhulbert/status/1553183032797933568

TLDR; Optane didn't work because Intel overestimated how fast and easy market acceptance would be and how valuable a monopoly of a new tech would be. As a result the failed at rule 1 of memory, keep the fabs full. Cost were high, so demand was low and it let to a death spiral. By the time they realized it (if they ever did), contract terms and technical decisions left them no options to pull up.


I feel that Optane is a missed opportunity due to economics - the cost of adapting our algorithms to multi layer storage (breaking the abstraction) is much more expensive then the economic benefits of nonvolatile "RAM". Perhaps the cloud was the Optane killer - who needs nvram when you can just up the number of machines

Yeah I'm sad to see Optane go. I bought a few DCPMMs myself (and had to upgrade my workstation CPUs to support them) to test out the capabilities and maybe even write a little toy OS/KV-store engine that runs as a multithreaded process in Linux using mmap'ed Optane memory as its persistent datastore (never got to the latter part). The "do not use mmap for databases" argument would not apply here as you wouldn't cache persistent block devices to RAM, but the "RAM" itself is persistent, no writeback or block disk I/O is needed.

Intel discontinued/deprecated Optane, before I could do anything really cool with it. But Intel can probably still reuse lots of their cache coherency logic for external CXL.mem device access.

One serious early adaptor and enterprise user of Optane tech was Oracle. More specifically Oracle's Exadata clusters (where database compute nodes are disaggregated from storage nodes that contained Optane) and connected via InfiniBand or RoCE. And since Optane is memory (memory addressable), they could skip OS involvement and some of the interrupt handling when doing RDMA ops directly to/from Optane memory located inside different nodes of the Exadata cluster. I think they could do 19 microsecond 8kB block reads from storage cells and WAL sync/commit times were also measurable in microseconds (if you didn't hit other bottlenecks). They could send concurrent RDMA write ops (to remote Optane memory) for WAL writes into multiple different storage nodes, so you could get very short commit times (of course when you need sync commits for disaster recovery, you'd have to pay higher latency).

With my Optane kit I tested out Oracle's Optane use in a single-node DB mode for local commit latency (measured in a handful of microseconds, where the actual log file write "I/O" writes were sometimes even sub-microsecond). But again, if you need sync commits across buildings/regions for disaster recovery and have to pay 1+ ms latency anyway, then the local commit latency advantage somewhat diminishes. I have written a couple of articles about this, if anyone cares: [1][2].

Fast forward a few years, Oracle is not selling Exadata clusters with PMEM anymore, but with XMEM, which is just RAM in remote cluster nodes and by using syscall-less RDMA ops, you can access it with even lower latency.

[1] - https://tanelpoder.com/posts/testing-oracles-use-of-optane-p... [2] - https://tanelpoder.com/posts/testing-oracles-use-of-optane-p...


> But Intel can probably still reuse lots of their cache coherency logic for external CXL.mem device access.

Holy cow, you nailed it: Intel can't get Optane fabrication cost down fast enough, and everyone is moving to CXL anyway, which presents its own latency challenges that tend to hide Optane performance advantage.

I wonder if you could leverage Optane's bit-level addressability in a shared memory pool scenario.

Sorry to see this tech go...


Why was everyone so obsessed with the idea of Optane as universal memory?

From a consumer perspective, it was a fast small SSD that doesn't wear out. That was enough to justify it's value at the prices it went for(Or at least it seems like it, I never personally used it). It didn't need any radical rethink of computing to be worth it.

I'm not even a fan of the nonvolatile RAM idea. Rebooting is often the first thing we do to fix stuff. Modern computing trends towards regenerating from descriptions, not adding more persistent state. Persistent state can get messed up, better to be able to wipe and start over.


I think the idea is to have large persistent storage with the speeds of RAM. Everything would open instantly.

That would be useful, but it wouldn't really replace RAM, or, if it did, we'd probably still wipe it every boot, to be in a known state.

Outside of some specialized database applications, having an entire program running right in persistent memory seems like a bad idea.

And even then, I'm assuming databases would still want strong separation between persistent stuff and stuff that can be restarted any moment, to minimize the surface area for problems to stay persistent in.


Even if you don't use it for state, there's still a lot of code and read-only data that could benefit from being executed directly out of Optane.

As it's part of the memory domain, not some external block device, the CPU caches can cache it (64B cache line sizes) and maintain cache coherency across CPUs, NUMA nodes (and soon with CXL 3.x ?) across memory pools shared by many servers. So your programming model will be just using CPU loads & stores to access the individual bytes you need directly from the Optane storage, instead of having to do some 512B or 4kB DMA or memcpy to RAM first and then access what you needed. And it's persistent, so you won't need to have a separate layer of code for persisting your "cache" of data... One can ditch the "database block I/O" and "buffer cache" notion and just treat Optane as byte-addressable RAM on a server that will never reboot or crash... Of course in reality your server can always catch fire or have a hardware malfunction, so you'd need to replicate your data to some distant enough backup or DR instance... which (at least in mission-critical database context) negates some of the value of persistent low-latency I/O... If you need to wait for 1+ms for the remote WAL write to be acknowledged from a different AZ or region anyway, then having an 1 us local Optane write is not gonna change the whole commit latency that radically.

There's not much here that requires optane as the underlying storage tech, though. CPUs can already treat PCIe devices as memory, and do the same caching work underneath it. Most modern SSDs are alreadying using PCIe, it's just the protocol on top of it doesn't support mapping the contents of the disk to memory like this, but there's no real reason they couldn't.

The reason is physics. NAND just isn't fast enough (and requires firmware wear-leveling, which can mess up latency) for this to be practical.

> Rebooting is often the first thing we do to fix stuff.

Depends on the environment. That used to be the common approach for windows, less so for *nix systems.


Linux is pretty good now about not having the whole system lock up, but restarting programs still seems to be the #1 troubleshooting action.

Yeah. There's a large difference with restarting an individual application and the whole system though. Sometimes several minutes difference. ;)

I've always thought this would be perfect for something like Redis to run on top of.

You wouldn't have to necessarily worry about DB dumps since the backing storage would still be very fast but could survive reboots.


RAM in 2010s were more expensive - or rather, more capped than it is today. Consumer desktop platforms always had 4 Unbuffered DIMM slots, servers had multiples of four RDIMM slots, and typical UDIMM were up to 8GB per slot although there were some 16GB modules towards the end of DDR3 era. SSDs were less advanced but were already exceeding 128GB in size when Optane appeared.

So there was significant hurdle going beyond 16GB per machine, and sales pitch for Optane was DRAM substitute at near-NAND density and cost. Which, if taken at face value, implied >64GB RAM without committing to server/workstation platforms. That was appealing, at least to me back then.

Nowadays you can buy 4 sticks of 32GB DDR4 or DDR5 modules for ~$350, which makes a hypothetical 64GB Optane DDR3 module a rather moot point.


Optane suffered the same fate as every other storage technology over the decades with a similar value proposition. Broadly, the notional value of storage technology like Optane is that it offers a drop-in improvement in storage performance without requiring any software modification, albeit at a significant cost premium. The theory that this market exists in any significant way never pans out in practice.

On one hand, you have a large market of people that are not sensitive enough to storage performance for the extra cost to be worth it. Ordinary storage is perfectly adequate for their needs and they would see little benefit in transparently making the storage their software was designed for faster.

On the other side, there are people who care about storage performance a lot. So much so, that they will design and modify their software to take full advantage of the storage and system characteristics. They use the storage so well that the gains by putting something in the middle will be marginal or non-existent, and certainly not worth the extra cost —- they would gain just as much by adding more storage devices.

The effective target market always ends up being “people who care a lot about storage performance but use their storage in the most naive way possible”, which isn’t that big of a market in practice. Using good software design to achieve similar results on commodity hardware is almost always the better option.


"The theory that this market exists in any significant way never pans out in practice."

The transition from HDD to SSD technology is a counter example to your claim. It was a drop-in replacement (same SATA interface) and the tech significantly improved performance without any other software modification needed.


Didn't SSD's get a start with laptops and mobile devices? There the advantage of being shock resistant is massive and unsolvable with any other solution.

For a superior but new tech to stick, it has to find a viable market to be first self-sustaining, only then can it attack larger and more lucrative markets (Back in those days, PC was probably more lucrative than mobile). Only when SSDs matured enough, did they start eating up the entire storage market.


I wouldn't really say they started with mobile/laptops

They came in 2.5 inch form factor from the very beginning, the laptop shape... which fits pretty much anything. Desktops and servers use it.

The packaging was trivial. They were just cost prohibitive for anything more than an operating system drive.

This worked out to our benefit, the OS is where random access performance shines

It might be fair to say flash used in SSDs started in other applications... but it's important to remember that's also different flash.

It wasn't until the flash changed significantly that we got SSDs (durability), changing everything we once knew. Price, applications, etc.


With SSDs you really feel the performance difference (especially latency) in everyday life. Someone who had experienced an SSD never wanted to go back to a HDD for storing anything but data, since program start-up times are just so much faster when there is no mechanical seeking involved.

The difference between Optane and an SSD was never so striking.


At least for consumer devices, I think the 'primordial soup' spark of motivation was the original lean and mean netbooks (with the initial push to web apps over running full clients locally, before they became low spec windows laptops), with compact flash ATA adapters to displace HDDs. Power consumption was a factor as well IIRC

SSDs are orders of magnitude faster than HDD, enough to justify the cost differential and investments to rethink the software stack.

Octane was an incremental improvement in performance when compared to SSD, and in cost when compared to RAM. An awkward spot to be in.


>Octane was an incremental improvement in performance when compared to SSD

Optane PM was byte-addressable and had latency of ~300 ns. It rendered the entire block storage abstraction obsolete. It was so radical that using it effectively would require throwing out the assumptions all OSes have made about IO for the past fifty years.

The fact that it offered an incremental improvement even when its unique capabilities were completely ignored shows how phenomenally capable the technology was.


What about the 1-10 billion dollar company whose business isn’t technology, like say some mid tier insurance company running MySQL with a combination of off the shelf specialists software and in-house software. They would benefit from faster storage and they can’t tune their stack like Facebook

Obviously it failed as a product but I am not so sure that the cause is as you lay it out


>Broadly, the notional value of storage technology like Optane is that it offers a drop-in improvement in storage performance without requiring any software modification, albeit at a significant cost premium.

I don't think so at all. The value of Optane was that it was dramatically different, offering byte-addressable storage with latency of hundreds of nanoseconds. Getting the most out of this requires radically redesigning how modern systems store data, including ditching the idea of block storage.

Optane was the fastest traditional SSD on the market by a pretty wide margin, but that's not all it could have been.

>They use the storage so well that the gains by putting something in the middle will be marginal or non-existent, and certainly not worth the extra cost.

I don't agree with this either. How do you propose to get latency down to match Optane using ordinary SSDs? It doesn't matter how much hardware you throw at the problem; parallelism cannot reduce latency. You need another storage system to hit those targets. Maybe that can be a more traditional DRAM cache with battery backup, but that's only adequate in bursty workloads.

Putting "something in the middle" isn't worth it if the rest of the architecture stays the same. But it could bring about tremendous speedups if the new capabilities were used.


It says here latency for Optane read is about 10 µs.[1].

It says here an SLC drive has the same latency for sequential read (random read is 5x higher though). [2]

[1]https://www.intel.com/content/www/us/en/products/docs/memory...

[2]https://www.solidigm.com/products/data-center/d7/p5810.html


I worked at a telco when Optane was announced and launched; we had an Intel account manager and quarterly business meetings with them.

I lobbied hard for some samples, but they simply would not provide them. I think that if we were representative of how they acted this may be the reason they never got buyin. If we had had those parts maybe we would have implemented the POC's we were planning, and those POC's may have made it to production and the portfolio of the time. Perhaps there would have been a good market ready for 2015.


Yeah, this. Intel was all talk and zero samples, not even a demo. So back then we just shrugged and moved on.

Literally unoptaneium?

Pretty much.

I remember sitting through those pitches and being like "Ok, you don't have any idea how and where to apply this new thing of yours, which is understandable, but then you are unwilling to let anybody have a shot at it. What I'm doing here then?".

IDK what they were thinking, but the general mood was "oh, another Itanic".


2015 was peak arrogance years for Intel. Even lifelong fanboys started to hate them, it was just too much.

In the end optane was a “cool, nice to know and play around” product with a real market way smaller than Intel was prepared to support.


BTW who else remembers the flap about Memristors and HP Lab's "The Machine" at the time. Disk and memory were definitely going to be abolished by 2020!

What if we improve PCIe and Nvme speeds? Can't we then run SSDs in RAID and approach RAM speeds?

Throughput yes, latency no.

Great article that gets to the key points. The problem with "persistent memory bus" is a very bad sign by itself. This means no existing software/systems can benefit from the feature without writing some new code between the volatile/persistent boundry. It's too much of an effort for a "not much faster" device. RIP.

The 1960s memory models of current programming languages were as much a problem as the OSes since since the benefits of persistent programming require slightly different thinking.

This reminds me of running Unix on the Cray X-MP. It worked, but was very slow since calling a function stalled these superpipelined machines designed to run FORTRAN really fast.


“very slow” compared to what?

Interactively, it felt like they were slower than even a sun 3. I know this isn't scientific, but I never metered it.

Seymour Cray's CDC machines were fast computers, but IMHO those Cray Corp machines were essentially high performance vector units that happened also to be able to run a bit of control code. I guess it wasn't sexy enough to build a coprocessor, or maybe by then there was enough choice in mainframes that interconnect (which was far from standardized in those days) would have been a barrier.


We weren’t building them to be interactive dedicated mostly-idle personal computers with snappy response times, and that’s not why they were purchased.

I’m not disagreeing with you! I don’t really understand why people wanted it — it wasn’t what the machine was designed for.

I mean, I do understand (but disagree with) the specific use case at NASA where I encountered it: the CFD would run all night (or longer) and when done the files went to some Irises for visualization, which also took forever. So running the same source code (basically rsync iirc) sounds easier in theory. But at what cost? I think it would have been better to just write the transfer code to run under Cray OS.

And indeed, once unicos was on the machine people did want to run it interactively.


It's not just Optane. There have been other NVRAM technologies that were hyped as "fast as SRAM, persistent as HDDs, cheap and dense as DRAM" over the years such as FeRAM and MRAM. MRAM exists as a commercial product but it still fails to deliver on the cheap/dense part.

FRAM and MRAM have found a place in embedded markets doing weird things - MRAM is a low-cost alternative to space-grade flash right now, and both FRAM and MRAM are used for small, fast persistent "caching" with good wear-out. Optane was the only one of these products that was really targeted at the server space, and had much better density than FRAM, MRAM, PCM, and all the other weird alternatives.

I have two 905p SSDs for ZFS cache devices. A two-way redundant point of failure, but these devices are way more durable than NAND flash storage.

A different form factor are the NVMe sticks. Given the relatively small capacity, the 118GB NVMe SSDs were not very expensive, and make ideal system drives for server applications.

But brute force seems to have defeated Optane: Intel's enterprise flash NAND SSDs are just super over-provisioned, retaining gobs of spare capacity that result in 8TB devices with one complete drive write per day durability, every day for five years guaranteed.


What I'm missing the most is byte granular reading (in case of Optane it was 256 bytes, because of checksums, but that's fine granular enough).

It means it would be possible to read/write data in much more fine granular chunks (potentially saving a lot of storage space in some cases).


It's a real shame we can't force Intel to license it, and instead have to wait probably another decade for the patents to expire.

It's worse than that. As far as I am aware, the real problem with Optane is manufacturing cost, and lack of scaling up. So even if you have the patent rights, you probably still can't make it.

I'd like to offer my experience with the consumer-targeted Optane devices. I had a laptop that I replaced about 2 or 3 years ago, that had a hybrid 512GB HDD + 32GB Optane storage.

It was a massive headache for me. One fine day my laptop ran into the common Windows issue of 100% disk utilization. I tried all the common fixes to no avail, and at some point I remembered my disk had some funky new tech called Optane. I disabled Optane through its software and was able to directly access the underlying HDD. I checked the fragmentation level for the HDD, lo and behold it was fragmented to oblivion.

Turns out because Windows treats Optane disks as SSDs even though I actually had an underlying HDD, my HDD was simply not defragmented by the OS. After a few rounds of installing and uninstalling large games, the HDD was in an unusable state with regards to fragmentation.

I did a short write-up PSA on r/Windows10, and apparently the issue was widespread enough that my post helped about 10 people in the comments. Thinking back, this whole series of events is partially the reason why I moved from being a non-technical person to a (somewhat) technical one. Good times.


Ugh, a friend bought some similar device due to a misleading description of what the HD was. He'd tried to change the partitioning without knowing it was there; turns out if you had anything but a single Windows partition (with custom drivers injected in the right undocumented way at install-time) you ended up with a sans-optane uncached 5400rpm HDD, which even at that time was completely unfeasible as a system disc.

Absolutely horrible product. And not even cheap before having to turf it out in favour of a proper SSD.


I think one of the main problems was, that the marketing was especially terrible as they branded both consumer SSDs with an Optane cache as Optane as well as the "real" Optane DC persistent memory, which is put into the appropriate DIMM slots.

Another issue may be that Optane DC persistent memory simply was not fast enough to replace non persistent RAM.

Still, I hope that another technology will arise, which is byte-addressable and persistent/durable. I think it could radically change the design of database systems again. You wouldn't have to have a page cache / buffer manager, which retrieves same sized blocks (or multiples of blocks) for instance. You probably wouldn't even need serialization/deserialization to disk.

It would be great if it'd be possible to for instance read 512byte blocks from disk with current SSDs, but I guess the block overhead for meta data might be too big.


What do you think of KV SSDs?

I'm also bummed by this and I think it's a major sad example of how poor measurables can really pervert a market (granted, in combination with proprietariness and network effects). SSDs have long been lumped together and measured almost entirely by a few high level sticker numbers that companies themselves loved to trumpet because they are high but are primarily relevant to specific DC/server/(certain)workstation workloads, not the common consumer case. Essentially sequential read, write, and then some random r/w, but all performed with high queue depths and a single block size. Typical users aren't exactly running at QD32 all the time. Storage performance is an area where use cases and edge cases matter, and the measures the industry settled on didn't do a good job of conveying that. Only a few places (like Anandtech back in the day) did a solid regularly job of checking 90/95/99/99.9[n] percentile latency, low queue depths, varied blocks, and you still had to dig into it.

So as a result Optane didn't look on the marketing or even in typical reviews any different then competing NAND-based solutions, or heck would even look slower. And lore developed on the internet even amongst tech folks that it "only made a difference on servers". But I was fortunate enough to grab a few to use for core storage and wow is it noticeable, I've built lots of regular SSD big arrays and they look great at simple patterns with large blocks and then one gets into regular workloads and they absolutely tank, with occasional noticeable blips when garbage collecting or the like gets hit. Better than spinning rust overall, but surprisingly not by much sometimes. Whereas Optane is rock solid consistent no matter what.

Intel and Micron were really dumb in how they tried to use it and push it, but I think it's too bad as well it never really got much popular recognition in terms of differences with NAND. A lot focuses on "closer to RAM" but it was also a better SSD.


For dang - this should probably be marked as being from 2022 (source: I wrote it). I still need to change the blog template to put dates on everything.

Ugh. Good riddance. As someone who buys millions in servers from Dell year over year, I cannot begin to describe how much for ~6 years Optane was pushed and pushed and pushed on us and it was all they wanted to talk about every other month and we had absolutely _zero_ desire for it.

I was so sick of empty suit sales reps looking at us funny saying "why for you not want optane!?" every other week (read that like an Idiocracy character, please).

I don't give a shit how much Intel is incentivizing you to sell it to me. I. Don't. Want. It.


Legal | privacy