>Inferior endurance is the "dirty secret" of the consumer-level SSD industry. You and I know better, but the average consumer doesn't pay attention to write endurance or under-provisioning.
The hope is that capacity increases outrun the loss in endurance by changing the way the SSD is used. If you have a large capacity SSD the assumption is that you often just write a large file onto it once and it stays there for a long time, resulting in a net gain in endurance.
If you had the opposite, like an SSD cache integrated into an HDD where the cache is being overwritten all the time you run into a problem and your best choice would be to use a smaller capacity SLC SSD with more endurance.
Often it's not the case, which is why general consensus with SSD's for ZFS SLOG devices is to go with ones that advertise end-to-end power loss protection unless sufficient testing has been done to guarantee the drive protects against partial-writes and doesn't lie about a flush being complete.
Samsung has bigger issues with writes though, almost all of their current drives are TLC NAND which has extremely poor write performance so while short bursty writes can be fast (due to there being a chunk of SLC NAND cells as a write buffer) sustained writes are terribly slow. Optane doesn't typically meet the same peak write performance but it will beat most (all?) TLC drives on the market in sustained workloads.
I think it isn't helped by the fact that SSD fail in a more specific way, you can write x amount of bytes before it should fail, so it feels more like a countdown to failure when you use one even though you could probably never write that much in your lifetime under normal loads.
The most important part, which I missed at first because I rushed to the graphs:
>> We started using SSDs as boot drives beginning in Q4 of 2018. Since that time, all new storage servers and any with failed HDD boot drives have had SSDs installed. Boot drives in our environment do much more than boot the storage servers. Each day they also read, write, and delete log files and temporary files produced by the storage server itself
IMO it's very light load which usually doesn't wear the drive much. [0]
It would be more interesting if they provided wear (as reported by the drive) and TBW values.
[0] Anecdata: a pair of Samsung 860 Pro 512Gb SSDs, apparently running LUKS, still had 45% endurance after 4 years (of unknown load though).
>I think if you're doing LOTS of writes to something you don't care a lot about - a ramdisk might be a nice way to prevent you from killing your SSD.
>Say, most things you put in /var
Even a bottom of the barrel QLC SSD can handle 100GB of writes per day without problems. Therefore it's unlikely that moving stuff in /var (random log/temporary files) off the SSD is going to change its lifespan by much.
This deals with performance, but not durability. The system would pump data into swap at a fast rate during a thrashing situation, wearing out the SSD.
> But SSDs can write so fast that you can burn out the drive in months (maybe even weeks?)
You can burn out a modern consumer drive in 2 days if you want to. Write perf ~6 gb/s, mtbf 700 tb written on a 1 tb drive. The tlc/qlc cells have very poor endurance imo.
That hasn't been our experience. While we've optimized our file system to minimize wear, we do an extremely high volume of reads and writes on our SSDs. We have many SSDs (previous generations) that have been running full steam for 3 years in production. We've been pleasantly surprised with the number of write cycles they can endure without failure.
Didn't someone say something about this putting more burden on SSD and possibly causing it to age more quickly as a result of additional writes required to coordinate between the two?
reply