I'm thinking maybe the Gamer's Nexus approach might work well here, where they buy failed hardware to do an autopsy on it -- and then publish the results of it on youtube as they recently have done for the ASUS high end motherboards that cook the AMD chips.
It allows the the media companies access to the failed hardware to do their own autopsy on it, and it saves the users from needing to go through a painful RMA process, complicated by companies not willing to admit fault.
Yep, Asus main issue was the miscommunication text of "update the bios to not hurt your chip, but btw if you do this you lose your warranty" was the icing on the cake for bad PR. But apparently it's more of a 7000 3D series chips issue.
According to Gamers Nexus their main issue was shitty BIOS which was a different problem than the X3D issues. Their horrible response was just another fail on their part.
I feel like this was a bit too aggressive backlash for this. I think they should be allowed to release a beta BIOS without warranty support. I get that it is very enticing if the old BIOS may fry your system but technically it seems better in every way.
- Gets it out early for third-party verification such as these YouTube channels that are doing testing.
- Keeps it transparent that they are still working on the issue.
- Allows them to experiment with more exotic BIOS (in the general case).
Maybe what is missing is some sort of message like "This is an early preview, we are still doing internal verification before we can provide a stable release with a fully warranty"?
Also IIRC the phrasing was "installing this will void your warranty" which is not true. It would have been better if they were clear that "damage done to your hardware by non-stable BIOS are not covered by warranty".
The "high end" PC parts market comprises such a horrendous pit of garbage.
The only way to know if anything even works to begin with is to read all (poorly written) manuals front to back taking notes, then procure the rest of the parts and rigorously test them yourself within all of their 30 day return windows. And even then you're virtually guaranteed to miss some glaring issue.
Just last week, an obscure forum post from someone who already went through the tech support/RMA gamut saved me from wasting a month + $5K on a build with a motherboard that doesn't support sleep mode, which the manufacturer ASRock doesn't mention anywhere.
Yeah, the sleep is a mess with new hardware (and MS also does not help, with the new OS sleep). I have a "creator"-targeted MB from Gigabyte and it does sleep, but if it wakes up immediately after I put it to sleep (because of a mouse move), it does a series of 7-8 BIOS initalizations/restarts and it resets the full BIOS in the process.
Realize the marketing and support (likely none) of retail parts like these: manufacturers don't believe enterprise customers will not scream at them if some or many of them fail. They can play fast-and-loose to push the boundaries to get marketshare. Big name vloggers and tech reporters may complain.
OTOH, enterprise parts are built and supported towards conservatism and reliability.
There is crossover and a spectrum between the 2, but this case isn't a complete surprise.
I've had equally bad luck with SanDisk's MicroSD cards. Samsung cards rarely ever go out to lunch in my Raspberry Pi systems, but I've never had a SanDisk card last more than six months.
As (sometimes six-year-old) Samsung cards get retired, I've gone to... Samsung. This time, I'm getting their cards intended for surveillance cameras and other write-intensive duty.
I bought 3 Samsung EVO 256G cards last year for a dash cam, all of them failed around 36 days - no return but just warranty with Samsung. Samsung’s customer service for storage products is pretty bad that there is not even a website to submit a warranty claim.
Switched to Sandisk then it has not failed after a year, I probably will never buy Samsung cards.
> cards that were packaged with official Pi 400 kits were counterfeit
Why not? Company needed SD cards, they bought them and packaged with their kits. It could be from literally any source. Even with official SandIsk partnership.
Strange, I have the exact opposite. Every single one of my Samsung cards have failed, but not a single SanDisk. As my Samsungs failed, I replaced them with Sandisks, which have all been going for quite a number of years now.
I had a phantom issue of my PC not booting (black lit screen) unless cold for about 30 minutes that followed me across 2 different builds. I eventually - after going through every other component - figured out it was a bad SanDisk SSD that wasn't even a boot drive simply not responding to ATA commands and the MSI BIOS simply didn't have a timeout during early boot disk iteration. It was extra weird because that drive worked fine if it initialized correctly.
I have not bought a WD or SanDisk drive since, I'm still very pissed that I spent days debugging this issue, decided I needed to scrap the entire machine and then still had the issue. Who thinks of a bad drive as a reason you can't even boot into BIOS?!
Jeesh bought a crucial drive last year and then the firmware bug thing happened, so instead of buying another crucial drive, I bought a sandisk (ultra sata) drive and now this happens. None of the first party nand makers seem to be immune from these nand killing firmware bugs, not even samsung.
Actually, I take that back. Maybe solidigm/SKHynix?
Pretty much any brand has had some issues here and there.
If you actually want a durable SSD, your best bet is probably to buy "new old stock" of a model that is 1-3 years old and has proven to be reliable.
Of course, it's not always easy to gauge the reliability of a product based on forum posts. A popular product that is 99.9% reliable will... still have thousands of unhappy people crying on forums if they sell millions of units.
In the end, you should probably just have some kind of daily (or better) backup system so that it's not a huge deal if your drive kicks the bucket. That probably makes more sense than obsessing over reliability in a world in which we as consumers don't have much insight into actual failure rates.
Wow... huh? I don't know about that. Been into personal computers since the 80s and it seems like storage has always failed orders of magnitude more often than anything else.
Just my experience, but I have a 1TB SanDisk Extreme V2 which was made several years ago. It routinely gets so hot that it nearly burns my skin to the touch, and when it does get that hot, it randomly unmounts from my computer. Last week, my SuperDuper backup to the drive failed due to underlying corruption, which strangely passed Disk Utility's health checks. I'm so done with this product.
My bad - I should've looked up the model rather than assume you were talking about a naked NVMe drive - but to put my comment another way, I think we should all be treating modern SSDs as devices that need proper cooling, so next time you buy a portable drive, consider either putting one together by buying a metal enclosure (with cooling fins if possible), and put an SSD in it, or else try to find a pre-packaged drive with a metal case.
Indeed, anything that has a "gamer" vibe to it has become suspect for me. I'm not interested in eking out that last smidgen of speed if it comes at the cost of reliability. With Asus, I'd go for the corporate stable motherboards rather than the gamer models.
This is the result of an industry optimising for profit and not longevity. That's why SLC NAND has become almost extinct and priced beyond reason. I don't care how fast or large a storage device is if isn't reliable.
SLC requires very little in the way of firmware, since the endurance is so high and BER so low that simple ECC and wear leveling is sufficient. TLC offers only 3x the capacity, but theoretically lasts 1/8th as long, as SLC NAND manufactured on an otherwise identical process; and the former also requires much more complex (hence more bug-prone) ECC and wear leveling algorithms too, which affects speed and power consumption. QLC is 4x the capacity for 1/16th the endurance.
Industry marketing (and accompanying irrational pricing) has basically persuaded consumers to choose an inferior product.
Why not? There's certainly a group of consumers who will pay more for higher quality. I'm one of them. 2-3x current SSD costs for a better quality SSD would be completely fine with me. You're not paying more than you have to for the same product, you're purchasing a different product.
Intel tried that with Optane with disastrous results (from a financial perspective). SLC doesn't require much separate R&D and manufacturing infrastructure beyond what already exists to serve the markets for TLC and QLC. But that lower barrier to entry still hasn't led to many attempts to serve this niche. Apparently the people with real sales volume data are convinced there's less of a market for expensive and small consumer SLC SSDs than there is for consumer 8TB TLC or QLC SSDs that cost as much as a decent laptop.
Apparently the people with real sales volume data are convinced there's less of a market for expensive and small consumer SLC SSDs than there is for consumer 8TB TLC or QLC SSDs that cost as much as a decent laptop.
They realised it's easier to keep making a profit when drives keep "wearing out" (i.e. failing to be a data storage device) on a consistent and short(ening) schedule. Just like SLC, Optane was too good.
"small" is relative. 8TB of QLC is 2TB of SLC. They will both cost the same (if anything, the SLC might even be cheaper from a firmware/controller development perspective) yet the former might last a few years, and the latter several decades.
A 2TB SLC drive is going to fill up before an 8TB QLC drive wears out, so I don't buy the planned obsolescence argument. And in reality, the kind of consumer who would spend $1k on an SSD is going to move on from it within two or three years anyways in favor of a newer drive with a faster interface.
It will fill up, and more importantly, the data will stay intact. The endurance and retention of SLC is high enough that you can trust it for more than a few years.
And in reality, the kind of consumer who would spend $1k on an SSD is going to move on from it within two or three years anyways in favor of a newer drive with a faster interface.
...or expect that it will last much longer than a cheaper one.
Physically this should be possible but I've never seen an SSD that allows it. I think very very few customers would use it, but I'm still a little surprised no vendor offers it.
I suspect it's heavily firmware-dependent, but I do wonder if taking a TLC drive and partitioning it to 1/3 of its advertised capacity will keep all the blocks in SLC mode, since the firmware, if it's behaving reasonably, should try to use SLC as much as possible just for the speed benefit that brings, and only start converting blocks to MLC/TLC once all the blocks have already been used in SLC mode.
Coincidentally I did some research on this topic the other day and I couldn't find any SSD model with unlimited SLC cache. They all have fairly low limits like 10% of overall capacity.
This is correct. "TLC" is a misnomer. 2^3 is 8, and TLC stores 3 bits per NAND cell, or 8 voltage levels. I suppose when they came up with "MLC" for 2-bit cells (4 voltage levels), which is now also a misnomer as all multi-bit cells are technically MLC, they did not expect to put more than 2 bits in one cell.
The downvotes make me wonder how much useful discussion on HN is actually lost because of ignorants who prevent valid/factual information from surfacing.
Even if you get this data, it is not uncommon for SSD manufacturers to change the original NAND chips for inferior chips without changing the SSD’s brand name or model number.
Yes, but you're not going to like the answer. Despite many industry professionals on this site, there is a large number who, by most definitions, never had a real job. Nothing wrong with that, but they're real loud, and they like to put down proven reliable solutions because they cost too much, and then slap on random terms like "zfs" that magically fix all problems.
The equivalent is in short "look at what the enterprise storage vendors are putting in their arrays, at 10x markup. (No, shiny things like Pure/Rubrik/Cohesity are not enterprise storage).
It all depends what you're putting on things. If you buy 5 drives from Newegg for your house and double-parity them or do zfs checksums, etc., you're going to have a bad time when a bunch fail at around the same time because it's an issue with the drive. Yet you do kinda want all the same/alike drives, because the stripe is only as fast as the slowest drive.
So look at what all the vendors picked after they tested the crap out thousands of them all. Me, I personally just mirror everything between two machines with different brand drives, and hope they won't fail at the same time. Once a year I dump an image of everything on an offline big-ass drive - the cheapest spinning big rust that I can buy - and call that my "airgapped vault."
They use them as a boot/system drives without a big load. And when you have more than 500 drives some would die just by pure luck (or lack of it). Mishandling, static charge etc...
The sellers are but one part of the larger market system at large.
If SLC NAND went extinct, that's because both the sellers and the buyers (read: customers, aka end users) didn't see value in reliability as much as other factors like storage density and price-per-bit.
You, as someone who does want reliability above all else, are an outlier.
I think this is somewhat misdirected because most end users don't know to think about reliability. When the storage fails, "the computer broke", they take it in somewhere and the tech gives them a fixed system with the data gone but had the CPU burned out, they would be just as accepting for the data to be gone in that case too with a "sorry couldn't save it".
The marketing might include a x-million r/w cycles in it, but it's going to be way under presented vs the speed.
It's more likely because the buyers have been persuaded by the marketing and attempts at deception. When 10k MLC (2-bit cells) came out, offering only twice the capacity of SLC for 1/10th the endurance of the 100k SLC that was the norm at the time, they already had to keep SLC prices artifically high (>2x) to attempt to force people to MLC, and I remember the beginning of efforts to hide the poor endurance. Old NAND datasheets proudly proclaimed their 100k or even 10k cycle endurance. Now it's basically impossible to find a TLC or QLC NAND datasheet that isn't behind an NDA or the rare few that get leaked, and even those which you can find, are extremely vague about endurance. Some parts will let you choose between SLC/MLC/TLC mode for each block, and some SSDs use this for some stupid "cache" feature, but the behaviour of that is not easily configurable --- i.e. without hacking the firmware.
The article doesn't mention "pseudo", so I guess you're implying that these are just their existing flash that's capable of TLC/QLC, used permanently in SLC mode? 60DWPD for 5 years is basically 100K endurance, the same as true SLC.
Either way, that's great news - and the ~$0.32/GB they mention (only $600 for 1.92TB!?) for Micron SLC is absolutely amazing value, if you consider that this other article I submitted not long ago mentions ultra-cheap TLC SSDs with NAND from an unknown manufacturer costing $0.10/GB (I even have a comment there lamenting the lack of logically-priced $0.30/GB SLC SSDs!): https://news.ycombinator.com/item?id=35382252
There are SSDs on the market that use all of their TLC flash as SLC cache, so you can almost use them as SLC drives if you partition them to leave 2/3 empty.
Eg the ADATA XPG SX8200. Look for whole drive fill speed benchmarks, if they use the whole drive as cache, the first third is fast (usually the SLC area is much smaller).
Random anecdote but I usually run Samsung (8X0 Pro) or Western Digital (Blue/Black) when I need cheap consumer NVMe drives. Otherwise I run used Intel enterprise SSDs with lots of life. Any component can fail but I have had good luck so far with these (fingers crossed). Of course, take frequent backups of any important storage.
I've had issues with WD Blue recently. MacOS intermittently wouldn't boot on a 3-month-old drive. I wonder how much is shared between this product and the Sandisk, given that WD now own Sandisk?
If anyone wants a recommendation for an alternative, I've had a good experience with the Crucial X8 4TB external SSD. Not quite as fast as the SanDisk Extreme Pro but pretty decent. Wouldn't trust it 100% though. (Also after writing ~300GB continuously the write speed falls from ~900MB/s to 90MB/s).
this is more common than you may think at all levels of the industry. some failures get more publicity than others. all the hand wringing over vendor silicon reliability data and TRIM and RAID topology and i/o pattern optimization doesn't matter one bit when the firmware just decides to delete itself along with all the data for no reason at all.
this one was a real pain in the ass to deal with. always make testable backups, people. backups are not the same as redundancy. a 1, 2, 3, even 7 or 14 day recovery point is far better than poof it's gone.
On a related note, I had a OCZ SSD back in the days which bricked itself all of a sudden. I was running nightly full-disk cloning of my disk via TrueImage, in addition to "realtime" backup of my user directory via Crashplan (back when they were good), so I was back up and running in less than 30 minutes with no important data loss.
I know there are tools like Restic that can do what Crashplan did, but what's the TrueImage equivalent for Linux? Ie something I can use to clone my primary disk nightly in the background, including boot partition, incrementally with periodic full clones, and that supports resizing partitions (both up and down) in case the replacement disk isn't of equal size?
I know of Clonezilla but from their front page it can't do incremental which is a showstopper. With TrueImage each incremental image takes only about 2-3GB per day on average, full is 500GB. It also seems to only support resizing up partitions which isn't great as it means I can't easily use old disks as emergency restore targets like I did when my OCZ died.
I know ZFS root + send/receive is an option but as much as I like ZFS, I'm not comfortable running it on root yet.
I had a OCZ Vertex brick itself like that as well. Fortunately, it was a ZFS cache drive, so it was just a performance hit, but I never bought an OCZ product again.
It allows the the media companies access to the failed hardware to do their own autopsy on it, and it saves the users from needing to go through a painful RMA process, complicated by companies not willing to admit fault.
reply