Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Data Storage Linux

TRIM and Linux: Tread Cautiously, and Keep Backups Handy 182

An anonymous reader writes: Algolia is a buzzword-compliant ("Hosted Search API that delivers instant and relevant results") start-up that uses a lot of open-source software (including various strains of Linux) and a lot of solid-state disk, and as such sometimes runs into problems with each of these. Their blog this week features a fascinating look at troubles that they faced with ext4 filesystems mysteriously flipping to read-only mode: not such a good thing for machines processing a search index, not just dishing it out. "The NGINX daemon serving all the HTTP(S) communication of our API was up and ready to serve the search queries but the indexing process crashed. Since the indexing process is guarded by supervise, crashing in a loop would have been understandable but a complete crash was not. As it turned out the filesystem was in a read-only mode. All right, let's assume it was a cosmic ray :) The filesystem got fixed, files were restored from another healthy server and everything looked fine again. The next day another server ended with filesystem in read-only, two hours after another one and then next hour another one. Something was going on. After restoring the filesystem and the files, it was time for serious analysis since this was not a one time thing.

The rest of the story explains how they isolated the problem and worked around it; it turns out that the culprit was TRIM, or rather TRIM's interaction with certain SSDs: "The system was issuing a TRIM to erase empty blocks, the command got misinterpreted by the drive and the controller erased blocks it was not supposed to. Therefore our files ended-up with 512 bytes of zeroes, files smaller than 512 bytes were completely zeroed. When we were lucky enough, the misbehaving TRIM hit the super-block of the filesystem and caused a corruption."

Since SSDs are becoming the norm outside the data center as well as within, some of the problems that their analysis exposed for one company probably would be good to test for elsewhere. One upshot: "As a result, we informed our server provider about the affected SSDs and they informed the manufacturer. Our new deployments were switched to different SSD drives and we don't recommend anyone to use any SSD that is anyhow mentioned in a bad way by the Linux kernel."
This discussion has been archived. No new comments can be posted.

TRIM and Linux: Tread Cautiously, and Keep Backups Handy

Comments Filter:
  • by Rinikusu ( 28164 ) on Tuesday June 16, 2015 @04:48PM (#49924977)

    I suggest we call it SNATCH.

  • by msobkow ( 48369 ) on Tuesday June 16, 2015 @04:48PM (#49924987) Homepage Journal

    I'll Google in a moment, but I was wondering if anyone knew of any good sites that maintain lists of good/bad SSDs for Linux. With the number of vendors out there nowadays, having to scan the source seems like a poor way to track the information.

    • by Ken_g6 ( 775014 ) on Tuesday June 16, 2015 @05:21PM (#49925201)

      It takes a couple of links and searching through source code [github.com] to get there. So here's the list of problematic drives, better formatted but still in regular expression format:

      /* devices that don't properly handle queued TRIM commands */
      Micron_M500*
      Crucial_CT*M500*
      Micron_M5[15]0*
      Crucial_CT*M550*
      Crucial_CT*MX100*
      Samsung SSD 8*

      So, basically, all the ones I thought were the best. The list of whitelisted drives after it only includes those brands, Intel, and ST-something. So other brand may be unknowns.

      • by Anonymous Coward on Tuesday June 16, 2015 @05:54PM (#49925383)

        The Crucial MX100 with the latest MU02 firmware is now whitelisted by the Linux Kernel, and has it's TRIM ability re-enabled.

      • by idontgno ( 624372 ) on Tuesday June 16, 2015 @06:11PM (#49925465) Journal

        ObPedant: those aren't regexes, they're globs. Otherwise (for instance), the Samsung entry would match

        Samsung SSD<space>
        Samsung SSD<space>8
        Samsung SSD<space>88
        Samsung SSD<space>888
        .
        .
        .

        ad nauseam: the "*" regex operator means "zero or more occurrences of the previous pattern", which in this case is the character "8".

        At least, I hope they're not supposed to be regexes. Otherwise, the kernel blacklist itself will have some serious issues known-bad SSDs because someone never learned how to create a regular expression.

      • by Anonymous Coward

        You will only find SSDs from the very best vendors there... because the crap ones don't claim to support queued TRIM in the first place.

        It is interesting that the Micron M500, *which is an enterprise datacenter SSD*, is listed. Rather bad PR for Micron, that: an enterprise datacenter SSD that corrupts data and has not been fixed?!

        As usual, good PR for Intel... too bad their SSDs self-destruct based on a timer, instead of trying to soldier on until things actually get really broken (and only *then* self-des

    • Comment removed (Score:5, Interesting)

      by account_deleted ( 4530225 ) on Tuesday June 16, 2015 @06:18PM (#49925497)
      Comment removed based on user account deletion
      • by PlusFiveTroll ( 754249 ) on Tuesday June 16, 2015 @09:38PM (#49926489) Homepage

        Because Windows doesn't do queued TRIM.

        TRIM in Windows and Linux before now worked more like this. -DATA- -DATA- -FLUSH ALL COMMANDS TO DRIVE- -WAIT- -TRIM- -DATA- -DATA- When I drive was doing the trim thing it could not do anything else, there could be no other in flight commands to the drive.

        This is different. -DATA- -DATA- -TRIM- -DATA- -TRIM- -DATA- -DATA- -DATA-

        TRIM is part of the NCQ and is an operation occurring with other instructions in the SATA queue. Problem is some disk manufactures have pissed this up. It seems likely that a firmware update will be able to fix this issue.

        https://en.wikipedia.org/wiki/... [wikipedia.org]

      • Yep freebsd is fine here with 840 pros.

        I find it hysterical that it must be the drives attitude on this heavily biased site towards linux.

  • Apple (Score:2, Insightful)

    by Anonymous Coward

    This is why Apple doesn't support TRIM in third-party SSDs...

    • by Lumpy ( 12016 )

      And it's trivial to get around.

      • Not really trivial as of Yosemite. You have to disable kernel extension signing. Luckily, there appears to be a command line tool for force enabling TRIM in 10.11.

        • by Lumpy ( 12016 )

          Still trivial even for computer n00bs.

          Download and install Trim Enabler from the app store.

  • by m.dillon ( 147925 ) on Tuesday June 16, 2015 @04:53PM (#49925023) Homepage

    The only TRIM use I recommend is running on it on an entire partition, e.g. like the swap partition, at boot, or before initializing a new filesystem. And that's it. It's an EXTREMELY dangerous command which results in non-deterministic operation. Not only do SSDs have bugs in handling TRIM, but filesystem implementations almost certainly also have ordering and concurrency bugs in handling TRIM. It's the least well-tested part of the firmware and the least well-tested part of the filesystem implementation. And due to cache effects, it's almost impossible to test it in a deterministic manner.

    You can get close to the same performance and life out of your SSD without using TRIM by doing two simple things. First, use a filesystem with at least a 4KB block size so the SSD doesn't have to write-combine stuff on 512-byte boundaries. Second, simply leave a part of the SSD unused. 5% is plenty. In fact, if you have swap space configured on your SSD, that's usually enough on its own (since swap is not usually filled up during normal operation), as long as you TRIM it on boot.

    -Matt

    • by jez9999 ( 618189 )

      Isn't TRIM support disabled by default in Linux? They must have set the "discard" mount option.

  • Name and shame (Score:2, Informative)

    by Anonymous Coward

    see ata_blacklist_entry

    (reformatted to get past Slashdot's 'junk' filter)

    static const struct ata_blacklist_entry ata_device_blacklist [] = {
    see ata_blacklist_entry [github.com]

    static const struct ata_blacklist_entry ata_device_blacklist [] = /* Devices with DMA related problems under Linux */
    WDC AC11000H, NULL, ATA_HORKAGE_NODMA ,
    WDC AC22100H, NULL, ATA_HORKAGE_NODMA ,
    WDC AC32500H, NULL, ATA_HORKAGE_NODMA ,
    WDC AC33100H, NULL, ATA_HORKAGE_NODMA ,
    WDC AC31600H, NULL, ATA_HORKAGE_NOD

    • Re:Name and shame (Score:5, Interesting)

      by Rockoon ( 1252108 ) on Tuesday June 16, 2015 @05:43PM (#49925329)
      The way I am reading the comments, the issue is that the buggy SSD's are flagging physical blocks as RZAT or DRAT when a trim request on a logical block is ignored. The bug presents itself later if the SSD performs wear leveling that swaps out the logical block with another, the bug being leaving the physical block tagged RZAT or DRAT.
  • TLDR (Score:2, Insightful)

    by Anonymous Coward

    Don't buy Samsung SSDs.

    • by Yunzil ( 181064 )

      Yeah, don't buy arguably the best SSDs on the market because your OS can't be bothered to work around their foibles.

  • Algolia is a buzzword-compliant ("Hosted Search API that delivers instant and relevant results") start-up

    It sounds like a kind of infection. The kind you get, you know, down there

  • Or if your going to use consumer ones vet the hell out of them.

    • I suspect that what we see here is a problem that is very common in the consumer hardware industry: manufacturers don't bother testing under any OS other than Windows, which means bugs that do not manifest under Windows go undetected. It's a problem most often seen in ACPI interfaces, where Windows has a very loose interpretation of the standards. So long as it runs fine on Windows, it's considered good enough to ship.

  • by LDAPMAN ( 930041 ) on Tuesday June 16, 2015 @05:06PM (#49925119)

    I wonder if this issue has anything to do with why Apple only supports TRIM on specific drives they OEM?

    • Re: (Score:2, Insightful)

      by Anonymous Coward
      Yes. Apple also has custom firmware to support temperature sensors (instead of just using the standard SMART commands), which is why the fans in their iMacs/Macbooks will run at full speed if you swap out their drive for a 3rd party one... OS X assumes that the sensor is broken and goes into safe mode to avoid an overtemp burnout.
      • by Algan ( 20532 )

        I will have to call BS on this one. Both my MBP and my old school MP run 3rd party drives, including SSDs (Crucial and OCZ - yeah, I know). No problems whatsoever so far, fans are spinning at their normal rpm.

    • by Anonymous Coward on Tuesday June 16, 2015 @05:32PM (#49925271)

      Its a good bet.

      As apple is probably quite aware, being probably the biggest seller of non-windows PCs, there is an endemic problem with a whole lot of hardware shipped claiming to be "compliant" with any given standard.

      Most vendor's testing methodology pretty much comes down to "Works on windows? Ship it"

      Linux has been dealing with this problem for decades. Power management implementations in laptops (and some desktop motherboards) are often outright broken and don't behave anything close to what the "standard" dictates. (Its so bad in laptops that Microsoft's power management maintains a hardware checklist with custom hacks for laptops with known bad implementations. On many systems it does not even /attempt/ to use standard calls)

      Linux developers attempt to access hardware in a manner according to how documented standards state and end up tripping all sorts of bugs from mild to hardware-bricking. Flabbergasted hardware vendors often respond with "It works in windows!"

      (Fortunately this shit doesn't fly in the server space where Linux is now pretty much King.. Well, at least in theory)

      So yes, I'd be willing to bet that Apple found that enabling trim in any old SSD led to an unacceptable chance of filesystem corruption and decided to implement a white list. So, you know, they don't catch shit for someone else's broken hardware.

    • by AmiMoJo ( 196126 )

      If that was the reason they would just make sure to never send queued TRIM commands like Windows does for all drives and Linux does for known bad drives. The performance loss is minimal, especially on Windows where TRIM commands are sent when the drive is idle anyway for performance reasons.

      Instead they just disabled it for all but their own drives. Seems more like a way to discourage people from buying non-Apple SSDs (which are rather expensive) by crippling performance for no good reason.

  • I have an 850 Pro at home and an 850 EVO at work, and haven't experienced any corruption. I know that Windows uses TRIM. Why am I not seeing any problems?

    I doubt EXT4 or whatever part of Linux issuing TRIM commands is doing it wrong, but they're clearly doing it different, and maybe it can be worked around or at the very least reported to the manufacturer to fix broken firmware.

    • Comment removed based on user account deletion
      • I can add an anecdote. 2x 840 evo pros at home and 1 840 evo running windows 7, 8.1 and 7 respectively. Never had any kind of corruption issue. Mind you none of these drives are under any serious load like search indexing.

        • by guruevi ( 827432 )

          Does your OS actually do TRIM or is it merely reporting that it supports TRIM? Or does it require a binary to execute TRIM? I've noticed, most devices are simply ignoring TRIM commands. Also, do you actually continuously verify that your data is written and stored correctly? Unless you have ZFS or BTRFS, you most likely are accumulating errors across your data.

          • How is ZFS or BTRFS going to help you?

            Sounds like the corruption occurs due to the trim command erasing data that has already been successfully written.

            Sure, ZFS will tell you if you're reading corrupt data, but by then it's too late, your data is gone.

          • Yes I've checked it on the windows 7 machines, not on the windows 8.1 (just assumed it was on as it was with windows 7).

            Also no I don't continuously verify data. These drives store only program info on them. If I were accumulating errors on the data at any kind of problematic rate then I would have started seeing random crashes / bluescreens sometime over the past few years, which I haven't.

            Errors are errors, and there's ways of noticing them other than continuous checksumming.

          • Comment removed based on user account deletion
        • by LDAPMAN ( 930041 )

          These guys are running a very out of the ordinary usage profile and they have also managed to identify the root cause. It's possible it happens rarely with more normal usage and that none has bothered to find the root cause.

    • I have an 850 Pro at home and an 850 EVO at work, and haven't experienced any corruption. I know that Windows uses TRIM. Why am I not seeing any problems?

      You're shielded from the problem because of 2 different things:

      - Samsung 850 aren't as much affected by speed decay as Samsung 840. Thus a firmware fixing the speed problem was only shipped for 840s, not for 850s - and it's that firmware which had the problem. You drive simply didn't get the problematic firmware.

      - That newest firmware falsely advertises that the drive supports TRIM together with NCQ. But the drive actually doesn't.Re-ordering should happen while TRIM is used.
      Linux follows the standards: it

  • by idontgno ( 624372 ) on Tuesday June 16, 2015 @06:02PM (#49925429) Journal

    Correct title: "TRIM and Any Fucking Operating System: Don't Buy Defective SSDs"

    It's not as if Windows or MacOS has any magic that makes queued TRIM work with non-compliant and poorly-coded hardware, right?

    Seriously, WTF, over?

    • Re: (Score:3, Informative)

      by Anonymous Coward

      Windows and MacOS do not issue Queued TRIM in the first place. They only issue the regular TRIM command, which has to stop all data in flight and quiesce the entire submission queue (all tags, etc).

      Linux is ultra-high-IO-load optimized, queued TRIM is a must when dealing with high-performance storage (not just SSDs). Maybe it should stop trusting devices that are neither attached to a SAS or FC transport by default when they claim to actually implement advanced features, though.

      • "queued TRIM is a must when dealing with high-performance storage (not just SSDs)"??

        TRIM is not used by anything other than flash storage.
        • by jabuzz ( 182671 )

          Wrong it is used with thin provisioning in enterprise storage products. That is I can thinly provision a volume on my storage array and it will use the TRIM commands to "reuse" blocks that are no longer needed in exactly the same way flash drive would.

      • by AmiMoJo ( 196126 )

        https://en.wikipedia.org/wiki/... [wikipedia.org]

        TRIM is explicitly a non-queued command. Linux attempting to queue it is out of spec. It works most of the time, but it isn't fair to say lay all the blame for failures with the SSD manufacturers. They should reject queued TRIM commands if they don't work, but equally Linux should not be sending them.

        • Perhaps you should update your lovely Wikipedia page, because it is outdated.

          SATA 3.1 standard note at techreport [techreport.com]
          Webopedia info on SATA 3.x [webopedia.com]
          Wikipedia's own entry on SATA 3.1 [wikipedia.org]
          TechPowerUp article about SATA 3.1 [techpowerup.com]

          Here's a press release from sata-io about it: in PDF format [sata-io.org]

          Not only does TRIM via NCQ exist, it is in the recent specifications. You see, the thing about computer technology is that it keeps being improved. Outdated information doesn't stop that. It just becomes outdated.

        • by Agripa ( 139780 )

          https://en.wikipedia.org/wiki/... [wikipedia.org]

          TRIM is explicitly a non-queued command. Linux attempting to queue it is out of spec. It works most of the time, but it isn't fair to say lay all the blame for failures with the SSD manufacturers. They should reject queued TRIM commands if they don't work, but equally Linux should not be sending them.

          From your link:

          This Trim shortcoming has been overcome in Serial ATA revision 3.1 with the introduction of the Queued Trim Command.

          If the drives were not reporti

    • by Trogre ( 513942 ) on Tuesday June 16, 2015 @07:55PM (#49926055) Homepage

      Dear Microsoft,

      Thank you for your generous donation to our staff social club. As promised, please find attached drivers that utilise the *real* TRIM commands for our SSDs.

      Sincerely yours,
      A. Manufacturer

    • by tlhIngan ( 30335 )

      It's not as if Windows or MacOS has any magic that makes queued TRIM work with non-compliant and poorly-coded hardware, right?

      OS X disables TRIM on third-party SSDs by default. On Mavericks and below, there's an app called TRIM Enabler that enables TRIM on third-party SSDs. On Yosemite, kernel signing prevents this from happening, resulting in a really sketchy method to get TRIM operational.

      El Capitan is supposed to come with a way to enable TRIM on third-party SSDs, but it requires special modes and using

  • Even when implemented correctly, TRIM slows down regular I/O that happens around the time it's done. On top of that, you are risking OS and drive bugs that can vary with every incremental revision. You may not notice corruption until all your backups are overwritten, and just think of a hassle of restoring even once. Is it really worth potential minor performance benefits that are often realized by drive itself anyway?

    I can think of exceptions like building a supercomputer with monolithic array of drives us

    • by fnj ( 64210 )

      Use of TRIM fights the deleterious effect of write amplification on lifespan, as well as reducing degradation of performance over time. Why does that "make no sense" for individual users?

      There are two strategies for using TRIM.

      The first one is "discard" in the mount options, which causes the drive to be informed via the TRIM command at the time a block is freed (file erased). The second strategy runs a utility (fstrim) periodically - for example, once a day - to TRIM all the blocks freed since the last time

  • I only have experience with customer grade SSDs and not with enterprise ones. But as it comes for customer SSDs most of the ones I've used or maintained caused no problems. But I recall one HP made drive that used crash after about a year - total data loss after a year of usage. Reformat and the drive was ok - another year passed and crash and data loss. As it turned out the disk had some encryption procedures in firmware which were faulty - firmware upgrade (hopefully) fixed it but also said firmware updat

  • FTFS:

    Our new deployments were switched to different SSD drives

    Is "SSD drive" grammatically anything like "PIN number"?

Never test for an error condition you don't know how to handle. -- Steinbach

Working...