What Linus Torvalds Gets Wrong About ZFS (arstechnica.com) 279
Ars Technica recently ran a rebuttal by author, podcaster, coder, and "mercenary sysadmin" Jim Salter to some comments Linus Torvalds made last week about ZFS.
While it's reasonable for Torvalds to oppose integrating the CDDL-licensed ZFS into the kernel, Salter argues, he believes Torvalds' characterization of the filesystem was "inaccurate and damaging."
Torvalds dips into his own impressions of ZFS itself, both as a project and a filesystem. This is where things go badly off the rails, as Torvalds states, "Don't use ZFS. It's that simple. It was always more of a buzzword than anything else, I feel... [the] benchmarks I've seen do not make ZFS look all that great. And as far as I can tell, it has no real maintenance behind it any more..."
This jaw-dropping statement makes me wonder whether Torvalds has ever actually used or seriously investigated ZFS. Keep in mind, he's not merely making this statement about ZFS now, he's making it about ZFS for the last 15 years -- and is relegating everything from atomic snapshots to rapid replication to on-disk compression to per-block checksumming to automatic data repair and more to the status of "just buzzwords."
[The 2,300-word article goes on to describe ZFS features like per-block checksumming, automatic data repair, rapid replication and atomic snapshots -- as well as "performance wins" including its Adaptive Replacement caching algorithm and its inline compression (which allows datasets to be live-compressed with algorithms.]
The TL;DR here is that it's not really accurate to make blanket statements about ZFS performance, absent a very particular, well-understood workload to measure that performance on. But more importantly, quibbling about the fastest possible benchmark rather loses the main point of ZFS. This filesystem is meant to provide an eminently scalable filesystem that's extremely resistant to data loss; those are points Torvalds notably never so much as touches on....
Meanwhile, OpenZFS is actively consumed, developed, and in some cases commercially supported by organizations ranging from the Lawrence Livermore National Laboratory (where OpenZFS is the underpinning of some of the world's largest supercomputers) through Datto, Delphix, Joyent, ixSystems, Proxmox, Canonical, and more...
It's possible to not have a personal need for ZFS. But to write it off as "more of a buzzword than anything else" seems to expose massive ignorance on the subject... Torvalds' status within the Linux community grants his words an impact that can be entirely out of proportion to Torvalds' own knowledge of a given topic -- and this was clearly one of those topics.
While it's reasonable for Torvalds to oppose integrating the CDDL-licensed ZFS into the kernel, Salter argues, he believes Torvalds' characterization of the filesystem was "inaccurate and damaging."
Torvalds dips into his own impressions of ZFS itself, both as a project and a filesystem. This is where things go badly off the rails, as Torvalds states, "Don't use ZFS. It's that simple. It was always more of a buzzword than anything else, I feel... [the] benchmarks I've seen do not make ZFS look all that great. And as far as I can tell, it has no real maintenance behind it any more..."
This jaw-dropping statement makes me wonder whether Torvalds has ever actually used or seriously investigated ZFS. Keep in mind, he's not merely making this statement about ZFS now, he's making it about ZFS for the last 15 years -- and is relegating everything from atomic snapshots to rapid replication to on-disk compression to per-block checksumming to automatic data repair and more to the status of "just buzzwords."
[The 2,300-word article goes on to describe ZFS features like per-block checksumming, automatic data repair, rapid replication and atomic snapshots -- as well as "performance wins" including its Adaptive Replacement caching algorithm and its inline compression (which allows datasets to be live-compressed with algorithms.]
The TL;DR here is that it's not really accurate to make blanket statements about ZFS performance, absent a very particular, well-understood workload to measure that performance on. But more importantly, quibbling about the fastest possible benchmark rather loses the main point of ZFS. This filesystem is meant to provide an eminently scalable filesystem that's extremely resistant to data loss; those are points Torvalds notably never so much as touches on....
Meanwhile, OpenZFS is actively consumed, developed, and in some cases commercially supported by organizations ranging from the Lawrence Livermore National Laboratory (where OpenZFS is the underpinning of some of the world's largest supercomputers) through Datto, Delphix, Joyent, ixSystems, Proxmox, Canonical, and more...
It's possible to not have a personal need for ZFS. But to write it off as "more of a buzzword than anything else" seems to expose massive ignorance on the subject... Torvalds' status within the Linux community grants his words an impact that can be entirely out of proportion to Torvalds' own knowledge of a given topic -- and this was clearly one of those topics.
Who actually uses ZFS? (Score:3)
I don't mean like you've got a desktop and you use it because you heard it's cool. I mean you're actually meaningfully using "atomic snapshots to rapid replication to on-disk compression to per-block checksumming to automatic data repair and more" in a way that makes existing other solutions obsolete because it's just so much better.
Re:Who actually uses ZFS? (Score:5, Interesting)
I'm one of the many users of it, and specifically for those reasons. Data integrity is critical to what I do. RAID controller failures and disk drive controller failures have caused a number of issues for me in the past two decades. ZFS virtually eliminates these issues, or at least detects them early enough to resolve them by replacing defective hardware before data corruption happens.
Snapshotting is absolutely amazing Think git commit history of files, but for everything! Except that instead of every single commit, it just has only certain point in time snapshots (advantages and disadvantages to both approaches). Samba now supports ZFS snapshots and exports shares as shadow volume copies that Windows can directly read. Someone accidentally saves junk to a file on a network share? They can easily restore it themselves, or have IT restore the file for them.
Replication? This is our primary backup strategy. With snapshots, we push them across the internet to remote data centers to other live servers. If anything happens, all of our network shares just need to point to a different destination, and we're back in business as if nothing happened. We also use this to replicate OS ISOs across all locations, we have one central repository that is replicated out everywhere. And ZFS is smart enough to do incremental transfers, so only new content is pushed over the wire.
Transparent compression is amazing. Some of our data sets are getting a 2-to-1 compression ratio. This means less disk I/O, so faster access to data, as well as less overall storage required for large objects.
Also, ZFS offers transparent on-disk encryption, which for some of my work is a regulatory compliance thing that must be enabled. Applications don't have to worry about it, as the file system does it for them.
There are countless other features that we're using across multiple different projects where ZFS has replaced entire suites of utilities, both hardware and software, in a single, concise, and unified package.
Re: (Score:3)
Thanks. These all seem like useful features.
It seems like its biggest impact will be realized in businesses wanting to build a custom SAN. Is ZFS licensed in a way that you're not worried about Oracle's litigious reputation, or are people hoping to fly under the radar?
Re:Who actually uses ZFS? (Score:5, Informative)
CDDL is a viral copyleft license that prohibits adding additional restrictions on redistribution. In that way, it's a lot like GPL. It varies in details, for example it leans on patents as a way to go after infringers while GPL relies on copyright alone. The issue is that CDDL and GPL contain different restrictions, and both prohibit adding additional restrictions. Therefore, you can't distribute a work that combines CDDL and GPL code. (GPL is an "end of the line" license - it's only "compatible" with other licenses if the other license is less restrictive and allows code to be redistributed under the terms of the GPL. The same is true of CDDL.)
The ZFS-on-Linux kernel module is distributed under the terms of the CDDL, being derived from the OpenSolaris ZFS code. The uncertainty is over whether it is constitutes a derived work of the Linux kernel, which is distributed under the terms of the GPL. Some people argue that by using Linux kernel interfaces and being loaded into the Linux kernel, it is a derived work of the Linux kernel. (It will be interesting to see what effect the result of Google vs Oracle on whether APIs qualify for copyright protection has on this argument.)
If ZFS-on-Linux is in fact a derived work of the Linux kernel, it is unlawful to distribute. Combining CDDL and ZFS code in itself is not unlawful - it's the act of distributing the combined work that's a license violation. If this is the case, anyone distributing or using ZFS-on-Linux will be exposed to two kinds of litigation: Linux kernel contributors, Oracle, and other OpenZFS contributors will be able to sue anyone distributing ZFS-on-Linux for copyright infringement; and anyone using ZFS-on-Linux lose the patent license provided by CDDL, and potentially be required to pay royalties to parties holding patents applicable to ZFS.
Re: (Score:3)
It's rude OR stupid. You have to choose (Score:2)
We all say dumb things from time to time.
Sometimes we can get away with being a bit of a jerk. For example, Linus can say "For the millionth time - We don't break userspace!" or "Don't fucking send me crap that hasn't been tested AT ALL. This shit doesn't even compile!" People don't tell him to fuck off because he's done the work and knows what he's talking about, so he's earned some respect, and can get some grace when he's less than polite.
You managed to be extremely rude AND really stupid at the same ti
Re: (Score:2)
Also, ZFS offers transparent on-disk encryption, which for some of my work is a regulatory compliance thing that must be enabled. Applications don't have to worry about it, as the file system does it for them.
That sounds about the least interesting part: Linux already offers transparent encryption for any FS by encrypting the block device.
Danger! (Going off an a tangent) (Score:3)
You mentioned pushing file shares to another system as backup. It's not 100% Clea exactly what you're doing, but it sounds like you may have a dangerous situation that isn't nearly as safe as you thought.
If the primary writes a copy of itself to the backup, you're going to be hosed when you get ransomware. Most ransomware encrypts any shares it has access to, so the main server will encrypt it's backup if it can.
Because encryption changes every block, that's going to use up all your snapshot space if the dr
Re: (Score:2)
Re: (Score:2)
If a client ends up encrypting your backup share, you can easily fix that just by rolling back to a previous snapshot. It's true you might run out of space in the meanwhile, but these aren't LVM snapshots -- the pool running out of space won't corrupt your snapshots. Quotas and reservations can limit how much storage a given client uses too, so this doesn't even have to affect other clients.
(I'm not arguing against pull backups in general, just pointing out that this particular situation isn't nearly as dan
Re:Who actually uses ZFS? (Score:4, Insightful)
I've been a user since Sun Solaris, I agree, as far as a filesystem is concerned, it is very stable. However, over the last decade, there hasn't really been much development. Sure some minor improvements were made, but BP rewrite has been on the roadmap since before Sun got taken over by Oracle and performance, especially with RAIDZx and its resource requirements (especially RAM but also CPU cycles) really has been atrocious.
I had to invest in 512GB RAM and 2TB worth of enterprise SSD to get 40Gbps out of an 80+ disk array. Another major feature promised but missing is HA and clustering, dedup hasn't ever been feasible and the license is a problem, at one point even Apple considered it for their systems but couldn't trust the license. My primary problem with ZFS is that it has required me to do forklift upgrades and over-provision all the time just to maintain performance, which is due to BP rewrite being missing.
So I agree with Torvalds on the technical side, even though I've been an advocate for ZFS and hope nobody gets sued by Oracle. Although in my case, recently, both Nexenta and iXsystems were beat by Dell Isilon and StorageCraft both in performance, feature set and price (this is enterprise-level, HA stuff) for a 5 year high growth (500TB and 30%+/y growth) budget. Ceph actually beats ZFS in my case, but both support and implementation costs are still high and CephFS isn't stable enough for my liking. I think ZFS is on its way out in favor of other open source file systems like Ceph once they mature.
Re: (Score:3)
No, this was highly tuned in collaboration with engineers from Nexenta, some of our workload was the reason for performance improvements Nexenta developed for OpenZFS. The problem is our workload isn't straight streaming some videos, we're dealing with millions of files (images) that are 300TB worth of ~100-500kB each with 200+ SMB clients requiring random read and write access over multiple 1-10Gb links.
We did eventually use mirrored vdev's to get the performance and RAIDZ2 on the backup, but the mirrors a
Re: (Score:2)
Re:Who actually uses ZFS? (Score:5, Interesting)
At my previous job we built all of our data clusters on top of ZFS. Why? Because of being in a highly regulated industry (for security) we ran everything on bare metal in our own datacentres. That and budget constraints meant we only had a finite/small number of machines we could allocate to this. We were able to reach mind blowing performance out of our ElasticSearch and MongoDB clusters by fine tuning ZFS's ARC and L2ARC (our workloads were very read heavy). In fact, ZFS's ARC got us out of trouble more than once. In one instance we ran into some pretty crappy performance issue with MongoDB's own cache flushing logic, ZFS's ARC, despite being completely oblivious to what it was storing, performed better/as expected from the hardware until the MongoDB bug was fixed.
We also ran extensive performance tests for our workloads for months prior to going live (we needed a guaranteed filesystem latency for this platform). We tested FreeBSD and OpenZFS (circa 0.6.8). There was no comparison, the FreeBSD implementation was much more predictable with a much lower latency standard deviation.
So there, we did all that for a fraction of what the equivalent cloud infrastructure would have cost, and we got all the data integrity perks to go with it.
Re:Who actually uses ZFS? (Score:5, Informative)
Re: (Score:2)
His point about the CDDL license is completely valid.
His comments about the technical features and benefits of ZFS are complete nonsense.
Re:Who actually uses ZFS? (Score:4, Insightful)
As someone who actually uses ZFS (which apparently he hasn't), it's more like someone who's never actually looked out the window of an airplane at 30,000 ft saying "the land below us is green because land has grass and trees on it", while anyone who looks out the window can tell you the plane is flying over the ocean (which incidentally covers more than twice the surface area as land). His technical criticisms of ZFS are for the most part completely off base.
His caution about the licensing issue is spot on. Though from what I understand the incompatibility is due to Linux's GPL being more restrictive than the CDDL, not the other way around as many are assuming (since ZFS is now owned by Oracle).
Re: (Score:3)
No, Linus' 30,000ft view is that he's flying over a ZFS wasteland with no redeeming features, and the person responding is saying that actually it's mostly green arable land down there.
Two completely different views, not nitpicking as you suggest.
Re: (Score:2)
It's more like the view from the airplane is, "nothing down there, it's just fly over country", as he flies over 150 million people including huge cities like Chicago.
Well, with places like Highland Park and Cal City I'd say "nothing down there, just fly over country" might not be far off the mark in this case...
Re:Who actually uses ZFS? (Score:4, Interesting)
I run several file servers in the 300-500 TB range. All using ZFS.
My big one was on an open Solaris variant until the OS drive ate itself. I replaced the OS drive, installed Debian 10, and was able to re-import the ZFS pools, and was back up and running in just a few hours. No data lost or corrupted.
Sorry Linus - but you're talking out your ass on this one.
Re: (Score:2)
Re: Who actually uses ZFS? (Score:3)
Re: (Score:2)
This container-based webhosting provider [antagonist.nl] does, to the extent that they even dug into the source code to figure out a performance issue (there's an English summary near the end of the article).
Re:Who actually uses ZFS? (Score:4, Informative)
It's clearly very important to the Lawrence Livermore National Laboratory because they're doing most of the work on the Linux version.
Myself, I've lost data at rest before, some source code of mine got turned into random garbage at some point back in the DOS days, and of course that's what got into the backups, as I discovered a year or so later when I tried to revisit that project. So the ZFS checksum and repair system is of great interest to me as I now have large amounts of video project, audio multitrack and artwork multilayers that can't be replaced, and I'd like some assurance that what's being backed up is still correct.
Re: (Score:3)
I do. Snapshots are great for backups - keeping the last one on the server lets me to only transfer the differences - no need to do a full backup more than once, it also lets me to rollback the last snapshot almost instantly if I need to. .didn't need to restore from backups.
Data repair is useful when disks start acting up - I managed to recover a RAID! where both drives had lots of bad sectoors -
I can't say that zfs is the best there is (to do that I would have to try everything), but its snapshots are bet
Re: (Score:2)
Re: (Score:2)
Re:Who actually uses ZFS? (Score:4, Informative)
What is the file you want to backup is in use, say, it's a disk of a virtual machine. Backing it up with rsync will result in a file that has a "newer" end than beginning. In other words - corrupt. With a snapshot you get a file that is all o the same "age", I think this is called a "crash-consistent" backup.
Also, rsync has to read the file on both ends to calculate the differences.This means that the backup performance is limited by reads, especially if you are backing up a few servers with, say, 20TB of data on them (but probably 100GB of changes since last backup), raync will take a long time to read the 20TB.
OTOH, if I keep the last snapshot on the vm host, when I create a new snapshot, zfs already knows the difference, so the backup speed is limited by the network or the write performance of the backup server, since only the changes are read and transferred. I can then delete the older snapshot from my vm host.
Also, since the backup server also uses zfs, I use snapshots to keep older backups, not just the last one, and all the snapshots are accessible. I can mount any snapshot and get the file I need very quickly, no need to assemble the data from one full backup and 20 incrementals. It's already done. I can then delete an old snapshot afer its retention period expires and do not need to do a full backup more than once.
Re: (Score:2)
Re: (Score:2)
You still haven't got around the requirement to read all the data at both ends. rsync is, by its very nature, going to be orders of magnitude slower than ZFS snapshot send/recv. I've used both. rsync is slow, really, really slow. And it's not atomic. If the collection of files and directories you are transferring changes during the transfer, it's not going to be self-consistent. ZFS snapshots are replicating a point-in-time snapshots of the entire dataset safely and with minimal overhead.
Re: (Score:2)
vmware is a good product, but it also is expensive. kvm + zfs is free.
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
I use it every day, and take advantage of all of the features you've mentioned.
I'm a homelabber, meaning I like to run server-class hardware at home and build out test environments where I can learn. Busy teaching myself Ansible at the moment on the lab, but I digress.
The main storage in my lab is a 48TB raw ZFS array attached to a Dell R710 that I picked up dirt cheap. It has a single Westmere CPU and 72GB of RAM that I also picked up really cheap (the second socket has a bent pin so I can't double up the
Re: (Score:2)
Re: (Score:2)
I am sure Larry Ellison has a few he could sell you.
Re: Who actually uses ZFS? (Score:4, Informative)
It would've been fine without dedup. The dedup metadata tables are fairly large and are read randomly, so if they're on spinning rust and not cached they're painfully slow to read. Importantly, turning dedup off only affects newly-written data. The dedup tables won't magically go away; they're still there and still need to be consulted when modifying or removing blocks that were written with dedup enabled. Blocks written with dedup will also use sha256 as the checksum algorithm by default, which is the slowest choice, but CPU time isn't going to be the limiting factor.
Re: Who actually uses ZFS? (Score:3)
Think of it as an enterprise filesystem. It was designed to make Sun Solaris storage servers competitive with enterprise NAS appliances, like NetApp's. Solaris was ported to X86, it was open sourced, blah blah blah, now ZFS has trickled down to the masses.
It can run it on smallish systems, but it wasn't designed for that. Yours should have been able to run it, but I don't know why you turned dedup on, it may be easy to enable, but you should know what you're doing because all dedup systems, software or h
What Instead (Score:2)
Linus makes the mistake of negatively criticizing ZFS but does not say what to use instead.
Simple reason: (Score:5, Insightful)
There is nothing else. Nothing can even remotely substitute what ZFS has to offer.
Doesn't mean it doesn't have its own problems.
Reality is mostly not a dichotomy, dear DemocratRepublican voters. ;)
There‘s btrfs (Score:2, Interesting)
btrfs may not be as battle-proven as zfs is, but for most of the use case of zfs, it’s the more modern alternative that’s actively developed within the Linux kernel community.
In case you’re worried about its maturity: You can get enterprise support for it from SUSE and Oracle (!)
Re: (Score:3)
In case you’re worried about its maturity: You can get enterprise support for it from SUSE and Oracle (!)
In my experience, btrfs is has been more of an attractive nuisance than anything. While SUSE will field support cases about it in SLES, the level of support is along the lines of "try btrfs.check --repair", and when that destroys the filesystem entirely, to recommend restoring from backup.
ZFS on Solaris was a solid production-grade filesystem. btrfs sounds like it offers similar functionality on paper, but has not been production quality on SLES 15. File checksumming is nice, but it's hard to really get
Re: (Score:2)
Do you realise where btrfs came from? It was originally developed by Oracle because they wanted an alternative to ZFS. When they acquired Sun (and by extension ZFS), they stopped caring about btrfs, and it's stagnated ever since.
Re: There‘s btrfs (Score:2)
Re: (Score:2)
Stop. You can repeat it until you're blue in the face, but Btrfs is still not production ready. SuSE don't offer support for anything but a minimal subset of its total features. It's still slow, it still suffers from unbalancing, and it still suffers from dataloss. It is not a replacement for ZFS or even a real competitor. To compete with ZFS, it would have to be as performant, as resilient, and as functional. It's none of these things.
In all the years of its "active development", it's yet to challenge
MD/ (Linux storage)all the same stuff, with layers (Score:3)
The standard Linux storage stack, based around multi-device (MD), can do all the same stuff that ZFS can do. The difference is that the standard system has distinct layers, ZFS is all-in-one.
The standard system uses the "lvm" commands to manage volumes and snapshots. It uses mdadm to manage duplicating or checksumming blocks across disks, aka raid. LVM can actually do raid too, but mdadm is the preferred tool. In the standard Linux model, the filesystem later is provided by - any filesystem you want, diffe
Re: (Score:2)
Internally, ZFS has all the same layers, plus some additional ones. Seriously, go and look at its design.
mdraid and LVM do not match the integrity checking and self-repair which ZFS provides. When you can resilver a degraded array like ZFS does, then I'll believe you.
Re: (Score:3)
That's one failure scenario out of many different potential failure scenarios. mdraid can't help you if all the drives report success but return different data. It has no means by which to determine which is correct. ZFS knows up front which is the good copy. What if all copies are the same but incorrect? Again, ZFS will know up front that there's a problem, and will look for an alternate replica (copies=n) or will record the failure and return a read error.
You're right in that drive checksumming and e
Looks like... (Score:3)
... NIH syndrome !
Re: (Score:3)
Well, ZFS is more or less the NIH solution... Linux wants to have layers, if you want compression it should be compressionFS layer and encryption an encryptFS layer and you should use file system agnostic tools like mdadm and rsync to do RAID and replication. Then along comes ZFS and wants to do everything like one monolithic solution. Now because of the license the ideological discussion never really took off, ZFS didn't have much choice if it wanted feature parity on all the supported platforms but if it
Re:Looks like... (Score:4, Interesting)
Well, ZFS is more or less the NIH solution... Linux wants to have layers, if you want compression it should be compressionFS layer and encryption an encryptFS layer and you should use file system agnostic tools like mdadm and rsync to do RAID and replication. Then along comes ZFS and wants to do everything like one monolithic solution. Now because of the license the ideological discussion never really took off, ZFS didn't have much choice if it wanted feature parity on all the supported platforms but if it had been GPL licensed it'd set off a bunch of sysv vs systemd style debates.
It's nothing so petty. The layering system gets in the way of what ZFS and BTRFS are doing, which is why BTRFS also has to reinvent its own wheels as well, e.g. it has its own RAID implementation. The bitrot protection scheme on both systems works by checksumming the data and metadata, which requires it do understand the filesystem disk format. If the checksum mismatches the data on one disk, it then has to pull another copy from the mirror, which it can determine is correct because the checksum will match. That requires interacting with the RAID layer in ways which a generalised RAID system won't have.
Basically, more advanced filesystems have more advanced requirements that Linux' layering system can't provide at this point.
Re: (Score:2)
Well, ZFS is more or less the NIH solution... Linux wants to have layers, if you want compression it should be compressionFS layer and encryption an encryptFS layer and you should use file system agnostic tools like mdadm and rsync to do RAID and replication. Then along comes ZFS and wants to do everything like one monolithic solution. Now because of the license the ideological discussion never really took off, ZFS didn't have much choice if it wanted feature parity on all the supported platforms but if it had been GPL licensed it'd set off a bunch of sysv vs systemd style debates.
And those are the guys who embraced systemd!
Re: (Score:2)
It's a European thing. You don't understand. (Score:5, Insightful)
This whole drama Americans have around Torvalds saying things like this, strikes me as distinctly American.
See, over here, we have no problem with negativity. To us, you seem obsessed with positivity in an unnatural way. So to you we might look obsessed with negativity. For us it's simply not the end of the world, if somebody is negative.
Analogous to our vs your reaction to tits. (Seeing them on large billboards, is nothing unusual around here.)
Over here, you simply speak your mind. Even if frank and direct. Even if it turns out to be wrong, later on, that is nothing special. We're humans. :)
But to Americans, that's kinda the perfect storm. He was negative, AND wrong. *gasp*
It's just a different culture. Yes, in your culture it's a WTF moment. I won't devalue your customs.
In ours, and Torvalds', we just tell him he's wrong, and why, and move on. Without that big drama around it that everybody hates.
And you can't expect him to adhere to your customs of all possible ones in the world.
So relax. Everything's alright. :)
And yes, he might bark back if you tell him he's wrong. But that won't mean it isn't appreciated and normal or that he won't think about it.
It's an (northern?) European thing. Works the same in Germany too. :)
You would understand, if you would live in a land with as little sun (!!) and religious doctrines and as much alcohol as where Torvalds is from.
Re: (Score:2)
Re: (Score:2)
Pretty much this. Also, few Europeans feel the need for grand-standing, while in the US it seems to be kind of expected. That is how tiny things get inflated to huge sized and actually important things get overlooked.
Re: (Score:3)
"few Europeans feel the need for grand-standing"
Never been to Italy, have we?
Re: (Score:3)
Who is "we"? Not me. While I do classify as a European and a German.
Your pointing to "America" and crying "drama", "obsession" and "tits" on the other hand, although this is just a civilized technical discussion about the properties and merits of ZFS and how they relate to a public statement from the maintainer of the world's most important operating system's kernel, strikes me as distinctly German.
Re:It's a European thing. You don't understand. (Score:4, Insightful)
this is just a civilized technical discussion about the properties and merits of ZFS and how they relate to a public statement from the maintainer of the world's most important operating system's kernel
No, this is cherry-picking, because Linus Torvalds further explained his points in the following comments. He makes distinction between ZFS and OpenZFS, something that Salter never mentioned:
I'm talking about small details like the fact that Oracle owns the copyrights, but turned things closed-source, so the "other" ZFS project is a fork of an old code base.
If you are talking about ZFS, you're talking about the Oracle version. Do you think it has a lot of development going on? I don't know.
And if you're talking about OpenZFS, then yes, there's clearly maintenance there, but it has all the questions about what happens if Oracle ever decides - again - that "copyright" means something different than anybody else thinks it means.
Re: (Score:2)
As an American of German decent, who lived in Germany for six years, I completely disagree with your assessment of the cultural differences...other than the tits.
Re: (Score:2)
Why is Torvalds' bluntness and criticism something to be accepted, but his critic here's isn't?
In any event, the thing is northern Europe is an outlier on this and most cultures have developed codes for criticism that are not blunt. I think when dealing with a multinational community you don't cling to your own provincial cultural norms if they are an outlier, especially when you are, like Torvalds, living in a country where the code is different. I certainly wouldn't move to Finland and not modify my appro
Re: (Score:2)
Why is Torvalds' bluntness and criticism something to be accepted, but his critic here's isn't?
Funny, isn't it? "Cultural differences" is a common defense, but sadly, never for certain cultures.
The hierarchy is also funny. European must be better than American. But if it were a clash between, say, Middle Eastern and European, then Middle Eastern must be better than European. Oh, unless the Middle Eastern is a certain small bit of Middle Eastern, in which case the whole world must be better than than that Middle Eastern ...
Re: (Score:2)
Europe is a big place; I don't think you can generalize culture and communication style like that., between Scandinavia, Spain, (Western) Turkey, and Poland. What part of Europe did you have ik mind?
I would not use ZFS, (Score:2, Troll)
Re:I would not use ZFS, (Score:5, Informative)
Go and look at the details of how ZFS is actually implemented. You'll quickly find out that internally it's composed of many different layers. More than Linux has with RAID+LVM+filesystem, in fact. It's not "clobbered together", it's a replacement for the whole storage stack.
datasets are filesystems
ZVOLs are logical volumes
VDEVs RAID sets
It's different, yes, but it's even more finely layered, and those layers provide features which the standard storage stack can not provide.
Look into the SPA (Storage Pool Allocator), the Meta-Object and Object layers, the ZPL and ZAP layers, and the several intervening layers, and you might change you mind about it. The design is very well done, and no facilities provided by Linux come anywhere close to matching it. Not even Btrfs.
Re: (Score:2)
My question is what commercial vendors of enterprise storage products are building them around ZFS.
My sense was it was probably most useful if you ran a bare metal server and wanted it to deliver native unix filesystems, but a lot less useful than conventional SAN systems for heterogeneous storage environments.
Re: (Score:2)
The difference is, raid will tell you there is a problem with your data in some cases, zfs will in many cases fix it for you.
Raid has a write hole problem, zfs does not.
I Get It... (Score:4, Interesting)
Newsflash: none of that helps.
If Jim wants to produce an efficient, effective response to Torvalds, then the ONLY way to do so is to take each of the elements of Torvalds' arguments and factually rebut them, item by item.
For example, the OP provides the following quote, apparently from Torvalds: "And as far as I can tell, it has no real maintenance behind it any more..."
OK, to respond to this, Jim Salter needs to go to the ZFS Git repository and pull some analysis:-
1. How many versions/patches released in the last 12 months/2 years?
2. How many individual contributors in the last 12 months/2 years?
3. How many individual commits in the last 12 months/2 years?
4. How many operating systems have added support for ZFS in the last 12 months/2 years?
Follow the same pattern and principles for every single concern that Torvalds raises and take that to print. *Then* you can say that Salter made a reasonable response.
So far, I've read the Ars article and this piece - and in both cases I've come away with the distinct impression that Salter is whiny and pathetic and needs to grow up or go home. I'm sure he's extremely nice in person, but the way he has gone about to responding to Linus, on a subject he clearly feels passionately about, is all wrong.
The correct way to respond to someone who is perpetuating myths, inaccuracies or outright falsehoods is to provide verifiable fact in response, not just complain about the bits you don't like.
What we've had so far is, sorry to say, pathetic.
Re: (Score:2)
Indeed. Some people seem to confuse engineering with religion. In Engineering you actually can deliver facts if you are right and the other side is not.
Re:I Get It... (Score:5, Informative)
Pathetic, you say? Apparently you haven't read the full article, because if you had you would know that Jim Salter went into some detail about the number of commits. Not that commit counts mean a great deal in a product as mature as ZFS that is used by many large and user-critical sites because of it is so resistant to data corruption, but there have been, and continue to be, a substantial number. And there are regular releases both to support updated hardware and new features.
Your comment about how many OSes have added ZFS support in the last two years makes little sense. OpenZFS has been available on commonly used OSes for a lot longer than 2 years - Linux, MacOS, FreeBSD, Windows, and several OSes with smaller user bases (OmniOS, SmartOS, Illumos, DilOS, TrueOS, etc.)
As a happy user of ZFS for the last 7 years, you would have to pry it away from my cold, dead hands. It reduces my workload, but more importantly, gives me peace of mind that petabytes of data is very, very unlikely to disappear due to a data center disaster or be corrupted beyond recovery by hardware failure.
In short, read Jim's full article. It might change your mind about his response being pathetic.
Linus has an outsize loudspeaker - perhaps he should be a little more careful about how he uses his greatly amplified voice to badmouth an excellent product and state categorically that it should not be used. He (and some other insiders in Linux kernel development) definitely seem to something against ZFS - much more than just a licensing issue. I don't understand it.
Re: (Score:2)
Linus has an outsize loudspeaker - perhaps he should be a little more careful about how he uses his greatly amplified voice to badmouth an excellent product and state categorically that it should not be used. He (and some other insiders in Linux kernel development) definitely seem to something against ZFS - much more than just a licensing issue. I don't understand it.
On the other hand, after Linus' clear comments of "don't use it" alongside dismissing its overall value, it would appear difficult to make the case that he was contributing in any way to any type of infringement. I much prefer for Linus to err on the side of not overvaluing the worth of something with a dodgy license. Rather than criticizing Linus, perhaps that energy would be better spent lobbying for a usable license for ZFS.
For the typical enterprise use case, ZFS was nice on Solaris to avoid paying Ve
Re: (Score:3)
The licence isn't "dodgy", it's a perfectly valid copyleft based upon the MPL. The GPL is incompatible with it. So what? It's still free software.
I don't like the self-centred sense of entitlement which assumes that everything has to be done so satisfy the GPL and Linux developers' wishes. The free software world is not wholly centred around their requirements. It's been adopted by several other open source operating systems without fuss. It's only Linux where there such a song and dance about it, and
Re: (Score:3)
Doesn't Torvalds have the same obligation?
Re: (Score:3)
ZFS does have quotas, as well as reservations to guarantee minimum amounts. If these features were missing, how long ago was this? On Linux or some other platform?
As for data at rest being unchecked, that's what "zfs scrub" is for. When it comes to DIF/DIX, I fail to see how this could be an improvement. ZFS can retrieve a good copy from elsewhere, including a second copy on the same VDEV if you set the copies=n property. I don't see an extended checksum being anything close to as flexible as the data
primary concern is license and litigation (Score:3)
The way I read linux post there is 80% of the psot about concern about a litigious company and licensing. Then there is only a throw away sentence at the end about usability. How that suddenly become a rebuttal of 2300 words about usability and zero word about licensing kinda miss the point...
Re:primary concern is license and litigation (Score:5, Informative)
How?
Because Jim Salter doesn't dispute Linus Torvald's comments about licensing at all. He stated that several times in his article, and he was correct to do so - Torvalds is absolutely right about the licensing issue. The CDDL and GPL are incompatible, and the only entity who can fix that is Oracle because only Oracle have the right to change the CDDL license on ZFS.
He disputes Linus's dismissal of ZFS's features as clueless nonsense. Because it is.
You'd know this if you actually read the article, instead of nit-picked your way through it.
Parthian Shot (Score:2)
Over 90% of Linus' post had to do with whether or not ZFS should be integrated into Linux' kernel or not, focusing mainly on the litigious nature of Oracle, etc. Given the many suits that Oracle has saddled the industry with, his concern may be valid. I don't know how 'free and clear' the legal position of ZFS is at this point. The unsubstantiated allegation re: maintenance was thrown in as some sort of Parthian shot at ZFS while Linus was already galloping off to his next snarky kernel denial.
That said, th
Torvalds just went full retard (Score:2, Interesting)
Linus has lost a lot of people's respect over this. It's a really pathetic NIH attitude. The license issue is valid, but the lack of respect shown to ZFS is really where he loses, and loses bigtime. Linux really has nothing equivalent, and that probably eats him up inside, and he's acting like a little man-child over it.
Lately, I've had to ditch Ubuntu's ZoL (ZFS on Linux) due to a nasty ZoL data corruption bug, and go back to the more-stable FreeBSD implementation of ZFS. FreeBS
Re: (Score:2)
Re: (Score:3)
Snapshots are only the start of ZFS' features. People are using ZFS not because it's "cool", but because it provides features which are completely unmatched by any Linux-native storage solutions. It's a tool, like any other, but it's unique in its capabilities. Until Linux can provide a replacement with equivalent capabilities, it's going to continue to be highly valued.
You've mentioned rsync several times in this and other threads. Don't you understand that it's orders of magnitude slower and much less
ZFS features and data loss (Score:3)
i agree that zfs is full of cool features, but those are also buzwords! While are interesting to some, they are useless to others and all those features have a hidden cost. a ZFS with deduplication and compression on a desktop will mosty kill the desktop
>filesystem that's extremely resistant to data loss
hey, right!
All FS can use RAID, but zfs use their own internal implementation (with their own set of limitations). Auto-repair is limited and checksum is good to know that one file is bad, but will not recover it. The reality is that is that zfs fails just like other filesystems, with the a bigger problem, as it is so complex, it is harder to repair/recover than filesystems like ext4 or xfs. O already lost one zfs with dedupe and compression, all attempts failed to remount it, even with dev support
I was not impressed (Score:2)
A total surprise (Score:2)
hardly (Score:2)
"author, podcaster, coder, and "mercenary sysadmin"
IOW, almost everybody.
Re: (Score:3)
Whenever I see "coder," I always wonder, does this mean that before becoming a writer they were a low level medical insurance bureaucrat?
Certainly "mercenary sysadmin" tends to mean, "I'm not a sysadmin but I get stuck administering the network printer." At least, when it isn't a shield used to deflect criticisms of having not read the manual.
Correct about the license, wrong about performance (Score:4, Insightful)
OpenZFS is better outside of Linux kernel (Score:3, Informative)
Another take on that, is there are more OSes that use OpenZFS than Linux. For example, FreeBSD / FreeNAS. There are some weirdly new implementations, OSv, (that's not MacOS), which uses OpenZFS as a critical component, and ZFS on Windows. This last one I am not certain it's going to succeed, but it's getting pretty far along.
As for me, I use OpenZFS on my NAS, (FreeNAS), and all my home Linux computers, (desktop, media server, laptop). It's for the features, not the performance. There are studies that show you can get a bit more performance out of something else. Which something else depends on your workload, (so XFS, EXT4, or BTRFS). Since my workload varies, it's not important to me.
Before I moved to OpenZFS at home, I used BTRFS. Never lost data, (as far as I knew), but it just never stabilized. Even today, 6 years after abandoning BTRFS, it's still not where USERS need it to be. Having it tied to the kernel, when most distros don't follow the cutting edge, (or even bleeding edge), means that users of those distros won't get the BTRFS bug fixes easily. Let alone features. The old EXT2/3/4 were not too bad, because you could ignore the newer one until it was supported everywhere and have good reputation of reliability. Thus, I am ignoring BTRFS, (like RedHat has for the last 2 years), until it's wide spread used.
To be clear, BTRFS solved 1 of my problems, alternate boot environments. Basically I could snapshot my running image, create new grub entry for the snapshot and boot off it. Then perform updates. If the updates went south, I simpled rebooted to the prior boot environment. This worked well, and neither EXT2/3/4 or XFS, (or most other FSes), supported this feature. And don't drag out LVM snapshots. Those need to be considered temporary and are required to be removed or rolled back in a reasonably amount of time.
OpenZFS is simply more stable that BTRFS, and more feature rich. Like the compression algorythms. You can change it on the fly and and new data will use your new choice. BTRFS has some choices, but last I looked, (and to be fair that was a while ago), their were problems. One feature that Sun built into OpenZFS, (and Oracle did NOT build into BTRFS), is that pool & file system attributes are both stored in the pool, (or dataset), and easy manipulated. Most of BTRFS' attributes are mount time options. Meaning if you need to change one, you may very well have to un-mount / mount, (or even reboot), to make the change effective.
Final thoughts:
Now it would be nice if OpenZFS could be redistributed with Linux distros, (other that Ubuntu). During an update, if one or other, (kernel or OpenZFS), was updated, no problem. Today, us OpenZFS on Linux users do have a bit more effort to make it work, but it's worth it. Better OpenZFS than the alternatives. If it became a violation to use OpenZFS on Linux, I'd seriously consider migration to FreeBSD. And not just because of OpenZFS. Many Linux distros are starting to go off into wonderland, implementing poorly thought out software, (which ends in D). Or forcing their brand of desktop on you, (looking at you Gnome 3!).
Re: (Score:2)
someone as feral as Larry Ellison.
Laaaaaaaaaaaaaaaaaaaaarrrrrrrrrrrrrrrrrrrryyyyyyyyy[shakes fist at sky]
moot (Score:3)
The whole drama-strom about Linus' opinion is moot. Until the licensing changes it has no place in the Linux kernel.
Re:About as relevant as the "criticism" in TFA (Score:5, Insightful)
What? I mean, the story itself has Linus attacking its technical merits; that's what this dude is pushing back on. Why are you ignoring Torvald's non-licensing remarks?
Re:About as relevant as the "criticism" in TFA (Score:5, Informative)
The story may have Linus attacking "technical merits". This, however was not what Linus said. What he said is this (arguments against ZFS in Linux kernel bolded for those lacking patience, it is such a long text, after all):
Note that "we don't break users" is literally about user-space applications, and about the kernel I maintain.
If somebody adds a kernel module like ZFS, they are on their own. I can't maintain it, and I can not be bound by other peoples kernel changes.
And honestly, there is no way I can merge any of the ZFS efforts until I get an official letter from Oracle that is signed by their main legal counsel or preferably by Larry Ellison himself that says that yes, it's ok to do so and treat the end result as GPL'd.
Other people think it can be ok to merge ZFS code into the kernel and that the module interface makes it ok, and that's their decision. But considering Oracle's litigious nature, and the questions over licensing, there's no way I can feel safe in ever doing so.
And I'm not at all interested in some "ZFS shim layer" thing either that some people seem to think would isolate the two projects. That adds no value to our side, and given Oracle's interface copyright suits (see Java), I don't think it's any real licensing win either.
Don't use ZFS. It's that simple.
That is, there are no "technical merit" considerations that influence the decision not bother with ZFS. The remaining THREE LINES are
It was always more of a buzzword than anything else, I feel, and the licensing issues just make it a non-starter for me.
The benchmarks I've seen do not make ZFS look all that great. And as far as I can tell, it has no real maintenance behind it either any more, so from a long-term stability standpoint, why would you ever want to use it in the first place?
This, which that dude is having a panties twist about is just the personal opinion of Linus, which does not go into the factors for the decision.
So, yeah, it is all FUD and lack of reading comprehension.
Re: (Score:3)
"These three lines are wrong."
"That's because you're only focusing on those three lines!"
Re:About as relevant as the "criticism" in TFA (Score:5, Insightful)
No. Let's not make shit up. Let's get a real quote:
Linus Torvalds says “Don’t use ZFS”—but doesn’t seem to understand it
Linus should avoid authoritative statements about projects he's unfamiliar with.
Linus Torvalds says "Don't use ZFS, because it's license is fucked". Then he adds "In my opinion, it sucks too". The bolded part being in all those three lines. So, no authoritative statements, just his opinion.
So, to recap: TFA is about some dude and his rant against Linus, in which the dude misrepresents what Linus says.
The dude's ranting just because ZFS didn't make into the Linux kernel. Not because of Linus. Because of Larry.
But the dude doesn't rant against Larry.
Sad, really.
Re: (Score:3)
OHHHH so that's how you can escape backing up your words? Just label it as your opinion? That will revolutionize modern discourse!
Buzzwords (Score:5, Informative)
And if you squint at it, ...
It was always more of a buzzword than anything else, I feel, and the licensing issues just make it a non-starter for me.
The benchmarks I've seen do not make ZFS look all that great.
You kind of see where he can get that opinion from:
- The CoW, corruption resiliance, snapshotting, checksuming, rapid replication, self-healing, transparent compression, etc. listed by TFA
are far from exclusive of ZFS.
- CoW specially (and the survivability to sudden power failure), is a feature shared by *many* file systems. A lot of the filesystem currently supported in the Linux kernel have either copy-on-write (ZFS, yes, but also BTRFS, BCacheFS once it gets mainlined, etc.) or are log-structured (F2FS, UDF, etc.) and all have the same "can survive sudden powerloss or kernel crash while mid-writing" feature.
- Snapshot and some corruption handling are features which have already been available at the block level with subsystems such as LVM and MDADM. Yup, those aren't system-level and miss some advantages (they can't leverage filesystem-level knowledge like ZFS, Btrfs and BCacheFS do), but it's not exclusive to ZFS.
- Similarly, transparent compression has been supported by several filesystems.
The combination of all the "ZFS features" is actually a feature set that is common to all the "new gen" file system. So that's not only ZFS, but also the current work done on BCacheFS and the work done by RedHat as part of stratis. And, the elephant in the room : BTRFS.
The set of features that are considered "stable" in BTRFS and the set features for which ZFS is touted is virtually the same, with one single exception: RAID5/6 isn't considered stable in BTRFS, whereas RAID-Z1/Z2 in ZFS is (well and BTRFS doesn't have a production ready FSCK - but being a CoW filesystem, FSCK is the absolute-last-resort anyway. That's not how you maintain a CoW file system. Scrub is how you do it, and that one is stable in BTRFS).
Everything else: CoW, snapshost, checksumming, rapid replication, self-healing, transparent compression, etc. list in the TFA applie just as well to BTRFS (bar RAID5/6, you need to restrict yourself to RAID-1/dup, or delegate the RAID5/6 to a lower layer like MDADM)
There's nothing extra (except RAID-Z1/Z2) that ZFS can bring to the table.
On the other hand BTRFS is developped *in* the kernel, can thus be debugged and avoid breakage on new internal kernel API/ABI, and has the advantage of sharing more code with the rest of the kernel facilities (compression routines is shared with the rest of the kernel, RAID basic building blocks are the same as those used by device mapper and mdadm to built their own capabilities, etc.)
There are companies (Suse, Facebook) paying kernel devs to maintain the in-kernel BTRFS filesystem module.
ZFS on it side bring very little new and has licensing issues. I can see why Linus would have a personnal opinion considering it as "buzzwords".
And regarding performance:
- it's a newgen file system. No matter how much you like them, features like checksum are still going to require extra cycle to get computed, and CoW (or Log-structure respect. depending on your FS) are still going to be problematic with large file with lots of random read/writes anyway (databases, VMs, torrents, etc.)
a direct-write simple filesystem like EXT4 isn't going to have them.
- also some "stable" features of ZFS are giant memory hogs. You can't use ZFS in embed setups. You can use BTRFS (well, as long as your storage is large enough to make sense. I'm looking at you Jolla 1 smartphone).
And again, this is the personnel opinion of Linus (the last three line of the post).
Not the kernel's policy about ZFS (the core of the post)
.
The kernel policy about ZFS remains:
- license is not GPL compatible (done on purpose by Sun back then) and can't be integrated.
- kernel's "don't break userland" only applies to USERLAND.
- you're using some out-o
Re: (Score:3)
Re: (Score:2)
If only the original post hadn't included that quote from Linus saying that it lacked support, was slow, etc...
Re: (Score:3)
ZFS on Linux is getting to be just about on a par with FreeBSD (after using them side-by-side for about 5 years now). A recent release added allow/unallow which made things much more usable for me. Still not there with NFSv4 ACLs but the Linux VFS maintainers have something against it, and will only support obsolete POSIX .1e DRAFT ACLs, a standard that was never ratified and is very primitive in comparison.
I'm still using it on FreeBSD where I really care about data integrity. Though I haven't lost any
Re: (Score:2)
According to wikipedia [wikipedia.org]
As of 2019, the Reiser4 patch set is still being maintained, but according to Phoronix, it is unlikely to be merged into mainline Linux without corporate backing
This is what really bothers me about Linux Development these days, you seem to need a big name sponsor, in the old days it would have been merged long ago. Granted they should rename it, but the name would not have mattered in the 90s.
Re: (Score:2)
For sure there a paths to losing data with ZFS, as I have learnt painfully too. At the same token, when I sat down and honestly reviewed my own actions, I was also part of the problem... I will continue to rely on ZFS to keep my data safe, I have upgraded all volumes with irreplaceable data to triple raid. I check the system logs for drive errors and run regular scrubs on the volumes. ZFS is a very reliable car, but like any car still needs an oil check and someone paying attention to the warning lights. ..