Linus Torvalds: Avoid Oracle's ZFS Kernel Code Until 'Litigious' Larry Signs Off (zdnet.com) 247
"Linux kernel head Linus Torvalds has warned engineers against adding a module for the ZFS filesystem that was designed by Sun Microsystems -- and now owned by Oracle -- due to licensing issues," reports ZDNet:
As reported by Phoronix, Torvalds has warned kernel developers against using ZFS on Linux, an implementation of OpenZFS, and refuses to merge any ZFS code until Oracle changes the open-source license it uses.
ZFS has long been licensed under Sun's Common Development and Distribution License as opposed to the Linux kernel, which is licensed under GNU General Public License (GPL). Torvalds aired his opinion on the matter in response to a developer who argued that a recent kernel change "broke an important third-party module: ZFS". The Linux kernel creator says he refuses to merge the ZFS module into the kernel because he can't risk a lawsuit from "litigious" Oracle -- which is still trying to sue Google for copyright violations over its use of Java APIs in Android -- and Torvalds won't do so until Oracle founder Larry Ellison signs off on its use in the Linux kernel.
"If somebody adds a kernel module like ZFS, they are on their own. I can't maintain it and I cannot be bound by other people's kernel changes," explained Torvalds. "And honestly, there is no way I can merge any of the ZFS efforts until I get an official letter from Oracle that is signed by their main legal counsel or preferably by Larry Ellison himself that says that yes, it's OK to do so and treat the end result as GPL'd," Torvalds continued.
"Other people think it can be OK to merge ZFS code into the kernel and that the module interface makes it OK, and that's their decision. But considering Oracle's litigious nature, and the questions over licensing, there's no way I can feel safe in ever doing so."
ZFS has long been licensed under Sun's Common Development and Distribution License as opposed to the Linux kernel, which is licensed under GNU General Public License (GPL). Torvalds aired his opinion on the matter in response to a developer who argued that a recent kernel change "broke an important third-party module: ZFS". The Linux kernel creator says he refuses to merge the ZFS module into the kernel because he can't risk a lawsuit from "litigious" Oracle -- which is still trying to sue Google for copyright violations over its use of Java APIs in Android -- and Torvalds won't do so until Oracle founder Larry Ellison signs off on its use in the Linux kernel.
"If somebody adds a kernel module like ZFS, they are on their own. I can't maintain it and I cannot be bound by other people's kernel changes," explained Torvalds. "And honestly, there is no way I can merge any of the ZFS efforts until I get an official letter from Oracle that is signed by their main legal counsel or preferably by Larry Ellison himself that says that yes, it's OK to do so and treat the end result as GPL'd," Torvalds continued.
"Other people think it can be OK to merge ZFS code into the kernel and that the module interface makes it OK, and that's their decision. But considering Oracle's litigious nature, and the questions over licensing, there's no way I can feel safe in ever doing so."
And it breaks Samba (Score:5, Informative)
The ZFS code in Samba has proven extremely destabilizing. Don't use it.
ZFS code?? (Score:2)
How does Samba even know what file system the OS uses below it?
It should use the VFS interface, and nothing else.
If what you say is true, that is as bad as MS putting parts of IE into the kernel for speed.
Re:ZFS code?? (Score:5, Informative)
It should use the VFS interface, and nothing else.
Samba has its own VFS layer ( https://wiki.samba.org/index.p... [samba.org] ) that allow it to use data sources other than the native filesystem.
Re:ZFS code?? (Score:5, Informative)
We do have a ZFS VFS module, but it's only about exposing the native ZFS ACLs as Windows ACLs.
It certainly doesn't destabilize Samba.
Re: (Score:3)
Re:ZFS code?? (Score:4, Informative)
fopen() is a library call. Samba uses the underlying system calls like openat().
We don't use the Gnome filesystem layer. The Gnome filesystem layer uses the Samba libsmbclient to expose remote SMB1/2/3 servers are local filesystems.
Re:ZFS code?? (Score:4, Informative)
Argh. Typo.
s/are/as/
Bloody /. comments with no comment editing system :-). Join the 19th Century guys..
Linus is a good engineer. DUH. (Score:5, Insightful)
Re: (Score:2)
Re:Linus is a good engineer. DUH. (Score:4, Informative)
ZFS on Linux is open source, so there's no danger of being sued if you install the binary module or compile it from source. The licensing incompatibility is an issue for the kernel developers if they wanted to merge the ZFS code into the kernel code, because they can't just change ZFS's license and then redistribute it under the GPL.
Re:Linus is a good engineer. DUH. (Score:4, Interesting)
Oracle could do all kinds of helpful things, but they never will. They'd much rather screw over anyone they can.
Consider FreeNAS (Score:4, Informative)
FreeNAS is reliable with ZFS, and doesn't have the GPL issue as is FreeBSD based
Re: (Score:3)
No, it's not.
FreeBSD currently uses the OpenZFS repo for their ZFS implementation, which is based on the code from Illumos (which is a continuation of OpenSolaris). That's what was released with FreeBSD 7 through 12, and is still in the source tree for what will become FreeBSD 13.
The OpenZFS repo is in the process of migrating away from/ the Illumos repo and rebasing on the ZFS-on-Linux repo. OpenZFS has not "died out", nor is it going anywhere. Once the rebase is done, it'll still be OpenZFS.
Once the re
Re: Linus is a good engineer. DUH. (Score:3)
I dunno about that.
After I saw Chef Gordon Ramsey LAND a multi-engine WWII (B-25?) airplane on one of his cooking competition shows, Iâ(TM)m pretty sure he could rebuild the engine in his car if he wanted to...
Re:Linus is a good engineer. DUH. (Score:5, Insightful)
Actually, he probably has more experience with what causes legal problems and what doesn't, in the context of the linux kernel, than will be evident in the publicly shared work product of Canonical's lawyers.
Their lawyers may be good, they may be bad, but they're not Linus' lawyers, so their opinion means about as much as if they were a random person on the street; actually less, because somebody else's lawyer might intentionally be giving you bad advice. Lawyers do not have any professional ethical requirement to be honest with the general public! Any expectation of honesty that they have is honesty to their client, or honesty to the judge, which is measured very narrowly compared to other examples of the word "honesty."
Trusting somebody else's lawyer about the law is like trusting a competing chef to change the recipe at your restaurant. It isn't just you'd have to be stupid to let them, you'd have to be stupid to even consider it.
If he wanted a free legal opinion, he could ask the EFF, and he'd get a good answer because they're on the same "side" here. Very different than the lawyer for a for-profit company that has a history of community conflict!
You're as good at understanding administration as a fry cook. You even point out that Canonical's lawyers are well-paid, so "where do they get their money" even made it onto your radar, without you even noticing, "Oh, wait, that's somebody else's lawyer, he'd be an idiot to listen to them."
And the fact is, any lawyer you ask is going to say the same damn thing that Linus said; even if you think your usage is allowed, you might still end up in a protracted, risky lawsuit. Because Oracle. It doesn't actually even matter here what the law is; Oracle is big enough and has enough experience at suing people that you'd have to be ready to set aside a giant war chest just to deal with the lawsuit. Even if you win big and Oracle pays your legal fees, you didn't get that far on a contingency arrangement; you'd have had to have paid your lawyers in the first place. If you're not ready to fight them in court, you don't do the thing. It is really that simple. Nobody needs ZFS anyways. There are people who like the flavor, but who cares? There are other good flavors.
Re:Linus is a good engineer. DUH. (Score:5, Funny)
Ms. Vito: I'm an out-of-work hairdresser and home chef.
Trotter: Out of work hairdresser. And home chef. Now, in what way does that qualify you as an expert in automobiles?
Ms. Vito: It doesn't.
Trotter: Well in what way are you qualified?
Ms. Vito: Well my father was a mechanic. His father was a mechanic. My mother's father was a mechanic. My three brothers are mechanics. Four uncles on my father's side are mechanics —
Trotter: Miss Vito, your family is obviously qualified. But have you ever worked as a mechanic?
Ms. Vito: Yeah, in my father's garage, yeah.
Trotter: As a mechanic. What did you do in your father's garage?
Ms. Vito: Tune-ups, oil changes, brake relining, engine rebuilds; rebuilt some tranny's, rear end —
Trotter: Ok, ok. But does being an ex-mechanic necessarily qualify you as being an expert on tire marks?
Ms. Vito: No. Not for that reason.
Trotter: Do tell.
Ms. Vito: Well my father was gunned down on the street by Larry Ellison. His father was a stunt double on Death Race 2000. My mother's father was a senior lieutenant for Boss Hogg. My three brothers were all killed when Larry Elision rode up onto the sidewalk in a Humvee and began spinning donuts with the parking break. Four uncles on my father's got caught up in an extended series of police chases that escalated for days until it involved every police car in Chicago and half the police cars in rural Illinois —
Trotter: — tell me you're making this up —
Ms. Vito: no, not at all; everyone knows the story of the three-year-old alleged arsonist who built an XJ-6 Airspeeder in his basement —
Trotter: — of course we do, what was his name? —
Ms. Vito: — I forget his birth name; when he finally went to juvie, he suppressed his real name by some mind trick on-one understands, and so everyone called him 'Larry' and it seemed to stick.
Trotter: The three-year-old who built his very own XJ-6 Airspeeder in his basement in his mother's basement was finally caught after long eluding the combined pursuit of the whole of Illinois? And how did that happen?
Ms. Vito: Unpaid parking tickets —
Trotter: — assigned to juvie for unpaid parking tickets? —
Ms. Vito: — let me finish!—after he was nabbed for unpaid parking tickets, he was additionally charged with tax evasion, antitrust activities, additional FTC violations, FCC violations, and several RICO predicates, though none of the predicates were finally proved despite mountains of circumstantial, and all the trade-related charges were dropped by the various prosecutorial offices for weird reasons after some creepy guy named Jeffrey Epstein blew into town and spent a month hosting unusually lavish, exclusive parties.
Trotter: Ahhhh, that's quite the web you weave. So he was finally sent to juvie for tax evasion?
Ms. Vito: Oh, no. He settled his tax problem with pocket change. He went to juvie for hacking SMODS and reassigning all his unpaid parking tickets to Richard Daley's homosexual pool boy.
Trotter: What an unbelievable yarn! That's all very interesting, but setting that aside for a moment, if you recall, a moment ago I asked you how the boy who built the speedracer in his mother's basement was finally apprehended over his unpaid parking tickets.
Ms. Vito: He was caught by a rogue Meter Maid riding hands-free on a Solo Scooter with a large dog in the front basket, who proved freakishly agile in a firefight.
Trotter: A large dog?
Ms. Vito: Yeah, an almost impossibly large dog—a bull mastiff crossed with a
Re: (Score:2)
Somehow I left the C out of SCMODS. Damn. I should be better than that, though on the flip side, a self-awarded +1 for the subtle suggestion that Marisa Tomei would one day grow up to be more than knee-high to a chopped Hog.
Sounds like a good idea. (Score:4, Insightful)
If you want people to respect your rules(license) then respect theirs and get it in writing that its ok if its questionable.
Why use ZFS in Linux anyway (Score:2, Insightful)
Unless you need to mount Solaris partitions there zero reason to use it in Linux as there are many other just as good filesystem type options
Re:Why use ZFS in Linux anyway (Score:5, Interesting)
good filesystem type options
Such as?
Debian's wiki lists issues with btrfs's RAID5 & 6 implementations.
Snapshots are free, replication to another backup is done at the file system layer. I've used and abused ZFS on multiple platforms over the past decade and never lost a pool's data. Regular scrubs also caught my Seagate drives dying.
The native Linux standard is called MD (madm, LVM) (Score:4, Informative)
The native Linux storage stack is based on MD, the Multi Device manager. Anyone who has worked with Linux servers has probably used mdadm to manage raid. The admin tool for volumes and snapshots is called LVM. Of course there are a hundred different filesystems to use on those volumes - distributed filesystems like GFS, filesystems for large files (think DVR, video editing), filesystems for lots of small files (mail server), all different ones to choose from.
Linux provided all of the same capabilities natively. The difference is Linux doesn't confuse raid with filesystems, or filesystems with volumes, etc. With DM, you can do raid across whole physical disks, or raid across partitions like you normally would, or raid across volumes, or raid across files (disk images), you can make volumes on raid or raid on volumes, etc. You can do whatever you want. ZFS combines thrm all together in one reasonable pattern and that's the pattern you get. Some people find it easier to use - in thr same way that Windows is easier to use - because it does it for you in the way it wants to do it. Linux follows the Unix philosophy is smaller parts that you can assemble any way you want to.
Re:The native Linux standard is called MD (madm, L (Score:5, Informative)
When you have a redundant array such as two mirrored disks, you have the question of "which one is wrong?" when data corruption occurs.
The magic of ZFS is that it is able to checksum the data and metadata, so when you have a mismatch it knows which one is correct and is able to repair it from the mirror. AFAIK this is not something which MD is capable of, because it needs cooperation from both the RAID stack and the filesystem itself.
BTRFS has to do the same thing (manage RAID itself) for the same reason.
Re: The native Linux standard is called MD (madm, (Score:3, Funny)
Re: (Score:2, Insightful)
Often, they're both wrong. Disks that are esed precisely equially, such as RAID 1 or RAID 5 disks, often fail together. whether from age or from a manufacturing error. Controller faults or cooling issues can contribute to one drive failing before another.
But RAID is not backup, and should not be considered as such. If the data is vital, that's what good backup is for. If the backup is critical, it should be backup that gets much less read and write than the original medium.
Re:The native Linux standard is called MD (madm, L (Score:4, Insightful)
But RAID is not backup, and should not be considered as such. If the data is vital, that's what good backup is for. If the backup is critical, it should be backup that gets much less read and write than the original medium.
That's true, but misses the point. ZFS checksums everything because it's designed to detect silent corruption of online storage, e.g. the media dropping a bit. A backup won't help you there - you'll just back up the corruption.
Re: (Score:2, Redundant)
If you're really shitty at backing shit up, sure, you can "have a backup" and have only backed up the corrupted data.
But if you have good backups, that isn't a problem.
Here is an article from 2003 that talks about basic backup strategies, and note that the strategies described will allow you to restore your corrupted file from backup. https://www.computerworld.com/... [computerworld.com]
When you say, "That's true but misses the point," I think you really mean, "That's true but I'm ignoring it because it shows I'm wrong."
You li
Re:The native Linux standard is called MD (madm, L (Score:5, Informative)
None of that matters if you don't KNOW the data is corrupted. Backups are important, but you're going to have down time while you restore (and you'll still need to figure out how far back to go to get un-corrupted data).
Having the filesystem maintain checksims and alert you immediately to corrupted data helps a lot. Even better if it can find it's own mirror of the file that isn't corrupt and maintain availability.
If the problem is a failing drive, you can replace it and have the filesystem scrub the volume to restore redundancy, all while still online. You can even install a new drive and evacuate the failing one so that you never lose redundancy.
I'm not a fan of layer violations, but there are really good technical reasons to let the filesystem handle the redundency.
There are similarly good reasons to do snapshotting at the filesystem level.
Re: (Score:3)
Sorry, mate... you clearly aren't as educated about ZFS as you think you are. ZFS can both detect and correct corruption due to "bit rot". The only caveat is that you need to have your VDEVs set up correctly. A RAIDZ1 or a straight mirror can detect bit-rot but can't correct it because you don't have a quorum of correct data and checksums. A RAIDZ2 or above can both detect and correct a failed or corrupted block because you have a quorum of 3 out of 4 devices within the VDEV. And since it's both a filesyste
Re: (Score:2)
Often?
On the same block?
I find that hard to believe.
Re: (Score:2)
On the same block, I'd agree. Is a degrading block your biggest risk? Or is "drop database;" or "rm -rf /" your biggest risk?
Re: (Score:3)
Disks that are esed precisely equially, such as RAID 1 or RAID 5 disks, often fail together.
You're not trying to catch failure of the disc (which don't happen together, but rather at a similar time), you're trying to catch random errors which happen all the time and are as the name implies random. The odds of two large drives flipping the same bit at the same time are ... can you come up with a bigger word than astronomical?
But RAID is not backup, and should not be considered as such. If the data is vital, that's what good backup is for.
Do you checksum your backup? How often do you re-do an entire backup verifying checksums for differences? Nothing is bullet proof if you lack the features to analyse how well b
Re: (Score:3)
ZFS scrubs (and resilvers) got enormously faster when they implemented sequential scrubs. The reason that scrubs were so incredibly slow is that they scrub the data based on the order it was written and not the location on the disk, since it was basically walking through the transaction history. This was very random IO. Sequential scrubs, IIRC, break the scrub up into a series of chunks, and then sort the reads within each chunk based on disk location. This feature saw my scrub times go from days to hours,
CAN, it's just a bad idea. (That's RAID 2), smartd (Score:5, Informative)
MD *can* use an error-correction code. That's called RAID 2. Nobody (except ZFS users) does that because it's a bad idea. It kills performance, with no advantage other than covering up the fact that somebody didn't know how hard drives work since 1986-1992. Maybe they learned prior to 1986 and never updated their knowledge.
The thing is, since at least 1992, and probably before, hard drives have used cross-interleaved Reedâ"Solomon codes, or other codes, to often correct, but at least detect, errors. So the drive itself tells you the sector is bad. All you have to do is listen to what it's telling you. Linux raid knows to listen to "bad sector" errors from reads, so it knows which drive is correct and which one is bad for that sector.
It's also a good idea to have a cron job or daemon which tells SMART to kick off a background self-check once a week or once a month to proactively detect weak sectors. The daemon is creatively named "smartd". The drive knows how to run checks in the background when it's not busy handling data requests and how to remap failing sectors to spare ones. Drives learned that trick in 1986, as I recall.
Re: (Score:3)
And yet, I have seen btrfs catch file corruption in the wild. The ECC on the drive won't catch errors introduced in the controller or drive cable.
SMART can be useful, but it's far from iron clad.
Re: (Score:3)
I personally encountered corruption, that's one of the reasons why I switched from MD to ZFS. One disk was particularly bad, and caused a lot of silent corruption.
And I've seen checksum errors several times on ZFS.
In case you are wondering, yes I use shitty hardware. Except for the Supermicro motherboard and ECC RAM, it is all consumer grade hardware bought from the cheapest seller on Amazon. But that's what exactly what RAID is supposed to compensate for, after all, the "I" in RAID stood for "inexpensive".
Cables (Score:3)
You didn't say what level of raid you were using, if you had scrubs enabled, or other info that would help me comment of your particular setup. You did, however, mention "the cheapest seller on Amazon".
I used to have a backup company similar to Backblaze, but much smaller. It was value-focused, we managed costs. We discovered that very cheap CABLES were a primary source of problems, and one that could be avoided for an extra $2 per drive or whatever.
In hindsight, that makes perfect sense. At 3Ghz or 6Ghz
Re: The native Linux standard is called MD (madm, (Score:4, Interesting)
Comparing mdraid, LVM, and filesystem of choice to ZFS is like comparing a moon buggy, lunar lander, and that thing the lander came from that's still in orbit to a Jeep. One will technically take you anywhere*, the other will.
Can grub boot an LVM snapshot? Like snapshot /, install OS patches to it, and boot it, preserving the original /.
Can we sync incremental LVM snapshots to another system? Snapshot a file system, send a full, then schedule hourly snapshots, deleting the Nth oldest one, and ship the snapshots to a remote system than can mount and recover from them?
Can we mount lv_opt and let it use everything available in the VG unless we wanted it to have a limit, which you can set, change, unset anytime?
It's frustrating that so many experienced Linux users don't ask for more, or look at what other systems can do, or think about how things could be better.
Yes, it can. Did you think the answers were "no"? (Score:5, Interesting)
Good questions.
Yes, you can do those things with MD.
> Can grub boot an LVM snapshot?
Yep. It's easiest to use "boom" to write the grub entry for you, but if you want to write it manually the grub syntax is:
options root=/dev/rhel_rhel7/root_snapshot_071018 ro rd.lvm.lv=rhel_rhel7/root_snapshot_071018
> Can we sync incremental LVM snapshots to another system?
Yeah bro this is Linux - everything is a file. Including snapshots, a snapshot is a file. Go ahead and send it over with a dd | ssh pipe or however you like to transfer data, then merge it.
> Can we mount lv_opt and let it use everything available in the VG unless we wanted it to have a limit, which you can set, change, unset anytime?
Yeah that's called sparse volumes. Check the --virtualsize and --size options to start.
You're welcome.
> It's frustrating that so many experienced Linux users don't ask for more, or look at what other systems can do
Yeah it's frustrating when people don't look at what other systems can do, and instead post FUD about the other systems, making a series of statements that are all incorrect because they didn't bother to check.
Thanks for asking these questions. They were good questions.
Re: (Score:2)
Ps, of you WANT to root for ZFS is the same way that fans root for their favorite sports team, here is something you can say about ZFS which some people consider an advantage:
With ZFS, you can use raid, volumes, and snapshots without understanding how they all fit together, or understanding what a filesystem is. ZFS packages them all up neatly together so it "just works" without the user needing to know the difference between a volume and a filesystem.
Re: (Score:3, Interesting)
You could also point out that ZFS on Linux was started at Lawrence Livermore National Laboratory [medium.com]
The Lawrence Livermore National Laboratory (LLNL), a research facility answering to the Department of Energy (DOE), originally created and open sourced the ZFS on Linux project. The need for this project was a simple one: LLNL wanted a modified version of this data management system that ran on Linux. To offer scope on how essential ZFS is to LLNL, this lab manages Sequoia, one of the largest supercomputers in the world. Sequoia is comprised of 30,000 hard drives transferring at 1TB/sec while using 500,000 memory cores for processing. At LLNL, over 1000 systems are using ZFS, managing over 100 Petabytes a day. To offer additional scale, a Petabyte is 1,000 Terabytes or 1,000,000 Gigabytes. The amount of data LLNL manages illustrates the draw of ZFS, especially in its resilience to data corruption. ZFS offers enough redundancies in order to deal with potential data loss. With this Linux-version of ZFS, LLNL has experienced only one instance of catastrophic software failure (in 2007), but no data has been lost since implementing ZFS on Linux.
Since OpenSolaris died, OpenIndiana stalled(?) and OpenZFS is in the breeze, ZFS on Linux is 'the' FOSS ZFS. FreeBSD is re-basing their ZFS code on the ZoL project [freebsd.org].
The code is more tested, developed, and used.
RedHat has deprecated btrfs, which itself is an Oracle project.
Re: (Score:2)
Yeah bro this is Linux - everything is a file. Including snapshots, a snapshot is a file. Go ahead and send it over with a dd | ssh pipe or however you like to transfer data, then merge it.
If I take a snapshot of a 100GB volume. Change 1kB and take another snapshot how much data is sent over ssh?
Re: (Score:2)
You can choose the extent size based on your application. You'd want small extents if you make a lot of small changes, such as a database or mail server. It would be best to use larger extents for storing videos. The default is 4MB. 4MB is reasonable for general use.
What that means for snapshots is the volume is divided into 4MB chunks. 1,000,000 changes to the same 4MB region would take 4MB in the snapshot, if there is only one change it's still 4MB.
Re: (Score:3)
That's where something like rsync comes in. It's file-based, so it still needs to read every file in the filesystem and compare them to every file in the destination (so it can take awhile to run), but it only sends the data that changed within the file.
It's much, much, much slower than a ZFS snapshot send, as ZFS knows exactly which blocks are different (no read/compare needed) and only sends those blocks.
Both rely on an external transfer mechanism that can be tuned to the network (SSH if you need encrypt
Re: (Score:2)
You can have a volume management built into the filesystem using btrfs. There is nothing at all wrong with integrating volume and raid functions into the filesystem layer. If you don't like this, you dont have to use btrfs. This gives you the choice of doing it that way, or using lvm+xfs. So there is nothing wrong with having a choice between the approaches.
Also, btrfs probably leads to simpler on disk data structures than a lvm+filesystem approach. Complexity here can be the enemy of reliability. btrfs hav
Re: (Score:2)
Absolutely there is nothing wrong with having choices.
If you want EXACTLY what btrfs does, using btrfs is a reasonable way to do it. If you want almost the same thing, but you want something slightly different, you can do whatever you want with MD and any filesystem you want.
Personally, I find this:
5+1
6Ã3
2+1
Simpler to do with confidence that there are no errors than this:
((5+1)/3+1)
Jamming them all together is more COMPACT, separating them out is more SIMPLE, IMHO. I actually just checked with my 5yo
F-ing Slashdot (Score:2)
Stupid Slashdot can't handle the *ASCII* division symbol.
Personally, I find this:
5+1
6 / 3
2+1
Simpler to do with confidence that there are no errors than this:
((5+1)/3+1)
Jamming
Re: (Score:3)
I can see why you'd find it easier, you've left ambiguity in your 'compressed' version.
temporary_variable = 5 + 1
other_temporary_variable = temporary_variable / 3
result = other_temporary_variable + 1
Yeah, that's just shitty programming, and inefficient to execute, and anyway, people don't think like that.
They think like 5 + 1 / 3 + 1 but writing that down forces assumptions on precedence.
((5 + 1) / 3) + 1 is easy to read and obvious in its execution order.
It's also a simple piece of code.
result = ((5 + 1) /
Re: (Score:3)
The btrfs warned against using Raid 5 and 6 since the feature was still being worked on so people who did that did so against the warnings they were given. The feature is fixed now, but it is certainly not fair to use this red herring to attack btrfs, because of the false idea there was an unknown bug, they very prominently stated the feature should not be used in production. btrfs which is an excellent filesystem choice for Linux and its absurd to bismirch because features were still being worked on.
BTRFS (Score:5, Informative)
The only analog to ZFS is BTRFS, which isn't *quite* as mature as ZFS. This is a relatively important factor, as anyone using either filesystem is concerned with data integrity and stability, so a mature, stable filesystem is critical.
I tested out both systems on similar hardware a couple of years ago, and Debain+BTRFS had a random kernel panic that wiped out an entire volume within a couple of weeks, where FreeNAS (FreeBSD+ZFS) transferred multiple TB of data over several months without a hiccup. Right now I have a dedicated Dell file server running FreeNAS with a ZFS+Z2 array with zero issues for over two years.
I'm sure BTRFS is more stable now, but ZFS is proportionally more stable as well.
btrfs is a trainwreck. (Score:3, Interesting)
They developed it into a corner.and now can't get out anymore, and try to patch it up.left and right because they can't face the fact that it is misdesigned at its core and should be trashed and redesigned from scratch. (I don't think the developers are incompetent. They would create a nice FS, if only they accepted it, and learned from it.)
A key example is, how the subvolumes are implemented as separate entry points into the same file system. Aka as separate root directories. This meams you cannot just ch
Re: (Score:2)
The only analog to ZFS is BTRFS
Woah, woah, slow down there cowboy.
Are you sure about that, or are you maybe engaging in a No True Scotsman fallacy?
Are you conflating your favorite marginal feature with the actual normal functionality provided by the tool?
Re: (Score:3)
Isn't a large part of the motivation behind a high-reliability file system to defend you (insofar as possible) against bad hardware?
Go on, make me list of FSes that ... (Score:4, Interesting)
* offer full checksumming
* scrubbing
* flexible partitions that use the free space together, but guarantee a minimum and quota.
* trivially show you the space a partition uses for its data. (No cookie for you, btrfs!)
* with raid functionality too.
* merge partitioning, disk management and file systems into one system that allows emergent possibilities not possible when keeping them separate.
I don't know of a single one.
I've used ext2, ext3, jfs, reiser3, reiser4 briefly, ext4, btrfs, (f2fs, fat32, ntfs), with and without LVM2.
Re: (Score:2)
It's not often I agree with you, but your post deserves to be modded up. People who claim you should just use alternatives don't seem to understand filesystems.
Re: (Score:3)
ZFS has been in production at scale for over 14 years at this point. It's absolutely mature enough for production, and patently silly to claim otherwise.
It doesn't use "insane" amounts of memory. That's an often-repeated myth. But it's still a myth. You can tune it down to a few hundred megabytes if you really want to. It will have poor performance, but it will work just fine. You only need silly amounts of memory if you enable on-line deduplication, which few people do.
As for performance, ZFS isn't a
Re: (Score:3)
I like some features of ZFS over hardware RAID (and MD raid):
1. When rebuilding the array, zfs only copies the data, so if the array has a lot of free space, the rebuild finishes quickly.
2. If a drive is failing, but not completely dead and there are empty slots in the server, I can add the new drive and start the replace procedure. If the new drive fails before the rebuild is completed, the old drive is still there. This is especially useful if I am replacing good drives with bigger ones as I do not need t
Re: (Score:2)
Good question actually. From what I've heard its not used a lot in actual production. A massive storage pool is nice but without redundant controllers and pathways its inefficient and failure prone.
No news here (Score:3)
This is what, 15 years old? Long before Oracle bought Sun they created a new copyleft-ish license for OpenSolaris to be "open" but avoid that all the good stuff was stripped and integrated into Linux. I see a lot of retcon but Sun never did anything to dispel that impression back then and Oracle scuttled the whole project and want all their precious IP to be proprietary again. Well if the GPL is viral then anything owned by Oracle is toxic, good move Linus continue to keep toxic code out.
I'd even say (Score:2)
Avoid Oracle. Period!
Damn commies! (Score:2)
ZFS anyone? (Score:2)
I don't get what you don't get. (Score:2)
It is not about deduplication.
I am a home user with a few hard disks. I want complete reliability and flexibility in my data storage.
That requires something like live LVM2, plus triplicates of everything, plus snapshots for versioning / "time machine", plus checksums, plus scrubbing. Only ZFS can do that. (No, btrfs can't. It can't even not corrupt itself after enough usage.)
Only problem: ZFS gobbles up ALL your RAM. :)
Re: (Score:2)
Re: (Score:2)
Buying proprietary hardware is a fool's game. Look at all the pain that using hardware RAID has caused. It's great because it's easy, until it fails and makes things extra hard. So nobody who knows what they are doing uses HW raid any more. That's why.
Re: (Score:2)
Re:ZFS anyone? (Score:5, Informative)
I don't get what the big deal is with ZFS. If you can afford an environment with enough resources to do dedup and things like that, then your money would be better spent on entry hardware that does that.
There is a dirty secret about hardware-based RAID systems which happened in the last data center I de-commissioned a few years back. There was one Dell hardware RAID array which was apparently working OK. I think it was a 5-drive RAID-6. But when I tried to archive the file system there were tons of files that could not be retrieved. Long story short at some point bad blocks were written to the media on one (or more drives). Later, when a bad drive was swapped out the RAID system faithfully preserved the garbage data by creating redundant parity bits for it on the new drive. The owner had no idea -- the affected files hadn't been read for years. Now beyond retrieval. There were no backups because RAID is assumed to be so safe.
Right now I maintain a 12x4TB array with ZFS. The hardware is just JBOD. In ZFS each data block is stored with a checksum that was originally calculated outside the storage system. So when the data is read back it can be verified that it came back as intended. It doesn't matter where or when in the hardware data structure it gets corrupted: on the cable, on the drive circuit board, glitch in the SAS adapter.
Every week I get a verification:
Downside: ZFS ain't fast. Reading is OK but writing has lots of overhead. Snapshots are very fast and very useful. Upside: It is very stable storage.
Re: (Score:3)
Moral of the story is ... clears throat "RAID IS NOT BACKUP!! ALSO LIVE SNAPSHOTS ARE NOT BACK UP!".
With that out of the way I am not a Unix admin but judging on the competency of others I should take their jobs for extra $$$. I myself would have tape backups and a rotation plan even if I bdid have zfs. Fires, earthquakes, ransomware, are all real shit that can coat you and your employer there job.
Re: (Score:2)
Dedup? You don't understand because you somehow think Dedup is ZFS's most important feature. You are woefully underinformed.
Re: (Score:2)
Re: (Score:2)
ZFS snapshots don't need a lot of resources. They are nothing more than a pool transaction number recording the point in time the snapshot occurred, plus a deadlist of blocks. Creation is instantaneous and requires no storage overhead. As the original data is modified, the old blocks are moved onto the deadlist. This is super cheap, because nothing is being rewritten unnecessarily--the copy-on-write behaviour would require a new block for each changed block in any case, so it's nothing more than writing
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
zfs send and recv are orders of magnitude more efficient than rsync. rsync needs to do a full scan of each file at both the source and destination to compute the delta to send. This is expensive in time, memory and disc bandwidth, and with large datasets it's prohibitively expensive. You might need to scan a terabyte of data at each end to work out that you need to send a few megabytes. ZFS doesn't need to do that. It knows exactly what changed without needing to look into each file, because the snapsh
Re: (Score:3)
Snapshots are the basis for zfs send/recv actions. Once you've created a snapshot, you can send it off to some other host. You can delegate permissions to allow end users to do this, or you can do it manually as an admin, or you can script it with cron to do minute/hourly/daily/weekly snapshots and automatically ship them off to other systems. Since it's sending the delta between two point-in-time snapshots, it's very lightweight, equivalent to incremental backups. But you can do your incremental backup
It's not that Oracle is litigious. (Score:5, Informative)
It's that it's predatory in general. Litigiousness is just something it can afford in monetary and reputation terms that most companies can't.
Oracle is so secure in its market niche that it doesn't care what anyone thinks about it, even its customers. I worked for a small developer that targeted local and state governments, which meant we had to have be an official Oracle reseller. When I found out nobody had got certified in Oracle's licensing model, I took the exam myself, because having somebody who is certified in Oracles' pricing policies is mandatory. Those policies are extremely complicated, so much so that I doubt more than handful of customers understand them, particularly all the implications of the per seat models which *look* like a bargain but usually aren't.
I had many conversations with customers who in violation of their per seat license without realizing it. They'd say, well it's just a minor technical violation, I'll fix it in next year's budget (because that's how government works). Then I'd have to explain, no, you have to fix this *now*. You think Oracle will be nice and reasonable with you because you're a customer, but if they find out they *will* come after you. It's their *business model*. They want you to switch to a per processor license when you figure out how expensive complying with the per seat model is, because you don't get any credit for your existing license. It's a whole new sale for Oracle, and you have no choice because you're locked into their proprietary SQL, tools and drivers now.
Re: (Score:2)
> all the implications of the per seat models which *look* like a bargain but usually aren't.
Isn't it a general rule in modern business that anything that looks like a bargain, isn't? If it actually was a bargain, it wouldn't be offered as it would reduce income compared to just offering the "standard" options.
"license" and "proprietary" ... (Score:2)
Two huge red flags that will make any sane person run like hell.
I wish there was a ZFS for normal PCs. (Score:2)
Unfortunately, ZFS takes about 1GB of RAM per TB of disk space.
There are hacks, to reduce it, but then the file system either becomes very slow, or you lose reliability (and with it, the whole point).
I tried btrfs for a few years, and it is an utter trainwreck.
You can't even find out how much of the total storage one volume's contents use , since it is implemented as merely different directory trees... so, hardlinks. You need to trqaverse that entire tree. Which isn't even built in. Their suggestion is, in
Re: (Score:2)
XFS has a similar problem, though it doesn't use quite as much memory. I used to keep my data on XFS but then I started using pogoplugs for NAS to cut power consumption and I had to convert back to ext4 because while you can use large XFS volumes on low memory systems, you can't recover them if you have a power failure or panic. I was having to connect the disks to my PC and run recovery, then I could put them back on the pogo.
Re: (Score:3)
I believe the "1GB per TB" myth is at this point just a myth.
https://linustechtips.com/main... [linustechtips.com]
The more RAM you have, the more gets cached in ARC. So read-access is faster the more RAM you have.
Re: (Score:2)
This is a complete myth. It's only true if you have deduplication enabled, which isn't the default. In most cases, it's vastly less than this, and you can tune the ARC and other caches to be minimal size (at the expense of performance, obviously).
DDR3 ecc is cheap as are older 2/4u rackmount (Score:2)
DDR3 ecc is cheap as are older 2/4u rackmount systems.
Re: (Score:3)
Unfortunately, ZFS takes about 1GB of RAM per TB of disk space.
There are hacks, to reduce it, but then the file system either becomes very slow, or you lose reliability (and with it, the whole point).
It does no such thing. Reliability is not dependent on your RAM and you do not need a lot of RAM to have full native performance of ZFS. You can happily run ZFS on a system with 256MB of RAM (just don't try and startx ... for non ZFS related reasons).
Now ZFS has *features* which do require additional RAM. Specifically you need at least 1GB of RAM per 1TB of disk space to store the data deduplication tables, but only if you want data deduplication enabled.
You can also enable the ARC which is nothing more tha
Legit question... (Score:2)
Re:Legit question... (Score:4, Informative)
The author killed his wife and everyone stopped caring.
Re: (Score:3)
Worse, all the fanboys of the software spent long months defending the guy, saying it was a conspiracy and she was still alive and all this crazy stuff.
And then he agreed to lead investigators to the body.
If they'd just accepted that writing popular software doesn't say anything at all about your character or if you did something bad, then it would have been possible to just rename it and move on. But the whole "circle the wagons" thing that the fanbois did made it even more toxic, and made all of them toxi
Re: (Score:3)
Oh damn that asshole is up for parole this month.
Re: (Score:2)
btrfs (Score:2)
I have tried ZFS, and BTRFS for my home / personal NAS setup. They both actually worked fine, but BTRFS was the one that gave me a few scares. Fortunately they have very good recovery tools, even though the issue should not have happened in the first place.
On the other hand, precisely due to license issues I decided to stick with BTRFS. I can handle the occasional issues, and I have proper backups. I do not care for RAID5/6 (I do RAID1+0), and it has better availability (RAID5 is no longer safe for 2019, an
Re: (Score:3)
Glad Apple decided against ZFS (Score:2)
They were all-into developing ZFS for OS X around the 10.5 days; but then they just dropped the Project. Many people thought they were daft for leaving their ancient HFS (+) in place; but their homegrown COW Filesystem has proven to be rock-solid from day one.
And best of all, it is unencumbered by licensing fees, iffy FOSS Developers, and nicely scales from watches to anything above that.
Put ZFS on a Watch, or a Phone. Fuck, it will hardly run on Laptop!
And before you trot out your Apple hatred; keep in min
Re: (Score:2)
Steve could have worked it out with "Uncle" Larry.
But he wasn't around anymore at that point...
Apple also has their own implementation of TLS, called ATS (Apple Transport Security, IIRC).
In retrospect, both APFS and ATS were pretty good moves. Though, having an advanced filesystem that can be read and written on multiple platforms would be a really nice thing to have.
Re: (Score:2)
Bear in mind that APFS was designed to be lightweight enough to run on those watches and phones first and foremost. The desktop was not the primary consideration, and it's lacking the most basic data integrity features offered by ZFS. No Merkle hashes or other data checksumming or hashing to guard against data corruption.
Absolutely right to be concerned (Score:3)
The CDDL was created by sun to be explicitly GPL in-compatible (to prevent Solaris code from being taken by Linux and hurting any advantages Solaris may have).
Re: (Score:2)
I don't see fear of Oracle's litigious nature a reason to be concerned. Oracle is going after Google's use of Java APIs in Android because, given how far and wide Android is used and Google's deep pockets, the potential payout is huge. Putting ZFS in the Linux repo does not offer the same kind of payout opportunity.
The reason that I see for concern is license incompatibility. CDDL is not compatible with GPL. You can't just take CDDL code and relicense it. If CDDL-licensed code were checked into the Linux
Re: (Score:2)
> CDDL is not compatible with GPL.
Incorrect. It's the other way around. The GPL is incompatible with the CDDL. The CDDL is compatible with any other licence, including proprietary licences, so long as you comply with its terms.
Re: (Score:3)
> Also, the last update to ZFS released under CDDL is pretty old now and lots of fixes have likely been made to the ZFS code since Oracle shutdown OpenSolaris and closed the code.
Oracle have made changes since they closed their code, that's true. However, they also fired all the main ZFS developers with the rest of the Solaris teams, and they went and joined or formed the companies that are now working on OpenZFS. Oracle's fork is no longer where the interesting development is happening, with Solaris b
Re: (Score:2)
What advantages does Solaris still have over Linux?
Re: (Score:2)
At the time Solaris absolutely had advantages (including ZFS)
That is not how it works. (Score:2)
I know GP is trolling or deluded, but you don't know how this works either.
Why would they waste resources *excluding* anyone?
It is much easier to grep *all* the data, than to code up exceptions for some.
They got the resources to just scan all of it since a decade ago, at least. The NSA leaks specifically talked about how that is finally possible for spying agencies "now".