EXT4 Is Coming 182
ah admin writes "A series of patches has been proposed in Linux kernel mailing list earlier by a team of engineers from Red Hat, ClusterFS, IBM and Bull to extend the Ext3 filesystem to add support for very large filesystems. After a long-winded discussion, the developers came forward with a plan to roll these changes into a new version — Ext4."
Sounds like a good idea. (Score:5, Funny)
Comment removed (Score:5, Interesting)
Re:Sounds like a good idea. (Score:3, Interesting)
Re:Sounds like a good idea. (Score:3, Interesting)
First off the bat: you can't install the bootloader in a XFS partition since XFS uses the first 512 byte block on the partition. Of course, most people install the bootloader in the MBR but for some it's an issue.
GRUB had a bug with XFS. When you tried to use a XFS partition as
For a considerable period of time, ext3's code was more stable than XFS.
ext3 has an ordered data mode (which is the default). Other journaled file systems only support write
Re:Sounds like a good idea. (Score:2, Insightful)
It's ugly, and annoying, especially for people like me who rely on ReiserFS in production. I'd love to see ReiserFS 4 in the standard kernel, it'd make my life a lot easier.
I can't use EXT2/3, it's too slow and just kills the machine for the amount of files
Re:Sounds like a good idea. (Score:2)
Re:Sounds like a good idea. (Score:5, Informative)
Re:Sounds like a good idea. (Score:3, Interesting)
Re:Sounds like a good idea. (Score:2)
Re: (Score:2)
Re:But... (Score:2)
However in this case we need to flip it:
"Will it run Vista, and will it come out before Duke Nukem Forever?"
Oh and to for a momentary instance of reason: is it GPL-compatible? If so then I'm sure that it will support Hurd, or the reverse, Hurd will support it.
Re:Sounds like a good idea. (Score:2)
Re:Sounds like a good idea. (Score:2)
While ext3 definitely started as a fork of ext2, I'm pretty sure that it's been totally rewritten by now.
Re:Sounds like a good idea. (Score:2)
Re:Sounds like a good idea. (Score:2, Interesting)
Re:Sounds like a good idea. (Score:2)
If you mount it as ext2... (Score:2)
ext2 won't mount unless the filesystem is marked clean, so you would have already suffered a fsck scan anyway, as opposed to a fast journal resync if it was ext3.
BTW, ext3 just "starts from the beginning" at each mount. There's nothing to keep in sync.
Yeah, ext3 is great. I've recovered from _very bad_ situations involving hardware that might not have been possible with any other FS.
Re:Sounds like a good idea. (Score:3, Interesting)
I've just converted my main partition (non-/boot) on a notebook from XFS to reiser3 mainly because I work with huge svn working copies and svn loves to keep small files around, as well as create lots of small files (lock files, etc) during routine svn work. xfs is just way considerably slower than reiserfs for svn status, update, commit, cleanup. Besides, reiser3's tail feature means svn's penchant for small files uses less space ov
Yes but (Score:5, Interesting)
Interesting bit from wiki/ZFS:
Re:Yes but (Score:4, Informative)
Re:Yes but (Score:2, Funny)
Re:Yes but (Score:2)
That is, until next week, when some guy in Peoria manages to do just that by trying to create a single mirror of all the pr0n on the Internet.
Re:Yes but (Score:2)
128 bits? (Score:3, Funny)
"128 bits should be enough for anyone." - Scott G. McNealy (retired).
/me ducks.
Re:128 bits? (Score:3, Funny)
Now lets assume we want to store every bit in a single carbon atom. Carbon has a specific mass of 12 g/mol, 1 mol about 6.022*10^23 atoms. So 2.7*10^39 bits would translate to 4.5*10^15 mol, or 5.4*10^16 g, which is 54 gigatonnes of carbon.
I doubt hard drives will get larger than that any time soon
and 640 K... (Score:2)
Everytime I hear someone say "there is no way we would ever use that much data", I laugh out loud! HD cameras are coming, bandwidth is getting faster and cheaper (DSL is like $12 here in Indiana) and lets face it, people want to save EVERYTHING...weather this is good or bad is a differant topic, but the fact is, if you give people the storage, they will use it...Remember when you asked yourself "How will I ever fill this 500MB HDD?" I do...
Re:and 640 K... (Score:2)
Re:and 640 K... (Score:2)
In addition to knowing who my friends are, whom I call, and that I'm interested in aviation and fast cars, they'll be able to track which brand of mouthwash I buy (and where I buy it), who my dentist is, how much I've spent on my teeth, which brand of toilet tissue I buy (because only terrorists buy Scotts, right?), how many times I've read 1984 over the year
Re:and 640 K... (Score:2)
Blah blah blah I hate Bush blah blah Bush is Evil blah blah blah I have no independent thoughts blah blah blah.
Build the storage, and SOMEBODY will fill it up, probably the government (not necessarily just the current administration by the way) with tracking every inane detail of our lives
Man, where have you been the past 10 years? Credit card companies, banks, insurance companies, big retailers (Wal-Mart, Blockbuster, NetFlix, Amazon, etc, etc, e
Re:and 640 K... (Score:2)
That should put 2^128 in perspective when it comes to adressing. Also see the posts about how much energy it would take to store this amount of data.
Really there comes a time when "... enough for anyone" really is enough for anyone. Unless you're building Deep Though or similar computers.
LWN article on ext4 (Score:5, Informative)
Modularizable filesystem (Score:2, Interesting)
Re:Modularizable filesystem (Score:5, Informative)
Reiser4 does this [namesys.com].
Re:Modularizable filesystem (Score:3, Insightful)
Anyone have a "more technical" link without dancing trees and with a bit about how to recover your filesystem when something goes weird with the hardware even if the filesystem is perfect?
Re:Modularizable filesystem (Score:5, Insightful)
It's dishonest to put something in quotes when it's not a direct quote. The exact quote is:
There's a substantial difference between saying that something is more stable "as a result" of something and more stable "because" of something. He's not claiming that being out longer intrinsically makes it more stable as your misquote suggests, he's claiming that it led to reiserfs becoming more stable - because of the practices he mentioned.
In short - something being out longer == more stable? No. Something being exposed to lots of real-world use and receiving only bugfixes == more stable? Yes.
He didn't quote Adam Smith, he drew an analogy between what he was saying and the network effect. It's an entirely reasonable analogy.
What ridicule? He's actually supporting that approach. For example:
Would you care to point out where you thought he was ridiculing the UNIX approach?
Yeah, they look dumb, don't they?
I can only assume you mean something other than "technical".
Dancing trees are a fundamental part of the design. How are you meant to understand the filesystem without understanding dancing trees?
Ah, you don't mean technical at all, you mean practical for somebody who is entirely uninterested in the way the filesystem works. Perhaps Reiser4 Transaction Design Document [namesys.com] is what you are after, but I doubt it.
Re:Modularizable filesystem (Score:2)
The context is supplied in the article and the portion above - do you see what I am getting at here and why I do not agree that it is a definition of stability? It certainly gives me no confidence to the questions of stability and recovery, which I'm sure are answered elsewhere, but no - I didn't think much of the article - perhaps the annoying graph
Re:Modularizable filesystem (Score:2)
Re:Modularizable filesystem (Score:2)
Re:Modularizable filesystem (Score:2)
Ext2 and its descendants have been less ambitious and thus considerably more robust.
Re:Modularizable filesystem (Score:2)
Arbitrary block device layering is the way forward.
Re:Modularizable filesystem (Score:3, Informative)
ClusterFS (Score:5, Funny)
OK, hands up - who wants to run ClusterFS so that they can say they needed to do a "clusterfsck"?
How Lustre's ClusterFS works (Score:2)
Does that mean that the "filesystem" is broken into chunks and spread across all the nodes in the cluster?
define very large (Score:3, Insightful)
Re:define very large (Score:5, Insightful)
ext3: 8TB total, 4TB files
ext4: 32 zettabyte (1024*1024*1024 TB), 1 exabyte files (1024*1024 TB)
Beyond that, it doesn't seem to actually change much.
Re:define very large (Score:3, Funny)
Re:define very large (Score:2)
We're talking about a dozen full 750GB hdds (with no redundancy) to hit 8TiB. That's over four grand just for the disks without controllers, never mind the broadband you need. Do tell where you can get that on unemployment benefits...
Re:define very large (Score:2)
Well, "he" lives in his parents' basement, and earns some cash doing errands for them...
Re:define very large (Score:2, Interesting)
For example, if we have 20-bit indexes (2^20 clusters max) and use 4-kilobyte clusters, to increase the maximum space we'll either have to add one bit to the indexes to double the maximum space or we'll have to increase the cluster size and have problems storin
Re:define very large (Score:4, Informative)
Re:define very large (Score:4, Insightful)
ext4: 32 zettabyte (1024*1024*1024 TB), 1 exabyte files (1024*1024 TB)
Are they just going to work on improving the 8TB paper limitation, or are they actually trying to improve on ext3 scalability? Which, currently tends to suck the big one, especially on a significant number of disks (eg: http://scalability.gelato.org/DiskScalability/Res
I also seem to keep coming up against a pretty hard 2TB block device limit in Linux (eg LVM2 lv size, LUN size for fibre attached SAN, etc). I don't really know what the reasons for it are, anyone know what technologies allow for larger single partitions?
Anyway, I've long ago settled on reiserfs (3) for speedy random access to small files, and XFS for file server type applications; though I still wonder why RedHat doesn't include any "enterprise" filesystems by default in their "enterprise" products (I know, I know, you can enable it - I did say "by default").
What about directories? (Score:2)
I realize this is irrelevant for most people, but for some of us it's crucial.
Re:What about directories? (Score:2)
Something's wrong with your data layout if you need to put 32,767 directories at a single level.
LKML Message (Score:3, Informative)
Subject Proposal and plan for ext2/3 future development work
From "Theodore Ts'o"
Date Wed, 28 Jun 2006 19:55:39 -0400
Given the recent discussion on LKML two weeks ago, it is clear that many
people feel they have a stake in the future development plans of the
ext2/ext3 filesystem, as it one of the most popular and commonly used
filesystems, particular amongst the kernel development community. For
this reason, the stakes are higher than it would be for other
filesystems. The concerns that were expressed can be summarized in the
following points:
* Stability. There is a concern that while we are adding new
features, bugs might cause developers to lose work.
This is particularly a concern given that 2.6 is a
"stable" kernel series, but traditionally ext2/3
developers have been very careful even during
development series since kernel developers tend to get
cranky when all of their filesystems get trashed.
* Compatibility confusion. While the ext2/3 superblock does
have a very flexible and powerful system for
indicating forwards and backwards compatibility, the
possibility of user confusion has caused concern by
some, to the point where there has been one proposal
to deliberately break forwards compatibility in order
to remove possible confusion about backwards
compatibility. This seems to be going too far,
although we do need to warn against kernel and
distribution-level code from blindly upgrading users'
filesystems and removing the ability for those
filesystems to be mounted on older systems without an
explicit user approval step, preferably with tools
that allow for easy upgrading and downgrading.
* Code complexity. There is a concern that unless the code is
properly factored, that it may become difficult to
read due to a lot of conditionals to support older
filesystem formats.
Unfortunately, these various concerns were sometimes mixed together in
the discussion two months ago, and so it was hard to make progress.
Linus's concern seems to have been primarily the first point, with
perhaps a minor consideration of the 3rd. Others dwelled very heavily
on the second point.
To address these issues, after discussing the matter amongst ourselves,
the ext2/3 developers would like to propose the following path forward.
1) The creation of a new filesystem codebase in the 2.6 kernel tree in
"ext3dev" filesystem. This will be explicitly marked as an
CONFIG_EXPERIMENTAL filesystem, and will in affect be a "development
f
Why EXT4 ? (Score:4, Interesting)
There are many factors that influence filesystems, not just "how fast it can write", but rather.. how it breaks when it does.
While the fanboys of XFS, JFS, ZFS may promise that their filesystems are faster, had no problems, secure and will not eat your data, it simply is not as proven as ext2 and ext3.
Scream fanboys scream, someone will listen, but the problem is that these filesystems are not proven in the field, or in some circumstances even in the kernel itself.
Re:Why EXT4 ? (Score:5, Informative)
With IBM's know-how in the mix, EXT4 may be able to join the above three, but it would seem to be time better spent fixing XFS/JFS support in Linux first, rather than worrying about backwards compatibility with EXT2.
Re:Why EXT4 ? (Score:4, Informative)
Um, I have yet to see a production installation of ZFS in an enterprise environment, and it hasn't been out as an actual release for even a year yet. You probably mean UFS. HTH.
Re:Why EXT4 ? (Score:2)
Then you haven't been looking very hard. SUN has been using ZFS internally in their enterprise environment for a while. In addition, there are several special customers that were using ZFS in production working closely with SUN engineers. Not only that, I know of a hosting company that posted about using ZFS already for their production environment. In addition, ZFS is now officially supported and part of Solaris 10 as of
Re:Why EXT4 ? (Score:3, Insightful)
Most people would not consider that to be "proven in the field"
By your logic, Windows Vista should have been released a year ago because it's long been "proven" stable via widespread deployment at Microsoft.
Internally, Sun has Sun software running mostly on Sun hardware, not the mis-mash of SANs, external and internal third-party hard drives, and custom RAIDs that many enterprises will have. When it's used and stable across a vari
Re:Why EXT4 ? (Score:3, Informative)
There is no way in hell that ZFS is even _remotely_ proven in the field. And since we're still fighting with a bug with Sun Disksuite where you can't boot off the second disk when a disk in a mirror breaks, I'd be VERY loathe to mention Sun, Filesystems and Disk management as being stable right now.
fsck quality (Score:5, Informative)
The e2fsck program has a huge test suite that it must pass before a release. A set of corrupted filesystems must be correctly repaired to be bit-for-bit identical to the desired result.
A typical fsck has a good chance of crashing (SIGSEGV, the "segmentation violation") when the going gets tough.
While FreeBSD's UFS developers were messing around with sync writes to avoid testing a fsck that would often crash, the ext2 developers ran full async and wrote a damn fine fsck to put things back in order. Now you can choose from three different levels of journalling, and you still get the ass-kicking fsck program.
There basically is no fsck for XFS, Reiserfs, or Reiser4. JFS doesn't have much AFAIK, and ZFS is a newborn.
What are you going to do when your fancy filesystem gets trashed? I hope you keep excellent backups, very recent and tested to be readable.
Re:fsck quality (Score:2)
Re:fsck quality (Score:2)
The fact that HFS+ is so unreliable is a bit worrying. While lower reliability is to be expected, failures should still be rare. Perhaps your hardware has some minor (or not so minor) memory problems.
Re:fsck quality (Score:2)
Unfortunately, I no longer have Linux installed on it, and I rebuilt the filesystem anyway.
Re: (Score:2)
Re:fsck quality (Score:2)
Hint: they barely do anything, if you are lucky.
Re: (Score:3, Funny)
Re:fsck quality (Score:3, Interesting)
All of the major filesystems have a decent fsck, and all of them are by now stable to the point that you should worry about your hardware and backu
Re:fsck quality (Score:2)
Kirby
that is not fsck (Score:2)
Re:Why EXT4 ? (Score:2)
Please complete the following;
We need another enterprise file system;
- like we need another web browser.
- like we need another Window manager.
- like we need another Bourne shell derivative.
- more than we need improved network filesystem support.
- more than we need Hans Reiser to rip out the limitations of the VFS from the kernel.
- because the other Enterprise level filesystems just don't support big enough filesystems/files.
- because the other Enter
Re:Why EXT4 ? (Score:2)
Re:Why EXT4 ? (Score:2)
Re:Why EXT4 ? (Score:2)
I am sorry, but you got this quite the wrong way around
XFS and JFS have been used in enterprise enviroments far longer than EXT2 (not to mention EXT3) has been in existance.
Re:Why EXT4 ? (Score:3, Interesting)
Note that servers with extensive mirroring and other hardware error-handling rarely need error-recovery from the filesystem. Filesystem errors happen on ordinary peoples harddrives when they grow old, and ext* have a million times more experience in the handling those than any enterprise FS..
Re:Why EXT4 ? (Score:2)
Last week on a debian sarge box runing 2.6.8 while setting up a file system on a box that built, ran and read from ext3 just fine. A format and reinstall of the 400GB array got XFS working on the second try.
This was minutes after creating the partitio
Re:Why EXT4 ? (Score:2)
Why only 48 bits? (Score:3, Insightful)
I guess we'll be on to ext5 or 6 by then, though.
Re:Why only 48 bits? (Score:5, Interesting)
We'll need to adjust other things if filesystems ever get so huge. The whole design probably needs a rethink, but we can't do it now. We don't know what the future holds in terms of seek times, transfer rates, sector sizes, etc.
Re:Why only 48 bits? (Score:2)
Pattern (Score:5, Funny)
Wait... I think I can detect a pattern. The next number has to be Ext7½!
Re:Pattern (Score:2)
Then extxp.
Linux and other Unix FSes (Score:4, Insightful)
But I'm amazed at how quickly these features are being integrated. There's functionality in Linux that allows me to easily create file-backed volumes, remote volumes, SAN LUNs, etc.. The "resize in a single command" is not fully there yet, but within 6 months I'd expect it to be.
Re:Linux and other Unix FSes (Score:4, Insightful)
I'm pretty certain that Linux would have better filesystem tools if the developers could resist add a new filesystem every few months.
OT: Metadata on the file system? (Score:2)
As an example of when I would like to annotate files: sometimes I download a file --let's say it's a program for my Palm, called "VP2.pdb". Now, that filename could mean just about anything; let's say it was some image viewer named "ViewPicture II", so I would like to rename it "ViewPi
Re:OT: Metadata on the file system? (Score:2)
[root@flathat example]# ls -l
total 4
-rw-r--r-- 1 root root 2692 Jul 2 10:03 VP2.pdb
[root@flathat example]# ln -s VP2.pdb ViewPicture2.pdb
[root@flathat example]# ls -l
total 4
lrwxrwxrwx 1 root root 7 Jul 2 10:03 ViewPicture2.pdb -> VP2.pdb
-rw-r--r-- 1 root root 2692 Jul 2 10:03 VP2.pdb
A real O/S filesystem needs defrag! (Score:2, Interesting)
This is based not only on the need for a larger maximum file system, but a recognition that there is significant performance advantage to reducing read/write head movement and initiating large reads from consecutive blocks that can take advantage of the high tran
can i upgrade from ext3 to ext4? (Score:2)
Add Access Control Lists!! (Score:2)
melissa
Re:How does it compare to zfs? (Score:3, Insightful)
Well, how does a Honda Civic ... (Score:3, Insightful)
Re:Well, how does a Honda Civic ... (Score:3, Informative)
I'd suggest reading through these links before spreading more mis-information:
http://unixconsult.org/zfs_vs_lvm.html [unixconsult.org] - ZFS vs. Linux R
Re:Well, how does a Honda Civic ... (Score:2)
Yes, ZFS has awesome volume-management features. If you have a big fileserver with a dozen drives, ZFS is a godsend. However, considering that most laptops, d
Re:Well, how does a Honda Civic ... (Score:4, Informative)
Assuming we still want mirroring or volume management on our two drives:
The overhead is still greater for SVM or for linux md and sistina lvm. Both require more administration knowledge, time, and commands to accomplish the same tasks that ZFS can do in a couple commands. (Yes, I'm aware that mdadm helps the process a *bit*, but it's still obtuse.) Anyone who has setup either knows how annoying anything is with either choice. (having to micromanage partitions, etc.)
The biggest thing for ZFS in a ``small'' 1-2 drive usage case is, in my opinion, the pooling: ZFS doesn't require one to set volume sizes in advance. Since everything pulls out of a common pool, the size of volumes can grow or shrink accordingly. (Affected by free pool space or volume quotas.) So, that means that one can just create their volumes, and not have to worry about making them the wrong size.
I'd also argue that fault tolerance is important anywhere, large or small.
Another thing is on-disk, low overhead, compression that can be enabled just by toggling one filesystem paramater, live. For a lot of things that people store, this compression would save a lot of space.
They really put a lot of thought in ZFS. It scales amazingly well, from small to large. I'm not really giving it justice explaining it here, so I'd encourage you to look at the documentation with an open mind before just writing it off as an ``enterprise only'' thing.
dks
(I have no affiliation with Sun in any way.
design (Score:2)
ext2 and ext3 are very high performance file systems that have no trouble moving large amounts of data. ext4 appears to be a market-driven extension of ext3, in which what amounts to users pay for the minimum number of changes necessary to get the job done.
ZFS, on the other hand, is a typical Sun design, in which their kernel engineers throw in every feature they can think of and Sun is marketing the hell out of it. But a lot of features also means a lot of
Re:How does it compare to zfs? (Score:2)
Your sig. (Score:2)
Yes. (Score:3, Informative)
Re:My take on current filesystems (Score:4, Insightful)
I have had my
(I just keep adding on)
I lose power a lot where I live (glitches) and XFS has been utterly bullet proof.
(This filesystem has bee thru 3 motherboards, several linux distros (1 mb dead/2 upgrades), 2 cases, and so on)
If Reiser4 is about as stable as XFS, I'll glady switch everything over tomorrow on my MythTV box.
Re:My take on current filesystems (Score:2)