Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Software Linux

Ask Slashdot: Best File System For Web Hosting? 210

An anonymous reader writes "I'm hoping for a discussion about the best file system for a web hosting server. The server would serve as mail, database, and web hosting. Running CPanel. Likely CentOS. I was thinking that most hosts use ext3 but with of the reading/writing to log files, a constant flow of email in and out, not to mention all of the DB reads/writes, I'm wondering if there is a more effective FS. What do you fine folks think?"
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Best File System For Web Hosting?

Comments Filter:
  • ZFS (Score:5, Informative)

    by Anonymous Coward on Thursday November 29, 2012 @06:31PM (#42136119)

    Or maybe XFS.

    • Re: (Score:2, Funny)

      by Anonymous Coward
      FAT
    • Re: (Score:3, Informative)

      by Anonymous Coward
      "Maybe" XFS? XFS.

      ZFS if funky and all but you don't need the extra features and the additional CPU overhead is just wastful. The only real thing to care about is lake of fsck on unclean reboot, and fast reads. XFS+LVM2+mdraid (although a proper RAID controller is preferable) is perfect.
    • Re:ZFS (Score:5, Funny)

      by sjames ( 1099 ) on Thursday November 29, 2012 @08:30PM (#42137369) Homepage Journal

      Depending on the type of web content, XXXFS might be appropriate.

    • Why not btrfs and backups?

      • by Dante ( 3418 )

        Why not btrfs and backups?

        BTRFS is not stable! I just lost my /home and all it's snapshots, two days ago.

        "You should keep and test backups of your data, and be prepared to use them."

        Yes I know about the latest tools. In the end I had to do a btrfs-restore.

        https://btrfs.wiki.kernel.org/index.php/Restore [kernel.org]

    • As I recall, XFS is particularly good with large files. I use it for a media volume on my NAS. It's major down side from my point of view is lack of TRIM support, which only matters if it's on an SSD of course. The other thing to consider would be the occasional defrag.
      • Ext4 is also a lot better then ext3 was for very large files, and has the larger market share / acceptance / eyeballs. So not sure I'd bother to use XFS just for large file support.

        In ext3, when you would delete a multi-gigabyte file, it would take up to a few minutes for it to happen. In ext4, that process is measured in fractions of a comparison.
    • Re:ZFS (Score:4, Informative)

      by joaommp ( 685612 ) on Friday November 30, 2012 @07:42AM (#42140469) Homepage Journal

      Why hasn't anybody mentioned JFS?

      Since the demise of ReiserFS, that's what I've been using everywhere. It's fast, really stable and has the lowest CPU usage of all. So, why not JFS?

  • ext3 (Score:5, Insightful)

    by Anonymous Coward on Thursday November 29, 2012 @06:33PM (#42136141)

    if you have to ask you should stick with ext3

    • by SuperQ ( 431 ) *

      +1 to this.

      Unless you have a business case where you know you need something different, stick to what's simple and what works.

      ext4 is also a nice option over ext3. It uses extent instead of bitmap block allocaiton which improves metadata efficiency with no downside.

    • by antdude ( 79039 )

      Why not ext4?

    • Re:ext3 (Score:4, Funny)

      by Admiral Llama ( 2826 ) on Friday November 30, 2012 @01:08AM (#42139031)

      I don't quite trust ext4 for writes.

      app: Hey, can you write this data out to
      ext4: DONE!
      app: Uhh, that wasn't long enough to actually write the data.
      ext4: Sure it was, I'm super faGRRRRRRRRRRRRRst at writing too.
      app: wait, did you just cache that write and report it written but then not actually write it to disk until 30 seconds later?
      ext4: Yeah, what about it?

      That being said, ext3 and mount it with the noatime flag. If you're on a web server you don't want to be hammering it with writes to update the last access time. That's just silly.

      • ext4 and almost all other filesystems, in chorus: You want a guarantee that the data you just wrote is permanently stored even in case of a power failure? Use fsync() you lazy bastard.
        • fsync() is waaaay too slow. You could have at least recommended the fdatasync(), which is less slow. Or even better: opening files with O_SYNC/O_DSYNC flag.

          The experimental nature of Linux IO subsystem, its unpredictability, is one of the reasons why some actually pick *BSD instead. OK, disk IO is slower than that of Linux, but at least one has sensible IO guarantees: data are written probably not right away, but without any great delay. (The only major problem of *BSD is the lack of drivers for the sto

    • At this point, you can change that to "If you have to ask, use ext4". It's been around long enough at this point that it's ready for production use (and has been for a year or two). Especially if you have situations of multi-gigabyte files that take a long time to delete under ext3, or you want the faster fsck of ext4.

      I plan on waiting until at least late next year before I'd test btrfs for production. Let others be the pioneers in that, because ext4 handles our workload just fine.
      • The main point for me to use ext4 over ext3 is that ext3 has broken fsync() behaviour. If you fsync() a single file descriptor on ext3 it will flush the whole filsystem buffer instead of just the dirty blocks of that file descriptor. Terrible for write concurrency, especially with databases.
    • I am running a cent 6.3 box with ext4 and works well for me I have no issues with he file system. Starting from Linux Kernel 2.6.19 ext4 was available. Supports huge individual file size and overall file system size. Maximum individual file size can be from 16 GB to 16 TB Overall maximum ext4 file system size is 1 EB (exabyte). 1 EB = 1024 PB (petabyte). 1 PB = 1024 TB (terabyte). Directory can contain a maximum of 64,000 subdirectories (as opposed to 32,000 in ext3) You can also mount an existing ext3 fs a
  • The best server? (Score:3, Insightful)

    by Anonymous Coward on Thursday November 29, 2012 @06:35PM (#42136167)

    The best file system would be one not running: mail, database, web hosting, and CPanel.

  • by millwoodtwo ( 517215 ) on Thursday November 29, 2012 @06:37PM (#42136179) Homepage

    The obvious argument for ext4, the current ext version, is that it's been around a long time and is very solid. I'd only use something else if I knew the performance of ext4 would be an issue.

  • by Anonymous Coward on Thursday November 29, 2012 @06:38PM (#42136199)

    It will kill your innocent files to save some space....

  • by MindCheese ( 592005 ) on Thursday November 29, 2012 @06:39PM (#42136215) Homepage
    The inefficiencies and handicaps introduced by that bloated turd of a platform will far outwiegh the sub-percentage point gains you might see from using ReiserFS or any other alternative filesystem.
  • by mikeken ( 907710 ) on Thursday November 29, 2012 @06:41PM (#42136237)
    Typically, that is the default file system. That is how you will get the best support when there is an issue. It will also be the most stable with your OS because the developers focus on that FS. So personally, I would use whatever is the default FS for whatever OS you decide to use. To get off topic a bit, IMHO that OS should be Debian because it is just too awesome and Debian based OS's have the largest community. Also, it should be running on Linode.com ;)
  • WinFS (Score:3, Funny)

    by jfdavis668 ( 1414919 ) on Thursday November 29, 2012 @06:43PM (#42136273)
    It will be released someday
  • by Anonymous Coward

    Especially if you decide to use a SSD. Even if there's not alot of data writing going on the constant rewriting of the directory entries to update the last accessed time stamp would wear an SSD and slow a regular hard drive.

    • by Lehk228 ( 705449 )
      it won't significantly wear any modern SSD, but shutting it off will save you wasted I/O for a function that is not terribly important for a web server, especially a web server that keeps logs storing far more complete information than most recent
  • From memory (I've been out of that business for 6 months) CPanel stores mail as maildirs. If you have gazillions of small files (that's a lot of email) then XFS handles it a lot better than ext3 - I've never benchmarked XFS against ext4. Back in the day, it also dealt with quotas more efficiently than ext2/3, but I really doubt that is a problem nowadays.

    If you aren't handling gazillions of files, I'd be tempted to stick to ext3 or ext4 - just because it's more common and well known, not because it is neces

    • From memory (I've been out of that business for 6 months) CPanel stores mail as maildirs. If you have gazillions of small files (that's a lot of email) then XFS handles it a lot better than ext3 - I've never benchmarked XFS against ext4. Back in the day, it also dealt with quotas more efficiently than ext2/3, but I really doubt that is a problem nowadays.

      If you aren't handling gazillions of files, I'd be tempted to stick to ext3 or ext4 - just because it's more common and well known, not because it is necessarily the most efficient. When your server goes down, you'll quickly find advice on how to restore ext3 filesystems because gazillions of people have done it before. You will find less info about xfs (although it may be higher quality), just because it isn't as common.

      XFS is probably better for large maildirs, but ext3 in recent kernels has much better performance on large directories starting in the late 2.6 kernels. It doesn't provide for infinite # of files per directory, but it doesn't take a huge hit listing e.g. 4k files in a directory anymore.

      • by ivoras ( 455934 )

        it doesn't take a huge hit listing e.g. 4k files in a directory anymore.

        Umm, maildirs store each message in its own file. I clean up (archive) emails from each past year in a separate folder and still easily have 8k files in each... and that is not my busiest mailbox.

        After a few thousand items of anything, the proper tool for the job is a database, not a file system. Though file system can be described as a kind of database, any in case there are problems common to both, such as fragmentation, a specialized data storage always beats generic ones. Personally, I like what Dovecot

  • by 93 Escort Wagon ( 326346 ) on Thursday November 29, 2012 @06:47PM (#42136321)

    You're not going to be there forever, and all using a non-standard filesystem is going to accomplish is to cause headaches down the road for whoever is unfortunate enough to follow you. Use whatever comes with the OS you've decided to run - that'll make it a lot more likely the server will be kept patched and up to date.

    Trust me - I've been the person who's had to follow a guy that decided he was going to do the sort of thing you're considering. Not just with filesystems - kernels too. It was quite annoying to run across grsec kernels that were two years out of date on some of our servers, because apparently he got bored with having to constantly do manual updates on the servers and so just stopped doing it...

  • by axehind ( 518047 )
    If you need a large filesystems then go with XFS. RHEL only supports up to 16TB filesystems with ext4 and up to 100TB with XFS. I'm not sure at this point where the limitation comes from as it is limited even with X86-64.
  • by dokebi ( 624663 ) on Thursday November 29, 2012 @06:52PM (#42136367)

    Unless you want the special features of other file systems (say ZFS), the default (ext3 or ext4) should be fine. They are capable of handling high I/O loads.

    If you want even more I/O performance, then use SSDs.

  • by glassware ( 195317 ) on Thursday November 29, 2012 @06:54PM (#42136389) Homepage Journal

    This isn't 1999. You have no reason to host your web server, email server, and database server on the same operating system.

    You would be well advised to run your web server on one machine, your email server on another machine, and your database server on a third machine. In fact, this is pretty much mandatory. Many standards, such as PCI compliance, require that you separate all of your units.

    Take advantage of the technology that has been created over the past 15 years and use a virtualized server environment. Run CentOS with Apache on one instance - and nothing else. Keep it completely pure, clean, and separate from all other features. Do not EVER be tempted to install any software on any server that is not directly required by its primary function.

    Keep the database server similarly clean. Keep the email server similarly clean. Six months from now, when the email server dies, and you have to take the server offline to recover things, or when you decide to test an upgrade, you will suddenly be glad that you can tinker with your email server as much as you want without harming your web server.

    • Re: (Score:2, Insightful)

      by Anonymous Coward

      After having worked for companies that do both, I honestly disagree. If you host your dbs and web servers on different machines, you wind up with a really heavy latency bottleneck which makes lamps applications load even slower. It doesn't really make a difference in the "how many users can I fit into a machine" catagory. CPanel in particular is a very one-machine centric piece of software, while you could link it to a remote database its really a better idea to put everything on one machine.

      • Thank you for the polite response. I did get a bit carried away in my post, so allow me to clarify.

        The basic principle I'm approaching here is that you should design your environment for simplicity of maintenance. Keeping your machines separate makes maintenance easier, it makes disaster recovery easier, it makes documentation easier, it makes upgrades easier, and it makes downgrades easier. The gains just keep on going.

        When I managed hundreds of separate machines - or even when I manage only three or fo

    • by leandrod ( 17766 )

      use a virtualized server environment

      And ðere goes I/O thru ðe drain.

    • by Hatta ( 162192 )

      Run CentOS with Apache on one instance - and nothing else. Keep it completely pure, clean, and separate from all other features. Do not EVER be tempted to install any software on any server that is not directly required by its primary function.

      Why is this required? Shouldn't we expect our operating systems to multitask?

      • by timeOday ( 582209 ) on Thursday November 29, 2012 @08:11PM (#42137221)
        It's a shame, isn't it? We have all these layers and layers of security (such as user separation, private memory address space for processes, java virtual machine...) which we do not trust and are therefore essentially nothing but configuration and performance cruft. If we're really just running one application on each (virtual) machine, that machine might as well be running DOS.
      • Why is this required? Shouldn't we expect our operating systems to multitask?

        We should expect our servers to be secure. But they're buggy.
        We should expect defense in depth to be unnecessary. But people screw up.
        We should expect OS tunables to be variable on a per-process basis, but they're not (with Linux anyhow).

  • If you are concerned about performance and expect constant email stream you should host mail, database and web-servers on separate computers. There is a reason any reputable host does it this way. Plus increased load on one component doesn't affect others.

    I think file picking system should be the least of your worries.

    • * I think picking file system should be the least of your worries.

      • by gagol ( 583737 )
        I use an old netbook for movie watching while trying to sleep (not a pervert, only insomniac). I let the computer die by itself many times a week and so far XFS never gave up on me. Never had to fscking at boot, never complained, so far no FS corruption I can see. I would recommend XFS based on its reliability and I trust it's stability being almost 2 decades old. Bonus point, it performs beautifully and do not require gigabytes of memory just to run. I mostly pick FS based on reliability, and disks based o
  • based on your topology you have described, the last thing you need to worry about is what file system to choose, since you have decided to host ALL tasks on a single server. if performance was an issue, you would separate them all to dedicated "farms" and if security is a factor (which it should be), none of them would be in the DMZ, only your proxy(s) would live there.
  • by e065c8515d206cb0e190 ( 1785896 ) on Thursday November 29, 2012 @07:04PM (#42136513)
    Whether your focus is on performance, reliability or both, you have other areas that require much more attention than the FS.
  • Go old school (Score:4, Informative)

    by RedLeg ( 22564 ) on Thursday November 29, 2012 @07:12PM (#42136629) Journal

    What do you fine folks think?"

    I think you're not a very well trained sysadmin.

    There is no reason to not have various parts of the filesystem mounted from different disks or partitions on the same disk. If you do this, you can run part of the system on one filesystem, other parts on others as appropriate for their intended usage. This is commonly done on large servers for performance reasons, quite like the one you are asking about. It's also why SCSI ruled in the server world for so long since it made it easy to have multiple discs in a system.

    So run most of your system on something stable, reliable and with good read performance, and the portions that are going to take a read/write beating on a separate partition/disc with the filesystem which has better read or write, whichever is needed, performance. If you segregate your filesystem like this correctly, an added benefit is that you can mount security critical portions of the filesystem readonly, making it more difficult for an attacker.

    Red

    • Actually, there is a reason not to have different apps using different filesystems in partitions on one disk. If those apps just use subdirectories within one filesystem, that filesystem can do a pretty good job of linearizing I/O across them all, minimizing head motion (XFS is especially good at this). If those apps use separate partitions, you'll thrash the disk head mercilessly between them if more than one is busy. Your advice is good in the multiple-disk case, but terrible in the single-disk case, a

  • by Anonymous Coward on Thursday November 29, 2012 @07:23PM (#42136757)

    Contrary to the majority of the people replying to this post, I emphatically DO NOT recommend ext3. ext3 by default wants to fsck every 60 or 90 days; you can disable this, but if you forget to, in a hosting environment it can be pure hell if one of your servers reboots. Usually shared hosting web servers are not redundant, for cost reasons; if one of your shared hosting boxes reboots you thus get to enjoy up to an hour of customers on the phone screaming at you while the fsck completes

    XFS is a very good filesystem for hosting operations. It has superior performance to ext3, which really helps, as it means your XFS-running server can host more websites and respond to higher volumes of requests than an ext3-running equivalent. It also has a feature called Project Quotas, which allows you to define quotas not linked to a specific user or group account; this can be extremely useful for hosting environments, both for single-homed customers and for multi-homed systems where individual customer websites are not tied to UNIX user accounts. The oft-circulated myth that XFS is prone to data loss is just that; there was a bug in its very early Linux releases that was fixed ages ago, and now its no worse than ext4 in this respect.

    Ext4 is also a good option, and a better option than ext3; it is faster and more modern than ext3 and is being more actively developed. Ext4 is also more widely used than XFS, and is less likely to get you into trouble in the unlikely event that you get bit by an unusual bug with either filesystem.

    Btrfs will be a great option when it is officially declared stable, but that hasn't happened yet. The main advantages for btrfs will be for hosting virtual machines and VPSes, as Btrfs's excellent copy on write capabilities will facilitate rapid cloning of VMs.

    This is already a reality in the world of FreeBSD, Solaris and the various Illumos/OpenSolaris clones, thanks to ZFS. ZFS is stable and reliable, and if you are on a platform that features it, you should avail yourself of it. I would advise you steer clear of ZFS on Linux.

    Finally, for clustered applications, i.e. if you want to buck the trend and implement a high availability system with multiple redundant webservers, the only Linux clustering filesystem I've found to be worth the trouble is Oracle's open source OCFS2 filesystem (avoid OCFS1; its deprecated and non-POSIX compliant). OCFS2 lets you have multiple Linux boxes share the same filesystem; if one of them goes offline, the others still have access to it. You can easily implement a redundant iSCSI backend for it using mpio. Its somewhat easier to do this then to setup a high availability NFS cluster, without buying a proprietary filer such as a NetApp.

    Reiserfs was at one time popular for mail servers, in particular for maildirs, due to its competence at handling large numbers of small files and small I/O transactions, but in the wake of Hans Reiser's murder conviction, it is no longer being actively developed and should be avoided. JFS likewise is a very good filesystem, on a par with ext4 in terms of featureset, but for various reasons the Linux version of it has failed to become popular, and you should avoid it on a hosting box for that reason (unless your box is running AIX).

    Speaking of older proprietary UNIX systems; on these you should have no qualms about using the standard UFS, which is a tried and true filesystem analogous to ext2 in terms of functionality. This is the standard on OpenBSD. NetBSD features a variant with journaling called WAPBL, developed by the now defunct Wasabi Systems. DragonFlyBSD features an innovative clustering FS called HammerFS, which has received some favorable reviews, but I haven't seen anyone using that platform in hosting yet. The main headache with hosting is the extreme cruelty you will experience in response to downtime, even when that downtime is short, scheduled or inevitable. Thus, it pays to avoid using unconventional systems that customers will use as a vector for claiming incomp

    • by CAIMLAS ( 41445 )

      While I agree with what you say, mostly, I've got contention with a couple key points.

      Btrfs will be a great option when it is officially declared stable, but that hasn't happened yet.

      On the contrary, btrfs will not be a good option 'when it's officially declared stable'. It'll be a good option when it's vetted as stable without too much regressive or destructive behavior, in the wild. Until then, it's still immature and best suited for closed environments.

      The main advantages for btrfs will be for hosting virtual machines and VPSes, as Btrfs's excellent copy on write capabilities will facilitate rapid cloning of VMs.

      This is already a reality in the world of FreeBSD, Solaris and the various Illumos/OpenSolaris clones, thanks to ZFS. ZFS is stable and reliable, and if you are on a platform that features it, you should avail yourself of it.

      I agree, but a word of caution... FreeBSD lacks the necessary stable storage controller support to make ZFS fully stable on FreeBSD on all but a hand

  • ... in the year 2012, people are seriously suggesting others use filesystems that can (and eventually will) lose data on an unclean shutdown. C'mon people, this isn't stone age anymore.

  • Go for PostgreSQL-backed services whenever feasible. For example, ðere is a quite competent IMAP server called Archiveopteryx, you can run Mediawiki on PostgreSQL, as well as Zope and whatnot.

  • by hendersj ( 720767 ) on Thursday November 29, 2012 @08:42PM (#42137505)

    I spent some time late last year and earlier this year working very closely with the developers of BetterLinux, and in the work I did, I did stress testing (on a limited scale) to see how the product performed. It has some OSS components and some closed-source components, but the I/O leveling they do is pretty amazing.

    http://www.betterlinux.com/ [betterlinux.com]

  • by jonadab ( 583620 ) on Thursday November 29, 2012 @08:50PM (#42137565) Homepage Journal
    There are arguments to be made in favor of FAT16 or even FAT32, but I think I'd go with FAT12, just because it's simpler. You don't need LFNs for web hosting, do you?
  • Google Apps (Score:2, Insightful)

    by nrozema ( 317031 )

    People still run their own email servers?

    • by sdw ( 6809 )

      Yes, I have been running my own mail / web server since 1992. As soon as something is more reliable than that, I might consider switching to it. ;-)

      My email archive is about 30GB last I checked. Fully backed up. Very fast to search.

      Maildirs are dumb. Imap to mbox folders are the way to go. I roll them over at 200MB. With Thunderbird caching and a good Imap server indexing, it is faster than any available email service.

      Of course Thunderbird is great with Gmail, AOL, and Outlook.com too.

  • I'm planning to race a Yugo kitted out with cast iron spoilers and wooden tires.
    Which type of decals will make me go fastest?

    Ontopic; the choice of filesystem will have far less impact than the choice of programming language, database, webserver application and how you use those. The choice to go with CPanel (or any *Panel) means the impact of the filesystem will be unnoticable. Nothing wrong with those panels; they drive down human cost, but if you need the absolute best performance, panels won't let you g

  • I used JFS on all my machines from around 2007-2011, including laptops. I had many unclean shutdowns (especially on laptops) and JFS rarely had any problems, except that one time briefly in 2009 where I did actually lose a bunch of data, but then so did my ext4 reinstall a few weeks later (bad hardware).

    JFS was much, much better than ext3. Especially in low-CPU situations/hardware.

    I can't remember why I went back to ext4, I guess I wanted to see if it still sucked compared to JFS. With noatime I decided I c

  • Go with tmpfs. It has the highest performance of any of the "standard kernel" filesystems, and if you use it for your personal webserver/blogserver/mailserver/etc, it will never lose any valuable data if the server reboots unexpectedly.

    --Joe

A complex system that works is invariably found to have evolved from a simple system that works.

Working...