Ask Slashdot: Best File System For Web Hosting? 210
An anonymous reader writes "I'm hoping for a discussion about the best file system for a web hosting server. The server would serve as mail, database, and web hosting. Running CPanel. Likely CentOS. I was thinking that most hosts use ext3 but with of the reading/writing to log files, a constant flow of email in and out, not to mention all of the DB reads/writes, I'm wondering if there is a more effective FS. What do you fine folks think?"
ext3 (Score:5, Insightful)
if you have to ask you should stick with ext3
The best server? (Score:3, Insightful)
The best file system would be one not running: mail, database, web hosting, and CPanel.
ext4 unless there's a good reason not to. (Score:3, Insightful)
The obvious argument for ext4, the current ext version, is that it's been around a long time and is very solid. I'd only use something else if I knew the performance of ext4 would be an issue.
CPanel will be your problem (Score:5, Insightful)
Is it for work? Don't roll a custom solution (Score:5, Insightful)
You're not going to be there forever, and all using a non-standard filesystem is going to accomplish is to cause headaches down the road for whoever is unfortunate enough to follow you. Use whatever comes with the OS you've decided to run - that'll make it a lot more likely the server will be kept patched and up to date.
Trust me - I've been the person who's had to follow a guy that decided he was going to do the sort of thing you're considering. Not just with filesystems - kernels too. It was quite annoying to run across grsec kernels that were two years out of date on some of our servers, because apparently he got bored with having to constantly do manual updates on the servers and so just stopped doing it...
Re:Just (Score:3, Insightful)
Even with an SSD you still need a file system format for it to be usable.
I'm all for ZFS, very reliable over long periods of time.
Re:Just (Score:5, Insightful)
yeah - its especially good for your log files, after all, SSD is just like a big RAM drive.....
you're going to be better off forgetting SSDs and going with lots more RAM in most cases, if you have enough RAM to cache all your static files, then you have the best solution. If you're running a dynamic site that generates stuff from a DB and that DB is continually written to, then generally putting your DB on a SSD is going to kill its performance just as quickly as if you had put /var/log on it.
RAID drives are the fastest, stripe data across 2 drives basically doubles your access speed, so stripe across an array of 4! The disadvantage is 1 drive failure kills all data - so mirror the lot. 8 drives in a stripe+mirror (mirror each pair, then put the stripe across the pairs - not the other way round) will give you fabulous performance without worry that your SSD will start garbage collecting all the time when it starts to fill up.
Re:Just (Score:3, Insightful)
What is wrong with you? (Score:5, Insightful)
This isn't 1999. You have no reason to host your web server, email server, and database server on the same operating system.
You would be well advised to run your web server on one machine, your email server on another machine, and your database server on a third machine. In fact, this is pretty much mandatory. Many standards, such as PCI compliance, require that you separate all of your units.
Take advantage of the technology that has been created over the past 15 years and use a virtualized server environment. Run CentOS with Apache on one instance - and nothing else. Keep it completely pure, clean, and separate from all other features. Do not EVER be tempted to install any software on any server that is not directly required by its primary function.
Keep the database server similarly clean. Keep the email server similarly clean. Six months from now, when the email server dies, and you have to take the server offline to recover things, or when you decide to test an upgrade, you will suddenly be glad that you can tinker with your email server as much as you want without harming your web server.
Use the standard and focus on what matters (Score:4, Insightful)
Re:Just (Score:5, Insightful)
Due to the amount of read writes & the life span of SSD's they are some of the worst drives you can get for a high availability web server.
Only if you're completely ignorant about the difference between consumer and enterprise SSDs. The official rated endurance of a 200GB Intel 710 with random 4K writes (the worst case scenario) with no over-provisioning is 1.0 TB. In order to wear this drive out in a high-load scenario, you could write 100GB of data in 4k chunks to this drive every day for nearly 30 years before you approached even the official endurance.
If you use a consumer SSD in a high-load enterprise scenario, you're going to get bit. If you use an enterprise SSD in a high-load enterprise scenario, you'll have no problems whatsoever with endurance, regardless of what people spreading FUD like you would have you believe.
Re:What is wrong with you? (Score:2, Insightful)
After having worked for companies that do both, I honestly disagree. If you host your dbs and web servers on different machines, you wind up with a really heavy latency bottleneck which makes lamps applications load even slower. It doesn't really make a difference in the "how many users can I fit into a machine" catagory. CPanel in particular is a very one-machine centric piece of software, while you could link it to a remote database its really a better idea to put everything on one machine.
Re:Just (Score:4, Insightful)
If only they had some kind of way of allowing drives to fail while still retaining data integrity. It's probably because I just dropped Acid, but I'd call the system RAID.
Re:What is wrong with you? (Score:5, Insightful)
Google Apps (Score:2, Insightful)
People still run their own email servers?
RAID10 (Score:5, Insightful)
Yep, agreed... agonizing over the FS choice isn't going to provide many gains compared to spending time optimizing the physical disk configuration and partitioning.
FS performance is only going to really matter if you're going to have directories with thousands of nodes in them. But then hopefully you have better ways to prevent that from happening.
But you do want to spend a good deal of time benchmarking different RAID and partitioning setups, where you can see some gains in the 100-200% range rather than 5-10%, especially under concurrent loads. Spend some quality time with bonnie++ and making some pretty comparison graphs. Configure jmeter to run some load tests on different parts of your system, and then all together to see how well it deals with concurrent accesses. Figure out which processes you want to dedicate resources to, and which can be well-behaved and share with other processes. Set everything up in a way to make it easier to scale out to other servers when you're ready to grow.
The FS choice is probably the least interesting aspect of the system (until you start looking at clustered FSs, like OCFS2 or Lustre)
Re:Just (Score:5, Insightful)
Intel rates the endurance of the 710 at 1.0 PB and the 330 at 60 TB, so yeah, there's a pretty big difference there.
In Intel's case, specifically, the difference is between using MLC flash and MLC-HET flash. The difference is largely from binning, but it's the difference between 3k to 5k p/e cycles on typical MLC, and 90k p/e cycles on MLC-HET. SLC produces similar improvements. I could explain how they achieve this, but Anandtech and Tom's Hardware have both done pretty good write-ups explaining the difference.
It depends entirely on your workload. If you've got an enterprise workload where you don't do many writes, then a consumer drive will work just fine. And since most drives report their current wear levels, it's actually pretty safe to use a consumer drive if you monitor that.
Anandtech gave one example, when they were short on capacity and were facing a delay in getting some new enterprise SSDs; they walked out to the store, bought a bunch of consumer Intel SSDs, and slapped those into their servers. They were facing a write-heavy workload, so they wouldn't have lasted long, but they only needed them for a few months and kept an eye on the media wear indicator values, so they were fine.
My point overall is that you can't look at SSDs the same way if you're a consumer versus an enterprise user, and if you're an enterprise user, you need to pick an SSD appropriate for your workload.
One thing people don't consider is upgrade cycles. Hanging on to an SSD for ten years doesn't really make sense, because it only takes a few years for them to be replaced by drives enormously cheaper, larger, and faster. They're improving by Moore's Law, unlike HDDs. I paid $700 for a 160GB Intel G1, and three years later, I paid $135 for a much faster 180GB Intel 330. If you're going to replace an SSD in three to five years, does it matter if the lifespan is 10 or 30?