Forgot your password?
typodupeerror
Data Storage Linux

Ask Slashdot: Best *nix Distro For a Dynamic File Server? 234

Posted by timothy
from the when-birdwatching-goes-too-far dept.
An anonymous reader (citing "silly workplace security policies") writes "I'm in charge of developing for my workplace a particular sort of 'dynamic' file server for handling scientific data. We have all the hardware in place, but can't figure out what *nix distro would work best. Can the great minds at Slashdot pool their resources and divine an answer? Some background: We have sensor units scattered across a couple square miles of undeveloped land, which each collect ~500 gigs of data per 24h. When these drives come back from the field each day, they'll be plugged into a server featuring a dozen removable drive sleds. We need to present the contents of these drives as one unified tree (shared out via Samba), and the best way to go about that appears to be a unioning file system. There's also requirement that the server has to boot in 30 seconds or less off a mechanical hard drive. We've been looking around, but are having trouble finding info for this seemingly simple situation. Can we get FreeNAS to do this? Do we try Greyhole? Is there a distro that can run unionfs/aufs/mhddfs out-of-the-box without messing with manual recompiling? Why is documentation for *nix always so bad?""
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Best *nix Distro For a Dynamic File Server?

Comments Filter:
  • Wow (Score:5, Insightful)

    by Anonymous Coward on Saturday August 25, 2012 @01:29PM (#41123315)

    I know I’m not going to be the first person to ask this, but if I understand it the plan here was:

    1 - buy lots of hardware and install
    2 - think about what kind of software it will run and how it will be used

    I think you got your methodology swapped around man!

    Why is documentation for *nix always so bad?

    You are looking for information that your average user won’t care about. Things like boot time don’t get documented because your average user isn’t going to have some arbitrary requirement to have their _file server_ boot in 30 seconds. That’s a very weird use case. Normally you reboot a file server infrequently (unless you want to be swapping disks out constantly..). I’m assuming this requirement is because you plan on doing a full shutdown to insert your drives... in which case you really should be looking into hotswap

    Also mandatory: you sound horribly underqualified for the job you are doing. Fess up before you waste even more (I assume grant) money and bring in someone that knows what the hell they are doing.

  • by Anonymous Coward on Saturday August 25, 2012 @01:30PM (#41123329)
    Why does it have to be a mechanical hard drive? Why not use an SSD for the boot drive?
  • by Anrego (830717) * on Saturday August 25, 2012 @01:37PM (#41123377)

    I have to assume they are using some clunky windows analysis program or something that lacks the ability to accept multiple directories or something.

    Either way, the aufs (or whatever they use) bit seems to be the least of their worries. They bought an installed a bunch of gear and are just now looking into what to do with it, and they've decided they want it to boot in 30 seconds (protip: high end gear can take this long just doing it's self checks, which is a good thing! Fast booting and file server don't go well together).

    Probably a summer student or the office "tech guy" running things. They'd be better off bringing in someone qualified.

  • Re:Wow (Score:4, Insightful)

    by LodCrappo (705968) on Saturday August 25, 2012 @01:40PM (#41123405) Homepage

    I know I’m not going to be the first person to ask this, but if I understand it the plan here was:

    1 - buy lots of hardware and install
    2 - think about what kind of software it will run and how it will be used

    I think you got your methodology swapped around man!

    Why is documentation for *nix always so bad?

    You are looking for information that your average user won’t care about. Things like boot time don’t get documented because your average user isn’t going to have some arbitrary requirement to have their _file server_ boot in 30 seconds. That’s a very weird use case. Normally you reboot a file server infrequently (unless you want to be swapping disks out constantly..). I’m assuming this requirement is because you plan on doing a full shutdown to insert your drives... in which case you really should be looking into hotswap

    Also mandatory: you sound horribly underqualified for the job you are doing. Fess up before you waste even more (I assume grant) money and bring in someone that knows what the hell they are doing.

    Wow.. I completely agree with an AC.

    The OP here is in way over his head and the entire project seems to have been planned by idiots.

    This will end badly.

  • Re:Wow (Score:0, Insightful)

    by Anonymous Coward on Saturday August 25, 2012 @01:44PM (#41123433)

    Agreed.

    Also: the submitter asks about a "distro". A distro is a pre-packaged solution for a broad group of users. He has to build and test his own solution.

    If you know what you are doing, it does not matter which distro you are using.

    To the boss of the submitter: fire him and hire somebody who has a clue.

  • Here we go again (Score:3, Insightful)

    by Anonymous Coward on Saturday August 25, 2012 @01:45PM (#41123441)

    Another "I don't know how to do my job, but will slag off OSS knowing someone will tell me what to do. Then I can claim to be l337 at work by pretending to know how to do my job".

    It's call reverse physiology, don't fall for it! Maybe shitdot will go back to its roots if no one comments in junk like this and the slashvertisments?

  • by NemoinSpace (1118137) on Saturday August 25, 2012 @01:45PM (#41123445) Homepage Journal
    • Enterprise-ready: Greyhole targets home users.

    Not sure why the 30s boot up requirement is there, so it depends on what you define as "booted" . Spinning up 12 hard drives and making them available through Samba within 30s guarantees your costs will be 10x more than they need to be.
    This isn't another example of my tax dollars at work is it?

  • by wytcld (179112) on Saturday August 25, 2012 @01:51PM (#41123491) Homepage

    "Enterprise class" is a marketing slogan. In the real world, all the RH derivatives are pretty good (including Scientific Linux and Fedora as well as CentOS), and all the Debian derivatives are pretty good (including Ubuntu). Gentoo's solid too. "Enterprise class" doesn't mean much. The main thing that characterizes CentOS from Scientific Linux - which is also just a recompile of the RHEL code - is that the CentOS devs have "enterprise class" attitude. Meanwhile, RH's own devs are universally decent, humble people. Those who do less often thing more of themselves.

    For a great many uses, Debian's going to be easiest. But it depends on just what you need to run on it, as different distros do better with different packages, short of compiling from source yourself. No idea what the best solution is for the task here, but "CentOS" isn't by itself much of an answer.

  • by Anonymous Coward on Saturday August 25, 2012 @02:14PM (#41123645)

    500G in a 24h period sounds like it will be highly compressible data. I would recommend FreeBSD or Ubuntu with ZFS Native Stable installed. ZFS will allow you to create a very nice tree with each folder set to a custom compression level if necessary. (Don't use dedup) You can put one SSD in as a cache drive to accelerate the shared folders speed. I imagine there would be an issue with restoring the data to magnetic while people are trying to read off the SMB share. An SSD cache or SSD ZIL drive for ZFS can help a lot with that.

    Some nagging questions though.
    How long are you intending on storing this data? How many sensors are collecting data? Because even with 12 drive bay slots, assuming cheap SATA of 3TB a piece. (36TB total storage with no redundancy), lets say 5 sensors, thats 2.5TB a day data collection, and assuming good compression of 3x, 833GB a day. You will fill up that storage in just 43 days.

    I think this project needs to be re-thought. Either you need a much bigger storage array, or data needs to be discarded very quickly. If the data will be discarded quickly, then you really need to think about more disk arrays so you can use ZFS to partition the data in such a way that each SMB share can be on its own set of drives so as to not head thrash and interfere with someone else who is "discarding" or reading data.

  • by Anonymous Coward on Saturday August 25, 2012 @02:14PM (#41123649)

    "Enterprise class" means that it runs the multi-million dollar crappy closed source software you bought to run on it without the vendor bugging out when you submit a support ticket.

  • by DamnStupidElf (649844) <Fingolfin@linuxmail.org> on Saturday August 25, 2012 @02:21PM (#41123695)
    Unless you're talking about millions of individual files on each drive it should be relatively quick to mount each hard drive and set up symbolic links in one shared directory to the files on each of the mounted drives. Just make sure Samba has "follow symlinks" set to yes and the Windows clients will see just see normal files in the shared directory.
  • waaaay over head (Score:5, Insightful)

    by itzdandy (183397) <`moc.liamg' `ta' `nosnednad'> on Saturday August 25, 2012 @02:39PM (#41123807) Homepage

    What is the point of 30 second boot on a file server? If this is on the list of 'requirements', then the 'plan' is 1/4 baked. 1/2 baked for buying hardware without a plan, then 1/2 again for not having a clue.

    unioning filesystem? what is the use scenario? how about automounting the drives on hot-plug and sharing the /mnt directory?

    Now, 500GB/day in 12 drive sleds....so 6TB a day? do the workers get a fresh drive each day or is the data only available for a few hours before it gets sent back out or are they rotated? I suspect that mounting these drives for sharing really isnt what is necessary, more like pull contents to 'local' storage. Then, why talk about unioning at all, just put the contents of each drive in a separate folder.

    Is the data 100% new each day? Are you really storing 6TB a day from a sensor network? 120TB+ a month?

    Are you really transporting 500GB of data by hand to local storage and expecting the disks to last? reading or writing 500GB isn't a problem, but constant power cycling and then physically moving/shaking the drives around each day to transport is going to put the MTBF of these drives in months not years.

    dumb

  • by Knuckles (8964) <knuckles.dantian@org> on Saturday August 25, 2012 @02:39PM (#41123809)

    Saying "only good mp3 player" makes no sense unless you specify your criteria. Amarok, Banshee, VLC, Rhythmbox, or smplayer are all capable mp3 players by various criteria and easily found by googling for "linux mp3 player". If you use Ubuntu, searching for mp3 player in Software Center finds a plethora of good players. Googling "list of linux audio software" easily finds other things besides just mp3 players: maybe something like Audacity satisfies your requirements better. Search for "mp3" on xmms2.org finds the answer in the first link - your xmms2 install needs have the MAD library, maybe your distro does not install that.

    Does not seem like the problem is with bad docs.

  • Re:Wow (Score:4, Insightful)

    by plover (150551) * on Saturday August 25, 2012 @03:42PM (#41124229) Homepage Journal

    While I'm curious as to the application, it's his data rates that ultimately count, not our opinions of if he's doing it right.

    500GB may sound like a lot to us, but the LHC spews something like that with every second of operation. They have a large cluster of machines whose job it is to pre-filter that data and only record the "interesting" collisions. Perhaps the OP would consider pre-filtering as much as possible before dumping it into this server as well. If this is for a limited 12 week research project, maybe they already have all the storage they need. Or maybe they are doing the filtering on the server before committing the data to long term storage. They just dump the 500GB of raw data into a landing zone on the server, filter it, and keep only the relevant few GB.

    Regarding mesh networking, they'd have to build a large custom network of expensive radios to carry that volume of data. Given the distances mentioned, it's not like they could build it out of 802.11 radios. Terrain might also be an issue, with mountains and valleys to contend with, and sensors placed near to access roads. That kind of expense would not make sense for a temporary installation.

    I don't think he's an idiot. I just think he couldn't give us enough details about what he's working on.

  • Not gonna happen. (Score:5, Insightful)

    by Anonymous Coward on Saturday August 25, 2012 @04:10PM (#41124453)

    You have to be able to identify the disks being mounted. Since these are hot swappable, they will not be automatically identifiable.

    Also note, not all disks spin up at the same speed. Disks made for desktops are not reliable either - though they tend to spin up faster. Server disks might take 5 seconds before they are failed. You also seem to have forgotten that even with all disks spun up, each must be read (one at a time) for them to be mounted.

    Hot swap disks are not something automatically mounted unless they are known ahead of time - which means they have to have suitable identification.

    UnionFS is not what you want. That isn't what it was designed for. Unionfs only has one drive that can be written to - the top one in the list. Operations on the other disks force it to copy it to the top disk for any modifications. Deletes don't happen to any but the top disk.

    Some of what you discribe is called an HSM (hierarchical storage management), and requires a multi-level archive where some volumes may be on line, others off line, yet others in between. Boots are NOT fast, mostly due to the need to validate the archive first.

    Back to the unreliability of things - if even one disk has a problem, your union filesystem will freeze - and not nicely either. The first access to a file that is inaccessable will cause a lock on the directory. That lock will lock all users out of that directory (they go into an infinite wait). Eventually, the locks accumulate to include the parent directory... which then locks all leaf directories under it. This propagates to the top level when the entire system freezes - along with all the clients. This freezing nature is one of the things that a HSM handles MUCH better. A detected media error causes the access to abort, and that releases the associated locks. If the union filesystem detects the error, then the entire filesystem goes down the tubes, not just one file on one disk.

    Another problem is going to be processing the data - I/O rates are not good going through a union filesystem yet. Even though UnionFS is pretty good at it, expect the I/O rate to be 10% to 20% less than maximum. Now client I/O has to go through a network connection, so that may make it bearable. But trying to process multiple 300 GB data sets in one day is not likely to happen.

    Another issue you have ignored is the original format of the data. You imply that the filesystem on the server will just "mount the disk" and use the filesystem as created/used by the sensor. This is not likely to happen - trying to do so invites multiple failures; it also means no users of the filesystem while it is getting mounted. You would do better to have a server disk farm that you copy the data to before processing. That way you get to handle the failures without affecting anyone that may be processing data, AND you don't have to stop everyone working just to reboot. You will also find that local copy rates will be more than double what the servers client systems can read anyway.

    As others have mentioned, using gluster file system to accumulate the data allows multiple systems to contribute to the global, uniform, filesystem - but it does not allow for plugging in/out disks with predefined formats. It has a very high data throughput though (due to the distributed nature of the filesystem), and would allow many systems to be copying data into the filesystem without interference.

    As for experience - I've managed filesystems with up to about 400TB in the past. Errors are NOT fun as they can take several days to recover from.

  • Re:Wow (Score:2, Insightful)

    by Anonymous Coward on Saturday August 25, 2012 @07:02PM (#41125545)

    Seismic data.
    Radio spectrum noise level.
    Accoustic data.
    High frequency geomagnetic readings.
    Any of various types of environmental sensors.

    Any of the above, or combination thereof, would be pretty common in research projects, and could easily generate 500gb+ per day. And the only thing you thought of was photos. You're not a geek, you're some Facebook Generation fuckwit who knows jack shit about science. Go back to commenting on YouTube videos.

Vax Vobiscum

Working...