Slashdot stories can be listened to in audio form via an RSS feed, as read by our own robotic overlord.

 



Forgot your password?
typodupeerror
Linux IT

Ask Slashdot: Linux Login and Resource Management In a Computer Lab? 98

Posted by timothy
from the explain-your-system dept.
New submitter rongten (756490) writes I am managing a computer lab composed of various kinds of Linux workstations, from small desktops to powerful workstations with plenty of RAM and cores. The users' $HOME is NFS mounted, and they either access via console (no user switch allowed), ssh or x2go. In the past, the powerful workstations were reserved to certain power users, but now even "regular" students may need to have access to high memory machines for some tasks. Is there a sort of resource management that would allow the following tasks? To forbid a same user to log graphically more than once (like UserLock); to limit the amount of ssh sessions (i.e. no user using distcc and spamming the rest of the machines, or even worse, running in parallel); to give priority to the console user (i.e. automatically renicing remote users jobs and restricting their memory usage); and to avoid swapping and waiting (i.e. all the users trying to log into the latest and greatest machine, so have a limited amount of logins proportional to the capacity of the machine). The system being put in place uses Fedora 20, and LDAP PAM authentication; it is Puppet-managed, and NFS based. In the past I tried to achieve similar functionality via cron jobs, login scripts, ssh and nx management, and queuing system — but it is not an elegant solution, and it is hacked a lot. Since I think these requirements should be pretty standard for a computer lab, I am surprised to see that I cannot find something already written for it. Do you know of a similar system, preferably open source? A commercial solution could be acceptable as well.
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Linux Login and Resource Management In a Computer Lab?

Comments Filter:
  • by lyapunov (241045) on Tuesday July 22, 2014 @12:44PM (#47509143)

    I would do it up A Clockwork Orange style.

    The original BOFH stories are a good guide: http://bofh.ntk.net/BOFH/ [ntk.net]

  • Try ulimit. It helps a lot keeping things under control.
  • by Anonymous Coward

    I believe you can do at least some of that with systemd user sessions and resource restrictions
    http://0pointer.de/blog/projects/resources.html [0pointer.de]
    User sessions are currently kind of beta-ish but they're getting better / more useful... I already launch emacs and a MIDI synth through it on login, and it works wonderfully (ironically, though, PulseAudio, the other Lennart project that got a lot of flak, doesn't launch through this mechanism yet).

  • by Anonymous Coward on Tuesday July 22, 2014 @12:48PM (#47509181)

    Trust your users.

    • by Culture20 (968837)
      This was modded funny, but it *is* a classroom computer lab, not a government installation. At some point, you have to let them learn by stepping on each others' toes. Protect the students' files from the other students. Protect the systems' secrets from the students. Beyond that, just institute a written policy of "don't be a jerk: nice your background processes". If a student uses up too many resources, use it as a teachable moment. Chances are, the students aren't trying to be jerks. They'll lp bi
  • Good grief (Score:1, Redundant)

    by sunking2 (521698)
    Is this 1988? The easiest/cheapest solution is spend a couple bucks on decent machines.
    • by Nimey (114278)

      Around here some of the public schools just got rid of Pentium IIIs. Not everyone can afford something decent.

      • Old saying goes, "I can't afford to buy cheap crap."

        I have yet to see a computing environment where the demand of computing power significantly outstripped supply due to antiquated technology except where the network administrators were practically tenured. In those cases they were gobbling up so much in salary and blowing time to keep fixing stuff mostly due to age.

        The administrator even seems to point at that he is trying to fix problems that don't fully exist. "...and it is hacked a lot." Is one of thos
    • by dissy (172727)

      Is this 1988? The easiest/cheapest solution is spend a couple bucks on decent machines.

      Sweet, I've been needing an upgrade myself as well, but there seems to be a strange shortage of people insisting we speed more than a couple bucks on the problem and will pay for the upgrade. I'm glad I found you!

      250 workstations upgraded to top tier is roughly $200000.00 or so. Better make it $250000 so we can get new LCDs too, these 10 year old 19" ones are getting a tiny bit of burn-in.

      Just go ahead and paypal it to me, and I'll get right on implementing your suggestion!

    • by cdrudge (68377)

      Even if all the machines were identical top of the line machines, many of the things that was listed as requirements would still apply.

      "Spend[ing] a couple bucks" isn't always fiscally possible in a education or non-profit environment which the computing lab is likely a part of.

      Finally, given likely limited resources, it likely made a lot more sense to buy more lower end less expensive machines if they could adequately meet the needs of the majority of users while having just a couple of high end machines f

      • by rongten (756490)

        Exactly the last point.

        What I dislike the most are users that take advantage of others due to their lack of knowledge. And this is either done intentionally or unintentionally when rules are not enforced.

        I would like all the students (often coming in contact with linux, shell programming and clusters for the first time) to have a fair shot of using the available resources, and not to backstab each other.

        Before everyone could run on the cluster, until I discovered that certain students were g

  • Maybe they have an EDU license? http://www-03.ibm.com/systems/... [ibm.com]
    • by rongten (756490)

      Hi,

        another alternative would maybe sysfera-ds [sysfera.com], but their open source offering seems lacking documentation and features (see here [github.io]).

        Need to investigate. Seems something on the lines of what vizstack [sourceforge.net] could have done.

  • by Sycraft-fu (314770) on Tuesday July 22, 2014 @01:00PM (#47509279)

    Seems like you are trying to work out a solution to a problem you don't have yet. Maybe first see if users are just willing to play nice. Get a powerful system and let them have at it. That's what we do. I work for an engineering college and we have a fairly large Linux server that is for instructional use. Students can log in and run the provided programs. Our resource management? None, unless the system is getting hit hard, in which case we will see what is happening and maybe manually nice something or talk to a user. We basically never have to. People use it to do their assignments and go about their business.

    Hardware is fairly cheap, so you can throw a lot of power at the problem. Get a system with a decent amount of cores and RAM and you'll probably find out that it is fine.

    Now, if things become a repeated problem then sure, look at a technical solution. However don't go getting all draconian without a reason. You may just be wasting your time and resources.

    • by Charliemopps (1157495) on Tuesday July 22, 2014 @01:26PM (#47509461)

      We did it like you describe. We had some problems with people doing dumb stuff and we just stuck post-its on the monitors describing how to use the "top" command.

      [you@server1 ~]$ top
      PID USER %CPU COMMAND
      1960 you 2.3 top
      2457 Bob 97.0 bitcoin

      [you@server1 ~]$ write Bob DUDE! wtf?!?!

      etc...

    • by Anonymous Coward

      Seems like you are trying to work out a solution to a problem you don't have yet. Maybe first see if users are just willing to play nice.

      You'll also discover that once in a blue moon, users do have a legitimate reason to briefly consume much more resources than typical.

      Spikes happen. It's normal. Monitor the usage, but don't cap it until your problems are more than theoretical.

    • by MerlynEmrys67 (583469) on Tuesday July 22, 2014 @01:38PM (#47509547)
      This is hilarious. So was in College several decades ago. Large computer labs and lots of SSH/X forwarding to do work. The only time I remember getting in "trouble" was when we were on a LISP module as a freshman. Their resource management only allowed a few LISP interpreters on the machine - otherwise it would deny them for resource reasons. I quickly got sick of typing $lisp and waiting for my session to actually start - so I created a shell script that ran an infinite loop asking for a lisp interpreter...
      15 minutes later, someone tapped on my shoulder and asked me what I was doing - I had taken the full processing capabilities for a while. I showed my script - gasp horror, and a 1 second pause was added to the script and I was good to go. Learned a lesson too.
      The year before I got there - enough people were learning how to hack the system to crash it that they were having trouble keeping the system up. Their solution - install a button next to each keyboard that when pushed would crash the system. No work was accomplished for a week - then it didn't go down again. We were told about the button, it was rough for a couple days - and then the systems were rock solid.
      Kids will be kids - good kids will create a nightmare for you - work to focus that energy in a positive way and good things will result.
  • by Anonymous Coward on Tuesday July 22, 2014 @01:00PM (#47509285)

    Some of what you're asking for are ulimit settings - total number of processes, for example. That's pam_limits. Some could also be handled with pam_tally2. Or, since you're already using LDAP, you could use a simple web-based reservation system which specifies allowed login hosts in the LDAP server for however long someone wants to "check out" a machine; that's how I've done it when I've needed to control access to cluster resources.

    When you talk about controlling other resources beyond logins, it's generally better to handle it at the application level rather than the OS level if you can. But using ulimits (and again, this can be integrated into LDAP pretty easily), you can restrict resources and apply process priority (ionice and nice are your friend) based on membership in a specific group or another LDAP attribute.

    You could, for example, create a "highpower" group per set of machines / per machine (highpower_serverA) and add users to that group based on a checkout system, then define limits on the number of processes they can use, amount of memory they can use, total CPU time they can use, etc in limits.conf based on being in that group or not being in that group.

    I'll send you my bill tomorrow.

    • by nthcode (1119827)

      I'll send you my bill tomorrow.

      Agree. PAM plus some usage guidelines and monitoring should be enough. Stuff like this http://www.ibm.com/developerwo... [ibm.com] BTW It feels being like one of the torrent nodes backing up your encrypted files for you.

  • by ZorinLynx (31751) on Tuesday July 22, 2014 @01:02PM (#47509299) Homepage

    Have these problems actually been happening a lot?

    When I first started to help manage a computer lab, I was concerned users would behave really badly and do horrible things. The truth is, very few users did, and we just talked to those users and told them how to behave.

    If you get the occasional repeatedly defiant user, locking out their account can be the final solution. But most people (at least at our site) aren't jerks and listen. Most "bad things" are due more to incompetence than malice, and educating students is easy.

    Also, as someone with experience in these matters, allow me to recommend AGAINST Fedora for production systems. I like to call Fedora the self-breaking distro; updates break things CONSTANTLY. You're much better off running Ubuntu (even non-LTS is more stable than Fedora) or the RHEL clones like CentOS or Scientific Linux.

    • When I used to work for a university (mid-1990s), our department's sysdmin had gotten in trouble at the engineering school because he had written a script that would log into every machine multiple times until all ttys were exhausted ... so he could run his ray-tracing jobs undisturbed. I heard he got away with it for quite some time before one of their sysadmins came in early and realized something wasn't right.

      They told him not to do it, but instead of banning him, they put him to work ... he wrote some

    • by rongten (756490)

      Hi,

      the beowulf clusters we have are running either based on Centos or SLES. For the development workstations where newer versions of certain software are needed I install Fedora.

      This means the developers basically run production on the cluster and develop on the workstations.

      Since there is always a gap between the two (i.e. centos 5 on cluster and fedora 16 on workstations before, centos 6 on cluster and fedora 20 on workstation), when the cluster is updated there is limited breakag

    • This!

      I've been managing systems with hundreds of well-meaning and not-so- scientists, for years.
      Generally, I subscribe to the school of thought that putting too many fences does more damage than good.

      I know for myself, that I *can* create trouble in a zillion of ways on a system, that fencing against it is almost pointless:
      * fork bombs
      * malloc bombs
      * /tmp overuse
      * /dev/shm overuse
      * deliver daemonized processes in the background
      The first two you may handle a bit with the PAM limit techniques des
  • by dougmc (70836) <dougmc+slashdot@frenzied.us> on Tuesday July 22, 2014 @01:08PM (#47509353) Homepage

    If you're giving your users access to the machines, they should be able to use them. And if you can't trust them to use them responsibly, don't give them access.

    If it were me, I'd secure the boxes normally, set up some resource usage rules (guidelines?) and see what happens. If problems happen often, then maybe look into something automated to enforce the rules, but if not, then you're done.

    As for renicing stuff done by remote users, I'm not sure this is a good idea, but if you want to do it you can renice sshd itself, and to be thorough you can also renice crond (if you give them access to cron/at.) But do keep in mind that nice (and ionice) can't do magic with an overloaded system -- they help, but they don't do magic.

    As for commercial systems, I haven't really seen this as being a big problem outside academia. Multiuser *nix systems where different people are competing for resources is kind of rare in the commercial sector, as it seems like the trends lately are to have enough hardware, often dedicated, and to enforce limits through voluntary compliance (and have their boss talk to them if it's still a problem.)

    That "have their boss talk to them" bit may not work so well for students, but still, I would wait for a problem to appear before I put too much effort into solving it.

    Instead, put your efforts into proper sysadmin stuff -- stay up to date on patches, look for problems (especially security ones), make sure backups work, help users with problems, etc. If there's any troublemakers, talk to them, and if they don't shape up after a few warnings, kick them out. (And make sure the policies permit that!)

    You can enforce limits on specific users through pam and sshd_config and some other mechanisms, but I'd suggest leaving that for later. Anything you do that will limit what people can do will eventually keep them from doing what they legitimately need to be doing.

  • That sounds like a lot of overhead for a problem that seems unlikely. I've used lots of multi-user linux boxes over the years and never noticed that a few bad users ruined the experience for everybody else. If it's really an issue, think of it instead as a learning opportunity - post concise instructions on proper lab utilization and how to use top, etc to check if somebody else is the reason why the machine you are using is slow. Then let users police each other.

    • by dougmc (70836)

      I've used lots of multi-user linux boxes over the years and never noticed that a few bad users ruined the experience for everybody else.

      I did ... but this was 25 years ago at college when hardware was scarce (we had 1 MB disk quotas!) and the computer system was used to do all sorts of things that people just couldn't do from their own personal computers (i.e. access mail, news or the Internet.)

      Users policed each other back then to a degree, but there wasn't much you could do to make a bad user behave unless the sysadmins backed you on it, and they'd only back you if they explicitly broke the rules set down. And often you didn't even know

  • by Vellmont (569020) on Tuesday July 22, 2014 @01:34PM (#47509519) Homepage

    If your users can't play nice together, the solution isn't to treat the place like a prison with automated systems enforcing a hard and fast set of rules.

    The solution is for users to create their own enforcement. If some guy tries to take all the resources across your network with distcc, then the people affected should be able to notice that and tell the guy to knock that the fuck off.

    In other words, give the users the freedom to break stuff, but also the knowledge to find out who'd breaking their stuff. It'll serve them far better than creating a walled garden where someone else has the responsibility to enforce social rules.

    Slashdot and reddit work this way. Neither go around trying to enforce how people behave, they give the users the power to do that themself.

  • I would write my own with LDAP and some custom code that will manage ulimit and other tools to manage resources. It's a piece of cake.
  • Back when I worked in schools, one of our techs setup LTSP with NFS-mounted homedirs.
    I mentioned that perhaps IP-based host authorization wasn't exactly a secure way of doing things, especially when it applied to both students and teachers/admin-staff.
    I was told that it wouldn't be an issue, and that files were perfectly safe.

    So some time goes by and a demo is scheduled for the system. My compatriot logs in and... he gets a hot-pink desktop with My Little Pony wallpaper theme. Unfortunately that didn't diss

  • by SampleFish (2769857) on Tuesday July 22, 2014 @01:54PM (#47509645)

    Easy solution:

    Put all of your systems in to one big active/active server cluster. Then everyone is sharing all the resources evenly by default.

    Here is a Fedora resource:
    http://clusterlabs.org/doc/en-... [clusterlabs.org]

    If you really want to have some fun you should try to create a Plan9 cluster. This is a transparent cluster OS that was designed for the purpose of resource sharing.
    http://plan9.bell-labs.com/pla... [bell-labs.com]

  • You write:

    In the past I tried to achieve similar functionality via cron jobs, login scripts, ssh and nx management

    NX? But you are using x2go? THAT is not NX. Contact the experts I.E NoMachine http://nomachine.com/ [nomachine.com]. Only the real authors of probably the most amazing remote access and management tool can you help you there.

  • Since you are on Fedora already, I'd recommend FreeIPA. It'll give you more than your LDAP+PAM for centralized authentication and authorization, like Host-based Access Control, centralized sudoers policy, DNS, etc.

    However, it wouldn't accomplish any of the tasks you specifically asked for out-of-the-box. I was thinking you could write some of these tasks as FreeIPA plugins.

  • I mean, if I had limits on how many systems I could connect to and use at once I would never have passed two of my courses.

    One was a neural networking course which involved programming a computational model and then running 100,000 iterations of the model and analyzing the results. We had been given 6 weeks for it because it was going to take at least 1 week or so to run, but I could not get my model to work for the life of me, and working with the professor finally got it working the night before the resu
  • What about scalable cloud instances that students pay for out of their tuition fees? That way if they want to use 32GB of ram and 12 cores for their hello world.c program, they can do so without affecting other users, but they have to pay?
  • As much as I would like to see Linux displace Windows in these kinds of environments, there really aren't any systems that give you the same kind of management functionality as Userlock, or even Active Directory and Group Policy. It's possible of course, but only if you have the time, skill, and manpower to rig something together yourself. I'm sure I'll get flamed for saying this, but the Linux desktop has a long way to go before it can even hope to be a viable alternative to Windows in the enterprise. E
  • The only way I see this happening is if you totally migrate your lab to something like Amazon AWS/EC2, and link each user to an individual account with specific bandwidth and storage (GRATIS) quotas.

    For one, processing power wont be an issue since that's on Amazon's side, and it's virtually unlimited. Now, everyone will have a decent amount of the other resources for whatever they need, as long as quotas stay inside each user's scope (for which their free quota should have been well defined).

    A user abuses h

    • by tibit (1762298)

      I'd just run my own "cloud" instead, using, say KVM. With billing etc. like in the old times.

  • We had a similar issue with our engineers. We had login servers which worked great as they were poorly advertised and woefully underused, but once we had a system in place for them to make efficient use of them, they started to randomly crash. Most times it was due to them trying to submit a job to our compute farm and end up running it on the login servers, but sometimes it was malicious and a deliberate attempt to get a few extra CPU cycles at the expense of others. For us, the solution was rolling our own virtual desktop farm. We used KVM for the hypervisor, python for the back end control, and php for the front end web interface. We used Active Directory for authentication and rights management. That way we could control precisely how much resources each engineer had rights to.

    As you are working at a school, it is not without reason to believe that you can use the students to help develop a system to manage the virtual instances. With a bit of forethought and a limit to the specifications, you can have a simple VDI broker developed and tested in a month. And if you avoid my mistake and use the libvirt API, you will even have the ability to easily expand the system to using linux containers.

  • To paraphrase Syndrome: When everyone's impacted by everyone's compile, no-one is.

    Also, find me something other than a full kernel compile that takes measurable amounts of time on a real machine.

  • by Minwee (522556) <dcr@neverwhen.org> on Tuesday July 22, 2014 @02:56PM (#47510069) Homepage

    Post a short, general list of rules in several obvious places. Make them reasonable enough to cover most possible user needs but flexible enough to cover things that you haven't thought of yet. Any user who is stupid enough to break the rules by running fork bombs, torrents, mining, hiding stashes of lemur porn or anything else which a child of six could tell you was a bad idea, will have their accounts disabled as soon as they are discovered.

    If they have a good excuse for abusing the systems then discuss it with them, suggest alternatives to running rendering jobs on the lab servers and keeping passwords on sticky notes or whatever else it is that they are doing wrong and then restore their access, trusting that they will know better. If you do it right, they may even decide that it is better to ask for permission than forgiveness next time.

    If they don't, send a memo to their department head briefly outlining what they did, how it was detected, what action you have taken, and that you won't be reversing this decision until you see a presidential pardon come down from an appropriately high authority. It doesn't matter if they have Really Important Work which needs to be done by the end of the week or not, just cut them off until the proper User Apology and Restoration procedure has been completed.

    There you go. This solution is licensed under the WTFPL [wtfpl.net] which is compatible with the Open Source Definition and the Debian Free Software Guidelines so you can use it any way you want. You can even supply your own LART and display it prominently by the door of your office if that helps get the message across.

    • by evilviper (135110)

      That sounds reasonable only if you have a very small group of users, and loads of time to deal with it.

      Everybody runs a fork bomb once in their life. A computer lab should be a safe place to make mistakes, not somewhere that any mistakes will make you a pariah. If you do take that unreasonable attitude, the "presidential pardons" will be coming down on a regular basis, just signed-off as a routine duty without the slightest thought, every time a department head requests it.

      • by Minwee (522556)

        If they have a good excuse for abusing the systems then discuss it with them, suggest alternatives to running rendering jobs on the lab servers and keeping passwords on sticky notes or whatever else it is that they are doing wrong and then restore their access, trusting that they will know better.

        Everybody runs a fork bomb once in their life. A computer lab should be a safe place to make mistakes, not somewhere that any mistakes will make you a pariah.

        It's good that we agree on that.

  • The 1970's called, they want their userspace problems back:

    http://www.cmu.edu/computing/c... [cmu.edu]

  • Run user sessions on linux containers (docker is getting momentum, may be the right option) that you can limit on the resources that they can use, while being far more efficient than VMs for that. Just a word of caution, they aren't as secure as VMs, they may be present or future vulnerabilities that may let hostile students to break their limits and/or access the main system, as they have more surface contact with the machine kernel than proper virtualization, mixing VMs for security with containers for ef
  • I think FreeIPA can address most of your needs and if you are already running Fedora then adding it to your network should be fairly trivial. FreeIPA is kind-of like an Active Directory type dealie (and it can synchronise against AD) that offers a lot of integration and control.
  • I did my undergrad degree on a lab not unlike this (actually Sun workstations using NIS/NFS to mount home directories - this was the 1990s). These machines were likely an 1-2 orders of magnitude less powerful than even your smallest desktop - desktops with 32MB of RAM and servers with 128-256MB. There was no resource management aside from disk quotas and the lab worked fine.

    Depending on what you mean by high-usage I would have thought even modest desktop systems would be powerful enough for just about any

  • I saw someone suggesting that the users should play nice. That'd be great... and maybe they did, 30 years ago. (We'll ignore the late 80's early 90's stealing of someone else in the lab's xterm....)

    I had a user last year - an intern - like everyone, NFS-mounted home directory. It was, of course, shared with a good number of other users. He ran a job that dumped a logfile in his home directory. MANY gigs of logfile, enough to blow out the filesystem. Users were not amused. *I* was NOT AMUSED, as my home dire

Nobody's gonna believe that computers are intelligent until they start coming in late and lying about it.

Working...