Ask Slashdot: Building a Large Email Service 484
Rewd asks:
"I'm looking at implementing a large scale email server (cluster) to handle POP3 and IMAP4 for about 25000 people, including a lot of attachments. I'd like to go for an Open Source solution, but a lot of people around here want to go for Microsoft Exchange on NT.
Has anyone here successfully built anything like this? Can you recommend any combinations and components which are particularly
efficent, capable, secure and reliable?"
Q-Popper (Score:1)
What about Hotmail ? (Score:1)
Will 25,000 user licenses for Exchange not cost the euquivalent of the debt of a small African country - Why not send it on developers to make a (or another as IMP or TWIG may scale that big) free solution in C, Perl or PHP.
Greg
Re:FreeBSD (Score:1)
Blitzmail (Score:1)
Consider this:
* Very, very efficient. A cluster of six NeXT boxes used to serve 15,000 accounts, and handled more than 200,000 e-mails a day. AFAIK, these machines have now been replaced with 6 Alphas.
* Unbeatable durability. It was first developed in 1987, and has been in continuous used for more than 10 years! Being a non-commercial product it has not suffered from featuritis or bloat. In other words, it has a very solid code-base.
* Excellent scaleability. It is designed so that you can simply slot in another machine in the cluster to improve performance.
* A great feature-set, in particular it has an excellent real-name based user-database as its core, with secure random number authentication.
And yes, it is free. Source code available etc.
Take a look at: http://www.dartmouth.edu/pages/softdev/blitz.html
Skip through the client bits and you will find links to the server software.
25 thousand users shouldn't be a problem (Score:1)
The company I work for has a user base of 19 thousand mail accounts. These run off of Microsoft Exchange on i386 NT boxes. Hold your breath, because getting this `solution' to work demands a cluster of no less than 60 NT machines. All running Exchange in a cluster. At least 10 of them crash each day due to heavy mail load. No, all you NT pundits out there, this has nothing to do with `bad drivers' or `poor configuration'. The amount of mail and their attachments flying around, craves this amount of machines. We have even checked in with Microsoft on this, and they have only told us that getting more redundancy will help this problem. Exchange, from my point of view is _not_ a good solution.
Then we called Sun. They brought 3 boxes, Sun Ultra Enterprise 450, with qmail and I do believe it was cyrus-imapd. This was because they did not have an `evaluation' of the mail software that was going to power these machines. They did let us borrow the boxes though... Funny i think. Anyway, they set it up during a day, and we let the machines run instead of the Exchange cluster. None of the boxes crashed. They kept going and delivering mail, without _one_ mail going astray for a whole 2 months. Nobody was complaining about mailservers being down or any other kind of cheesy messages in their mail clients.
Now here is the funny part. The chief of investments called up one day and told us that the sales department from Sun had contacted him to ask him if we wanted to keep the boxes or not. He had told them no, and they were going to pick them up two days after... That meant booting the NT cluster again... After about 30 hours of syncronizing Exchange machines, they were online. Sun came and picked up the machines. My boss said that the chief of investments had the firm belief that Microsoft was The Right Way to Go(TM). Now this decision was _not_ based on our impressions of stability, scalability and reliability. But rather a term of Microsoft buying executives too many lunches...
So my advise... If you don't want excessive work, stay away from Exchange. Call Sun. They will let you evaluate their boxes. Trust me, they are eagre to sell you their stuff. =)
Scalable systems up to 10^6 users (Score:1)
I found this while going through the sendmail newsgroup on http://www.deja.com It's about the mail system that earthlink uses. It discusses how they use open source to build a scalable mail system.
A Highly Scalable Electronic Mail Service Using Open Systems [earthlink.com]
"...In September of 1997, EarthLink provided email service for over 350,000 subscribers with a 99.9+% service uptime record. In fact, we expect the current system to scale to well over 1,000,000 users without significant alteration of the architecture as presented....."
microsoft, exchange and PSTs, Oh My! (Score:1)
The Solaris system is a sparc 10, with a couple of gigabytes. I worked well for years, with little complaint.
The NT server, with just as much diskspace, has had backup problems, spaces problem (no quota system), is slow at accessing it's 'PST', the single file it stores mail in.
Additionally, our network calendar system, years in use, was dumped for The Exchange's calendar system. Exchange/Calendar is very limited in memory/diskspace usage, compared to Solaris.
And of course the NT manager did this on a shoestring, so no backup servers (Solaris does this easily) were bought.
Now for the best. The company start with Exchange servers at several sites. But has now consolidated to a single server. The other sites must put up with very slow mail access over TI or ISDN connections. Late afternoons many NT systems seem to hang, as the wait on the complex network file access to the Exchange server.
For my money then nt/exchange system is a waste. It's slow when you need it most, hard to backup, file structure is not easily adapteable to other mail program (like M$ would make a program interoperable).
Sendmail (on any good U*IX platform) is a bitch to setup, but reliable year after year. Any good pop3 or imap client can be used.
Go with the Gold, Sendmail!
Re:NT (Score:1)
These concerns are, for the most part, unsubstantiated. Recent tests in c't magazine showed that Linux and FreeBSD perform nearly identically well for web serving; FreeBSD is slightly faster serving static content and Linux is slightly faster serving dynamic content. Linux' slow userland NFS is being supplanted by a kernel NFS daemon comparable in performance to FreeBSD's, and Linux is significantly farther along in SMP support than any of the BSDs.
Arguing about Linux' stability is absurd. Properly administered Linux boxes, like properly administered FreeBSD boxes, can stay up for a year or more at a time. In fact, IIRC, a Linux machine currently holds the uptime record between these two OSes.
WRT security: well, if one is prepared to admit NT to consideration as a server OS, one certainly cannot write off Linux as insecure. I am not aware of any outstanding security holes in the Linux kernel. Both Linux and FreeBSD are widely deployed commercially, have withstood considerable scrutiny, and are certainly suitable for anything short of mission-critical banking and military use.
The key theme here is that, in the hands of a competent sysadmin, either Linux or FreeBSD is an excellent choice for a server OS. If the recent influx of Linux newbies has managed to tweak the BSD camp by their exuberance, let me assure you that the recent rise of self-righteous BSD advocacy has been at least equally vexatious to experienced users of Linux and *BSD alike.
Let me apologize if this has come off as a flame, but it really is time for the Linux and *BSD circles to play nicely and get along with each other. Unsupported statements like "Linux IMO is not a server OS" are flamebait, pure and simple.
AC
QI/PH + popper (Score:1)
and it works fine. All users are stored in
a QI/PH Database which is designed to handle a large amount of mail users and is integrated with sendmail.
For fetching the mails I mofified popper in a way,
that it searches for password and mailbox in the
Database as well.
All in all it was done within a few days easily.
Send a mail to 007@freemail.at for further infos if you are interessted in the system.
Re:No recommendation... (Score:1)
So no, that isn't a unique story.
Not Netscape... (Score:2)
I've never tried any of the open source products on any large scale so I don't know how they perform. I wouldn't go with Exchange if you are looking to do good scalability. It would be nice to run the service on a big beefy box, and Sun boxes are far better than Intels.
Re:25000 on NT? (Score:1)
I saw an NT box running exchange for only about 20 people and the damn thing would keep crashing once a month or so. We moved to Linux / Sendmail / CuciPOP and everything was all good.
I'm sure FreeBSD would work as well...
Re:Thank You... (Score:2)
----
Re:Exchange => Pain (Score:1)
Exchange does suck. Thankfully I've only had to use it once in my life. I'll stick with sendmail and qmail.
MS antiSupport (Score:1)
Case studies (Score:1)
Re:Don't use exchange - scalability and $$$ reason (Score:1)
Exchange Experiences with 2000 users (Score:1)
It doesn't scale well at all (NT doesn't...so Exchange doesn't) and those boxes look in pain as they try to do what people ask them to. However, response-wise, it frequently "pauses" to check new mail and your composing is also paused...what a great product!
Unix solutions will be far less headache if anyone understands Unix in your environment and the support is gonna be worth training someone, versus paying Microsoft to vainly (in our case) strive to repair your dead Exchange databases/servers. Our company is STILL headed into a MS only environment and it's sickening. Ok...enough rant and rave.
Qmail is a good way to go (Score:1)
On the Web server side, sadly, it's Windows NT and IIS/ASP, with some ASP components including a custom-written client-side mail store. We use the commercial AspHTTP component from ServerObjects to send requests to the mini CGI server on the Unix box when we need to create accounts. We also use AspMail and AspPOP3 to handle sending and receiving messages. (The mail server is firewalled, so you can't connect to it from the outside with POP3.)
Qmail is definitely industrial strength and free, two qualities that we appreciated. It's also easy to configure and fairly easy to customize. Recommended. Oh, and you can find the end result at www.webb.net [webb.net].
Eric
--
Re:FreeBSD (Score:2)
And I suppose yours does?
IMAP's Problems (Score:1)
Cyrus and UW IMAP are not light solutions. They are very large binaries which run from inetd and are anything but clean implementations.
Really, I'd say stick to POP3 using something like Qpopper or go with a commercial vendor. Qpopper is probably the best POP3 server you are going to find in the OSS world, though it does still run from inetd.
Open source POP3/IMAP4 servers really are lacking in the Unix world once you get past a certain number of users.
--
Re:Exchange bad, any UNIX Good (Score:1)
Remember, however, that the version of Exchange the Senate was running was at least two major revisions out of date, running on (I think) NT 3.51.
...phil
How about... (Score:1)
...phil
Re:Why Slashdot is going down hill... (Score:1)
Typical Microsoft tactic - can't stand to have the issues it uses on others applied to its own world.
...phil
Re:Exchange... why not?? (Score:1)
Hope you don't mind some questions. How many servers? How much maintenance staff for those servers? How stable is the system? Can you provide more details? We've received lots of details about Exchange systems that have failed, how about providing some more info on a successful system?
On a related note, anyone who says, "Don't use Exchange because Micro$soft sucks," really needs to be a little more open-minded.
Agreed. How do you respond to those that say 'Don't use exchange because it crashes all the time, and here's another example'?
...phil
Re:what about ignorance (Score:2)
He was talking about the user interface, not the mail server. The mail server is a hacked Qmail running on Solaris. The web servers executing the mod_perl code are FreeBSD.
And what does hotmail being owned by Microsoft have anything to do with it? Microsoft bought Hotmail after it had already been established, after all. Unless Microsoft has suddenly added a
bash-2.03$ telnet www.hotmail.com 80
Trying 216.33.151.7...
Connected to www.hotmail.com.
Escape character is '^]'.
HEAD / HTTP/1.0
HTTP/1.1 302 Found
Date: Fri, 30 Jul 1999 02:26:01 GMT
Server: Apache/1.3.6 (Unix) mod_ssl/2.2.8 SSLeay/0.9.0b
Location: http://lc3.law5.hotmail.passport.com/cgi-bin/logi
Connection: close
Content-Type: text/html
-E
Yep, that's typical NT type (Score:2)
Then it crashes, and the Microsoft apologists say that it's because it takes an expert to install, configure, and maintain an NT installation, and of course it's going to crash if you have the janitor maintaining it.
Typical. Just typical.
My question: If you need a skilled system administrator in the first place, regardless of the operating system, where's the TCO benefit for the Microsoft software?
-E
Solaris/SPARC vs Linux/*BSD (Score:2)
IMAP for 30,000 users on Linux or *BSD would require a cluster of machines for the front end, but you'd still only need one machine feeding the data into the Netapps on the back end. But the big iron Solaris solution definitely has the cojones to handle this situation without clustering, and will be tremendously easier to configure and maintain than the cluster. And also quite a bit more expensive hardware and software-wise. So it is a tradeoff, and it depends upon how much in-house talent you have. If you have a lot of inhouse talent, the Linux cluster will save quite a bit of money. If you don't, the Solaris solution will require less expensive consultant time to set up and configure, meaning it will be the more cost effective solution. Wanted a simple answer? The only simple answer I can give is "Don't use NT, at least not with Exchange" (grin).
-E
Set the record straight (Score:1)
Hotmail runs many email servers behind an ip switch (ala F5's BigIP, Ipivot's Broker, Alteon's switch, Cisco's LocalDirector, LinuxDirector, etc). The responses gained from queso or nmap are the answers from the load balancing box, not the mail servers. This has been experimentaly proven and confirmed.
According to all of their press releases they run sun hardware and solaris.
Hotmail runs a (heavily) modified Zmailer for inbound connections, and local delivery, probably in some failsafe and distributed manner.
Hotmail uses qmail for delivery, outbound. This version of qmail is also modified, at least somewhat, and probably a *lot* by now.
-Peter
Qmail, Netapp, F5 Labs, cyrus (Score:1)
Qmail - Fast mta, reliable NFS delivery in the Maildir format. Others may vote for PostFix. I don't know it, but it may do as well.
Netapp - Fast nfs server. Multiple mail server can write to it. There are other NFS solutions from much higher-end storage vendors like MTI and EMC, but that may be more then you want
F5 makes BigIP, a load-balancing ip switch. This lets you multiplex things like pop3, imap, and smtp to a farm of servers, and elimintas the possibility that anyone won't be able to check their mail or won't be able to receive email.
Cyrus is an imap server. It's pretty good from all I hear. I still have yet to implement it. The other 3 parts I've used a lot, and love 'em.
-Peter
Re:References (and a MSX Nightmare Story) (Score:1)
-Peter
Re:Sun Solution, of course! (Score:1)
Use cyrus or something else for imap. The above system, according to sun sales literature from about 1.5 years ago regarding SIMS, should support about 250 users. Plus there's no way to recover the SIMS database that I've heard of if there's corruption.
Good Luck!
-Peter
Re:Exchange server does work fine (Score:1)
-Peter
Re:PostFix (Score:1)
-Peter
Re:Is Linux scalable enough for mail processing? (Score:1)
FreeBSD probably would have served you better then linux in this case as well.
Question: anyone know if future filesystems will address this issue, or is this outside of the filesystem and in the kernel?
-Peter
Ridiculous proposition (Score:2)
1) syslog - for this much volume, syslog will slow your system a lot.
2) Qpopper requires a read through the entire mail file for that user each time mail is checked. For a user with a couple of megabytes of crap (think attachments) this can be a few seconds worth of activity just to get the first 5 lines of each message. Solution? Use the maildir format, which gives each message a file. Don't use MH Mail file format. Why? Because mh will do ungodly amounts of rename() calls each time the user deletes a message from the middle of their mailbox. Maildir is much more efficient.
3) Sendmail takes a lot of tuning to meet this sort of demand. Sendmail also has a large footprint. Using a mail server like qmail (my pref) or postfix (others' pref) will buy you a lot of performance for a one time learning curve of about a week's time, without having to guess at how to get high-capacity out of the system.
4) Linux is good, but unfortunately if you're going to do this on a local file system for a system with 25,000 users you need to have a lot of space. I think a Journaled (sp?) filesystem is called for here. Currently for supported tools that really means a commercial unix. I've used solaris and veritas' filesystems a lot, and I know that for a mail queue and for mail delivery veritas does amazing this. In addition, it makes recovery in the case of a system crash amazingly fast, and its snapshot facility allows you options to backup that are better then that is usually available on a mail system (i.e. minimal to no downtime to perform a backup of a stable image of the filesystem).
Anyway, hopefully I've contributed some useful thoughts to this!
-Peter
Maildir (Score:2)
-Peter
Re:Exchange and relaying (Score:1)
Qmail handles *BIG* mailboxes realy well. (Score:1)
Re:OS/2 Solution: Inet.Mail (Score:1)
Exchange 5.5SP1 fixes this (Score:1)
-o
an MCSE who hasn't used MS products in 15+ months..
Re:NT (Score:2)
Re:References (and a MSX Nightmare Story) (Score:1)
Why obscure the vendors name, when its obvious that the vendor here is Microsoft? Unless there is some other OS and MTA vendor in Redmond Washington...
--
Python
You're looking for Cyrus (Score:1)
As for the MTA, well I've seen plenty of votes for Qmail and Exim. I'm still pretty partial to Sendmail though. I think they'll all work (though I've been told Sendmail on a single server probably would have a tough time keeping up with the load on something like this).
With all of these solutions, if the users are getting much mail, you'll probably have to do something kind of exotic to break up the mail requests across multiple servers. The more transparent you can make this, the better. Either the users will have to know WHICH server thier mail goes on, or you will have to make the multiple servers ALL have access to ALL the mail.
One possible solution would be to use something like CODA (also from CMU http://www.coda.cs.cmu.edu/ [cmu.edu]). This is a cacheing network file system that you could set up on a backend server (running over something like a multi-ported 100Mbps Ethernet Switch with the multiple client servers on the front-end exposed to the network). When client server "x" gets a request from "joe", "x" accesses the file system and gets all the files in joe's mail box (a series of directories) (the ones requested first, then pre-caching all the others). When joe stops using his files, they are allowed to expire on "x" (releasing the cache for use by mary, or adam). Once downloaded, the files can be manipulated on the client and changes are sent to the server when there is time/bandwidth (I'm not sure how the locking and similar mechanisms work on this ... read the coda docs for details).
This way, you can dedicate one or more MTA servers to stuffing mail into the backend CODA server, then have one or more client servers pulling the data out and handing it to the clients. You spend most of your money getting a BSB (Big Stinking Box) for the backend, and use cheap, easily-replacable-if-it-crashes machines for the front end.
Another nice thing about Cyrus: It allows you to set per-user space limitations and access restrictions, and mail sent to multiple users is put into a special cache directory meaning it doesn't take up space for each copy.
One warning: Cyrus suffers from the same problem as INN's traditional storage system - it eats the hell out of inodes because each message is a file. Most email messages are in the 1-2 K range, so when you create the filesystem for Cyrus, make sure to create the maximum Inodes.
I know from the docs that CMU uses this on a 10000+ user mail network, and they apparently are quite happy withit. I've heard similar things from other large sites.
Basically, Cyrus is what Exchange hoped to be :^).
Don
Re:No recommendation... (Score:1)
Cyrus probably a good bet. (Score:3)
The Cyrus server at CMU is probably your best bet. You'll find it at at this link [cmu.edu].
It's worth noting that this project is currently supporting all of CMU's e-mail needs. It's also my understanding that it forms the basis for Netscape's Message Server and Post.Office. This should satisfy any concerns about it's scalability. It has lots of handy features like kerberos authentication, a database style message repository, support for ACAP, etc.
Alternatively try QMail [qmail.org]. Personally, while I think it provides better SMTP performance than Sendmail, I'd rather use the Cyrus IMAP server than the UW one (the only one supported by QMail). You could go with using a combo of sendmail|postfix + Cyrus for incomming mail (i.e. what your MX records point to) and QMail for outgoing mail. It depends on your performance needs
Exchange Server is NOTORIOUS for being both difficult and expensive when you need it to scale to a large number of users, although I understand it's improved substancially since the 4.x days when it was just impossible.
Re:software.com (Score:1)
for 25k users, I'd go with a higher-class system to support the load that they'd generate.
I'd recommend running Sun SPARC-based systems running Solaris 2.7 and InterMail, with a fiber-channel storage solution like a A5000. The multi-threaded design of both Solaris and InterMail will do you much better than any Linux/*BSD solution.
Perspective (Score:1)
I'd seriously look at all other solutions, wether commercial or OpenSource.
The disk subsystem is the bottleneck, kids! (Score:1)
So get a good, fast disk subsystem and attach it to whatever Unixoid OS you like. Run Qmail if you like (it's a bitch to configure, and not as versatile as sendmail, and it's not open source but it's fast), or get a bunch of good sendmail admins.
You can configure any mailer to handle large load on the SMTP side of things by using multiple MX records and mail relays. POP and IMAP are a little tougher.
--
I noticed
Notes for 25000 people??? PLEASE, NO!! (Score:1)
Nobody buys Domino for the mail capability, they buy it for the groupware and application system, and they *accept* the mail function that comes with it. The client UI is so amazingly bad that a UI design site did an in-depth critique, attacking pretty much the entire app. (http://www.iarchitect.com/lotus.htm)
The server is difficult to administer; the user directory is flat, slow and scales badly; and a given user's mail store only *looks* hierarchical, but is actually just one big lump of data which must be read in its entirety, first-in-first-out, for many applications, especially connectivity with third-party apps attempting to access the VIM subsystem (which is itself pretty awful) for messaging. This means that anything attempting to read a Notes message will read every single thing in the mailfile in order of its creation until it finds what it is looking for, including any archived, sent, and deleted items.
The client is difficult to navigate, crash-prone, and really beats the hell out of a network.
Hopefully the newer versions are not so bad as 4.1, but I was involved in a rollout of Notes to a mere 600 users at a chemical plant two years ago. Notes was hosted on a single high end NT/Intel box, which should be sufficient for the intended usage. Users were sent to training classes on Notes in small groups twice a day, and their machines were converted to the new mail system while they were in the class. So we knew how many users would be on the system on any given day throughout the rollout. The guy onsite from Bay Networks and I made a bet on when the network would crash from the load - I was within an *hour*. We didn't even get halfway through the deployment before the net died. Not the server, the network. Admittedly, the network was sufficiently stressed and just plain odd at the best of times, which is why the Bay guy had a desk, but 300 casual mail users should not kill a production network.
(One of the plant managers was upset by the burning fuse I drew up on a whiteboard in the MIS room and updated daily. After I was proven to be exactly correct in my forecast, he bought us lunch.)
Domino is not a mail system. If you don't need all of the other whiz-bang features of Notes/Domino, don't do it. For groupware it kicks some serious ass, really doesn't have any serious competition, but for straight mail you are better off with nearly anything else. Except Exchange.
-reemul
who dumped Notes 5 testing on someone else just this morning, and is still smiling
Linux filesystems (Score:1)
ext2fs returns from write calls while the metadata may still be in RAM cache (making it fast) -- if you want complete end-to-end reliability, that won't do. XFS will fix this for Linux, (as does a tiny patch Linus sent to the qmail mailing list) otherwise FreeBSD may be a better free UNIX to use.
--
Re:Why Slashdot is going down hill... (Score:1)
We have problems, true. But we're getting better overall.
Anyhow, about Exchange: ouch. The worst part is that in order to use Exchange at work I _have_ to use Outlook. That alone makes it inconceivable to actually WANT to use Exchange. A real turnoff. Outlook 97 was a real bastard; 98 is passable.
I've got many Unix accounts, and I've had many others, and never lost past email. I only have one Exchange email address, and already I lost 30 megs.
-Billy
Cyrus and Exim (Score:2)
I've had good luck using the Cyrus IMAP server and the Exim mail transfer agent (MTA). The Cyrus server handles POP3 and IMAP, and stores the mail in an internal file per message format, and is designed for hosting mailboxes for those without accounts on the system. I've found both Exim and Cyrus to be fast, secure, scalable, and stable for thousands of customers, and I don't see any trouble scaling it further.
my $0.02 (Score:1)
A couple of them feed into the mainframe systems.
The only ones that ever give us problems are the notes servers.
overall it works very well.
Use qmail. Case study: pgpkeyserver (Score:1)
Ok, me neither, but the server I work at has a pgpkeyserver, run by our admins.
We HAD to change to qmail, because sendmail wasn't able to cooperate with the thousands of emails we have to handle in about half an hour (about 7200, which is about 4 incoming mails per second, this being a low number) when the pgpservers synchronized.
Little side note: we run a 2.0.36 linux kernel, on a single 233mhz intel pentium 2 with 128Mb of ram, and sometime we reach loads of 24 or more when the pgpservers synchronize.
It is one public key per mail.
Bulk email, open source, qmail rulez.
Regards
Re:Exchange => Pain (Score:1)
-Harry
I worked for a 25k user ISP and a 1.2mil one now (Score:1)
3 ultra 2's
512MB RAM each
30GB array by artecon that was NFS mounted.
This was slightly overkill for us. A few this to keep in mind.
1. Have more then on machine running this. I would say use 4 PII's. Use dns round robin for load balancing. If you have the money get a real load balancer. With an NFS disk array and sendmail file locking this isn't hard to administer.
2. Use as much RAM as you can afford. 512MB min. 1GB to 2GB is better.
3. Fast local disks. QPop servers files locally. Have at least 4 gig for mail to be queued. We had at least one user trying to cycle 2 gig attachments through our machines every month, Bastards.
4. Set up qpop (or whatever) in server mode. This will decrease the traffic from you to your raid array. Server mode tels it that it is transfering data across the network, if nothing in the data changes just revert and don't move the extrea traffic.
5. Disk. We were fine with 30, but upgraded to 50GB with the last upgrade. Artecon NFS mounted to the 3 machines. Look at Netapps though. You can cluster them in a failover config. I have heard of some hardware problems with them though. We pushed 16MB/sec across a peak so make sure you are AT LEAST ultra wide if not ultra2. You could set up a server with the disk attached and let it do your NFS instead of a full network disk thingy.
6. I'd suggest sendmail, but qmail is nice, too. Sendmail seems easier to set up to use with pine if you want to hook a shell machine up to it with Pine or use a webmail package, but honestly I haven't played with Qmail much.
Kashani
Re:Scalability Issues (Score:1)
/var/mail/u/s/user for ours with an Oracle backend for authentication. We tried switching back to
Kashani
OS/2 Solution: Inet.Mail (Score:2)
It's heavily multithreaded, so the performance is excellent. I couldn't say whether it's ever been used with 25,000 users, though.
Timur Tabi
Remove "nospam_" from email address
Using Exchange (Score:1)
Problems: If you don't buy the enterprise version, you have a limit of (I think it was) 17Gig TOTAL of mail... Which came out to about 40 users on our system until we started bitching at people to delete old mail. It really bites because exchange shuts down with no warning when you hit the limit. If you must use exchange, get the enterprise edition, which has no database size limit.
Exchange's database under 5.0 is guaranteed to corrupt eventually. You need to shut down regularly (at least once a month) and do an ISINTEG, or you get fun things like the President's email being delivered to the Janitor with a header from the Lawyer. 5.5 is a lot more stable, but I still don't trust it enough to not want to run ISINTEG once in a while. To date though, 5.5 has yet to corrupt.
Make backups!
Consider a massive farm of smaller Exchanges rather than one large box, and reserve one day a month of downtime for maintenance BEFORE deploying users.
If you go for a unix box, I'll give you my metric. While working at an ISP, we had a limited budget due to accounting problems (ie: no accountant, so we didn't know how much cash we had. *SIGH*), and we had a unix box in desparate need of upgrading, but it still ran:
486DX2 66MHz
narrow SCSI-1 drives
FULL news feed
400 domains
2000 mailboxes locally
routing mail for another 2000
web server
shell accounts
anonymous FTP
96 modems doing SLIP, and PPP at 28.8
Accounting
logging
and to add insult to injury, it was SCO!
I wouldn't ask that much out of ANY single machine if I had the choice, and it was SUCH a relief when we got 4 Alphas running OSF/1 (and some telebits) to replace the tired SCO, but you know it can be done with a solid OS. So a 486 properly configured with fast disk and lots of RAM should be able to handle 2 to 4 thousand users. Use of Cyrus or similar types of software will drag that number up, otherwise large mailboxes drag that number down (LARGE POP3 mailboxes on IO bound machines can actually take too long to scan. If POP3 doesn't report the number of messages fast enough, POP3 clients will time out).
Today, if I were you, I'd get a Sun Netra T1, and add a FibreChannel RAID card, with as much ram as you can cram in (on the motherboard AND the RAID card). That should handle your 25k users on one box in a pinch, but use two at least. Spreading the people out makes everyone breathe easier.
Been There Done That (Score:1)
We used Control Data's MailHub. A system built on a portable x500 database send/pop mail and Perl scripts. It ran on a SUN spark box.
These days you could replace the hold thing with an LDAP server and Linux.
I'd use large Intel box (dual PentumIII) with 128m of memory and 19gig of hard disk, mirrored. This would be a good start. Strip down the Linux system, only run mail on it. Put the LDAP on another box same size and power.
I'd use Open LDAP for the user server. Then write some Perl scripts (Web forms) to update and manage the LDAP data. (The PHP module for Apache would work good too.)
If needed, use a round robin DNS to balance the load between several POP servers. These servers will all need to connect up to one large file server. I'm using IBM's AFS file system. It's better then NFS.
Mark@grennan.com
Been There Done That (Score:1)
We used Control Data's MailHub. A system built on a portable x500 database send/pop mail and Perl scripts. It ran on a SUN spark box.
These days you could replace the hold thing with an LDAP server and Linux.
I'd use large Intel box (dual PentumIII) with 128m of memory and 19gig of hard disk, mirrored. This would be a good start. Strip down the Linux system, only run mail on it. Put the LDAP on another box same size and power.
I'd use Open LDAP for the user server. Then write some Perl scripts (Web forms) to update and manage the LDAP data. (The PHP module for Apache would work good too.)
If needed, use a round robin DNS to balance the load between several POP servers. These servers will all need to connect up to one large file server. I'm using IBM's AFS file system. It's better then NFS.
Mark@grennan.com
Re:Hotmail (Score:2)
From http://pobox.com/~djb/qmail/dist.html [pobox.com]
"If you want to distribute modified versions of
qmail (including ports, no matter how minor the
changes are) you'll have to get my approval."
Please reply in email if you feel the need, I'd
rather not start a flamewar here
--
Kevin Doherty
kdoherty+slashdot@jurai.net
Re:Hotmail (Score:2)
They also state that along with Solaris being used, Windows NT is also used, but they fail to mention how/where it is used, so my guess would be as devel, and not production.
My favorite quote from the article is "Solaris is Hotmail's legacy production operating system". bwuahahaha.
Run, do not walk, away from Exchange. (Score:5)
MS would like people to believe that Exchange is an enterprise-level communications tool, when it fact it is a buchered and bloated decendant of a mediocre 1992 X.400 email system from Data Connection Limited (check out http://www.datcon.co.uk/press/messserv.h tm [datcon.co.uk]) Don't believe the version number; Exchange is in its second major release (4.x really is 1.x, 5.x = 2.x, etc) and still has significant stability problems. [slashdot.org]
In my experience, Exchange can support 300 users per server happily on commonly acceptable x86 corporate server hardware (say, a 2 processor PII with 512mb ram). It seems that (in my limited experience, lest MS lawyers take this to be a declaration of fact, which it is not) once you've reached this level, doubling the ram and adding more cpu's has only a minimal effect, which means that you really have to add more servers to add capacity.
Let's do the math. 25,000 users at 500 users per server (to be quite generous) means that you're going to need a Windows NT server farm of about 50 systems just to do email. Again, being generous bargain hunters, let's say you can buy one of these servers for $10kUS. That means you're out $500,000 just for hardware. In my experience, you can support 500 POP users easily on a SPARC 2 or IPX, which can be had these days for about $500 decked out (including a 17" monitor). You could support the same (probably many more) on a $500 x86 box running any of the free *nixes. Assume you blow $500 on disk storage for these boxen just to level the starting line, bringing the total cost to $1000 per. That's still only $50,000.
One less zero usually gets the accountants' attention on an expenditure like this.
But let's talk about administrative support. IMHO you're going to need 1:1 admin per NT server at that usage level, given that remote admin of NT is difficult, and 500 users per server is going to prompt more than the occasional pretty blue interface. (Nevermind the security team you're going to need for a major NT installation.) Say a cheap NT admin costs $50kUS including benefits & overhead. You're looking at an HR budget of $2,500,000us. On the other hand, say you splurge and spend $150kUS per *nix admin. If they couldn't handle 10 little boxen apiece, I'll eat the electrons this was posted with. That's an HR budget of $750,000us.
That's 1/10th the hardware expense and 1/3 the maintenance expense of using Exchange. And that's (a) making some wild assumptions that benefit the Exchange argument, and (b) assumes that you're running *nix on shit hardware. Spend 5 times as much on hardware for new, supported stuff (say $250,000us, which would buy you a couple of well-outfitted Sparc 4500s, or 10 really gorgeous systems from VA Research [varesearch.com]). Your downtime will become next to nothing, you'll still have spent only half of what you would have for NT and Exchange, and your ongoing yearly administrative cost will be 1/3 of the other option. The *nix administrative savings alone will pay for the *nix hardware in a few months.
Oh yeah. I forgot the expense of 50 copies of Windows NT, 50 copies of Exchange Server, and 25,000 client licenses... (*erk*!!)
Cyrus (Score:2)
Its also trivial to write scripts to automate the management of the server, so you can create a new user quickly and easily.
Two years ago I installed Cyrus at a company that was using NT domain servers for their logins on all the client machines. Quick patch to Cyrus to work with PAM, and a SMB PAM module, and people were able to check their mail using their NT passwords without having any security issues of having all those users on the mail server.
I also hacked something together that automagically created the mailbox when an IMAP connection was attempted with a username/password on the NT domain that a mailbox didn't already exist for, so the NT-centric admins didn't need to ever touch the mailserver.
The number of users are much smaller, but other installations have shown that Cyrus will scale, so the ability to extend it like this is also important.
Notes from Linux '99 (Score:3)
The open source solution was much more cost effective and has proved fairly stable.
Unfortunately the proceedings from the event are not yet online, however I'll try and forward you a copy (or post a link to this thread) as it may prove useful to you.
--
Re:No recommendation... (Score:5)
A sysadmin at, ahem, a "large jeans manufacturer" was put in charge of Exchange on hundreds of NT servers. He dutifully logged and reported dozens of bugs, system outages, etc., to MS support, as the thing crashed and burned like the Hindenburg II. After a few months of this, Microsoft decided to act on the problems. The solution was simple: they sent a letter to his boss saying he was a troublemaker.
Where are "Exchange horror stories" online? cost? (Score:3)
Knowing MS Exchange is a "Bad Thing", and I'd like to save the company money where possible, I decided to search the web for a collection of "horror stories and MS Exchange"... to my surprise I couldn't find ANYTHING!
Now I've seen articles here and there (InfoWorld, news.com etc.) about Exchange bugs, but I would have thought SOMEONE had collected URL's and posted them. Nothing. I'd have to do a lot of research to get this info, and given my workload it would be an unwise distraction.
The second thing I'd like to know, is how much does MS Exchange COST? I know the price varies, and larger companies get breaks if they "cozy" up to MS, but that doesn't help me much. Say a company has 50-150 employees... what does that translate into just for the software licensing?
Re:No recommendation... (Score:5)
Listen to this advice, it's obviously born on the hard back of experience, just as much as me reiterating this same line: do not use exchange.
For example:
This is only a start, but I'm sure other people have many of their own reasons as well...
I remember our migration of a mere 750 (users) with extreme horror. We had to manually create each user.
You can create mailboxes in exchange via a config file with the mailbox import tool, although I figured it out by looking at files it created and not via any documentation. With exchange 5.5 I'm pretty sure you can create mailboxes with ldap (although this is far from documented last I looked).
As to solutions, I haven't used any open source email solutions with more than ~5000 users, for which sendmail and the UW pop3d and imapd worked well for the users that I had (many were very light on email). I'd be really neat to integrate an MTA and an IMAP server with ldap to support IMAP referrals and smart mail redirection. I know some of this is done as sendmail has LDAP patches and example rules for this, but I'm not so sure about IMAP side.
Why Reinvent the Wheel? (Score:4)
Re:Outlook doesn't scale, look at other solutions (Score:2)
Exchange => Pain (Score:3)
Check out Cyrus, from Carnegie Mellon, which is gratis (but not free).
Or maybe you'd like to spend some money. Then there are lots of companies, like Mirapoint [mirapoint.com], who I work for.
I have 40,000 pop3 users currently with FreeBSD (Score:2)
I use the qpopper with a *lot* of local modifications for security and performance. A custom perl+mysql system manages the userids locally and talks to a campus-wide "meta-directory" which allows people to manage the users from their Winblows machines...User management is probably a bigger problem than performance.
IO will be your biggest concern, followed closely by getpw* calls, network bandwith, then RAM and/or CPU. There are lots of other issues such as expiring mail, preventing/detecting mailbox corruption.
Cyrus IMAPd will solve a lot of problems with IO bandwidth, quotas, expiring mail, etc...but it will require more RAM to sustain more simultaneous connections, and more disk space as users can/will/should leave more mail on the server. I have not tested Cyrus in a large scale environment...yet...
Sendmail works well, other mailers such a qmail, etc. may work as well, many claim to be more efficient, but a properly configured sendmail environment is hard to beat...but any reasonable mailer should be adequate, the actual MTA load shouldn't be that great, no delivering to the mailbox, that's another story.
Feel free to contact me directly if you desire any more details or statistics.
Re:I had Exchange to work well (Score:2)
Amazing how little hardware we had to use.
---
qpopper bad for large mailboxes (Score:2)
We were using sendmail at the time, so we started using qmail as the local delivery agent. And pop agent of course. Eventually we switched entirely to qmail.
One thing to watch out for regardless which solution you use is that (last time I looked) linux (or is it ext2?) is limited to 16-bit uids. There's ways to get around that; I just wish we'd considered it when we started.
Large Scale free email (Score:5)
I know of three potential semi-free solutions.
Carnegie Mellon Cyrus (go to the FTP site and download the latest version. Don't rely on the way out of date web page to link to it.) IMAP server.
University of Washington's imapd. This seems to be under more active development, and supports a nice range of features, mailbox formats, and security mechanisms. However, it uses the passwd file (although you might be able to get around this using PAM) and it doesn't natively support quotas. (although you can do this at the OS level.
Darthmouth's Blitzmail Server: This has been ported to linux, and is *wonderfully* scalable across multiple machines. It inlcudes its own directory services too. The only problem is that it doesn't support Imap (although some work has started on that front), and the only database it supports as a backend is oracle. I would love it if someone hacked it to use mysql of postgresql with IMAP support, but that's a tall order. The client is also under-featured.
All of these have their drawbacks though. You might wish to go with a commercial IMAP/POP server on linux. There are a few good ones that exist. You definitely don't want to go with exchange. A lot of people go that route because they are forced to. My experience with exchange 5.5 was so bad that I would not recommend it to anyone.
-OT
Hotmail (Score:2)
see Horms' paper from the CALU conference (Score:2)
larger Asutralian ISPs. They run Linux internally
and he presented a paper at CALU on building a
large and scalable mail system.
See:
http://www.linux.org.au/projects/calu/cdrom/pap
for the conference paper.
Well, over here at Cisco... (Score:5)
You gave it away (Score:2)
Hmmmm....who could this be?
---
Put Hemos through English 101!
No recommendation... (Score:3)
If you are going to setup 25,000 users, do not, repeat NOT, use Exchange. I remember our migration of a mere 750 with extreme horror. We had to manually create each user.
Of course I was simply a lowly programmer working under the direction of our totally incompetent network admin--maybe there was an easier way and she missed that topic in the training the week before.
What you really need is a requirements analysis. Exchange is a totally different thing than, say, Sendmail. Analyzing what you need will tell you which to go with. For instance, do you need public folders, scheduling, etc? If so, maybe use Exchange. Do you need configurability, speed and Internet email? Then you want not-Exchange.
---
Put Hemos through English 101!
Sendmail @ Netcom (Score:3)
We noted that directory lookups got worse in a distinct knee -- i.e., we had no problems for a long time and then we hit a magic number and things went all to hell. I do not know offhand how well linux or Solaris deals with directory lookups, but you could test easily enough.
The thing you didn't tell us was what the volume would be like; the number of users matters for the mail spool but the number of email messages matters for the CPU usage... I suspect that you won't need a very heavy box, though. Email is cheaper than you might think.
Oh. Run a DNS server on the mail hub, to avoid a lot of lengthy DNS queries on some other poor machine. Flush the cache daily.
Exchange support (Score:2)
So, unless you're willing to fork over the $$$$ for consultants from Digital to come and build the whole thing for you, I'd avoid Exchange like the plague for a project this size.
Here at a mailing list company. (Score:2)
That's me: here are the details (Score:4)
Re:No recommendation... (Score:3)
I use and maintain an Exchange server (well, three) and the main server consumes 10 gigs of a harddrive and all of a 333 MHz Pentium. This is for about 200 users and most are not that active.
Besides the hardware overhead there are other negatives to Exchange. Namely, it does not route internet traffic well, it has poor error reporting, and it "clusters" badly. I'll take each point one by one.
My company has affiliates in small offices around the world and they have neither the on-site resources or talent to maintain an e-mail server so these offices use our Exchange server as POP3 and SMTP. This creates an open relay and all attempts to close the relay have met with stiff opposition -- users complaining they now have to use a password, cannot remember what domain they are on, and general users resisting change. At the moment, Exchange has no true "Back Office" solution for this problem and I would have to personally configure all of our affiliate offices if I want to completely secure routing.
The error reporting come down to this -- either you log all of the messages passing through Exchange or none of them. I wanted to log the messages that caused errors for obvious reasons and after about 4 days noticed the drives filling up with archives all all the messages, not just those messages generating errors. Microsoft admits this is a problem but there is still no fix, at least not in SP2.
And finally, "clustering". I'm not talking about true clustering but instead about using multiple Exchange servers to distribute the load somewhat. We have two e-mail domains and wanted to start putting people on the second domain to balance the load. Each server runs fine on its own but for some reason they hate talking to each other. The replication services keep stopping (pausing, really) and site connector is more frustrating than helpful.
I have not had many problems with our Exchange server otherwise. It runs forever and reliably. It has the longest uptime of any of our NT machines, only needing a reboot every month or two. However, I'd think long and hard before accepting a job caring 25,000 user's e-mail if the server were NT. Anything over about 1000 users you should look elsewhere if you can.
Mail cluster (Score:2)
We use exim, Qpopper with mysql patche, mon, fake and rsync. Each base box hold 88 GB of data and are fully duplicated (double delivery with exim, and further syncronization with rsync). The switch between a main base box and his double are handled by mon and fake. A hot spare then reconstruct a new double, delivery and popper deletions are blocked during the reconstruction.
Two problems aren't solved yet:
- raid 1 between boxes
- imap
I hope that imap will work when nfs locking will be reliable. For raid 1 over boxes I have a very tiny hope that nbd could be a solution.
Anyway, we made some tests, and it somewhat works already. We are now tuning various parts and writing procedure to handle the beast and react to failures (our current estimate is one major but handable failure every month).
If you have ideas of working solutions for my 2 problems don't hesitate to share
Nicolas
Scaling the box might be the real problem... (Score:2)
Solaris will do this, but you will probably need to run it on a _big_ box, like a Sun Ex500 class machine with about 8 or more processors. And get their SIMS product, too, it's pretty well optimised for the high end. Other high end commercial unixes like AIX and IRIX will no doubt scale this far as well.
If you are able to go distributed (ie, the organisation is easily divisible geographically), then something like Linux or FreeBSD with qmail or smail will probably cut it.
Beware that exchange servers offer a fairly high level of integration with Outlook, which a product based on open standards will not be able to deliver.
Exchange on NT for 25000 people??? PLEASE, NO!! (Score:3)
First of all, suggesting to implement an NT solution for an organization of that size is already tempting your job security, but to actually do it?
Assuming standard users and needs for this system, I can only recommend using a Lotus Notes/Domino system. If you've got the cash, there is simply no better solution out there, or even close.
Run Domino (the server end of Notes) on several UNIX servers. Solaris (SPARC and x86), AIX, and HP/UX are all supported, with a Linux port (Caldera 2.3 (currently in beta) and Red Hat 6.0 will be supported, as well as SuSE 6.1 and Pacific HiTech) on it's way Q4 99 per DevCon.
Notes has got all you'll ever need, and R5 simply blows away anything M$ has out there. You've got to pay for seats with Notes, but to tell you the truth, Exchange is free, and you get less than you pay for.
Plus, your users can run the Notes client on any Win32 they think is prettiest (please tell me you'll use NT and not 9x on the client end).
Look at this [weightlessdog.com] for a guy in your situation who had to deal with Exchange.
Some other really good links are here [sandia.gov], here [computerworld.com], and here:
http://www.notes.net/50beta.nsf/7d6a87824e2f097
(problem with the last one, copy it and cut out the space that is stuck between the zero's, the href tag keeps putting it in! It is a great article though : )
(TIP: Show the guys with the money those links so they know why you should use a Domino/Notes solution.)
50 000 clients quite easy. (Score:2)
FreeBSD Does the Web. Solaris does the database. (Score:2)
Outlook doesn't scale, look at other solutions (Score:4)
This is a large client trying to implement a server farm of 20+ NT machines, each server supporting 600-800 users, and combining the whole lot into a coherent whole. Fortunately I only have to fix their poor network designs. The team of administrators now numbers more than 50, most are MCSEs, none less than 5 years experience with Micro~1.oft products. They are tearing their hair out on a daily basis. Complaints number in the hundreds every day, and thats just the users who haven't given up completely.
My advice is to start looking at the larger commercial products, possibly Netscape's server. Get a reputable vendor to support it.
If you look at open source systems, start with OpenBSD and NetBSD.
Divide your system up between the MTA doing delivery/reception of the messages, and the MTA serving the users. Its ok if email to the outside world goes down for short periods of time, its almost expected. But if users cant get to their mailbox 100% of the time, you will look bad.
You also need to look at managing more than 32000 or 65000 users in the future, remember that various *nixes have either 15 or 16 bit UID fields. You should make sure user accounts/authentication/logins are separate from any UID system on any machine type. This means getting some kind of medium sized DB, and tying it into your auth and login schemes. Others have done it, its not that hard (look at AOL with 10million+ user accounts)
the AC
A request for rewd (Score:2)
Thanks...
Re:Scaling the box might be the real problem... (Score:2)
I work for the second (or third, I forget) largest Sun reseller on the east coast. I have set up mail systems for several fortune 100 companies with 10k+ users. Anyone recommending a *500 series machines for this number of users is insane, especially with more than 2 processors.
None of the MTA's out there are capable of making use of an SMP system and so anything more that 2 processors is really going to go to waste. (This is not entirely true however disk bottlenecks are far more critical to system performance)
I have also set up mail systems based on FreeBSD. My last box was a Dual p][450 with 2 gigs of ram and a pair of mirrored seagate cheetah system disks. The machine has a pair of SmartRAID IV caching RAID controllers from DPT with 64 megs of cache. Connected to each controller is a series of seagate cheetah hard drives in 4 DPT Drive cabinets (per controller). The controllers run RAID 0+1 for maximum performance and reliability.
The OS itself has been configured with a large MAXUSER limit and it is running Postfix using an LDAP server and running UW imapd (all hacked slightly to work together more smoothly). The system is also configured with softupdates to imporve FS performance. This system is as fast as anything I have ever used. It is easily capable of handling 5 million messages a day. This is over 200 messages a day per person on a 25k user system. Needless to say this system continually outperforms my expectations.
I have set up similar systems on Sun hardware but the high cost of that hardware makes these solutions prohibitive. It also makes it a lot harder to get a system to do what you want it to do if you cant hack the source code a little.
In the end a freeware solution like FreeBSD is more than up to the task of handling a large mail system like this. The only issues are proper configuration of this system. This issue applies to Solaris on sun hardware as well so it should not be mistaken as only a freeware problem.
Re:qpopper bad for large mailboxes (Score:2)
A good way to get around this is using qmail's LDAP patch. This way, you only need qmail's own local users. You should be able to convert your existing users to LDAP with no problem.
Qmail with 50,000+ (Score:2)
If you need IMAP, it gets tough... Except for IMAP I'd recommend Qmail.. it's the most robust thing, besides the NetApp that we have.. With NetApp and a RAID0 backed queue drive it screams...
You could all of the above with Penguins or VARs for pretty cheap... at guess 10,000 excluding the NetApp... If you use a beefy linux box with a fast raid 5 for the NFS server back end you'll also allow your servers to "load gracefully"
If you need IMAP but on a single domain, use the UW IMAP server... It even comes as an RPM and looks great... If you need multiple virtual domains like we do... IMAP looks pretty grim...
As it is when we go to web based email it looks like we're going to have to do a WebBased POP client...
Careful though, IMAP can get *WAY* more abused though... With IMAP you have the tendency for people to park and use more space... With POP it's just grab and go..
Resource wise POP is a better bargain and most clients can deal with it just fine... IMAP isn't worth the server load IMHO... Use qmail anyway ya can...
Re:Run, do not walk, away from Exchange. (Score:2)
First off I have 1200 users on a dual processor 16G hard drive, 512mb ram system. It has run 372 days without crashing/reboots/etc. Mail delivery is fast enough that it might as well be a chat room at times from people sending emails and replying so fast.
It has taken a good 4 waves of the Melissa virus without crashing or even blinking hard.
Other than adding uses and deleteing users there is NO, I reiterate- NO, other work done on it. The damn thing just runs. Period. No extra maintenance at all.
Yes the license cost blows goats. Yes MS does too. No Exchange isn't all that bad for a large scale environment if the people setting it up have a clue.
Meta (Score:2)
The paper is available here [stacken.kth.se] (in postscript).
At the talk I had the impression that the softwware was free. I cannot find it on their (skimpy) web site though.
From their description, 25,000 users wouldn't begin to make it sweat.
MCIS (Score:2)
i would recommend against using ms exchange as i know many people who have had loads of trouble with it.