How Facebook Runs Its LAMP Stack 111
prostoalex writes "At QCon San Francisco, Aditya Agarwal of Facebook described how his employer runs its software stack (video and slides). Facebook runs a typical LAMP setup where P stands for PHP with certain customizations, and back-end services that are written in C++ and Java. Facebook has released some of the infrastructure components into the open source community, including the Thrift RPC framework and Scribe distributed logging server."
Re:One question: (Score:5, Informative)
About how much has Facebook saved by using Open Source Software? I ask because I am not familiar with licensing costs from competing solutions. Thanks!
I haven't watched the presentation so don't know if this is answered there, but it's hard to pin down any numbers on precisely how many servers facebook operates. That said, an estimate of their expected power usage in their recently acquired second datacenter [datacenterknowledge.com] is 6 megawatts, placed at twice the usage in their current datacenter. Realistically, this probably equates to a cluster of around 5,000 machines in the current datacenter.
Costs per machine are likely to be restricted to Windows Server Web Edition; other software would not be needed on all machines (depending on cluster architecture, of course) so would be a trivial cost in comparison. Retail for the web edition is $399; I think we could expect such a high profile user to qualify for a 50% discount. This would put their software costs at about $1M. Considering that they're believed to have spent over 100 times this on hardware and support costs over the last year, I doubt this would be a particular concern. Price of purchase is not a factor in why facebook does not run on proprietary software.
Re:One question: (Score:1, Informative)
I think the real question is, if they run with LAMP so much, how come they have and request for so many oracle developers?
Palo Alto, CA
Description
Facebook is seeking an Oracle Applications Database Administrator to join the IT team and help build and maintain the IT application footprint. This is a full-time position based in our main office in downtown Palo Alto and will report to the manager of IT Development.
Re:Both Java and PHP Are Interpreted (Score:4, Informative)
Both Java and PHP are interpreted languages because this is how you create a cross-platform language.
Each gets compiled to bytecode which gets executed in a OS specific VM.
Java is JIT compiled to native code, whereas PHP is bytecode interpreted. The difference is more than an order of magnitude. In fact, judging by this comparison [debian.org], in many cases Java is about 100 times faster than PHP.
Frankly, most websites do not need an app server. Wikipedia uses PHP, not Java. It is not a 'simple' website that you say PHP is suited for.
Wikipedia is presenting uncustomised content to most users. It runs a huge squid cache in front of its PHP servers. If it tried to run PHP for each user it would crawl. I run mediawiki locally on an AMD Athlon64 2200+. It takes ~0.2 seconds of 100% CPU time to process a simple request. There is simply no way Wikipedia could run without content cacheing.
This is not to say that the task of serving that content is cheap. But they're doing a lot better than facebook; they're serving 30,000 requests/sec with only 350 servers. The difference, I suspect, is mostly down to the amount of cacheing they prform.
Facebook is much less able to cache content. It doesn't have a squid front end because relatively few users see the same exact content, unlike for wikipedia; most users are logged in most of the time and see pages customised for themselves.
Re:Yeah, Blame the Language (Score:5, Informative)
Depending on the application, PHP can handle several hundred transactions per second, on *one* machine. It is common knowledge that Java requires far more resources to achieve a typical transaction rate, than PHP.[citation needed]
This is just bullshit. A Java-based server will typically require a fairly constant 64MB more RAM than an equivalent PHP server, but other than this the Java system will outperform PHP in every sense. If the content generation is even remotely complex, Java can be up to 100 faster, which translates to 100 times higher transaction rate.
Sure, PHP can handle several hundred transactions per second, if your script is <?php echo "hello world"; ?>. This benchmark [mindcraft.com] of a non-trivial e-commerce application shows that Java can easily handle 500 requests per second on a small 2000-era 4-cpu cluster. A modern quad-core server should be handling at least 20 times that rate, absent any improvements in Java architecture since then (and there have been many; this test was run on Java 1.1, which was hideously slow compared to modern Java versions), and ignoring the performance improvement from not having to load balance requests at the front end or access the database server across the network.