How Google Uses Linux 155
postfail writes 'lwn.net coverage of the 2009 Linux Kernel Summit includes a recap of a presentation by Google engineers on how they use Linux. According to the article, a team of 30 Google engineers is rebasing to the mainline kernel every 17 months, presently carrying 1208 patches to 2.6.26 and inserting almost 300,000 lines of code; roughly 25% of those patches are backports of newer features.'
A New Culture (Score:3, Funny)
Re:A New Culture (Score:5, Informative)
Funnily enough the roads were there before the cars.
Re: (Score:3, Informative)
Fair chance you were just joking, but I figure, why not go on a info dump?
Release the patches already (Score:5, Interesting)
I. Want. This.
Re:Release the patches already (Score:5, Informative)
Try iotop.
http://guichaz.free.fr/iotop/ [guichaz.free.fr]
Re:Release the patches already (Score:5, Funny)
Can we donate some money and buy these people a site that _doesn't_ look like a goatse link?
Re: (Score:2)
I like the way the site is designed. Nice and simple, not like some sites where you have to turn to google to find a single page.
Re: (Score:2)
I think he was referring to the URL rather than the actual site.
Re: (Score:3, Funny)
In contrast to goatse, which has all the content on one page?
Togh (Score:3, Informative)
Google does not distribute the binaries, so they are not obliged to publish the source.
Re: (Score:3, Interesting)
TFA does suggest though that google have gotten themselves into a horrible mess with their local changes and would be better off by offloading their stuff to the community and taking properly integrated releases.
Re: (Score:1, Informative)
I think TFA also tries to notice how stupid is to base all your work in a old kernel because it's supposed to be the well-know stable release used in the organization, and then waste lots of human resources into backporting features from newer kernels. This is what Red Hat and Suse used to do years ago, and avoiding it is the main reason why Linus' set up the new development model. Google could learn from the distros, they probably can use all those human resources to follow more closely the kernel developm
Re:Togh (Score:5, Insightful)
Yeah great work Linus.
The distros STILL stick with older versions and backport fixes, because who in their right mind is going to bump a kernel version in the middle of a support cycle? It's even MORE broken because the kernel devs rarely identify security fixes as such, and often don't understand the security implications of a fix, so they don't always get backported as they should.
The Linux dev model is NOT something to be proud of.
Re:Togh (Score:5, Funny)
Indeed:
"The Linux dev model is the worst form of development, except for all those other forms that have been tried from time to time." - Winston Churchill
... Oh wait, no. That was me, actually.
Re: (Score:3, Insightful)
Oh actually I think the form of development used by the BSDs is a lot better. At least it is a lot more efficient. They don't just crap software and deprecate it as soon as it remotely works (hal).
Re: (Score:2)
Re: (Score:2)
Re: (Score:3, Informative)
If you read [lwn.net] about [lwn.net] them, you'd know that devtmpfs just populates
Now during init, udev's job is to parse udev rules and add user configuration plus fix the permissions of nodes in
Re: (Score:3, Interesting)
Indeed:
"The Linux dev model is the worst form of development, except for all those other forms that have been tried from time to time." - Winston Churchill
... Oh wait, no. That was me, actually.
Holy humour-impaired down-modding, Batman! How is the above a troll?
For those too dense to get the joke: I actually agree that the Linux development model has significant weaknesses. It's just that, despite its shortcomings, it actually has proven workable for many years now.
I'm not implying that there aren't better community-driven coding projects in existence. Nor do I want to suggest that critiquing the community is unwarranted (or even unwanted). It's just that, for all its warts, it has produced consis
Re: (Score:2)
The Linux dev model isn't the only one in existence, isn't the only one that has withstood for a lengthy period of time, and certainly isn't the only one with plausible contention for the best (as opposed to democracy).
The FreeBSD (core-team) development model has certainly been around, in it's current form longer than the Linux kernel, and it doesn't suffer fr
Re: (Score:2)
You are right... ...except that the distro makers and the kernel hackers think the contrary.
Re: (Score:2)
How is your fork coming along?
DTrace (Score:2, Informative)
I. Want. This.
DTrace code:
#pragma D option quiet
io:::start
{
@[args[1]->dev_statname, execname, pid] = sum(args[0]->b_bcount);
}
END
{
printf("%10s %20s %10s %15s\n", "DEVICE", "APP", "PID", "BYTES");
printa("%10s %20s %10d %15@d\n", @);
}
Output:
# dtrace -s ./whoio.d
^C
DEVICE APP PID BYTES
cmdk0 cp 790 1515520
sd2 cp 790 1527808
More examples at:
http://wikis.sun.com/display/DTrace/io+Provider
Re: (Score:2)
I thought about this too and I guess your example shows why they don't use it...
All the 'printfs' are a burden, so they probably put the stats in memory and do some other thing with it
The above should be so much modded up (Score:2)
People just don't know about the innovation that has being going on in the storage arena by Sun.
It is funny, but I think what let Sun down was their marketing, not their Engineering.
You can currently get storage devices that run those diagnostics at the click of a mouse.
How regrettable that this wonderful technology may be shelved. Tragic really.
Re: (Score:2)
SystemTap [sourceware.org]
Use Solaris then. (Score:2)
Or buy one of their Solaris/ZFS/Dtrace based storage devices, you can do what you ask with a few clicks of the mouse...
Re:Release the patches already (Score:4, Informative)
Re: (Score:1)
Is it worth it? (Score:2, Interesting)
The whole article sounds so painful, what do they actually get out of it?
Google started with the 2.4.18 kernel - but they patched over 2000 files, inserting 492,000 lines of code. Among other things, they backported 64-bit support into that kernel. Eventually they moved to 2.6.11, primarily because they needed SATA support. A 2.6.18-based kernel followed, and they are now working on preparing a 2.6.26-based kernel for deployment in the near future. They are currently carrying 1208 patches to 2.6.26, inserting almost 300,000 lines of code. Roughly 25% of those patches, Mike estimates, are backports of newer features.
In the area of CPU scheduling, Google found the move to the completely fair scheduler to be painful. In fact, it was such a problem that they finally forward-ported the old O(1) scheduler and can run it in 2.6.26. Changes in the semantics of sched_yield() created grief, especially with the user-space locking that Google uses. High-priority threads can make a mess of load balancing, even if they run for very short periods of time. And load balancing matters: Google runs something like 5000 threads on systems with 16-32 cores.
Google makes a lot of use of the out-of-memory (OOM) killer to pare back overloaded systems. That can create trouble, though, when processes holding mutexes encounter the OOM killer. Mike wonders why the kernel tries so hard, rather than just failing allocation requests when memory gets too tight.
Ooooh... efficiency.. I'm curious what the net savings is.. compared to buying more cheap hardware.
So what is Google doing with all that code in the kernel? They try very hard to get the most out of every machine they have, so they cram a lot of work onto each.
(30 * kernel engineer salary) / (generic x86 server + cooling + power) = ?
Re:Is it worth it? (Score:5, Insightful)
Re: (Score:2)
How did you manage that?
Re: (Score:2)
Re: (Score:2)
Funny, but that search don't give any reliable source for the 1 million servers estimate. The only source is an estimate by Gartner, and if you belive them, you also belive that Itanium II is the most sold 64 bit server chip
Re: (Score:2)
Either way they have a hell of a lot of servers, far more so than is needed to justify arguments about how a few dozen salaries is peanuts compared to their hardware investment.
They also have a lot of desktops (Score:2)
Not telling how many servers they run. My numbers would be 18 months out of date anyway.
-B
Re:Is it worth it? (Score:5, Insightful)
They are already running absolutely absurd amounts of cheap hardware. "Just buying more" is something that I'm sure they are already doing all the time but clearly that only goes so far.
I suspect the answer to that is a very very small number.
Re: (Score:2)
Another way to put it... say you can make the server produce 1% extra performance
A *very* conservative estimate of 100 000 servers (I'd be shocked if they didn't have many times that) means that you now have the capacity of an extra 1 000 servers, which means 1 000 less servers that have to be purchased, deployed and maintained.
Re:Is it worth it? (Score:5, Insightful)
You are clearly not an engineer of scientist. Aside from the fact that some people just like to solve technical problems, I am betting google's logic goes something like this:
We have a problem that is basically only costing us $0.01*10,000computers/day. While that seems low, we plan on staying in business a long time, we could pay someone to solve the problem. Then there is that X factor, that if you don't do it, if you stop innovating, your competitors will, and they will get more and you will get less from the pool of money that is out there. In addition to that, the CS guy you paid to solve that is now worth more to your company (if you employed him) because [s]he now has a better understanding of a complex bit of code (the linux kernel) that you rely on heavily.
Re:Is it worth it? (Score:5, Insightful)
That could amount to hundreds of millions of dollars saved over the next decade, and it doesnt take a genius to realize that a couple dozen programmer salaries will be a hell of a lot less than that.
Re:Is it worth it? (Score:4, Interesting)
A while back I got an invitation to work for Google as a kernel developer. I declined to interview, because I already had a job doing just that. This article makes me glad I never accepted that offer. I feel sorry for those kernel developers at Google. Porting all that code back-and-forth over and over again. Now *that's* a crappy job.
Re: (Score:2)
please read that 25% line one more time.
now remember that comes after 20% free time.
so that's 1 day backporting and 1 day free time per week. doesn't sound bad to me.
now... last time google offered me a job I took it, so maybe that the difference between us.
What an ass. (Score:2)
Support of older applications has a great pedigree in the IT industry.
You will find much more interesting problems from a technical point of view doing that, because you will be basically on your own.
Re: (Score:2)
You are clearly not an engineer of scientist. Aside from the fact that some people just like to solve technical problems, I am betting google's logic goes something like this:
... because I question your efficiency? I'm keenly aware of the "just because" excuse, and to hear Google say that would make my day. They have the resources to do it for sure.
We have a problem that is basically only costing us $0.01*10,000computers/day. While that seems low, we plan on staying in business a long time, we could pay someone to solve the problem. Then there is that X factor, that if you don't do it, if you stop innovating, your competitors will, and they will get more and you will get less from the pool of money that is out there. In addition to that, the CS guy you paid to solve that is now worth more to your company (if you employed him) because [s]he now has a better understanding of a complex bit of code (the linux kernel) that you rely on heavily.
I see many of the things added/backported in Linux by Google are already included in other current operating systems.
Google does not sell operating systems.
What is the relationship between computing efficiency and advertising revenue?
How have these practices affected Google's bottom line?
I agree with that last sentence you wrote.
Ca
Re: (Score:2)
You commented on my aside "just because" twice as if it was even part of the main part of my response. It was not.
I see many of the things added/backported in Linux by Google are already included in other current operating systems.
In the article they talk about how they have slowly gone from one kernel release to another. Some specifics are in TA, but the only thing unsaid that might help is that because the kernel is so complex and development so fast, google can't just keep updating to the next one. It would put undo work on the people maintaining the google_specic_patches. It would put that much more work on their t
Re: (Score:1, Informative)
Wait, what? Has Google seriously never heard of vm.overcommit_memory [kernel.org]?
Re:Is it worth it? (Score:5, Interesting)
Ooooh... efficiency.. I'm curious what the net savings is.. compared to buying more cheap hardware.
We're talking about Google here. They have dozens of datacenters all over the globe, filled with hundreds of thousands of servers. Some estimate even a million servers or more.
So lets assume they have indeed a million servers and they need 5% more efficiency out of their server farms. Following your logic, it would be better to add 50,000 (!) cheap servers which consume space, power and require cooling and maintenance, but I'll bet you paying a handful of engineers to tweak your software is *a lot* cheaper. Especially since Google isn't "a project" or something. They're here for the long run. They're here to stay and in order to make that happen, they need to get the most from their platform as possible.
Re: (Score:2, Interesting)
Re: (Score:2)
Google has more than 15,000 servers. A well tuned system can outperform a poorly tuned system 2:1 for very specialized apps like google uses. you dont think that having 15,000 vs 30,000 servers is worth maybe 2Mil in wages and power bill? google had a 2Mil power bill per month. Those developers are starting to look pretty cheap..
Increasing the efficiency of their code, from memory management and scheduler to proxy servers can save huge amounts of CPU time which in turn lowers electricity requirements an
Re: (Score:2)
People seem to be ignoring in this equation that this team of engineers becomes deeply familiar with the Linux kernel and likely participates in a lot of problem solving and strategic work on the side. Knowing Google, they are confronting this patch migration problem from a high level and generally thinking about *all* the problems in the Linux kernel development and maintenance space. I'm sure this mess also counts toward code review against their mission critical infrastructure and their general handlin
Re: (Score:2)
Re: (Score:2)
I'm not quite sure where it says in the whole OSS ethos that making a profit from OSS is against the rules. Redhat have been doing it for a while, as have IBM, and I'm sure Dell et al wouldn't be selling Linux PCs and servers if they weren't making money from doing so. Google have released the source to loads of different stuff as well so again I'm not sure exactly where you're coming from or why the insightful mod was awarded.
Low memory conditions (Score:5, Interesting)
This is something I have been wondering too. Doesn't it just lead to applications crashing more often than them normally reporting they cannot allocate more memory?
Re:Low memory conditions (Score:5, Insightful)
Re: (Score:2)
This is something I have been wondering too. Doesn't it just lead to applications crashing more often than them normally reporting they cannot allocate more memory?
It results in (practically speaking) non-deterministic behaviour. Which is pretty much the worst thing you could have when it comes to system reliability. The OOM Killer (a solution to a problem that shouldn't even exist) basically kills stuff at random and (at least in my experience) rarely the process that's actually causing the problem in
Re: (Score:2)
Re: (Score:2, Interesting)
In Unix if malloc returns null then the memory allocation failed and you don't have the memory. A well written program should check that. Overcommitting memory can have efficiency advantages, but things can also turn out badly. Linux has heuristics to determine how much to overcommit the memory, or it can be disabled entirely.
http://utcc.utoronto.ca/~cks/space/blog/unix/MemoryOvercommit [utoronto.ca]
http://utcc.utoronto.ca/~cks/space/blog/linux/LinuxVMOvercommit [utoronto.ca]
Does Google give coade back (Score:5, Insightful)
Does Google give any code and patches back to the Linux kernel maintainers? Since they probably only use it internally and never distribute anything they are not required to by the GPL, but it would still be the right thing to do.
Re:Does Google give coade back (Score:5, Informative)
Yes, they do. Since they use older kernels and have... unique... needs, they aren't a huge contributor like RedHat, but they do a lot.
During 2.6.31, they were responsible for 6% [lwn.net] of the changes to the kernel.
Re:Does Google give coade back (Score:5, Interesting)
Andrew Morton, Google employee and maintainer of the -mm tree, contributed the vast majority of the changes filed under "Google" (and most of those changes aren't Google-specific - Andrew has been doing this since before he was employed there). If you subtract Andrew, Google is responsible for a tiny part of kernel development last I heard, unfortunately.
Re: (Score:3, Informative)
Re:Does Google give coade back (Score:5, Informative)
Andrew has been doing a large amount of kernel work for some time now, before his employment with Google. Note that the 6% figure is under non-author signoffs - people that patches went through, instead of people who actually authored them. Heck, even I submitted a patch that went through Andrew once (and I've submitted like 5 patches to the kernel). Andrew does a lot of gatekeeping for the kernel, but he doesn't write that much code, and he certainly doesn't appear to be committing code written by Google's kernel team under his name as a committer.
Google isn't even on the list of actual code-writing employers, which means they're under 0.9%. I watched a Google Tech Talk about the kernel once (I forget the exact name) where it was mentioned that Google was (minus Andrew) somewhere in the 40th place or so of companies who contribute changes to Linux.
Re: (Score:2)
http://en.wikipedia.org/wiki/Andrew_Morton_(computer_programmer) [wikipedia.org]
Sending him a "thank you" e-mail would be one way to pat him on the back.
Re: (Score:2)
A lot of companies will also use a single employee for all of their commits too. I know the company I used to work for made one man look like a code factory to a certain open source project, but, in fact, it was a team of 20 or so devs behind him doing the real work.
You clearly know nothing about Linux Kernel development if you think Morton is a face for a team of hidden coders.
Re: (Score:2, Insightful)
most of those changes aren't Google-specific
Why would they submit "Google-specific" patches?
It would make sense for them to only submit those patches that they believed to be of general utility. Other stuff would likely not be accepted.
Re:Does Google give coade back (Score:4, Informative)
By that I meant "developed for Google, useful to other people".
We can divide Andrew's potential kernel work into 4 categories:
Points 1 and 2 can be considered a result of Andrew's employment at google. Points 3 and 4 would happen even if he weren't employed at Google. From my understanding, the vast majority of Andrew's work is point 4 (that's why he's listed under non-author signoffs as 6%, along with Google). Both Andrew's and Google's commit-author contributions are below 0.9%.
So what we can derive from the data in the article, assuming it's accurate, is:
So no, Google doesn't contribute much to the kernel. Having Andrew on board gives them some presence and credibility in kernel-land, but they don't actually author much public kernel code. Hiring someone to keep doing what they were already doing doesn't make you a kernel contributor.
Re: (Score:2)
Don't tell him or Google that... I'm sure he prefers to eat in-between maintaining Linux.
Re:Does Google give coade back (Score:5, Informative)
Google is responsible for a tiny part of kernel development last I heard, unfortunately.
I don't know that much about google's private modifications, but the question of "what to give back" does not always have a clear default answer. I've modified lots of OSS in the past and not given it back, simply because my best guess was that I am the only person who will ever want feature x. There's no point in cluttering up mailing lists or documentation with something extremely esoteric. It's not because I'm lazy or selfish or greedy -- sometimes the right answer is to just keep things to yourself. (Of course, there are times when I've modified something hackishly, and had been too lazy or embarrassed to send it back upstream :)
Perhaps google answers this question in a different way than others would, but that doesn't necessarily conflict with "the spirit of OSS", whatever that might be.
Real example... (Score:5, Interesting)
Back in the 90's, we had a customized patch to Apache to make it forward tickets within our intranet as supplied by our (also customized) Kerberos libraries for our (also customized) build of Lynx. It all had to do with a very robust system for managing customer contacts that ran with virtually no maintenance from 1999 to 2007--and I was the only person who understood it because I wrote it as the SA--when it was scrapped for a "modern" and "supportable" solution that (of course) requires a dozen full-time developers and crashes all the time.
Not really bitching too much, because that platform was a product of the go-go 90's, and IT doctrine has changed for the better. No way should a product be out there with all your customer information that only one person understands. But it was a sweet solution that did its job and did its job well for a LONG time. Better living through the UNIX way of doing things!
But, anyway, I never bothered to contribute any of the patches from that back to the Apache tree (or the other trees) because they really only made sense in that particular context and as a group. If you weren't doing EXACTLY what we were doing, there was no point in the patches, and NOBODY was doing exactly what we were doing.
Re: (Score:2)
Well... (Score:2)
In fairness, the new system has capabilities that the old system did not. And I also think that there is a certain amount of overhead that necessarily accompanies a serious development effort, especially one that involves more than one person. If I just want to throw something up that will work, I can do it, by myself, without documentation etc., in a matter of weeks. Require documentation, controlled processes, and make me work with a team, and it will take months. The "mythical man month" is an ever-p
Re:Does Google give coade back (Score:4, Insightful)
If you subtract search engines google is responsible for a a tiny portion of the internet. Andrew gets benies from google so I suppose they do get some credit for the quantity of his work as he needs to eat and pay rent so that he can code.
wtf? (Score:2, Funny)
Oh sorry...title had me thinking this was penguin porn
Reminds me of Android (Score:2, Insightful)
Somehow I'm reminded about the whole Android thing. Google really seems to have the urge to only do their own thing. Same thing with android where they have thrown out the whole "Linux" userspace to reinvent the wheel (only not as good, see Harald Welte's Blog for a rant about it). Here it seems the same thing they just do their own thing without merging back and disregarding experiences others might have had.
On a side note, their problems with the Completely Fair Scheduler should be a good argument for plu
Re: (Score:2)
If what you say is true, and if what is said about Free Software is true, then there should therefore exist an opportunity for someone to come along and eat google's lunch by taking advantage of the Bazaar development model. I suggest you get crack-a-lackin'.
Solaris (Score:2)
It's amazing how many of these problems, especially with regard to multi-threading issues and multiple cores, have already been solve and implemented in Sun Solaris. In 1994. Fifteen years ago.
Re: (Score:3, Informative)
Pick your poison.
I never used tar (Score:2)
One had cpio and dumpfs which worked fine as far as I can tell.
When you bought Sun back then the last thing you were worrying about was if tar worked or not ....
Google is not givin back a shit (Score:2)
Google is using extensively open source, but is not giving back any significant technology to the open source world.
No efficient search technology.
No decent OCR software (ocropus + tesseract are still years behind what you get for free with any multifunction HP printer on the windows world) No GIS technology No JSP cooperation, Minimal kernel patches, etc, etc
Google could be a major open/free source contributor, they have the money and the skills, but they have no will to do-it. In fact, Google is behav
Re: (Score:2)
According to http://code.google.com/opensource/ [google.com], Google has released 1M lines of source code across 100 projects. Are you disappointed in the volume of contributions, or because they aren't releasing software that you're interested in? Sure, Google could open source their entire search product, but that's kind of a critical part of their revenue stream, yeah?
They don't have to. (Score:2)
You should really read carefully the licenses of open source software before the foam in your mouth asphyxiates you.
Re:Open source is the coat tails that Google rides (Score:4, Insightful)
you missed the point of open source then
Re: (Score:1, Insightful)
For Free Software, 'take' is fine. 'Provide but restrict' is not.
Re:Open source is the coat tails that Google rides (Score:5, Insightful)
Re: (Score:2)
Hmm, you realize that Android alone is over 10 million lines of code right? That's a pretty big open source contribution right there. But then there's also over a million lines of code across 100+ smaller projects too. So I am not sure what your definition of "table scraps" is but it's significantly more lines of code than most companies do.
I see millions of lines of Code from the Apache Foundation's various Java projects in Android.
So about 1/10th Sun's contribution (Score:2)
Re: (Score:2)
shwartz ruined sun
Re:So about 1/10th Sun's contribution (Score:5, Funny)
That's a drop in the bucket compared to what Sun has contributed to open source. Of course, slashdot appears to be perversely against Sun for some reason I cannot fathom.
Names are very important. The name Sun reminds of that place on the other side of the door where if we go, our skin gets red and burns. Google reminds us of that friendly homepage that would load under 5 seconds on dial-up.
Re: (Score:2)
I'm assuming this is to give Verizon exclusivity with their "droid" phone to be the only one running 2.0. I don't think they anticipated projects like cyanogenmod taking off quite like
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I'm not a big fan of Google, but god damn man. These guys are a huge player no matter what they do.
Re: (Score:2)
Re:Open source is the coat tails that Google rides (Score:2)
Is "Amazingly short sighted" your sig, that is a self referential thing you need to tack onto everything you write? Seems very apt.
Are you nuts (Score:3, Insightful)
I'm not a huge goog fan, I never take their cookies so I don't use anything but search..but JUST search is way more "give back" than table scraps. If they announced tomorrow their search would now cost x-dollars a year, as long as it was somewhat reasonable,like an extra 5 bucks a month on top of my ISP bill, I'd pay for those table scraps. Google search has done more than anything else to make the web actually *useful* since the invention of the hyperlink.
Sure, there are other search engines, but if you ac
Re: (Score:2)
Re:Open source is the coat tails that Google rides (Score:4, Insightful)
They take and take from open source and throw back a couple of table scraps and you people all kiss their ass for it.
300K lines of code? Yep, table scraps.
For people who wonder why I continue to want to see the end of the FSF, the above attitude is the reason why. Stallman and his organisation are the reason for it.
Aside from being ugly and spiritually bankrupt, reciprocity paranoia is based on completely erroneous reasoning, as well. The same people who talk about how music piracy isn't harming anyone, because it doesn't physically take away from a finite supply of copies, are also those who express the above paranoia about people "taking," from FOSS, as if that is somehow a physically finite resource, when music isn't.
Get rid of your fear.
Re: (Score:2)
I think you need to distinguish between true FOSS zealots and leeches who just want stuff for free. Hint: the grandparent is the latter.
Re: (Score:2)
Who said they were throwing those lines back? They don't have to, and a short look at TFA didn't make it look as if they did. (Not that I mind -- I myself maintain such kernel code at work.)
Re: (Score:2)
Isn't this odd for slashdot to discuss the news that old...
You must be new here