Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Open Source Linux

10 Years of Git: An Interview With Linus Torvalds 203

LibbyMC writes Git will celebrate its 10-year anniversary tomorrow. To celebrate this milestone, Linus shares the behind-the-scenes story of Git and tells us what he thinks of the project and its impact on software development. From the article: "Ten years ago this week, the Linux kernel community faced a daunting challenge: They could no longer use their revision control system BitKeeper and no other Software Configuration Management (SCMs) met their needs for a distributed system. Linus Torvalds, the creator of Linux, took the challenge into his own hands and disappeared over the weekend to emerge the following week with Git. Today Git is used for thousands of projects and has ushered in a new level of social coding among programmers."
This discussion has been archived. No new comments can be posted.

10 Years of Git: An Interview With Linus Torvalds

Comments Filter:
  • by Anonymous Coward on Monday April 06, 2015 @12:34PM (#49415723)

    I hear from so many people who love git, and also from so many people who see it as needlessly complicated to the point of getting in the way of getting things done. If that latter view didn't have any truth to it, this page wouldn't exist:

    http://git-man-page-generator.lokaltog.net/

    So, which is it? A useful tool, or simply a way for the the brightest technology people to feel smarter than everyone?

    • by jones_supa ( 887896 ) on Monday April 06, 2015 @12:43PM (#49415779)
      It isn't complicated. Check out Git - The Simple Guide [github.io].
      • by Anrego ( 830717 ) * on Monday April 06, 2015 @01:26PM (#49416127)

        As someone mostly in the "I dun get it" crowd, I'll say the problem for me is that I feel like while I can use it, I don't have a great deal of understanding as to what it's actually doing outside of the basics. I feel like I'm following a bunch of recipes that I know work.

        With svn (which admittedly I've used for many years and on sizable projects vs git which I've used for months and on small stuff), I feel like I have a really good grasp of the whole thing. Sure there are some subtle bits I don't know because I've never needed, but I know the important bits, and I feel like from that I can solve just about any problem I run into by understanding what svn is trying to do and why it's not working.

        I get that at least some of this is just inexperience, but I think even with experience, git seems far more complex and nuanced than svn, which has a relatively consistent way of working and a seems to have a much smaller set of features. I feel like I got comfortable with svn way faster, and at that point I was only mildly familiar with version control in general.

        I know I'm gonna get flamed for this, but just wanted to provide some insight into the mind of someone who hasn't jumped on the git bandwagon yet.

        • by pz ( 113803 )

          The team I was on was using cvs for a long time (quite successfully) and then switched to git. I could never use git without having a page of cheat-sheet notes in front of me. There were some good things about it, some really good things (the code merger was magic), but you had to stay on top of the state of your code in a way that CVS never required.

        • by Traf-O-Data-Hater ( 858971 ) on Monday April 06, 2015 @06:58PM (#49418923)
          Like others, I both love git and hate it. The bit I dislike the most is the inconsistency in commands and their opposites.
          For instance, it is easy to add files to staging:
          git add .
          Oops! A bunch of other things got added, because I'm a newbie and haven't yet tuned my .gitconfig. Fine, I'm still learning.
          OK, have a guess at undoing it:
          git unadd
          wtf?...nope..
          Frustrating searching to find that git reset is really unadd. Yeah, I could guess that! not.
          And that's the crux of it. Sure you can add git aliases, but an xxx/unxxx pattern could have been built in right from the (ahem) git-go for any sensible command. Git commit/uncommit... merge/unmerge... etc etc.

          And the great thing about git: Linus realised disk space was becoming to cheap to meter. Why bother crunching a delta on something when it was easier to just store compressed blobs. Thus the advantage of simple, fast and cheap (pick any three) branching.

      • by Yunzil ( 181064 )

        It isn't complicated.

        Yes it is. Ten years is about how long it takes to figure out.

      • Here is an even simpler guide: A Guide to GIT using spatial analogies [tartley.com].

    • by kthreadd ( 1558445 ) on Monday April 06, 2015 @12:46PM (#49415809)
      Very few people actually know their version control software. Most people know the basic commands, and that's the case for pretty much all of them. Git is not much different in that regard.
      • by Anrego ( 830717 ) *

        I made a post about this above, but yeah, that describes my current relationship with git, and is one of the reasons I don't enjoy using it.

        I feel like I truly know svn, I understand what it does and am very comfortable with how it works. Part of that is just having used it for a long time, but I do feel like git is much harder to wrap your head around.

        With git I feel like I'm just following a bunch of recipes that I know work (or seem to work), and that's really not a good way to go about anything. Every t

      • by Kjella ( 173770 )

        Very few people actually know their version control software. Most people know the basic commands, and that's the case for pretty much all of them. Git is not much different in that regard.

        And I suspect some don't understand version control at all. I've worked with the following (non-git) setup:
        One production branch, at any time one active development branch. When we merge to production, we branch off a new development branch. Could it get simpler? I don't see how. Yet people manage to start developing on the prod branch (that they have access to for hotfixes), fail to understand that bugfixes to prod must go into dev or be overwritten at the next merge, branch the dev branch instead of the p

        • I prefer the model where the "master" branch is the continuous trunk of development. When you want to release, you branch off for that release.

          Functionally equivalent, but has the nice property that most stuff goes into "master" and only bugfixes/backports or special-case stuff goes onto the release branch.

      • by jrumney ( 197329 )
        Except that for people who are not working alone, there are more commands that need to be considered basic in DVCS than in traditional VCS.
    • by jandar ( 304267 )

      You have to understand the data-structure, how files, directories and commits are all content-addressable objects. The linkage of the commits by means of their id's must be understood.

      To understand merging you should have used diff and patch a few times.

      This is all.

      With these few concepts anybody, who can write a program, should be able to understand git. To invent new use-cases or work-flows is another thing, but comprehending a given usage of git should be straight forward.

      • by Guy Harris ( 3803 ) <guy@alum.mit.edu> on Monday April 06, 2015 @01:57PM (#49416451)

        You have to understand the data-structure, how files, directories and commits are all content-addressable objects. The linkage of the commits by means of their id's must be understood.

        Git: the best file system anybody ever confused with a version control system. :-)

      • by Yunzil ( 181064 ) on Monday April 06, 2015 @03:01PM (#49417065) Homepage

        You have to understand the data-structure, how files, directories and commits are all content-addressable objects. The linkage of the commits by means of their id's must be understood.

        See, here's the thing. Why should I have to understand internal data structures in order to use a piece of software? Imagine if you made a word processor and people found it difficult to understand, and you said, "It's easy once you understand that the words in the text are stored in a hash map along with a structure with various flags that encode things like whether it's italic or not." People would look at you funny and go back to using Word.

        • Why should I have to understand internal data structures in order to use a piece of software?

          Because you're not used to thinking about source code the way Git thinks about source code. Git is very much like a database from a usability standpoint, and you will probably get into bad trouble trying to use either without understanding both the problem that they are trying to solve and the implementation. If you do read about these things, you will understand that git's internals make sense, the decisions it makes are logical, and the user interface is (mostly) transparent and simple. Revisions are hard

          • Because you're not used to thinking about source code the way Git thinks about source code. Git is very much like a database from a usability standpoint

            That's probably a significant part of the problem. :-)

        • by mschaef ( 31494 )

          After a point, users must develop a deeper understanding of how a given software package works in order to use it effectively.

          You mentioned word processing as an example... carrying that along, Microsoft Word faced a huge amount of resistance from WordPerfect users who had internalized WordPerfect's 'stream-of-markup' model for representing formatted text. In both cases, you could highlight text and make it bold, but WordPerfect's model was much clearer to more advanced users. Microsoft has tried to replica

    • by bobbied ( 2522392 ) on Monday April 06, 2015 @01:40PM (#49416277)

      I hear from so many people who love git, and also from so many people who see it as needlessly complicated to the point of getting in the way of getting things done. If that latter view didn't have any truth to it, this page wouldn't exist:

      http://git-man-page-generator.lokaltog.net/

      So, which is it? A useful tool, or simply a way for the the brightest technology people to feel smarter than everyone?

      Personally, I'm in both camps. I both hate and love Git. I hate that I have to explain and provide "scripts" to developers that explain how the project uses Git and I love that I can manage my project in multiple ways, depending on my needs.

      Git's problems stem from it's Unix like command line basic user interface. It's not a surprise that they decided to go with this kind of interface, they where basically Linux developers after all. In the tradition of good CLI's, Git is full featured, meaning it does a LOT of things, or really it supports doing things in a lot of different ways. I love the flexibility. But, unless you understand what Git is doing for you under the covers you may not know which of the confusing commands in Git you need to use. If you don't understand how your project is using Git, it may be difficult for the newbie come up with the necessary commands to get things done.

      Personally, I end up writing scripts for my developers. I force them into following a set procedure to "check out" the source, do their local development and get their changes though the review cycles and into the main repository again. Developers don't like scripts like this and because it's a script that describes how they use git, they think they don't like git. What they really hate is being told exactly what to do...

      I love git because it allows me to control my project's source. It keeps local backups and my history on MY machine, but doesn't expose others to my mindless rambling commits unless I decide to push them. I love the flexibility to manage my source how I want too locally....

      So, IMHO, git is great and a curse at the same time. Much like AWK and SED where a huge boost to the Unix CLI (if you understand them) git is wonderfully complex and thus frustrating to learn. Git doesn't force you into a configuration management model, but lets you roll your own process. Being flexible is great but git doesn't stop you from shooting yourself in the foot so be careful and know how you want to manage your repo, figure out how to make git do that, document how it's done, and test your process.

      Remember, it's the PROCESS you need to have straight in your head. Just googling Git is going to cause you trouble because how THEY use git is unlikely to match how YOU want to use it. You got to know your tools and git is no exception but you REALLY need to know what you are trying to do with git.

      • by gmack ( 197796 )

        Or you can just use GitHub's Windows client [github.com] which, the last time I used it required me to use the command line to init non Githib repos but then didn't require the user to use the command line for anything. Might even be better now, it's been a couple of years since I had to support software developers running on windows.,

        • Still, if you don't know what's happening locally and remotely, you can get into big trouble even with a GUI.
    • For those who are used to any of the older-style centralized revision control systems, it's a huge gear shift to begin using Git. And yes, it seems needlessly complicated, and seems to get in the way of getting things done. However, once you get over the hump and begin to "git" it [tee-hee], it seems very well designed and is very efficient to use.

      To git over the hump, you really need to study it somewhere such as Pro Git [git-scm.com]. After that, you need some practice, as well some help from a buddy or two. Soon, i

      • After reading TFA, I found that Linus Torvalds made some of the same points, where "a traditional version control system" could be substituted wherever he uses "CVS":

        The other big reason people thought git was hard is that git is very different. There are people who used things like CVS for a decade or two, and git is not CVS. Not even close. The concepts are different. The commands are different. Git never even really tried to look like CVS, quite the reverse. And if you've used a CVS-like system for a long time, that makes git appear complicated and needlessly different. People were put off by the odd revision numbers. Why is a git revision not "1.3.1" with nice incrementing numbers like it was in CVS? Why is it that odd scary 40-character HEX number?

        But git wasn't "needlessly different." The differences are required. It's just that it made some people really think it was more complicated than it is, because they came from a very different background.

      • The problem with what you're saying is that Mercurial exists, and it can do everything that git can do with an easy to use interface.
        • I used Mercurial for a short time a few years ago. Although I focused above on Git, the same points apply to Mercurial since it's also a distributed version control system. Both use similar concepts, but both are very different from what people who haven't used a distributed version control system are used to. So, it's a huge gear shift to really grok the new concepts of either one. Likewise, it's probably pretty easy to switch between Git and Mercurial since the concepts are similar.

          So, the problem wit

    • by rwven ( 663186 )

      GIT, imho, is no more complicated or confusing than any other scm system. I've used svn, mercurial, and git quite a bit, and found git and hg to be the easiest to work with, with svn coming in a distant 3rd...

      • I used SVN casually for a few years, but I can't say I ever fully grasped its approach to branching and tagging, which are done mostly by using a set of specially named directories rather than as a fundamental feature of SVN itself. In contrast, Git's approach to those things make perfect sense to me.

        • Wow, I can't believe I'm defending svn....

          But as far as I understand (and as far as I've been *using* it for several years now), there are not "specially named directories". There is a user-created convention of directory names and locations..

          But underneath, there's NO difference between a tag directory and a branch directory.. (Which _annoys_ me, since I wish I could "lock" tag directories so I don't accidentally check into them, which I think I did once.. then undid of course..)

          I guess this is a case of

    • I think it is a bit of all of it.
      Git isn't for everyone or every project... For some people they are better off with Subversion or something else... Just because their project may have a structure that makes their tool easier to handle.

      However I expect, there is a lot of room to make it easier to use, that the end users do not want to embrace, Primarily because they have gotten use to how it is now, and doesn't want to change their methods. But also as a way so they feel good that they are somehow special

      • SVN's strengths are:

        - Centralized repository model, which is simpler and for less technical users makes it less likely that they will screw up. Once something is committed to the SVN server, you can back it up and not worry that you have portions of your data not covered by backups. Plus you get monotonically increasing version numbers, which non-techies find easier to digest.

        - Excellent at handling binary files. Like MSOffice files, or LibreOffice, or images, or other binary assets. We have a few r
    • What's annoyed me in the past is when I've gone to download some software and rather than giving you a simple link to an archive, you get instructions on how to do it all with Git.

    • by gweihir ( 88907 )

      Stupid people will always perceive most things as "difficult". That does not mean they are.

  • by Johnny Loves Linux ( 1147635 ) on Monday April 06, 2015 @12:48PM (#49415823)

    As a software developer who's been a git user for 7 years, I don't know how I could have written any serious code without git. Branching and merging is trivial. Cloning is trivial. The staging area makes choosing what to commit trivial. git rebase makes life much easier when it comes to reordering/editing/removing commits out of the history. git blame --- such a nice tool. Binary searching to find bugs is trivial. Every git tool is documented to within an inch of its life.

    And the icing on the cake? Code cowboy [wikipedia.org] hates git. Like sunlight or garlic to a vampire, Code cowboy abhors git. He can't hold the source code hostage to his every brain damaged whim. He can't hose anybody with a distributed version control system. It's no wonder why Code Cowboy is always yapping away at git -- he can't show off his genius if his code can be ignored.

    • As End User most of the time (I rarely diddle the code, I'm not that good at it, though I do occasionally fix things) I am kind of grumpy about git. Nobody tells you to use --depth 1 or however you do that, luckily it's been a while since I've downloaded any large sources for which I wasn't able to get a simple tarball. The inability to resume a fetch is pretty horrible, though. Unless that's been fixed recently?

      • I wouldn't bother with --depth options, at that point you're better off downloading a tarball/zip of the branch. Downloading a nightly snapshot tarball is even better, because you can resume that.

        • I wouldn't bother with --depth options, at that point you're better off downloading a tarball/zip of the branch.

          I agree, but with the caveat that I occasionally run into someone using git on their own server and without providing a tarball, in which case woe is me.

    • As a software developer who's been a git user for 7 years, I don't know how I could have written any serious code without git.

      Very much this. Git has a steep initial learning curve. The concept of a branch being just a pointer is foreign at first. New users try to fit the concept of a branch into their pre-existing notion that it's the sum of everything that has changed since the last merge.

      But once you really understand how Git works, you're ruined for every other version control system. When I'm forced to use TFS for a project, I use Git locally and Git-TFS to keep them in sync. Now I commit often, all day long, tracking all my

      • by Shados ( 741919 )

        But once you really understand how Git works, you're ruined for every other version control system. When I'm forced to use TFS for a project, I use Git locally and Git-TFS to keep them in sync. Now I commit often, all day long, tracking all my changes and (relatively) easy rolling them back or reordering them if necessary.

        Yup. I can deal with any language (ok, aside PHP...), any operating system (yeah, I don't mind developing on Windows), any framework, any technology...but source control has to be on git o

    • I write serious code all the time without git. In other tools it was easier to avoid branching or merging by keeping the teams small and only track a "dev" branch and several release branches that are rarely updated. Apply the same patch to each release branch to avoid having to do any complicated merges between branchs. Yes it's all kind of silly and painful, but it doesn't take a long time if you avoid the weakness in other tools. Plenty of time left over to write "serious code".

      The easiest of course is t

      • You can actually share an RCS repo just fine since it does file locking. It's actually quite common even nowadays for configuration files in cases where a larger configuration management system is overkill.
        • Well you can share it on the same system or over a network filesystem that is designed to do file locking in a unix friendly way (like NFS).

          And I totally agree with other posters that RCS for /etc is a great way to go. I use it myself. (but I keep my DNS zone files in git)

    • by pspahn ( 1175617 )

      git pull --rebase origin master

      There might possibly be no other command in the history of software development that has saved more man-hours than this gem.

      • by tlhIngan ( 30335 )

        git pull --rebase origin master

        There might possibly be no other command in the history of software development that has saved more man-hours than this gem.

        Except when you forget the --rebase and now have hours of work fixing your tree.

        Especially if you provide your work as a bunch of patches against an official (but read-only) repo (because said repo is like AOSP where it's easily 30+GB).

        git's ability to generate working patches that apply cleanly breaks if you branched somewhere along the line, then merged

        • by Shados ( 741919 )

          Except when you forget the --rebase and now have hours of work fixing your tree.

          Any merge conflict and you'll notice something fast what happened, and then you can simply abort. If someone there's no conflict, you can just look at the reflog and reset to the pre-merge commit.

          Whoop-y-doo.

      • And why then does it say in all kind of manuals that you should not use rebash unless you know what you are doing!

        After using Git for about a year now, my conclusion is that subversion is good enough for your team, and heaven compared to Git. With subversion you can just peform an update without having to wonder if you might have something that still needs to be commited. I am also not convinced that creating branches and merging all the time is realy a good way of working for the team we are working in.

        • by Shados ( 741919 )

          And why then does it say in all kind of manuals that you should not use rebash unless you know what you are doing!

          Because you shouldn't do anything without knowing what you're doing. That command is just unique-ish to git, so it requires a bit of special attention (you need to understand how the commit hashes work).

          Once you do though, merge conflicts are 100x easier to handle, commit history makes more sense, etc. There's cases where you don't want to use it (when you want to be able to trace branching hist

        • The rule against using rebase really only applies if you are publishing your git repo (and specifically the branch in question) to other people.

          The reason for the rule is that if you rebase your changes on top of the latest upstream, anyone pulling in your branch is then forced to rebase as well since you've essentially rewritten history. (Doing a rebase changes the hashes on all your local commits.) If you instead merge upstream work onto your local branch then your history is preserved and everyone downs

      • I *never* do development on an upstream branch. So instead of the above I would always checkout the local "master" branch, do a "git pull", then checkout my development branch and rebase my work on top of the latest local "master" branch.

        The nice thing about this is that the local "master" branch is always identical to some version of the upstream "master" branch, I never need to worry about it getting polluted with my development work.

    • There is a difference between Serious Code and Distributed code.
      You could write serious code without any sort of Source Control... However it isn't recommended.
      But sometimes having a small term working a program is much better then having hundreds of people. So for the smaller teams GIT is too cumbersome.

      • All small teams I worked with the last 3 years love git.
        What is cumbersome at git, don't get it?
        The daily work cycle is exactly the same as in CVS e.g.
        Everyone simply does a commit with push and optional rebase, unless you have to merge this is a single command, usually triggered via the GUI.
        At some point your team is done, someone merges the current branch to "the trunc" and makes a new branch for the next sprint or story/feature. Something similar you do in CVS as well, but much much later when you have a

  • by Digana ( 1018720 ) on Monday April 06, 2015 @12:54PM (#49415871)

    Let's not forget the other contender for replacing Bitkeeper: Mercurial [iu.edu]. We will also be celebrating its 10th year anniversary next week during the Pycon sprints [selenic.com].

  • by Anonymous Coward on Monday April 06, 2015 @01:16PM (#49416029)

    how bitkeeper fucked up and was swiftly relegated to irrelevance... you have to wonder how many of these [bitkeeper.com] are even still using bk......

    • by Dracos ( 107777 )

      Exactly, BitKeeper committed suicide by throwing a fit over their licensing for open source projects, the terms of which stipulated that copies of all commit messages must be sent to BitKeeper, and one of the kernel devs figured out (basically, IIRC) a way to circumvent that.

      At the time of the fiasco that caused git to be created, the top two OSS projects (by lines of code) using BK were the Kernel and MySQL (the third was a PHP CMS that I was part of at the time). There used to be a OSS projects page with

    • by gmack ( 197796 )

      I'm thinking IBM doesn't use their license anymore. AFIK IBM was only using Bitkeeper for the Linux kernel. Larry made them pay for a license since IBM has it's own competing SCM and the free version banned use by any company that has a competing SCM.

  • Udacity has a free class [udacity.com] on Git and GitHub. I recommend it. They spend a little too much time on writing a chart that diagrams the different parts of Git, but the class is well-structured and clear.

  • This is a security failing that is hard to overlook for projects with far-flung participants. Any time data is downloaded it could be subjected to MITM attack.

    Core git coders appear oblivious to the problem. Even so, how hard can it be to replace SHA1 with SHA256?

  • by Foresto ( 127767 ) on Monday April 06, 2015 @11:33PM (#49420167) Homepage

    Git is its own worst enemy

    Sigh... Git. Ten years later, and it's still making people suffer with its unforgivably awful user interface. Seriously. I like the command line, and git is my primary version control system, but git's UI is the single most user-hostile example of human-computer interaction that I have had the misfortune to encounter in years. Maybe decades.

    Git's command structure is a train wreck of inconsistencies, some of its most important terminology is worse than worthless, and its man pages and built-in help text are idiotically obtuse. I have been following its development closely enough to understand how it got this way. A lot of it has to do with placeholder terms that were never updated, synonyms that were never reconciled, features that were grafted onto existing commands and never properly organized, and its origin as a set of low-level components rather than a tool intended for humans. In other words, a pattern of evolution much like any other software, except for one thing: Even after years of being relatively stable, its mantainers still haven't addressed its glaring usability problems.

    These aren't just minor warts that only affect a few people, either. There are countless articles, blog posts, and forum threads expressing frustration with git and detailing specific improvements that could transform it from a usability nightmare to an elegant piece of work. Sadly, the maintainers either ignore them or respond with some half-witted reason to resist change. Frankly, I am embarrassed to see my fellow software developers failing so miserably to recognize the importance of usability, and failing to fix it.

    What is the cache? It's a place where you're expected to manually arrange your data before you commit it. Does it function like a person would expect a cache to function? No, but we call it that anyway. What is the index? It's the same thing. Does it function like a person would expect an index to function? No, but we call it that anyway. You're referring to the same thing in both cases? Yes, for the most part. Does it function like anything that might be familiar to anyone? Yes, it's essentially a staging area. Why don't you call it a staging area? We do, but only in the minority of cases. You mean you have three names for the same thing, and the most accurate name is the one that you use the least? Yes. Why? Because the meaningful name might be harder to translate into other languages. So you deliberately use a confusing variety of misleading names when writing in English, the single most widely used language in computer science, because one of your translators didn't want to describe a staging area in another language? Yes. Well, that's probably okay, because this thing is probably some obscure piece of git that most people don't have to use, right? No, it's actually one of git's most distinguishing features, and interacting with it is absolutely required in order to use git. I see.

    Newcomers shouldn't have to be encouraged to "take the time to learn git." It should be easy. A programmer familiar with version control systems should be able to pick up a new one in five minutes, and find the answer to most intermediate-to-advanced problems in maybe ten or fifteen. They should be able to walk away for a month or two, come back, and still remember how to use it. That doesn't generally happen with git. One has to invest quite a bit of time and patience to confidently use anything beyond its most basic operations without screwing something up, and stay in practice with it, or else end up having to learn most of it all over again.

    The ridiculous thing is that it doesn't have to be this way. Mercurial is real-world proof of that.

    I hate git for these reasons. It's a cantankerous bastard of a tool that will just as soon kneecap you as handle your data. I only use it because of github (which is brilliant, by the way.) If you want to see an example of how version control should be done, get to know mercurial. Its internal de

    • by Foresto ( 127767 )

      To anyone curious about Mercurial enough to try it, keep in mind that what git calls a "branch" is called a "bookmark" in mercurial, because "branch" has a more traditional meaning over there.

No spitting on the Bus! Thank you, The Mgt.

Working...