Linux Kernel Developer Chris Mason's New Initiative: AI Prompts for Code Reviews (phoronix.com) 47

Posted by EditorDavid on Monday February 02, 2026 @05:34AM from the life-with-Linux dept.

Phoronix reports: Chris Mason, the longtime Linux kernel developer most known for being the creator of Btrfs, has been working on a Git repository with AI review prompts he has been working on for LLM-assisted code review of Linux kernel patches. This initiative has been happening for some weeks now while the latest work was posted today for comments... The Meta engineer has been investing a lot of effort into making this AI/LLM-assisted code review accurate and useful to upstream Linux kernel stakeholders. It's already shown positive results and with the current pace it looks like it could play a helpful part in Linux kernel code review moving forward.
"I'm hoping to get some feedback on changes I pushed today that break the review up into individual tasks..." Mason wrote on the Linux kernel mailing list. "Using tasks allows us to break up large diffs into smaller chunks, and review each chunk individually. This ends up using fewer tokens a lot of the time, because we're not sending context back and forth for the entire diff with every turn. It also catches more bugs all around."

Linux Kernel Developer Chris Mason's New Initiative: AI Prompts for Code Reviews

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 47 Comments Log In/Create an Account

Comments Filter:

- Re: (Score:3)
  
  by Mascot ( 120795 ) writes:
  
  I'm not sure the business numbers are all that important when it comes to code. We already have them trained on _a lot_ of code, and since they're more focused they can be smaller without being useless compared to the full size ones. If we can run them locally on a single GPU, it doesn't go away when the bubble pops and the big players stop throwing away money.
  As with any tool, they need to be used where they actually offer value. Which is definitely not to architect solutions, but to sanity check smaller c
  - Re: Don't be stupid, people (Score:3)
    
    by simlox ( 6576120 ) writes:
    
    I am on both impressed and disappointed with AI software development. Impressed when it can analyse logs and code and pin point the course of a bug for so I can save 30 minutes. Or when it can give me a huge template for code I can start with instead of spending hour coding it myself and/or googling libraries to use. Disappointed in code generated too often doesnt work in the, and I have the feeling I should have done it myself - fortunately something discovereable up front most of the time. But also cost:
    - Re: (Score:3)
      
      by bleedingobvious ( 6265230 ) writes:
      
      The hidden costs - especially on up-and-coming devs - is the fact that knowledge isn't being retained so a solved problem will end up being re-solved by LLM again. And again. And again.
      Worse yet, given how human creativity works, this means we won't see novel applications or solutions. Just layer after layer of mostly functional AI slop
      - Re: (Score:2)
        
        by HiThere ( 15173 ) writes:
        
        If you just use it to create an initial mock-up, I don't think that cost occurs.
      - Re: (Score:2)
        
        by Quasar1999 ( 520073 ) writes:
        
        The hidden costs - especially on up-and-coming devs - is the fact that knowledge isn't being retained so a solved problem will end up being re-solved by LLM again. And again. And again.
        Worse yet, given how human creativity works, this means we won't see novel applications or solutions. Just layer after layer of mostly functional AI slop
        Funny thing is, the more AI consumes all of our CPU and RAM for their data centers, the more necessary it is to build performant and optimized code, both for CPU cycles and for memory constraints, since consumer devices are going to have worse compute resources available going forward due to cost.
        
        The dinosaurs among the devs (of which I sadly count myself as one), know how to write code that squeezes every bit of useful work from every clock cycle, and how to use the least amount of memory necessary to a
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        So the more layer after layer the AI slop generates, the more job security actual skilled developers will have.
        That is certainly true. And since junior people will get reduced in numbers and very few learn actual skills anymore, actually skilled developers may all but die out. Will take a while, but we are looking at a potential catastrophe in the making.
      - Re: (Score:3)
        
        by gweihir ( 88907 ) writes:
        
        Indeed. And when it comes to code security or architecture, LLM type AI is a complete catastrophe. Now, who is supposed to do these when junior people get scarce and those left do not learn the basics anymore?
        I predict that even if LLMs stay around and available, LLM code will cause a delayed catastrophe.
  - Re: Don't be stupid, people (Score:3)
    
    by devslash0 ( 4203435 ) writes:
    
    Correction:
    They've been trained on a lot of UNCHECKED code.
    - Re: (Score:2)
      
      by Quasar1999 ( 520073 ) writes:
      
      Correction: They've been trained on a lot of UNCHECKED code.
      Garbage in, garbage out.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      Indeed. And they do not know never developments, including security bugs, docu updates, features made obsolete, new regular bugs, etc.
      A coding model becomes a problem when not updated for a year or two.
    - - Re: (Score:2)
        
        by devslash0 ( 4203435 ) writes:
        
        The 10-20% difference that you're quoting is actually huge. It's a difference between a successful cancer treatment and deadly poison. It's a difference between a building standing strong, or collapsing and killing thousands of people. Correctness matters. If you can't rely on results, the tool is useless in a professional setting where your reputation and life of others depends on it.
        As for the PhD-level problems - most of the time there's nothing to solve. AI just serves you a reheated pancacke of the sol
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    First, you need to retrain frequently for code as well unless you want to stay on low amateur level. Think security problems (of which tons of new ones are discovered all the time), new libraries, old stuff getting deprecated, etc. And second, you cannot run the large coding LLMs locally in a meaningful way. And you would need to be able to get them in the first place.
    So, yes, the catastrophic (3 years in) business numbers matter. In fact, they are critical.
    - Re: (Score:2)
      
      by Mascot ( 120795 ) writes:
      
      This is starting to look like bad faith argumentation to me. Remember that the premise here was using LLMs as assistants for code review, not code production. Decent coding assistant LLMs that can be run locally already exists, and have for years. They're not as good as full size models running in data centers, of course, but they're also not useless. As long as they are able to catch _some_ security issues, at an acceptable signal to noise ratio, they offer value in freeing up reviewer brain bandwidth to f
- Re: (Score:1)
  
  by Zero__Kelvin ( 151819 ) writes:
  
  I am not a big AI fan, but you are like the person in the 1970s who said home computers would be a fad because they don't have the power of a mainframe. Just as computing moved from 8 bit processors with a few kilobytes of RAM and no hard drive to pocket "supercomputers" (todays "smartphones" are far, far more powerful than supercomputers of the 1970s), AI is also going to experience an exponential growth in terms of power and capability. LLMs are the current state of the art, perhaps, but that is going t
  - Re: (Score:2)
    
    by Entrope ( 68843 ) writes:
    
    The question isn't "is AI ever useful", but rather "is it useful enough today for the specific use case?". That is what this guy is exploring. My gut feeling is that it isn't, but I don't have the experience to know for sure, and neither does anyone else.
    Well, most of us don't have the specific experience to know whether AI is useful enough for Chris Mason's specific use case, but we have enough from our own.
    I have found that AI is good enough for a first draft of code, or for providing comments on existing code -- but I want a human to review whatever it generates, and would expect the normal suite of other tools (linters, SAST/DAST, fuzzers, etc.) to pass the code before publishing it. I recently interviewed someone else with 25-ish years of professional
    - Re: (Score:1)
      
      by Ted Stoner ( 648616 ) writes:
      
      Well, most of us don't have the specific experience to know whether AI is useful enough for Chris Mason's specific use case, but we have enough from our own.
      I have found that AI is good enough for a first draft of code, or for providing comments on existing code
      I have no idea how good AI is right now for this task, but presumably existing off-the-shelf dedicated s/w for doing code inspections will use way less resources than an LLM. Regardless of the inspection tool, human review after the code has passed that inspection is always helpful, if for nothing else than maintaining styles, standards, and for awareness of what is being done. Awareness is important until we turn the whole shebang over to the LLMs when we are enslaved.
      It will be interesting to see the poin
      - Re: (Score:2)
        
        by Entrope ( 68843 ) writes:
        
        I have no idea how good AI is right now for this task, but presumably existing off-the-shelf dedicated s/w for doing code inspections will use way less resources than an LLM.
        Based on my experience, I am almost certain you are right: the static cofe analyzer / static application security testing tool that I have used professionally needs fewer resources than an LLM. But on the other hand, an LLM might catch things that the special purpose tool does not. The guy I interviewed said the race conditions escaped his static analyzer, and I've seen even a locally hosted mid-size (120B parameter) LLM flag cut-and-paste errors that an SCA tool might miss. (I did not run a dedicated an
    - Re: (Score:2)
      
      by HiThere ( 15173 ) writes:
      
      You can't have the specific expertise, because he's training it himself. Expect it to catch the kinds of errors he trains it to catch, and to probably do a good job at that...and a worse than lousy job on other kinds of errors.
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Indeed. And that is only one of the problems. It is already on show-stopper level though.
        Let's face it, to do things competently (and find the issues AI misses), you need a lot of experience. Experience can only be gotten from doing things. If you let the AI do the easier things, you will never get to the skill level needed to master the harder things. If enough people do that, it could be catastrophic.
        
        Re: (Score:2)
        
        by HiThere ( 15173 ) writes:
        
        That NOT a problem. Don't expect any tool to be the be-all-end-all. It's really useful to have a bunch of classes of error be automatically detectable.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    You do not understand what the problem here is. I am pretty sure I do.
    - Re: Don't be stupid, people (Score:3)
      
      by Zero__Kelvin ( 151819 ) writes:
      
      The fact that you are "pretty sure" that you understand is actually the problem that you don't understand. It's called the Dunning Kruger effect. I'm certain you know what that is ... you just don't recognize that you are a walking example of it every time you discuss AI.
- Re:Don't be stupid, people (Score:5, Insightful)
  
  by rocket rancher ( 447670 ) writes: <themovingfinger@gmail.com> on Monday February 02, 2026 @10:26AM (#65964062)
  
  First, LLM-type AI may not actually be around in any suitable way in a few years. The business numbers are catastrophic.
  That’s not an argument, it’s astrology with a spreadsheet. Even if Vendor X faceplants into a crater, the workflow Chris is talking about doesn’t evaporate. These are prompts and scripts that turn “big diff, big context” into small, reviewable chunks. Swap the engine, keep the tooling. The kernel has outlived entire tech empires, compilers, VCSes, “next big things,” and at least three “Linux is doomed” decades. Tools that reduce reviewer fatigue stick around because reviewers keep using them, not because a quarterly earnings call went well.
  Also: the LKML thread is about making AI review less magical by structuring it, scoping it, and forcing it to show its work. That’s basically the opposite of “bet the farm on a single vendor’s hype cycle.”
  Second, LLM-type AI misses what is really important, namely quality of architecture and interfaces
  Correct in the most trivial way possible: lint won’t design your subsystem either -- and nobody claimed it would. This isn’t “let the chatbot be a maintainer,” it’s “use a tool to catch more bugs while humans stay responsible for architecture and interfaces.”
  Kernel review is layered. Humans do the high-level “does this belong, does it fit, is the interface sane, does it age well?” work. Tools do the tireless “did you miss a refcount, a NULL check, a lock ordering hazard, a surprising call path” work. Chris is explicitly carving the diff into tasks, extracting call graphs, and even cross-checking lore and Fixes tags. That’s a checklist machine, not an architect. Complaining it’s not an architect is like complaining grep can’t write a better filesystem -- which Chris *obviously* can do... :)
  and is[t] bad at finding security problems outside of toy examples.
  If your model is “AI must find every non-toy security bug or it’s worthless,” then congrats, you’ve also just declared static analyzers, fuzzers, and humans worthless, because none of them are complete. In reality, we stack imperfect tools and get better outcomes. Syzkaller doesn’t understand architecture either, yet it finds terrifyingly real bugs. Sparse doesn’t grok interfaces, yet it saves us from type and annotation shooting us in the foot. Smatch doesn’t have to grok the dev's intent to catch patterns reviewers miss at 2AM.
  AI review is the same category: a probabilistic pattern spotter that can flag suspicious deltas fast, especially when you constrain context, force targeted questions, and make it operate on extracted facts instead of vibes. That’s exactly what this informal RFC is doing, including extra rigor around syzbot reports.
  if you don’t want to use the prompts, don’t. But don’t pretend “VC math scary” and “AI isn’t a maintainer” are substantive rebuttals to an RFC, even an informal one, about reducing token waste and catching more bugs with a structured, auditable review pipeline.
  
  - Re: (Score:1)
    
    by gweihir ( 88907 ) writes:
    
    First, LLM-type AI may not actually be around in any suitable way in a few years. The business numbers are catastrophic.
    That’s not an argument, it’s astrology with a spreadsheet.
    That you do not understand business analytics does not mean they do not work. Seriously. Your ignorance is strong.
    - Re: (Score:2)
      
      by DamnOregonian ( 963763 ) writes:
      
      That you don't understand that these workflows exist even if these businesses disappear from the Earth is surprising. You do know that local LLMs are a thing, right?
      And if you don't have the hardware to run one, there are many crowdsourced solutions, or even companies that are profitable who do nothing but inference on local models.
      
      It's very true that training is a big fucking problem. Companies building out datacenters to house their GPU horsepower requirements are a big fucking problem.
      Raw rented-hors
- Re: (Score:2)
  
  by Junta ( 36770 ) writes:
  
  This is a use case where I could see LLM being decent, subject to a couple of constraints:
  - The submitter is always able to advance it to the next stage regardless of what the LLM says
  - A human review is always the next step after the LLM review
  LLM code review is actually one of the more innocuous situations, it offers very small, digestible indirect feedback about code. It's still usually wrong, but it occasionally will catch something useful that was otherwise overlooked. It may spare the reviewers from
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    So essentially catch submissions by incompetents and AI-slop? That could work. That may even become necessary, given how many idiots think AI turns them magically into good coders.
    But it seems that is not what the person from the story is trying to do.
Wtf is a kernel stakeholder? (Score:3)

by Viol8 ( 599362 ) writes: on Monday February 02, 2026 @05:57AM (#65963836) Homepage

Is that pointy hair speak for kernel code contributers?

- Re: (Score:3)
  
  by Mascot ( 120795 ) writes:
  
  Based on the summary, to me the use of "upstream" indicates "maintainers". As in the people responsible for approving and merging.
- Re: (Score:2)
  
  by kwalker ( 1383 ) writes:
  
  "stakeholder" in this case means "contributors" and "people who benefit from those contributions" so maintainers and distro developers.
- Re: (Score:2)
  
  by ChunderDownunder ( 709234 ) writes:
  
  Disagree [ I am not a kernel contributor ]. If you look at the mailing list, patches go through multiple iterations, with back and forth between code reviews.
  Code ought still be reviewed by humans but if an AI can reduce patchsets from, say, 8 to 5 based on the experiences of a veteran kernel developer then it has performed a service.
  And maybe the AI can introduce a special, pedantic, Linus-mode to pre-emptively yell at you when your approach is bad ! :)
  - Re: (Score:2)
    
    by piojo ( 995934 ) writes:
    
    Yeah, I'd rather be warned privately that my code likely has some issues than have it pointed out publicly! Of course for educational purposes some people need to make public mistakes, but it need not be as many as are doing it now.
    - Re: (Score:2)
      
      by SirSlud ( 67381 ) writes:
      
      What does that have to do with *what* or *who* is reviewing your code? If you're submitting PRs to the kernel, you're way beyond the concern of being self-conscious about shit like that.
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  Imagine getting your first code reviewed instantly instead of waiting a day. And then iterating 5 times with the bot before the first human sees it and tells you what's against the guidelines. You save a lot of time. You have less wait time to the first feedback and can prepare a good patchset, and the maintainers only need to read your code 3 times instead of 8. Everyone wins.
- - Re: HELP! (Score:2)
    
    by devslash0 ( 4203435 ) writes:
    
    Probably confusion caused by too many drugs.
    - - Re: (Score:2)
        
        by Quasar1999 ( 520073 ) writes:
        
        Help! I am trapped inside the fortune cookie factory, being forced to write quips that sound funny if you append [in bed]!
        I wish to subscribe to your fortune cookie newsletter.
Did AI write the summary? (Score:3)

by 93 Escort Wagon ( 326346 ) writes: on Monday February 02, 2026 @01:13PM (#65964362)

... has been working on a Git repository with AI review prompts he has been working on for LLM-assisted code review of Linux kernel patches.
It's good to know he has been working on a thing he has been working on.

Why should I trust him... (Score:2)

by Wolfrider ( 856 ) writes:

When btrfs RAID5/6 has been broken for well over a decade? The filesystem is a shitshow of bugs and bad implementation compared to ZFS.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Re: (Score:3)

Re: Don't be stupid, people (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: Don't be stupid, people (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Don't be stupid, people (Score:3)

Re:Don't be stupid, people (Score:5, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Wtf is a kernel stakeholder? (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: HELP! (Score:2)

Re: (Score:2)

Did AI write the summary? (Score:3)

Why should I trust him... (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals