Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Linux

'Blue Screen of Death' Comes To Linux (phoronix.com) 109

In 2016, Phoronix remembered how the early days of Linux kernel mode-setting (KMS) had brought hopes for improved error messages. And one long-awaited feature was errors messages for "Direct Rendering Manager" (or DRM) drivers — something analgous to the "Blue Screen of Death" Windows gives for critical errors.

Now Linux 6.10 is introducing a new DRM panic handler infrastructure enabling messages when a panic occurs, Phoronix reports today. "This is especially important for those building a kernel without VT/FBCON support where otherwise viewing the kernel panic message isn't otherwise easily available." With Linux 6.10 the initial DRM Panic code has landed as well as wiring up the DRM/KMS driver support for the SimpleDRM, MGAG200, IMX, and AST drivers. There is work underway on extending DRM Panic support to other drivers that we'll likely see over the coming kernel cycles for more widespread support... On Linux 6.10+ with platforms having the DRM Panic driver support, this "Blue Screen of Death" functionality can be tested via a route such as echo c > /proc/sysrq-trigger.
The article links to a picture shared on Mastodon by Red Hat engineer Javier Martinez Canillas of the error message being generated on a BeaglePlay single board computer.

Phoronix also points out that some operating systems have even considered QR codes for kernel error messages...
This discussion has been archived. No new comments can be posted.

'Blue Screen of Death' Comes To Linux

Comments Filter:
  • I'm confused (Score:4, Insightful)

    by ArchieBunker ( 132337 ) on Sunday June 16, 2024 @04:16PM (#64553893)

    Is there something wrong with plaintext?

    • If Linux machines are ever going to make it to the mainstream then users are going to need simple upfront feedback. If Linux devs do their job well they will be users who probably won't even know system logs exist.
      • Re:I'm confused (Score:5, Insightful)

        by 93 Escort Wagon ( 326346 ) on Sunday June 16, 2024 @04:42PM (#64553939)

        If Linux devs do their job well they will be users who probably won't even know system logs exist.

        That's a big "if". Did you ever see a Windows BSOD that was actually useful?

        • Re:I'm confused (Score:4, Informative)

          by ArchieBunker ( 132337 ) on Sunday June 16, 2024 @04:49PM (#64553957)

          Yes, actually. Well back when they showed all the debug info. I was able to determine the video card drivers were causing the crash.

          • Debug information would be helpful, rather than the random lockups that make me press the Big Red Button for 8sec to reboot.

            The syslog very rarely tells me enough to make me understand why it locked. I blame NVIDIA closed-source drivers, but perhaps it's something else. The drivers are the only closed-source installed.

        • His point was, I think, that Linux - to be successful with mainstream users - needs to handle issues that don't involve users digging into log files to diagnose a problem. If a user needs to start reading log files to fix a failing application, the mainstream user will not do that, they'll seek easier alternatives, perhaps on another platform.

          It's a very reasonable suggestion/observation.

          • Re: I'm confused (Score:4, Informative)

            by PPH ( 736903 ) on Sunday June 16, 2024 @06:11PM (#64554079)

            Mainstream users won't be diagnosing kernel/driver problems. The level of expertise needed to do so is typically possessed by users who are not put off by "digging into log files".

        • yes all the time. For various hardware or driver failures they can be fantastic. For dodgy memory they tend to be various levels of randomness (which in itself can be telling)
        • Actually yes, the windows blue screen of death gave me excellent advice many times. The advice was always "use UNIX".

        • Did you ever see a Windows BSOD that was actually useful?

          I was getting the same error code repeatedly, looked online, GPU issue. It was the GPU alright -- its fans were clogged with dust. Cleaned, no more problems.

        • No, but the crash dump file it generates is useful. Iâ(TM)ve loaded those in to WinDBG many times and tracked down the source of BSODs. Iâ(TM)m not sure of the use of the info at the time of crash when the system is unresponsive.

        • That's a big "if". Did you ever see a Windows BSOD that was actually useful?

          Yes constantly. If nothing else the faulting module can often point you to the piece of hardware that is failing. I have on several occasions googled what has come up on the BSOD in Windows and it lead me virtually directly to the source of the problem. It's rare but I've had 3 BSOD's, and in two cases it virtually directly took me to the solution (one was GPU driver problem, the other was RAM), and the third case at least pointed me to have to look deeper with answers from Google basically saying you need

      • I think that's how they become mainstream. By keeping the Windows users in their comfort zone, which means seeing the occasional BSOD.
      • The point is when the feedback is so simple to the point of being useless. I've seen the image linked in the article and the BSOD doesn't give any information besides that there has been a kernel panic.
      • I like error messages that offer enough information for a technical person to resolve the issue. Plus being able to cut and paste the error message seems useful if you want to search for a solution online.

        For people who are not technical enough to resolve the error. They'll probably just reboot or find someone who can fix their problem for them. So the depiction of error messages is essentially irrelevant to them, as it wouldn't alter the outcome in any case.

    • The issue isn't text, it's that the OS gets so lost it suffers an unrecoverable error and simply halts the computer, putting all pending work lost.

      When Windows did it, Linux advocates howled, said it was unacceptable and proof that Windows was a flawed OS with deep issues in its fundamental design.

      Now that Linux is doing it, I am positive Linux advocates will point out that BSODs are the only reasonable response when an OS has certain issues...

      • Re: I'm confused (Score:5, Insightful)

        by test321 ( 8891681 ) on Sunday June 16, 2024 @07:26PM (#64554179)

        When Windows did it, Linux advocates howled, said it was unacceptable and proof that Windows was a flawed OS with deep issues in its fundamental design.

        The problem with Windows BSOD that made it unacceptable, is not the screen itself, it is the fact that it happened frequently, and that is the indicator of deep OS flaws.

        Now that Linux is doing it,

        It has always done it. Dennis Ritchie commented about a UNIX panic() function in 1971 https://www.multicians.org/uni... [multicians.org] . It's just uncommon to see a kernel panic, because Linux is incredibly reliable.

        • typo, it was 1973

        • it is the fact that it happened frequently
           
          I haven't seen chronic BSODs since win9x/XP. And XP was a driver issue or a $15 PSU

          • I haven't seen chronic BSODs since win9x/XP. And XP was a driver issue or a $15 PSU

            Yes, the word "happened" happens to be the past tense of the word happens. BSODs are rare these days and the systems which do experience them are often likely to experience a kernel panic too (hardware problem).

        • by AmiMoJo ( 196126 )

          Linux wasn't that much better when it came to hard crashes. The issue on both operating systems was 95% drivers. Both were running them in the kernel.

          Over time both Windows and Linux moved drivers out of the kernel and into user space, greatly improving stability. Both also replaced a lot of drivers with more generic interfaces that provided better stability, such as WinUSB and libusb.

        • by mjwx ( 966435 )

          When Windows did it, Linux advocates howled, said it was unacceptable and proof that Windows was a flawed OS with deep issues in its fundamental design.

          The problem with Windows BSOD that made it unacceptable, is not the screen itself, it is the fact that it happened frequently, and that is the indicator of deep OS flaws.

          Indeed, the problem wasn't that the Blue Screen was useless, it wasn't. It's the fact that a poorly written driver could crash the entire OS seemingly at random.

          VMWare's PSOD also tends to have some useful info about what went wrong, although 9 times out of 10 what went wrong was the hardware.

        • Making the Linux kernel panic is what allowed us (gave us the idea) to find the ping of death for Windows.

      • It better be more info than the crap that was posted as a "hey look at this!" shit on whatever twitter wannabe site was linked in TFS.

        Literally just "Kernel panic. Please reboot your computer". Thanks, that message was SUPER helpful about what actually went wrong.

        I mean fuck, would it kill them to let you know that "your RAM be fucked up yo" or "Your video card crashed, update your drivers or replace the card", or "Your CPU just went nuts, have you cleaned the fucking dust out of it lately so it doesn't ove

    • by Cyberax ( 705495 )

      Is there something wrong with plaintext?

      How are you going to print it when your monitor is in graphics mode?

      • Since the "monitor" is almost always on a remote virtualized host, with some kind of remote serial console access, the admin running cloud hosts can get screen shots of that. Getting at the _hardware_ remote consoles of the docker server, or the cloud provider, is a different level of access but there's often a means to get a screen shot of that today.

        • by Cyberax ( 705495 )
          The BSOD functionality is not for servers, but for local devices (like laptops). On servers you can just output the debugging data onto the serial console.
          • by kenh ( 9056 )

            On servers you can just output the debugging data onto the serial console.

            Uh, what? Where are you working that you have "serial console(s)"? (How have you kept that VAX running for all these years?)

            • by Cyberax ( 705495 )
              ????

              I have not seen a server _without_ a serial console. It often is virtualized through the management card, but all the servers that I used over the decade support console output for debugging. Most servers even have a good old RS-232 plug.
              • So we're calling a powershell session a "serial console"?

                Oh, and I guess a KVM session into a servers BIOS is also a "serial console"?

                When I hear something called a "serial console" I'm experienced enough to think you are talking about a terminal connected to a serial port - apparently you kids consider any interface that doesn't involve a mouse/pointer is a "serial console"...

                And yes, "Get off my lawn!" LOL

            • The serial console is great, and it's almost never actually a physical connection. It's basically a text console to a BMC.

              Sometimes it's emulating a serial port fully and it's a bit obnoxiously slow topping out at 115209, but sometimes it just immediately signals transmit complete and has an effective data rate that's extremely high, much higher than 115200.

              The result is that you can trivially log months of console output, copy and paste data with ease, and when you feel like it watch a hundred systems con

            • Serial consoles are available in every server I oversee (which is about 300)
              BIOS on servers is also usually able to output the basic character-mode screen output (getting BIOS, POST, Grub, etc) to it as well.
              Paired with IPMI, you've got an out-of-band network-connected text management interface to the server.

              I get the feeling you haven't touched a server since the VAX days ;)
            • It's not actually a serial port. For VMware and KVM, it's access to the remote video at boot time of the individual guest instance, useful for debugging kernel errors and filesystem issues. Access to it tends to be limited by policy. For physical hardware, it's usually actually a remote keyboard-video-monitor setup.

    • by znrt ( 2424692 )

      i loved tos' bombs back in the day: https://content.invisioncic.co... [invisioncic.com]

    • Re:I'm confused (Score:5, Informative)

      by gweihir ( 88907 ) on Sunday June 16, 2024 @05:25PM (#64554029)

      As the article says: This is for a "kernel without VT/FBCON support". Personally, I consider such a kernel simply broken. This whole thing seems to be a dumbing-down. On the other hand, maybe that is needed to make Linux more mainstream.

      • by guruevi ( 827432 )

        I believe systemd has a mode where there is no VT. I know on some modern OS I have had to enable it to get any error messages before systemd loads. This basically solves that.

        • by russotto ( 537200 ) on Sunday June 16, 2024 @07:16PM (#64554175) Journal
          When I saw the headline, I know Poettering had to be involved somehow.
          • by gweihir ( 88907 )

            I wonder what other screw-ups fly under the radar on Linux because Poettering is so prolific in breaking things.

          • When I saw the headline, I know Poettering had to be involved somehow.

            He's not, but your hateboner made it happen anyway. systemd resolved this already. In the situations where VT is disabled systemd has a module for logging kernel panics during boot called "systemd-bsod" (no not joking). https://www.freedesktop.org/so... [freedesktop.org]

            • by guruevi ( 827432 )

              Yes, but that doesn't help when your system doesn't even boot to get the (binary) logs from journald.

              • Yes, but that doesn't help when your system doesn't even boot to get the (binary) logs from journald.

                You'd think that... if you had no idea what systemd-bsod is about. Hint: It's there to resolve specifically the problem you mentioned.

          • by stooo ( 2202012 )

            when you need a BSOD, you get the best expeerts to achieve it.

      • by Burdell ( 228580 )

        Modern video cards don't have classic text modes, except as a barely-functional emulation for BIOS menus (which are graphical now instead of depending on those classic text modes). Plus, the majority of desktop Linux systems spend all their time in X or Wayland, and switching to text mode requires a whole lot of reprogramming of the video card (at a time when every bit of code and data has to be considered "suspect" at best).

        So being able to dump useful information while still in a graphical state with mini

      • Personally, I consider such a kernel simply broken.

        Such a kernel very much is not broken.

        This is the norm in the embedded space where you may not be able to dedicate a UART to a linux console.
        In this instance, the driver is being used to show a blue screen- but it's useful for far more than that. Like throwing the panic into an area of RAM for next-boot retrieval, dropping a message into a mailbox for a hypervisor, or letting the bootloader know (increment a counter, for alternate image boot, perhaps)

        This feature has been hacked into kernels out-of-tre

        • by gweihir ( 88907 )

          Soo, color display but no serial console? Got an example of that? Because it seems to be really uncommon to not have that serial line in anything large enough to run a Linux kernel.

          • No- no color display either. No display of any kind, and no serial console.
            Example? The one I'm most familiar with- SOHO CPEs that use their single UART to talk to another peripheral in the device.
            But even that's a red herring- the console itself isn't fundamentally useful if you don't expect the device to have a console. This is just about being able to provide a panic() driver.

            Linux provides no way to save any panic information without directly modifying the panic call.
            This will allow us to hook it w
            • by gweihir ( 88907 )

              Ah, you are arguing for the hook, not the actual blue screen, correct? That I have zero issue with. In fact, I could have used that myself in the past. On revisiting the story, looks like I should have read it more carefully. My apologies.

    • by guruevi ( 827432 )

      No, and plaintext will continue to work, this is for the weird event where you are in a DRI environment and experience a crash, I don't believe there is a method to get back to VGA before printing the error. Then there is also the issue that the terminal/kernel isn't always able to address the GPU during a crash. This will however be more similar to how macOS prints the error messages over a drawn window rather than a true BSOD that gives no information.

    • Is there something wrong with plaintext?

      Yes. I've never had a plaintext kernel panic produce some usable result. More often than not the error runs off the screen, so unless you have some external logging on a console you don't get the whole output. Also the whole basis for this is that VT is being depreciated in some cases and as a result you won't get kernel panic messages able to be logged at all.

    • Is there something wrong with plaintext?

      The geniuses responsible for systemd apparently think so.

  • Fictional (Score:2, Insightful)

    by The Cat ( 19816 )

    I've been using Linux as my daily driver for over 30 years and it has never crashed.

    Not once.

    • Re:Fictional (Score:4, Interesting)

      by gweihir ( 88907 ) on Sunday June 16, 2024 @05:26PM (#64554035)

      Yep, pretty much the same here. Maybe these people are preparing to make Linux as shoddy as some other OSes.

    • by HiThere ( 15173 )

      I've had it crash a few times. Generally a hardware fault, but once due to an update that removed something it shouldn't. Not in the last decade, though. (But there was an install failure that I traced to a failing disk drive.)

    • Come to think of it the last kernel panic I remember involved me doing deliberately weird things to an early version of reiserfs. That was some time ago.

    • Congrats. I've been using Linux for 30 years as well and I've had similar level of crashes to post windows 7 windows. Which is to say I've had kernel panics and they have been related to a) dodgy drivers, and b) failing hardware.

      I'm impressed you've never had hardware fail on you.

  • BSOD (Score:4, Interesting)

    by markdavis ( 642305 ) on Sunday June 16, 2024 @05:05PM (#64553995)

    >"And one long-awaited feature was errors messages for "Direct Rendering Manager" (or DRM) drivers â" something analgous to the "Blue Screen of Death" Windows gives for critical errors."

    Thankfully, Linux users are probably far less likely to ever see such a screen. I manage hundreds of various Linux machines (different distros, hardware, roles, users, apps, environments) doing all kinds of things and I can go years without seeing a panic (or messageless freeze/crash) on any of them.

    • by Anonymous Coward
      Sadly I see more kernel panics than I do BSOD. hardware drivers being the most common root cause.
    • by antdude ( 79039 )

      Once in a while, I get those hard crashes like kernel panics and freezes. It's even more annoying when it happens when my screen was asleep! :(

  • and my first thought was: what has Lennart Poettering been up to now ?

    Pleasantly surprised to find that I was wrong.

  • by Luckyo ( 1726890 ) on Sunday June 16, 2024 @05:26PM (#64554033)

    But can it even be allowed to be blue, considering how widely associated with Windows BSOD is?

  • by Big Hairy Gorilla ( 9839972 ) on Sunday June 16, 2024 @06:14PM (#64554081)
    I've never really seen a windows user use the information on the BSOD... so why bother?
    Also, coming from Redhat? Those guys want to BE the Microsoft of Linux, so they have to "innovate". If they don't add things, then how do they convince you that the new version is worth buying?

    In the end it doesn't add much or any value, but it keeps people talking, and others in jobs.

    Designers gotta design. <tm symbol>
    • I've never really seen a windows user use the information on the BSOD... so why bother?

      Those messages were never for users. They were for administrators who could actually do something with those codes.

      • Never saw that either... and admins could look at logs... so I'm gonna say there's still no practical value to intercepting an error and presenting it nicely. It's still just window dressing.
        • Never saw that either... and admins could look at logs... so I'm gonna say there's still no practical value to intercepting an error and presenting it nicely. It's still just window dressing.

          You should probably tell ArchieBunker what you said: https://linux.slashdot.org/com... [slashdot.org]

          After you are done chatting with him, we can chat about the times I used the information on the BSOD to fix what the issue was. *shrug* Why be so negative? Too much information can be a problem; however, when it causes a full stop of the operating system, more information is always better.

    • I've never really seen a windows user use the information on the BSOD... so why bother?

      Did you ask everyone who has had a BSOD? Quite a few people ignore them, then there's those who know their significance and we *DO* use the information on the BSOD. It often points to the source of the problem. I've had 3 cases of BSODs since windows 7 and in all two of the three the information on the screen directly pointed to the problem: a) NVIDIA drivers playing up, b) randomly changing windows module errors -> almost always RAM. In the third case googling the problem resulting in internet search re

  • This is like car manufacturers building all kinds of features to handling accident impacts with no regard to ADAS features that can actually prevent accidents.

  • About every 2 to 3 months my linux box just freezes with everything locked solid.

    To be fair, I don't know that it's linux causing this.

    I may have a bad RAM module or some other bit of hardware that glitches occasionally and brings everything to a 100% instant frozen halt. Holding the reboot button down for 5 seconds or whatever forces a reboot and then it's fine for another couple of months.

    • About every 2 to 3 months my linux box just freezes with everything locked solid.

      To be fair, I don't know that it's linux causing this.

      I may have a bad RAM module or some other bit of hardware that glitches occasionally and brings everything to a 100% instant frozen halt. Holding the reboot button down for 5 seconds or whatever forces a reboot and then it's fine for another couple of months.

      Hmm. The fact that your computer has a "reboot button" and you need to hold it for 5 seconds gives me the feeling that it's vintage. Hardware does age, so maybe you're right that it's glitchy. Diagnostic programs like memtest could help you determine this.

      Also, when was the last time you updated or upgraded your Linux system? Sometimes freezing can be caused by bugs or misconfigurations in a distribution that are fixed with updates.

      • I update within a day or so of updates being available. Currently using Linux Mint 21.3 Cinnamon, 5.15.0-112-generic.

        I built the system about 5 years ago maybe (??) something like that.

        The last PC I built, a gaming PC, has inputs for a reset switch just like all the motherboards I've seen, but I notice that some manufacturers are putting a reset button on the case anymore, is that a trend?

        I like having a reset button as sometimes the power button won't kill or restart the PC.

        • I update within a day or so of updates being available. Currently using Linux Mint 21.3 Cinnamon, 5.15.0-112-generic.

          Sounds good. I haven't used Mint but I have heard great things about it. I use Ubuntu 22.04, on which Mint 21 is based.

          BTW, I had trouble getting Ubuntu to run on my laptop until I changed the configuration of the Intel i915 GFX graphics driver. I needed to set the enable_psr parameter to 0 in /etc/modprobe.d/i915.conf. Without this configuration, my laptop would freeze up shortly after booting. It appears all releases after 18.04 require this change.

          I built the system about 5 years ago maybe (??) something like that.

          So, fairly new then. You're not overclocking? Anyway, goo

  • ...has finally arrived!

  • That's it, I'm moving to BSD.

    • by jmccue ( 834797 )

      Maybe you are joking, this is another indicator that IBM, Microsoft is making Linux work like Windows.

      If you ever used a BSD, you will see how much more any BSD system is compared to Linux. I have been evaluating the BSD for a few years and I have chosen my escape from Linux Route. All I need to see is have a couple of more straws loaded on the camel so to speak. To me this is 1 more straw.

      I say that because I wonder if this has to do with DRM (Digital Rights Mang.) which I heard was added a few releases

  • Except for the pretty blue color, that error message is useless. "Something went wrong" gives zero information. No error message, no data, nothing. Even Windows' blue screens are more informative.
    • I suppose it's probably more informative than status quo running a graphical desktop. It's been years since I've seen a panic on a desktop machine, but last time I did the UI just hung, the kernel couldn't printk to system while GUI was up.

      Which was a step backward from the Sun systems I used to work where a panic would just start scrolling the text away.

      So having an indication that it is crashed is better than an ambiguous hang. I also hope that they have access to the panic text and could render it to t

  • BSOD screensaver (Score:3, Informative)

    by kge ( 457708 ) on Monday June 17, 2024 @02:42AM (#64554593)

    Just install xscreensaver-screensaver-bsod for fun...

  • This driver seems extremely modern, judging by its name. A fitting choice for implementing bleeding-edge features.

  • ...Linux is ready for the desktop.

To communicate is the beginning of understanding. -- AT&T

Working...