Linus Torvalds Injects Tabs To Thwart Kconfig Parsers Not Correctly Handling Them (phoronix.com) 117
Michael Larabel reports via Phoronix: Within yesterday's Linux 6.9-rc4 release is an interesting little nugget by Linus Torvalds to battle Kconfig parsers that can't correctly handle tabs but rather just assume spaces for whitespace for this kernel configuration format. Due to a patch having been queued last week to replace a tab with a space character in the kernel tracing Kconfig file, Linus Torvalds decided to take matters into his own hand for Kconfig parsers that can't deal with tabs... Torvalds authored a patch to intentionally add some tabs of his own into Kconfig for throwing off any out-of-tree/third-party parsers that can't correctly handle them. Torvalds added these intentional hidden tabs to the common Kconfig file for handling page sizes for the kernel. Thus sure to cause dramatic and noticeable breakage for any parsers not having tabs correctly.
Re: (Score:2, Insightful)
What is the point of this. To show your parser is broken? Cool. Now you just completely broke projects downstream so you can say told you so? This is a product enterprises depend on. I felt bad when Linus was forced to step down during the plandemic.. now? Good riddance. Time for someone with more tact to start running this project
Keep using Windows. You'll fit right in there. We don't need shitty developers like yourself bitching that someone broke your shitty code or that something that has been working for 40 years doesn't work in the latest version of the OS because you'll actually have to get off your lazy ass and update your code.
Re:Why (Score:5, Informative)
Re:Why (Score:5, Funny)
And yet, some still install to "c:\PROGRA~1" or whatever, because Microsoft, in their opposite of wisdom, didn't design things for the future.
Sounds like a retcon'd attribution, to me.
Re:Why (Score:5, Insightful)
Re:Why (Score:4, Insightful)
Re:Why (Score:5, Informative)
Gary Kildall was the CEO of Digital Research. They created CP/M, a very popular operating system for (at the time) 8 bit microcomputers running the Z80 CPU.
Microsoft at best made a BASIC for these microcomputers - things like the FAT filesystem can trace their origins to microcomputer BASIC from Micro-Soft (as it was known then).
Now, IBM famously tried to arrange a hush-hush meeting with Digital Research to get CP/M ported to 16-bit processors (i.e., the 8086) for their new Personal Computer. Unfortunately, Gary was out flying, while his wife refused to sign the NDA IBM wanted. Now, depending on the version of the story, the widely publicized one was that Gary was flying for fun and not to be disturbed which irritated IBM as they mutually arranged for this meeting. In reality, Gary's wife was the one who did the business matters (Gary was the technical guy/programmer/etc) so Gary wasn't actually needed since it was thought that they would hammer out the legal agreements. But it was not only the failure of Gary's wife to sign the NDA, but that they could not hammer out a licensing arrangement. IBM wanted "free and clear" licensing - they'd pay Digital Research a sum of money for unlimited copies, and that was something Digital Research refused to do. So IBM left.
IBM then asked Micro-Soft (who was providing the BASIC and languages support) if they would supply the OS as well, which is where they bought QDOS from Seattle Computer Products, which was a clean room implementation of a CP/M like OS for the 8086. (It was by definition clean room - as CP/M for 16-bit processors didn't yet exist). Of course, Digital Research sued IBM and Microsoft, and IBM settled by offering CP/M, when it was available, with the IBM PC. Incidentally, this was one of the first "look and feel" lawsuits - it was felt that MS-DOS felt a little "too close" to CP/M
Of course, by then it was too late - MS-DOS and PC-DOS was well established for years, and were cheaper at $99 vs. DR's 16-bit CP/M at $250.
And yes, 8.3 was ridiculous in the 80s, when even contemporary PCs supported longer file names - I think the Commodore 64 supported 16 characters for the filename, while the Apple II supported 30 characters (!) in DOS and 15 characters (aww...) in ProDOS. And the Macintosh supported 32 characters initially.
And that we put up with this silliness until the mid 90s. When every other computer and OS on the market supported much longer filenames. Or *gasp*, spaces.
It should be noted that FAT12 and FAT16 are hardcoded to only support 8.3 filenames - the directory contains 8 characters for name, and 3 characters for extension (all filenames are padded to 8 with spaces). FAT32 was the first to support long filenames natively (support was added in Windows 98, so Windows 95 only did FAT12 and FAT16). So long filename support was a hack for FAT12 and FAT16 (and added to Linux as "vfat" but you could mount it as "msdos" to get back to 8.3. (and there was the umsdos extension to add UNIX permissions and ownership information).
Incidentally, long filename support was one of the patents Microsoft sued over initially, leading to vfat being slightly incompatible (I believe it stopped making the 8.3 compatible filename), but it was OK as Windows would auto-create them on mounting.
Re: (Score:2)
Windows 95 introduced support for longer file names.
>Released in August 1995, Windows 95 featured a new version of FAT, called VFAT (virtual file allocation table), that supported file names with a maximum length of 255 characters. All this was accomplished without losing backward compatibility with existing DOS volumes.
Re: (Score:3)
Irrelevant. The patch submitted was to fix the config file for all non-compliant browsers. Program files, was intended to break non-compliant programs.
That some programs ignored this, either intentionally or via their installer wrapper, does not take away from the number of applications whose installer wouldn't work and got fixed.
Pretty sure this is an example of the exception that proves the rule. Meaning that there were rules, and your example is an exception to those, but nonetheless, the rule was follow
Re: (Score:2)
And yet, some still install to "c:\PROGRA~1"
No they do not, unless they originally targeted XP and previous. Your modern install of Windows probably has short file names disabled. Aside from that, there is no guarantee that the short filename for "Program Files" is actually "PROGRA~1"
Re: Why (Score:1)
Or they use broken installers or are sourced from older systems. I regularly see the C:\Program~1 directory or even C:\Program (completely eliminating the Files, probably because someone wrote a parser that couldnâ(TM)t handle spaces in file names). The latter you can actually find in components of Adobe Creative Cloud.
Re: (Score:1)
Microsoft didn't design the 8.3 file naming system. Tim Patterson did.
Patterson was working for Seattle Computer Products, who needed an OS for their new computer. He created QDOS, "Quick and Dirty Operating System". That eventually became MS-DOS, on which Windows 95 and long file name support were built.
So it's a legacy from a hack done by a 24 year old contractor who didn't think his code would be used for more than a niche kit computer.
Re: (Score:2)
Nope, Seattle Computer Products didn't design it. CP/M already used 8.3 filenames, and the original FAT filesystem used by QDOS was a copy of the filesystem format used by Microsoft Standalone Disk BASIC, which used 8.3 filenames to imitate CP/M. It got to QDOS from CP/M via Microsoft.
Re: Why (Score:1)
And CP/M got it from PDP-11 (TOPS/TENEX) OS.
Re: (Score:3)
"Program Files" has a space in it for the same reason Linus put tabs in his Kconfig -- to force devs to handle spaces in paths correctly.
Because the people that did this were focused on a GUI experience and not a command-line one and failed to consider the ramifications of their actions. Personally, I'm not a huge fan of white-space in file paths/names and take efforts to name things w/o them.
Re: (Score:1)
Re: (Score:2)
I wish GNU Make would be support spaces in filenames. There have been numerous times I've thought that a Makefile would be a great way to quickly solve a problem, only to be fouled up by some files with spaces in their names.
Re: Why (Score:4, Informative)
There is no change to announce, the parser is broken, everyone has known it's broken. It's unfortunate, but sometimes private discussions don't do the trick and the issue has to be taken public. This is that.
Re: (Score:2)
Re: (Score:2)
... you don't release a breaking change because you feel like causing drama.=
There's a difference between a release and a release candidate - and this is the latter.
Although personally I'd rather LInus had just rejected the silly kludge that got submitted to the previous RC - perhaps accompanied by one of his diatribes - instead of this passive-aggressive action.
Re: (Score:3)
What Linux makes sure here is that people that are too stupid or lazy to run real tests against their parsers do not get away with it anymore. You know, fixing the problem permanently. This is not "passive aggressive", this is called good engineering.
Re: Why (Score:2)
Good engineering is not impacted by things like this. Good engineering follows the spec. Good engineering doesn't implement a broken parser then push their shit downhill onto everyone else.
Re: (Score:2)
Indeed. But what this whole bunch of fake-criticism shows is that many people are working in the IT space that have absolutely no place in there because they only think about themselves and their short-term issues but have zero larger or let alone strategic understanding.
Re: (Score:2)
For something like this? No. Whoever is affected did not even try to do proper testing of their crapware. And if this cause any real problems, that is purely on them. We need coders that deserve the name, not amateurs with delusions.
Re: Why (Score:2)
Well to be fair it could be absolute dog shit ... but it's free
Re: (Score:2)
"What a joke, worshipping a developer and a system that has garbage like this in it 50 years after everyone knew better."
Everyone knew better how to parse the config files, but didn't. I don't think anyone here is worshiping a developer, but if you think developers should be able to just screw up any way they want to, think again. Mangled config files is serious business, and being unable to read config files means they might have problems writing a valid config file. In other words, don't trust them.
Re:Why (Score:4, Funny)
Python and YAML Users: "Shots fired!"
Re: (Score:2)
"Shots fired!"
Missed. Hit nothing but white space.
Re: (Score:3)
One good thing in these stories is that they always make some incompetent, intellectually lazy and hugely arrogant assholes obvious. Congratulations, you are one of them.
Re: (Score:2)
And yet all the cool kids love Python and YAML these days, both of which break in fun and interesting ways if you get the indenting wrong.
But that's by design, and is very clearly spelled out. And if you can't deal with Python's formatting rules, maybe you should go back to BASIC. The rest of us are making great stuff with it.
Re: (Score:1)
Re: (Score:3)
Indeed. Sometimes you have to just force lazy or stupid people to clean up their act. He is doing that here. When teaching imbeciles, you cannot always be friendly or they will learn nothing. And when letting things like this pass, the culture slowly goes to hell.
Re:Why (Score:5, Insightful)
Tabs vs Spaces is right up there in geek hills-to-die-on with Vi vs Emacs. It has been around for decades.
If your parser is that shit and can't HANDLE tabs, it needs to be broken explicitly so you fix it and it doesn't break accidentally, in some weird and obscure way. Even just blindly converting tabs to spaces is better than breaking.
Re: Why (Score:2)
You are not wrong. But current behavior is correct behavior even if it is being used against your expected use case. So now you have users dependent on this feature. You need to deprecate a feature you announce it. In kernel release N+2, tabs will be introduced. You have been warned.
Way better than saying SURPRISE! Your shit is broken, enjoy the delays to your project that depends on a poorly defined feature!
Re: Why (Score:4, Interesting)
It's a release candidate, so he's done exactly as you recommended, except N+1 instead of 2. Nobody fixes an N+2 problem anyway.
Re: (Score:2)
Re: (Score:2)
It's not new functionality.
That's unchanged from at least ~2016, which is the oldest conveniently retrievable:
https://www.kernel.org/doc/htm... [kernel.org].
Also, there are old commits replacing spaces with tabs. Here's one from 2016:
https://patchwork.kernel.org/p... [kernel.org]
I guess if Slashdot needs to go for
Re: (Score:1)
Re: (Score:2)
Likely the parser was written based on an assumption because of well established prior practice. Basically for the entire history except in a few cases spaces have been used. Nowhere is explicitly written that tabs and spaces can be used interchangeably, or what the tab means semantically speaking to the config.
Re: (Score:3)
except in a few cases spaces have been used
One of those cases being the instance that was requested (but not approved) to remove a tab from a Kconfig file.
Implying that there has been one there in the past. And unless the "third party parsers" referred to in TFS are brand new, they must have been handling tabs correctly in the past. Now they crash. Someone either jumped the gun with a change to their parser code. Or they are trying to force the issue by crying that Linus "broke my stuff".
I imagine that Linus is trying to prove one of two possible
Re: (Score:2)
How many spaces should a tab be converted to, especially in a file that already has tabs and spaces?
Re:Why (Score:5, Informative)
What is the point of this. To show your parser is broken?
The summary is missing one very important detail: there used to be tabs in a KConfig file, and the people who wrote that broken parser submitted a patch to remove the tabs instead of fixing their parser. Linus merely put back something that used to be there.
Re: (Score:1)
All bugs are shallow right? Thank goodness we have all these competent eyes on open source.
Re: (Score:2)
Re: (Score:2)
Re: Why (Score:2)
Re: (Score:3)
What a sad take. Because "enterprise" users chose to use half-baked code Linus must walk on egg shells? Linus has done exactly the right thing here: make the low caliber programmers fix their crap work. That's what makes Linus special: 99% of the species would have done exactly as you expect and knuckled under to incompetence.
Re: (Score:2)
I had a particular electrical engineering professor on my PhD committee. He told me one time that when his students made spelling mistakes in their code he told them that rather than correcting the spelling mistakes, they should make macros that fix the problem.
I was of the opinion that this was a bad idea. The software engineering prof who was also on my committee almost had a heart attack.
Re: (Score:2)
Indeed. And the Linus mind-set of "good engineering over bowing to incompetent assholes" is exactly what put Linux on the landscape and keeps it there.
Re: (Score:2)
The problem is that people start submitting (as in this case) patches to the kernel for their preferred styles despite the fact that the tabs exist in various other locations (you just didn't happen to need them, so they were ignored).
It requires time and effort from maintainers for what is basically a stylistic decision. Just like someone converting the entire codebase to their preferred 3-spaces-per-indent and then submitting that as a patch would not help anyone.
Re: (Score:2)
plandemic.
Really, you casually slip that into an unrelated conversation? 7 million people died. Freedom of speech is precious but some people really don't deserve it.
Re: (Score:2)
Freedom of speech is precious but some people really don't deserve it.
Agreed. The only reason even inane cretins must have freedom of speech is because all mechanisms to limit the noise (or worse) they produce are too easily abused.
Re: Why (Score:2)
Re: (Score:2)
Found the lazy/shitty programmer that couldn't be bothered to add a trivial additional check for tabs. /s
This isn't fucking rocket science but basic 101 Computer Science.
Re: Why (Score:2)
Use the language.
The isspace() function has been part of the language for 35 years, and covers all six standard whitespace values if you are processing each individual letter. Or, use the formatted i/o functions that can skip over all the whitespace characters.
No need to write your own and miss any.
Re: (Score:2)
Re: (Score:2)
Which (ironically?) both VT and FF are missing from JSON [json.org]. /shrug
Re: (Score:2)
Indeed. But for that you need to know it exists. Apparently there are complete fuckups out there that write frigging kernel config parsers and do not know this very basic thing.
On a related note, I have notices that a lot of "network security elements" got hacked in the last few years. It seems that "cretin level" coders are routinely used for security-critical stuff these days.
Re: (Score:2)
Not if your code might ever be used on Solaris where the function is f-ed up at least in en_US.UTF-8. Or, on better systems but with wrongly set locale. Then, should it handle whitespace values outside 7-bit? I'm saying "7" because U+A0 might or might not be handled by isspace().
With all such mess, it's safer to code your own check, even though you risk missing '\v' and '\f'. But hey, by now these characters are not whitespace, they're garbage.
Faster, too -- you avoid loading locale tables from the disk
Re: (Score:2)
Re: (Score:2)
Linus was never forced to step down.
Indeed. Linux was confronted by some virtue signalling "woke" assholes and just decided to not give them a fight because his time is a million times more valuable than theirs. Hence he stepped out of the center of attention for a bit but never let the rains go. Smart person.
Re: (Score:1)
Indeed. Linux was confronted by some virtue signalling "woke" assholes
Linus agreed he was going a bit over the top sometimes in his interactions and dealt with it.
Only morons like you make it into some culture wars "wokeist" shit. You're fucking obsessed.
BTW You need a new word or phrase, you've worn that one out now, like you did with "PC" and "SJW". Now it just means "anything the far right doesn't like".
Re: (Score:2)
... but never let the rains go.
I read this imagining Linus as Thor summoning a thunderstorm.
Re: (Score:2)
Well, if there was any actual recognition of merit, he would be able to.
Re: (Score:2)
Re: (Score:2)
The point of this is to make sure that some lazy and incompetent assholes fix their act. Not being able to handle tab as whitespace is beyond broken. I guess the people that have trouble dealing with tabs in kernel config files use the cheapest and most incapable excuses for "coders" they could find.
This is 100% on however cannot handle the tab and that this is even necessary shows how fundamentally broken some organizations are.
Re: Why (Score:2)
Enterprises can depend on it precisely because there are people forcing the parsers to follow the standard, rather than just some random made up garbage.
Re: Why (Score:3)
If someone wrote some code without test cases in a regular nightly build, that's not enterprise level. But Linus was nice enough to write some tests for FREE.
Re: (Score:2)
If your enterprise depends on a product that can't parse a textfile correctly without appropriate and simplistic sanity-checking, you absolutely and desperately need a new product for your enterprise.
And if that product says "Oh, we can't do that, because it's undocumented and the format could change at any time, so no warranty for that..." take that as a hint.
I would guess that absolutely nobody is paying the kernel team to solve their boo-boos with their third-party, out-of-tree, unnecessary KConfig parse
dramatic (Score:2)
Re: (Score:2)
Well he wants the breakage to be noticed so that coders will fix their parser bugs. So in this case, yes, "dramatic" is an appropriate word. Maybe not the best choice, but it works. If the breakage was subtle and minor, it might go unnoticed by some of the authors of the bad parsers. At least that's how I read it.
Re: dramatic (Score:2)
Sometimes a creating a hard break is the only way to get people to make the necessary code updates.
Iâ(TM)ll intentionally do this in my own code sometimes, so I know what I need to be fixing. Treat this a pressurisation test.
Re: (Score:2)
Nothing "dramatic" here. Just stupid writing that tries to make everything earth-shattering, as is so common today,
Unix philosophy (Score:2)
Can anyone give a reference for Unix on tabs and spaces? Foud this while trying to make a python point,"Tabs or spaces, but you have to handle both".
Re: (Score:2)
man isspace_l
Re: (Score:2)
IME many things expect only spaces, but most config files with lots of spaces will tolerate tabs. The really core classic UNIX stuff either expects spaces or uses some other delimiter entirely, and possibly even doesn't have indentation or comments. It was designed for brevity.
How incompetent can you be? (Score:5, Insightful)
These people must be the most stupid or lazy fucks. Obviously, tabs are whitespace. Obviously, you need to be able to handle them in a config file.
I am fully with Linus on this one. These idiots need to clean up their act.
Re: (Score:3)
It's a perennial problem with text handling. Made worse by Unicode, which now has many difference space characters.
This is what libraries are for. You have a kernel config parsing library so that it only needs fixing in one place.
Re: (Score:3)
Unfortunately, that is not true. Unicode does indeed have "many different space characters": https://unicode-explorer.com/a... [unicode-explorer.com]
Cost me 2 minutes with a search engine to confirm.
Re: (Score:2)
Turns out there's lots of whitespace [wikipedia.org]
Why (Score:3)
Is there more than one parser for KConfig?
Re: (Score:3)
Is there more than one parser for KConfig?
The other's were written by a group who call themselves "The nights who saaaaay NIH"
Re: (Score:2)
*knights. WTF does english have words which sound the same but are spelled differently.
Laziness (Score:2)
1. We stopping saying the letters, for laziness.
2. We didn't update the spelling, for laziness.
Nothing is stopping you from going around saying "K'Nih-gh (gurgle sound)-T". Or go with writing "nite" for both.
But what about that vowel-consonant-e thing? English should have a way of denoting a long letter without having to snuggle up to a consonant and have a third-wheel e tagging along, "I'm silent," Yeah, that's what they all say at first. But you weren't, were you, "so called" "silent"-e? It took those sno
Re: (Score:2)
Nothing is stopping you from going around saying "K'Nih-gh (gurgle sound)-T". Or go with writing "nite" for both.
Shame. Shame is stopping me from saying K'nights :-)
Re: (Score:2)
English is my first language (and only, as I suck at learning more), but I have trouble spelling... stuff like this. I always try to look up, comic book sounds e.g. argh and the like. A Monety Pythyon quote will tend to be confusing to non-native English speakers, and honestly, I find it hilarious, but I'm not sure "I get it". I love British comedies but I don't get everything the first time.
As for the article, seems like an easy way to prevent lazy parsing . I would hope Linus has laid out some idea/plan
Re: (Score:2)
Yeah I'm a non-native speaker. ... And thanks for not pointing out the additional grammatical mistake I made. XD
Dick move, should have done it with unit tests (Score:3)
Big project, warned another dev not to take some lazy ass approach and creating a God object instead of doing it properly.
I added a unit test that counted fields under some circumstances.
Test would run fine even in dev scenarios, the minute it made it to main line CI all hell broke loose.
Test wasn't pointless btw, it said something akin to "Don't add new fields to this object, you'll break the flaky deserialisation in (external system) and you'll also cause issues with size of the object in (xyz scenarios)."
That developer was exceptionally pissed with me because his feature didn't go out that sprint, had it done it would have destabilised the flaky as fuck anyway tech debt ridden prod system relied on by thousands of people, so he had to change his solution since the error was now visible to all other project devs. That's what Linus should have done and blocked it from mainline merge.
Re: (Score:2)
Commit link (Score:2)
I suppose the file format is documented? (Score:2)
Re: (Score:2)
If it's undocumented, third-party integrators should be asking for such and/or KNOWING that their integration could break at any time and that they should therefore do a lot more sanity-checking.
Tabs vs Spaces (Score:2)
Small brain: Four spaces
Large brain: Tabs
Galaxy brain: Both
Re: (Score:2)
Re: Linus, the benevolent Linux asshole for life (Score:4, Insightful)
If your team isnâ(TM)t following through with what they need to do, then maybe being an asshole is a necessary evil to get their act together.
Now if I could find the secret sauce to get team members to review each others code.
Re: (Score:2)
I do not even see him being an asshole here at all. He just helpfully provides a test-case that apparently the real assholes are not testing their software against.
Re: (Score:2)
I'd agree with that too, but unfortunately people who aren't happy with the change will use the label. Change is hard, but dealing with breaking software is even harder.