Linux Virtual Ethernet Bug Delivers Corrupt TCP/IP Data (vijayp.ca) 40
jones_supa writes: Vijay Pandurangan from Twitter warns about a Linux kernel bug that causes containers using Virtual Ethernet devices for network routing to not check TCP checksums. Examples of software stacks that use Virtual Ethernet devices are Docker on IPv6, Kubernetes, Google Container Engine and Mesos. The kernel flaw results in applications incorrectly receiving corrupt data in a number of situations, such as with bad networking hardware. The bug dates back at least 3 years or more – it is present in kernels as far back as the Twitter engineering team has tested. Their patch has been reviewed and accepted into the kernel, and is currently being backported to -stable releases back to 3.14 in various distributions. If you use containers in your setup, Pandurangan recommends that you deploy a kernel with this patch.
Re: (Score:2, Funny)
Corrupt Data? I thought his name was Lore.
Re: corrupt data (Score:1)
Better late than never (Score:4, Funny)
After ten years and billions of dollars, Twitter has finally contributed something useful to society.
Twitter's fix shoud be denied (Score:1, Interesting)
Given that they wantonly violate people's civil liberties by shadowbanning twitter accounts of people they deem politically incorrect while continuing to allow tweets from known terrorist groups - Twitters patch should NOT be accepted in accordance with their own code of conduct.
There's a term for that (Score:4, Insightful)
"Cutting off your nose to spite your face."
Re: (Score:3)
My experience has been that the TCP checksums are fairly useless - they can detect single bit errors only since they are just simple checksums, not CRCs or something more sophisticated. According to the article what was actually happening was that the virtual ethernet driver (veth) did not flag bad packets correctly. There's a flag that tells TCP there's no need for it to checksum since the hardware has already verified the packet. On errors, the veth driver set that flag instead of the one that says it
Re: (Score:3)
How often does the TCP/UDP checksum detect errors that the previous two could not?
TCP/UDP checksums are useful for one thing primarily - mitigating the effect of defective network hardware. That is about the only thing that can cause a transport level checksum error. Anything else is caught with a very high probability by Layer 2 protocols, which typically use a 32 bit CRC. Some Layer 2 protocols have do relatively weak checksums, but not so weak that TCP checksums are likely to catch much more than they
Re: (Score:2)
How often does the TCP/UDP checksum detect errors that the previous two could not?
May I remind the distinguished audience that IPv6 does NOT have a Header checksum. Therefore, on IPv6, TCP/UDP/SCTP checks are MANDATORY in all cases (UDP Checks were optional in IPv4, the guys doing VoIP are jumping of joy,/sarcasm> about it...).
One REALLY NEEDS to do those checks.
(Computer networks teacher speaking here).
Re: (Score:2)
How often does the TCP/UDP checksum detect errors that the previous two could not?
May I remind the distinguished audience that IPv6 does NOT have a Header checksum. Therefore, on IPv6, TCP/UDP/SCTP checks are MANDATORY in all cases...
One REALLY NEEDS to do those checks.
(Computer networks teacher speaking here).
May I remind the distinguished teacher that (a) the checksums in TCP and UDP are lame compared to CRC and (b) they are irrelevant given a sufficiently robust data link layer. As I said in another post just above, TCP and UDP originally included checksums because IP was being carried over lame data links, and so the checksums were a bit of "belt and suspenders". Very few data link protocols today lack a robust CRC, so the checksums are anachronistic.
The particular issue in this topic seems to be not that t
Re: (Score:2)
I worked with very early internet technology in the 1990's. Back then, the network chip was on a separate board just like the graphics card. The MAC address had to uploaded into flash memory on startup. These could blow up given the right conditions, then the card would either just keep blasting out random packet data, or traffic collisions would result in fragment packets (less that the minimum size) going out. Some early day drivers would pick up these packets. Filtering was done in software. Now with th
Re: (Score:2)
I once ran into problems where a server with a certain Intel chip would sometimes corrupt data over the PCI bus. I had to put in a check in my driver to detect that chip and turn off a major PCI optimization if that one chip was detected. CRC errors would not detect it because that was handled in the network adapter. At the time some of my tests were with Netbeui which has no L3/L4 checksums and resulted in corrupt files (which were detected by the test scripts).
I've run into a number of times where data ge
Re: (Score:2)
I started working on transport protocols, And I always wondered about this:
ethernet has its crc, ipv4 has its crc, how often does the TCP/UDP checksum detect errors that the previous two could not?
TCP was finalized in 1981, long before modern Ethernet was around. TCP was originally developed for ARPAnet which used various longhaul communications technologies (think "modems") to interconnect sites. In those days, communications hardware usually did not have CRC or any other checksum checking. So TCP did its (simplistic) checksum to provide some protection.
IPv6 does not have the checksum, but the ethernet one is still there.
IPv6 came along much later, after Ethernet (and long-haul communications) had advanced to the point where CRC protection was a standard expectati
I could have sworn this was intentional (Score:5, Interesting)
Re: (Score:2)
Re:I could have sworn this was intentional (Score:5, Informative)
Most NICs don't drop packets with bad L3/L4 checksums. Instead they flag them as bad and pass them to software, and the packet doesn't get checked until it hits the TCP/IP stack. The problem is that in this configuration, the packet arrives and the physical NIC and is flagged as bad, but when it is passed through the veth device that flag is intentionally cleared, and only after passing through the veth device does it hit the TCP/IP stack. Because the checksum was marked as good the stack trusts it and passes the data up to the socket.
Re: (Score:2)
Most NICs don't drop packets with bad L3/L4 checksums
Traditionally, NICs do not even "know" that there is such a thing as Layer 3, let alone check it in any way. L3 checksum validation is a bonus feature.
Bad L3 checksums tend to be caused by defective networking hardware, and in this case the defective networking hardware of the recipient. If you are using checksum validation offload, ignoring the result in the presence of defective hardware isn't likely to make a difference either way.
Good it's fixed, but not too bad of a bug. (Score:1)
Any application that checked for proper data (encrypted links, ssh, etc) would have automatically
been protected from this.
And, any attacker with access to the local network can already craft arbitrary TCP or other data and calculate a 'proper'
checksum to have data pass up the stack.
So, I'm glad it's fixed...but hard to see why this made it to Slashdot!
Re: (Score:3)
A lot of traffic is sent unencrypted because encryption just isn't needed. You don't get encryption for free in most cases since it requires a fair amount of CPU overhead to implement it and/or additional hardware, plus there's all the overhead of setting up an encrypted link. Within a LAN, encryption usually isn't required for most of the data being sent.
As for corrupting packets, I had a setup in my cubical a few weeks ago running 10G traffic where I could corrupt packets on request by switching the fluor
All the more reason to use systemD (Score:1, Flamebait)
It is not the operating systems job to route packets. It's the startup daemon silly
I had the same thing (Score:2)
It turned out to be the fault of the VM and functionality offloading. See here: http://stackoverflow.com/quest... [stackoverflow.com]
It's a feature not a bug (Score:2)
The only purpose of the checksum is to increment a universally ignored error counter so operators know to replace broken hardware.
TCP checksums are wholly insufficient to prevent corruption of TCP streams at anything resembling a useful rate. It went unnoticed for years because checksums are irrelevant.
I've been bitten by this (Score:2)
I make a local cache of debian packages on one of my VMs, using apt-mirror. From time to time one of the packages would fail its checksum - reloading it from the offsite source would usually work. When I changed the VM's ethernet device to a virtual e1000, the problems went away. I later found an interesting cabling issue that was causing transmission errors between a switch the the physical host.
How many eyes were looking at the Virtual Ethernet (Score:2)
How many eyes were looking at the Virtual Ethernet feature/code?
Clearly, not enough.
I've said it before and I say it again. You need enough QUALIFIED and MOTIVATED eyes. You also need clear QA test cases in order to render all bugs shallow.
XenServer? (Score:1)