Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Bug Networking Upgrades Virtualization Linux IT

Linux Virtual Ethernet Bug Delivers Corrupt TCP/IP Data (vijayp.ca) 40

jones_supa writes: Vijay Pandurangan from Twitter warns about a Linux kernel bug that causes containers using Virtual Ethernet devices for network routing to not check TCP checksums. Examples of software stacks that use Virtual Ethernet devices are Docker on IPv6, Kubernetes, Google Container Engine and Mesos. The kernel flaw results in applications incorrectly receiving corrupt data in a number of situations, such as with bad networking hardware. The bug dates back at least 3 years or more – it is present in kernels as far back as the Twitter engineering team has tested. Their patch has been reviewed and accepted into the kernel, and is currently being backported to -stable releases back to 3.14 in various distributions. If you use containers in your setup, Pandurangan recommends that you deploy a kernel with this patch.
This discussion has been archived. No new comments can be posted.

Linux Virtual Ethernet Bug Delivers Corrupt TCP/IP Data

Comments Filter:
  • by Anonymous Coward on Monday February 22, 2016 @01:40PM (#51559741)

    After ten years and billions of dollars, Twitter has finally contributed something useful to society.

  • by Anonymous Coward

    Given that they wantonly violate people's civil liberties by shadowbanning twitter accounts of people they deem politically incorrect while continuing to allow tweets from known terrorist groups - Twitters patch should NOT be accepted in accordance with their own code of conduct.

  • by Verdatum ( 1257828 ) on Monday February 22, 2016 @02:23PM (#51560201)
    I was under the impression that virtual ethernet devices intentionally don't bother verifying checksums, because they were intended to be used in situations where there is very little probability of the data being corrupted.
    • This. How do you get corrupted data from bad networking hardware into a virtual machine, without it passing through a real NIC first?
      • by Anonymous Coward on Monday February 22, 2016 @02:49PM (#51560491)

        Most NICs don't drop packets with bad L3/L4 checksums. Instead they flag them as bad and pass them to software, and the packet doesn't get checked until it hits the TCP/IP stack. The problem is that in this configuration, the packet arrives and the physical NIC and is flagged as bad, but when it is passed through the veth device that flag is intentionally cleared, and only after passing through the veth device does it hit the TCP/IP stack. Because the checksum was marked as good the stack trusts it and passes the data up to the socket.

        • by butlerm ( 3112 )

          Most NICs don't drop packets with bad L3/L4 checksums

          Traditionally, NICs do not even "know" that there is such a thing as Layer 3, let alone check it in any way. L3 checksum validation is a bonus feature.

          Bad L3 checksums tend to be caused by defective networking hardware, and in this case the defective networking hardware of the recipient. If you are using checksum validation offload, ignoring the result in the presence of defective hardware isn't likely to make a difference either way.

  • by Anonymous Coward

    Any application that checked for proper data (encrypted links, ssh, etc) would have automatically
    been protected from this.

    And, any attacker with access to the local network can already craft arbitrary TCP or other data and calculate a 'proper'
    checksum to have data pass up the stack.

    So, I'm glad it's fixed...but hard to see why this made it to Slashdot!

    • by AaronW ( 33736 )

      A lot of traffic is sent unencrypted because encryption just isn't needed. You don't get encryption for free in most cases since it requires a fair amount of CPU overhead to implement it and/or additional hardware, plus there's all the overhead of setting up an encrypted link. Within a LAN, encryption usually isn't required for most of the data being sent.

      As for corrupting packets, I had a setup in my cubical a few weeks ago running 10G traffic where I could corrupt packets on request by switching the fluor

  • It is not the operating systems job to route packets. It's the startup daemon silly

  • It turned out to be the fault of the VM and functionality offloading. See here: http://stackoverflow.com/quest... [stackoverflow.com]

  • The only purpose of the checksum is to increment a universally ignored error counter so operators know to replace broken hardware.

    TCP checksums are wholly insufficient to prevent corruption of TCP streams at anything resembling a useful rate. It went unnoticed for years because checksums are irrelevant.

  • I make a local cache of debian packages on one of my VMs, using apt-mirror. From time to time one of the packages would fail its checksum - reloading it from the offsite source would usually work. When I changed the VM's ethernet device to a virtual e1000, the problems went away. I later found an interesting cabling issue that was causing transmission errors between a switch the the physical host.

  • How many eyes were looking at the Virtual Ethernet feature/code?

    Clearly, not enough.

    I've said it before and I say it again. You need enough QUALIFIED and MOTIVATED eyes. You also need clear QA test cases in order to render all bugs shallow.

  • Does anyone know if XenServer uses this functionality?

Prototype designs always work. -- Don Vonada