Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
Cloud Networking Open Source Software Upgrades Linux

Pushing the Limits of Network Traffic With Open Source ( 55

An anonymous reader writes: CloudFlare's content delivery network relies on their ability to shuffle data around. As they've scaled up, they've run into some interesting technical limits on how fast they can manage this. Last month they explained how the unmodified Linux kernel can only handle about 1 million packets per second, when easily-available NICs can manage 10 times that. So, they did what you're supposed to do when you encounter a problem with open source software: they developed a patch for the Netmap project to increase throughput. "Usually, when a network card goes into the Netmap mode, all the RX queues get disconnected from the kernel and are available to the Netmap applications. We don't want that. We want to keep most of the RX queues back in the kernel mode, and enable Netmap mode only on selected RX queues. We call this functionality: 'single RX queue mode.'" With their changes, Netmap was able to receive about 5.8 million packets per second. Their patch is currently awaiting review.
This discussion has been archived. No new comments can be posted.

Pushing the Limits of Network Traffic With Open Source

Comments Filter:
  • by Anonymous Coward

    must be thoroughly considered. CloudFlare is the greatest Man-in-the-Middle on the Internet, and don't think for a second they're not collaborating with U.S agencies who wants to get at sensitive data going through their systems.

    • by Anonymous Coward

      "Prince and his team were inspired to start the company after a call from the Department of Homeland Security."(quote from article, not my opinion)

      Interesting take on it?

  • by JoeyRox ( 2711699 ) on Saturday October 10, 2015 @03:13PM (#50700007)
    If they only need to "shuffle" packets around (ie, not crack open the frames and actually interpret the data beyond making routing decisions) then routers/switches are better suited for this. If they actually need to do something more with the data then that quoted 5.8 million packets/sec. rate will drop very quickly for each single line of code they add that does anything with the data.
    • by raxx7 ( 205260 )

      Their goal is to receive the packets into their own user space analysis software and drop most of them (as being a flood attack).
      Their problem is that, using the existing methods, they can't get more than ~1 M packets/s into their software.

      I guess they are not using dedicated router hardware because there's no way to run their software on it.
      At which point, maybe they need a piece of kit based on Cavium's chips (lots of of low performance cores).

      • by AaronW ( 33736 ) on Saturday October 10, 2015 @06:20PM (#50700619) Homepage

        I work at Cavium on the SDK team (I do all the bootloader stuff for their MIPS chips). The Ubiquiti Edgerouter Lite uses one of our old (2nd gen CN5020) low-end dual core chips and is able to handle 1M packets/second by running the packet processing on a dedicated core and Linux on the other core. Our current generation (4th gen) is far faster. I work with chips from 4 up to 48x2 cores (48 cores, 2 chips running in NUMA). There's a lot of support for offloading packet processing in our chips, for example, directing packet flows to different groups of CPU cores. There's also various engines built-in to the chips for things like compression, pattern matching, deep packet inspection, encryption, RAID calculations and more. We also are selling NIC cards (Liquid I/O) which can run Linux on the NIC card as well as dedicated software that can offload a lot. For example, it can perform all the SSL, VPN and firewall stuff on the NIC. I'm working on some of the new ones now. I'd love to see some inexpensive eval boards available, especially with our CN73XX or even CN70xx chip. Even our low-end quad core CN71xx can handle 10Gbps of traffic.

        • I'm curious I just looked and I don't see any LiquidIO 40G adapters.Am I just missing them on the website? The ones I found seem nice but the Mellanox 40G with FPGA chip seem nicer as 10G is kind of out of date at this point in the server market.
    • You do realize that routers are made out of software too, right?
  • SystemD (Score:3, Funny)

    by El_Muerte_TDS ( 592157 ) on Saturday October 10, 2015 @04:08PM (#50700181) Homepage

    Wouldn't it just be easier to put this in systemd?

  • by AaronW ( 33736 ) on Saturday October 10, 2015 @06:02PM (#50700555) Homepage

    My employer deals with this on their multi-core MIPS processors. What we do is we can run Linux on one set of cores and dedicated applications on other cores. These applications offload most of the TCP/IP stack and only pass the relevant traffic to the kernel. The Ubiquiti EdgeRouter Lite uses one of our lowest-end chips and handles 1M packets/second. Our higher-end chips can easily handle far more packets. Then again, the dedicated cores are also able to take much better advantage of the hardware offload support for forwarding and filtering. Even without using the dedicated special application we can handle 40Gbps or more of traffic on the high-end chips. We can also handle stuff like IPSec at these rates due to built-in encryption and hashing instructions if coded properly.

    Having the right NIC card can also help since some NIC cards can offload things like TCP/IP segmentation and reassembly. I've also dealt with small gigabit switch chips that can offload stuff like NAT but Linux can't really take advantage of that as-is.

    There's a lot of room for improvement. Some years ago I was doing performance analysis for Atheros with respect to CPU cache utilization. The biggest bottleneck was the fact that the transmit path in the Linux networking stack would only pass a single packet at a time. Batch processing of packets for WiFi makes a HUGE difference since groups of packets need to be aggregated for 802.11N. It also would allow for more efficient packet processing for non-wireless as well. There are a lot of other areas that also could be improved.

    • by Bengie ( 1121981 )
      A 900mhz single core x86 CPU can handle 14 mil pps, but if using Netmap or some other decent network API/stack.
    • You can see it clearly if you take a managed switch apart. There are usually two large chips. A very big one that connects to all the interfaces and does the actual switching logic with specialised silicon, and a much smaller x86 or ARM processor that runs the management software.

  • Real switching, high speed carrier grade stuff, is more about hardware asics than software. Its comparatively exhaustingly expensive to route subnet or vlan traffic because the CPU on most machines isn't quick enough with bus overhead. Cisco and others own a monopoly on ultra high speed asic enabled hardware used by cloudflare and others. Modern virtual switching hardware is fast enough to crush practically any consumer hardware.

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (3) Ha, ha, I can't believe they're actually going to adopt this sucker.