Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
Networking Software Privacy Linux

Multi-Threaded SSH/SCP 228

Posted by kdawson
from the recovering-wasted-bandwidth dept.
neo writes "Chris Rapier has presented a paper describing how to dramatically increase the speed of SCP networks. It appears that because SCP relies on a single thread in SSH, the crypto can sometimes be the bottleneck instead of the wire speed. Their new implementation (HPN-SSH) takes advantage of multi-threaded capable systems dramatically increasing the speed of securely copying files. They are currently looking for potential users with very high bandwidth to test the upper limits of the system."
This discussion has been archived. No new comments can be posted.

Multi-Threaded SSH/SCP

Comments Filter:
  • by sqldr (838964) on Wednesday February 13, 2008 @05:29AM (#22404034)
    Hi, I've invented a new way of downloading pron^H^H^H^H^H^Hcopying files across a network. If you have uber bandwidth, please contact me urgently!
    • Re:A likely story (Score:5, Insightful)

      by slyn (1111419) <ozzietheowl@gmail.com> on Wednesday February 13, 2008 @05:34AM (#22404072)
      Makes you wonder how many innovations can either be directly attributed to or partially attributed to the distribution of porn (not (necessarily) that this is).

      VHS v Betamax comes to mind.
      • Re:A likely story (Score:5, Insightful)

        by shenanigans (742403) on Wednesday February 13, 2008 @06:50AM (#22404426)
        I think this story is interesting because it shows a general trend: increased focus on multi-threading. We will see much more of this in the future as multi-core and multi-processor systems become more common. This trend is driven not by porn though, but by that other big driving force behind the computer industry, gaming.
        • Re:A likely story (Score:5, Insightful)

          by rapier1 (855437) on Wednesday February 13, 2008 @10:08AM (#22405990)
          Actually, this is one of the main reasons why we did this. If you look at where processor architectures are going they aren't necessarily increasing the pure computational power of each core as much as they are using more and more cores. If you are in a situation where you have a single threaded process that is CPU bound - like SSH can be - you'll find that throughput rates (assuming that you aren't network bound) will remain flat. In order to make full use of the available network capacity we'll have to start doing things like this. There is still a lot of work to be done but we're pleased by our progress so far.
        • Re: (Score:3, Interesting)

          by dodobh (65811)
          There's also message passing and event driven programming, which can be a much simpler model if done right. Multi threading tends to shared state, and that's bad for programmers.
      • Re:A likely story (Score:5, Insightful)

        by mikael (484) on Wednesday February 13, 2008 @08:06AM (#22404810)
        People are always willing to pay more to be entertained that to be educated.

        Which explains why football players and movie stars will get paid more than the innovators that carried out the research to develop the broadcast technology that helped to make those stars famous.
      • Hey, I graduated college with a final project I originally built to search porn.

        Porn powers computer science.
    • by bigmouth_strikes (224629) on Wednesday February 13, 2008 @06:13AM (#22404234) Journal
      Wally: "My proposed work plan for the year is to stress-test our product under severe network conditions. I will accomplish this by downloading large image files from the busiest servers on the net."

      (PHB rejects suggestion)
      (later)

      Wally: "I was this close to making it my job to download naughty pictures."
      Dilbert : "It's just as well; I would have had to kill you."

      ( http://books.google.com/books?id=dCeVfKrZ-3MC&pg=PA77&source=gbs_selected_pages&cad=0_1&sig=xD5tmMhG1RcspLch8gCIJu8ro2U#PPA79,M1 [google.com] )
  • by Anonymous Coward
    not that scp as-is isn'--stalled--
  • by MichaelSmith (789609) on Wednesday February 13, 2008 @05:33AM (#22404060) Homepage Journal
    I get a lot of use out of ssh for moving files around and rsync is definitely the best way to do heavy lifting in this area. Improving scp would be good to. I can't wait to hear what Theo thinks about this. I don't see him as a fan of adding complexity to improve performance.

    Big scp copies through my wifi router used to cause kernel panics under netbsd current of about a year ago. I never had that problem running rsync inside ssh.
    • Re: (Score:2, Insightful)

      by cheater512 (783349)
      Rsync doesnt encrypt. SSH/SCP does.

      Rsync is only really useful as a synchronizing method between a source and a out of date copy.
      Then its real benefits get shown.
      • by AnyoneEB (574727) on Wednesday February 13, 2008 @07:13AM (#22404520)
        Unless your servers are running rsh, rsync is probably going to get routed through ssh, in which case it gets encrypted just like scp. ref [everythinglinux.org]:

        Secure Shell - The security concious of you out there would like this, and you should all be using it. The stream from rsync is passed through the ssh protocol to encrypt your session instead of rsh, which is also an option (and required if you don't use ssh - enable it in your /etc/inet.d and restart your inet daemon if you disabled it for security).
        • by Tony Hoyle (11698)
          That's just flat out wrong. rsyncd has its own, unenctypted, protocol.

          You can run it from inetd or as a daemon, but it's unrelated to rsh.

          That connection may or may not be encrypted depending on the route it takes.. VPNs tend to be encrypted for example, but LAN connections not.

          • by blueg3 (192743)
            From "man rsync":

            There are eight different ways of using rsync. They are: ...
            * for copying from the local machine to a remote machine using a remote shell program as the transport (such as ssh or rsh). This is invoked when the destination path contains a single : separa-tor.
            * (same as the above, but copy from remote to local machine)

            Note that the remote machine could alternately have an rsync server. However, this is not required -- if the remote machine does not have an rsync server, transport is done vi
          • Okay, it's very simple.

            Encrypted and tunneled over SSH, rsync is spawned by a login shell at the other side:
            rsync /some/path myhost.com:my/directory

            Not encrypted, rsyncs daemon must be running at other end:
            rsync /some/path rsync://myhost.com/my/directory OR
            rsync /some/path myhost.com::my/directory
        • And if you have both available, it becomes a little ambiguous. However, this excerpt from the rsync man page indicates it uses ssh as default (unless the system is old):

          Once installed, you can use rsync to any machine that you can access
          via a remote shell (as well as some that you can access using the rsync
          daemon-mode protocol). For remote transfers, a modern rsync uses ssh
          for its communications, but it may have been configured to use a dif-
          ferent remote shell by default, such as rsh or remsh.

          You c

    • Re: (Score:3, Informative)

      by rapier1 (855437)
      As a note, the changes are actually to SSH itself and not just SCP. So any application that uses SSH as a transport mechanism can see a performance boost. This isn't to say *every* user will. This is mainly geared towards high bandwidth delay product networks (greater than 1MB) or GigE LANs.
  • by Diomidis Spinellis (661697) on Wednesday February 13, 2008 @05:40AM (#22404096) Homepage
    If you want to speed up transfers and you're working on a LAN you trust (i.e. you don't worry about the integrity and confidentiality of the data passing through it), you can dramatically increase throughput using socketpipe [spinellis.gr]. Although the initial socketpipe communication setup is performed through client-server intermediaries such as ssh(1), the communication channel that socketpipe establishes is a direct socket connection between the local and the remote commands. This eliminates not only the encryption/description overhead, but also the copying between your processes and ssh or rsh.
    • by upside (574799) on Wednesday February 13, 2008 @05:59AM (#22404176) Journal
      You can also use a cheaper cipher. From the ssh manpage:

      -c blowfish|3des|des
                                Selects the cipher to use for encrypting the session. 3des is
                                used by default. It is believed to be secure. 3des (triple-des)
                                is an encrypt-decrypt-encrypt triple with three different keys.
                                blowfish is a fast block cipher, it appears very secure and is
                                much faster than 3des. des is only supported in the ssh client
                                for interoperability with legacy protocol 1 implementations that
                                do not support the 3des cipher. Its use is strongly discouraged
                                due to cryptographic weaknesses.
      • And if you really trust your network, you can recompile SSH with the 'none' cipher enabled, which (as the name implied) uses no encryption on the datastream whatsoever.
        • Re: (Score:3, Interesting)

          by HAKdragon (193605)
          I may be speaking out of ignorance, but doesn't that defeat the point of SSH?
          • by drinkypoo (153816)
            The "point" is whatever you want it to be. When copying files on my LAN, I still want secure password exchange, but I could give a shit about the data. I don't know if it's actually possible to achieve this (encrypt only for exchange) with ssh or not, though.
          • Re: (Score:3, Interesting)

            by forkazoo (138186)

            I may be speaking out of ignorance, but doesn't that defeat the point of SSH?

            SSH is one of those uberutilities that has a surprising amount of usefulness once you dig a bit. Sure, secure telnet functionality is great, and I use it a lot. But, I still use ssh on my own LAN where I don't really care about security. I use sshfs because it is easier and more convenient for me than bothering with Samba. SCP/SFTP to avoid bothering with ftp. I use it for forwarding ports between various machines, and I use i

      • by KiloByte (825081) on Wednesday February 13, 2008 @07:11AM (#22404508)
        Actually, it appears that (at least on Debian) AES is already the default. Selecting 3des gives tremendous slowdown; blowfish is somewhat slower than AES.

        Copying 100MB of data over 100mbit ethernet to a P2 350Mhz box (the slowest I got) gives:
        * 3des 1.9MB/s
        * AES 4.8MB/s
        * blowfish 4.4MB/s
        • Re: (Score:3, Informative)

          by Neil Watson (60859)
          Actually, it depends upon the SSH protocol. Both Debian and Cygwin have this to say:

          -c cipher_spec
          Selects the cipher specification for encrypting the session.

          Protocol version 1 allows specification of a single cipher. The
      • That's protocol v. 1. Protocol version 2, which we should all be using, will try aes128-cbc first off, and that should be one of the fastest ciphers available.
        $ cat /etc/ssh/sshd_config | grep Pro
        Protocol 2
    • by MoogMan (442253)
      Or use netcat/nc (installed by default on most Linux distros). Server cats it's output directly to a file (or to tar -x). Client grabs it's input from a file.

      Server: nc -l 1234 | tar -x
      Client: tar -c file_list_here | nc localhost 1234
      • Re: (Score:3, Informative)

        Nc is useful, but it still involves the overhead of copying the data through it (once at the client and once at the server). Nowadays, in most settings this overhead can be ignored. But, given the fact that a well-behaved application will work with a socket exactly as well as with a pipe or a file descriptor, I thought it would be very elegant to be able to connect two instances of (say) tar through a socket. Hence the implementation of socketpipe. Socketpipe sets up the plumbing and then just waits fo
    • by houghi (78078)
      The situation where this can be usefull will be almost non-existing. The places that could benefit from the increase will most likely have several people working.

      That means the better way is to encrypt BY DEFAULT. Security is not so much a technical issue. It is also a social one. Encrypting by default will benefit you in the long run.

      This means also sending gpg signed messages by default. Much shorter, beter and more usefull then the legal sh|t they attach now.
  • How will this affect the operation of applications wich use SSH to standardie applications accross Linux like urpmi's --parallel parameter?

    This is one of the most useful aspects of Mandriva, but as the number of nodes I have to manage increases, I find RPMS being SCPed to other nodes taking longer and longer. I think this is because even though with Kerberos Authentication is much faster, urpmi is waiting until one node finishes copying the files to start copying to the next node in the Domain.

    Thoughts?
  • Sweet! (Score:5, Insightful)

    by miffo.swe (547642) <(moc.liamg) (ta) (molbdeh.leinad)> on Wednesday February 13, 2008 @05:49AM (#22404142) Homepage Journal
    I really hope this will make it into OpenSSH after some security auditing. The performance gains was pretty impressive. It will make ssh much more fun for rsync, backups and other times when i transfer large files. I also wonder if one cant get similar performance gains with normal ssh and for example forwarded X-windows. That would be very interesting indeed.
    • I also wonder if one cant get similar performance gains with normal ssh and for example forwarded X-windows.

      Probably not. The X11 protocol is very latency sensitive, so the bottleneck tends to be round-trip times rather than raw throughput.

      I haven't read the article, so I don't know what it says about per-packet set-up times, but I wouldn't be surprised if latency was actually increased due to the overhead of having to at least decide to distribute encryption work across multiple CPUs.

    • Re:Sweet! (Score:4, Informative)

      by Per Wigren (5315) on Wednesday February 13, 2008 @08:48AM (#22405124) Homepage
      Use NX [nomachine.com] instead of plain old remote DISPLAY or ssh's X11 forwarding or even VNC! It's silly fast! You get a perfectly usable desktop even on slow, high latency connections. The free edition is free as in GPL.
    • AFAIK, the OpenBSD kernel has adopted the SMP approach of the Linux 2.2 kernel (i.e. one great big kernel lock), and threads are implemented in a userland library. I assume that there will be less of a performance benefit on OpenBSD.

      Given this stance, is it very likely that either the core maintainers, or the maintainers for the portable releases, will integrate this code?

      Given the danger of protecting critical sections of code from race conditions and other exploits, should we keep things simple?

      p.s.

  • by pla (258480) on Wednesday February 13, 2008 @06:02AM (#22404198) Journal
    the crypto can sometimes be the bottleneck instead of the wire speed.

    Between two devices on my gigabit home LAN, the CPU barely even registers while SCP'ing a large file (and that with every CPU-expensive protocol option turned on, including compression). What sort of connection do these guys have, that the CPU overhead of en/decryption throttles the transfer???


    Coming next week: SSH compromised via a thread injection attack, thanks to a "feature" that only benefits those of us running our own undersea fiber.
    • by dm(Hannu) (649585) on Wednesday February 13, 2008 @06:19AM (#22404252) Homepage
      They claim that the first bottleneck is actually flow control of buffers, which prevents utilizing full network bandwidth in normal gigabit connections. The threads will help only after this first bottleneck has been cleared. They have patches to fix both problems. The slashdot summary was therefore a bit inaccurate, and reading TFA certainly helps.
      • by egede (266062) on Wednesday February 13, 2008 @07:53AM (#22404748)
        The limitations of transfer rates for scp is often the round trip time that consumes time for confirmation of received packages. This is a serious issue for transfers from the Europe to the US West Coast (around 200 ms) or to Australia (around 400 ms). Having several parallel TCP streams can solve this problem and has been in use for many years for transfer of data in High Energy Physics. An example of such a solution is GridFTP http://www.globus.org/toolkit/docs/4.0/data/gridftp/ [globus.org].
        • by Vellmont (569020)

          The limitations of transfer rates for scp is often the round trip time that consumes time for confirmation of received packages. This is a serious issue for transfers from the Europe to the US West Coast (around 200 ms) or to Australia (around 400 ms).

          Huh. I'm surprised TCP sliding window protocol doesn't take care of that. Shouldn't it account for filling up the pipeline between sender and receiver?
          • by Andy Dodd (701)
            Older implementations of TCP only allow for a 64 KB window size. (Older meaning "really old" - nearly any implementation in the last decade implements extensions that allow for much larger transmit/receive windows.)

            Many apps set fixed window sizes (incl. apparently standard SSH - the webpage implies 64K.)

            Linux can "autotune" window sizes, but most OSes don't, hence the need for an app to be able to specify a larger window.

            Even with larger window sizes, TCP congestion control starts breaking on networks wit
    • by totally bogus dude (1040246) on Wednesday February 13, 2008 @06:38AM (#22404364)

      Have you measured your actual throughput on the file transfer? It tends to take a crapload of tuning to get anywhere near saturating gigabit, even if you're not using encrypted transfers.

      I wrote the bit below which I'll keep because it might be interesting to someone, but dm(Hannu) already mentioned the claw flaw in the logic behind the PP and article summary: if the CPU is the bottleneck, how could adding more threads possibly help?

      Just for a laugh I used scp to copy a 512 MB file from my file server to itself, an Athlon 3700+ running at 2.2ghz. I got about 18 megabytes / second out of it. I took a snapshot of top's output right at the end (97% complete) and the CPU usage was as follows:

      ssh: 48.6%
      sshd: 44.9%
      scp: 3.7%
      scp: 1.3%
      pdflush: 0.7%

      So this system was pretty much pegged by this copy operation, and it achieved less than a fifth the capacity of a gigabit network link. Obviously the system is capable of transferring data much faster than this; the source was a RAID-5 set of 5 new 500 GB drives, and the destination was a stripe across two old 40 GB drives. I'd also repeated the experiment a few times (and this was the fastest transfer I got) so it's likely the source file was cached, too.

      I do agree that there's probably more interesting and useful things to optimise (and make easy to optimise) than scp's speed, but I know for sure that scp'ing iso images to our ESX servers in a crapload slower than using Veeam's copy utility or the upload facility in the new version of Infrastructure Client (at least I think it's new, never noticed it before).

      • by mikael_j (106439) on Wednesday February 13, 2008 @06:49AM (#22404414)

        A possible problem source here is that you're also doing disk I/O, when transferring data on my home network I've noticed that rsyncing things for redundancy purposes I end up with a lot more CPU usage (even when reading from a RAID5 via a hardware controller) than if I just pump random data from one machine to another. I reommend you try just transferring random data and piping it directly to /dev/null on the receiving machine to see if there's any difference in CPU usage.

        /Mikael

        • by Kyojin (672334)
          That's mostly a good idea except for the randomly generated bit. Random number generation can be cpu intensive. A better option might be to randomly generate 500Mb of data in memory and send that to dev/null over whatever link, using the same data each time.
        • Re: (Score:2, Interesting)

          True enough, but my main point was that getting to actual gigabit speeds in the first place is actually pretty difficult. Plus, I couldn't find an easy way to copy only X amount of "random" data via scp which was the point of the article. Regardless, copying data is rarely if ever a useful thing to do with scp, anyway.

        • by drinkypoo (153816)
          The only time disk access ever causes measurable perturbations in CPU usage, at least for me, is when I'm either doing software RAID (which is handled by your CPU) or USB (which is also handled by your CPU, even in USB2 - don't believe the hype.) The SAME DISK in the SAME ENCLOSURE which is USB2/IEEE1394 will raise my CPU to as much as 12% total (I have a core duo, so that's 24% of a 2.16GHz Core core) just, say, reading 25MB/sec and writing it to the internal disk. Reading 10MB and writing 10MB at the same
      • by Andy Dodd (701)
        "if the CPU is the bottleneck, how could adding more threads possibly help?"

        Pretty sure the article summary covered this - it is intended for multicore/multiprocessor systems.

        i.e. a single CPU is a bottleneck, but multithreading allows the load to be distributed over multiple CPUs, removing the bottleneck a single CPU might provide.
      • Re: (Score:2, Informative)

        by rapier1 (855437)
        > if the CPU is the bottleneck, how could adding more threads possibly help? This is actually a great question. On single core systems its very unlikely that the multi-threading aspect of our patch will be of much use to you. The stock version of SSH is, because of its design, unable to use more than one core regardless of how many cores you actually happen to have. Which means that you could have one core thats pegged by SSH and have other cores that are essentially running idle (if you look at the pre
    • by alexhs (877055)
      Hello !

      My home server/internet gateway is a Pentium MMX at 200MHz, with a 100 Mb/s NIC.
      With SCP (default options, server sending), I can transfer at 8Mb/s.
      With RCP, at 25 Mb/s
    • by BitZtream (692029)
      This was my first thought as well. I've never had SSH use more than a tiny amount of CPU. As you said, it will be great to have some race conditions in there to make it all that much more secure. $10 says OpenSSH won't be in a big rush to put these patchs in place.
    • by Danathar (267989)
      PSC = Pittsburgh Supercomputing Center (At Carnegie Mellon).

      They are a node on the Teragrid which has throughput over some segments of around 100Gb/s
    • the crypto can sometimes be the bottleneck instead of the wire speed.

      Between two devices on my gigabit home LAN, the CPU barely even registers while SCP'ing a large file (and that with every CPU-expensive protocol option turned on, including compression). What sort of connection do these guys have, that the CPU overhead of en/decryption throttles the transfer???


      Coming next week: SSH compromised via a thread injection attack, thanks to a "feature" that only benefits those of us running our own undersea fiber.

      A worked on a program that is similar to what the summary describes. For various legacy reasons (legacy code-base) we only supported Triple-DES encryption. But the bottlenecks break down as follows:

      1. Network Bandwidth
      2. CPU/encryption
      3. Disk Access

      Network bandwidth is easily over-come to a degree - in the manner the summary describes, it will easily fill the pipe if you can get past the other two issues.

      CPU/encryption - this is really more the encryption and how much it affects you will depend on wha

  • by arkarumba (763047) on Wednesday February 13, 2008 @06:49AM (#22404420) Journal
    Preferably, we would like to test it any very high bandwidth systems running Linux kernels version 2.6.17 to 2.6.24.1.
  • by Eunuchswear (210685) on Wednesday February 13, 2008 @06:53AM (#22404438) Journal
    Almost all the improvements they talk about come from optimising TCP buffer usage. The summary of the fixes:

    HPN-13 A la Carte
    • Dynamic Windows and None Cipher
      This is a basis of the HPN-SSH patch set. It provides dynamic window in SSH and the ability to switch to a NONE cipher post authentication. Based on the HPN12 v20 patch.
    • Threaded CTR cipher mode
      This patch adds threading to the CTR block mode for AES and other supported ciphers. This may allow SSH to make use of multiple cores/cpus during transfers and significantly increase throughput. This patch should be considered experimental at this time.
    • Peak Throughput
      This patch modifes the progress bar to display the 1 second throughput average. On completion of the transfer it will display the peak throughput through the life of the connection.
    • Server Logging
      This patch adds additional logging to the SSHD server including encryption used...
    So the main part of the patch set is "It provides dynamic window in SSH and the ability to switch to a NONE cipher post authentication" and the only part that has to do with threading is marked "This patch should be considered experimental at this time".

    By the way, does anybody else think "the ability to switch to a NONE cipher post authentication" is pretty dodgy?
    • Re: (Score:3, Informative)

      by Arimus (198136)

      By the way, does anybody else think "the ability to switch to a NONE cipher post authentication" is pretty dodgy?

      Not really, for some of the stuff I do via SSH: eg logging into my webhost to untar a patch and apply it the only part of the transaction I want to be secure is my initial password/key-exchange post authentication I really don't give a stuff who sees me type

      cd ~/www
      tar xvfz ~/patch.tar.gz

      or any of the other commands I type in. However it should be down to the admin of the system in the first

    • by dissy (172727)

      By the way, does anybody else think "the ability to switch to a NONE cipher post authentication" is pretty dodgy?

      Yes, you almost might as well just use telnet or rlogin.

      The only advantage ssh with no cipher is that an attacker will not see your authentication details (password or key) to login to the remote machine.

      Unfortunatly just like telnet, using ssh with the none cipher opens the connection up to tcp hijacking and injection of packets, so the attacker doesnt really need your password anymore, they can just execute commands as you on the server once you are authenticated.

      My guess is with the dynamic tcp window s

      • by Atzanteol (99067)

        Yes, you almost might as well just use telnet or rlogin.

        Sorta. If you're on a private network sending a 4Gig ISO (or other large file/files) why do you need the data to be encrypted? Encrypting credentials is sufficient.

        • by dissy (172727)

          Sorta. If you're on a private network sending a 4Gig ISO (or other large file/files) why do you need the data to be encrypted? Encrypting credentials is sufficient.

          Exactly. As long as you can trust (or at worse, assume) your LAN is secure.
          Much easier to do on a home network where it is either just you, or you and family.
          A rather safe assumption to make if your client machines and 'servers' are secured from eachother, limiting the potential damage from an infected wi^H^H^H client machine.

          THe main use I see is that the scp command is damn handy compared to most any other command line method of transfering files. Especially so with RSA/DSA keys.
          As far as people that u

      • by rapier1 (855437) on Wednesday February 13, 2008 @10:49AM (#22406608)
        As a note - while the NONE cipher switch turns off data encryption we do *not* disable the message authentication cipher (MAC). So each data packet is still signed and authenticated. If it detects any in transit modification of the packet the connection is immediately dropped.
    • Re: (Score:3, Insightful)

      by tbuskey (135499)

      By the way, does anybody else think "the ability to switch to a NONE cipher post authentication" is pretty dodgy?

      I'd like it when I tunnel a new SSH or scp through another SSH tunnel. We call it a sleeve. I've had to sleeve within a sleeve's sleeve before to get through multiple SSH gateways and firewalls to an inner system. You can tell ssh to use XOR but I'm not sure you can in scp.

      Of course, if speed is paramount, you can use netcat inside the sleeve(s) to copy files. No encryption of the netcat

    • Re: (Score:2, Informative)

      by rapier1 (855437)
      Where the performance boost comes from is going to depend on a lot on the characteristics of the network path. If its a high bandwidth delay product path then the majority of the performance increase may very well come from the dynamic window sizing (this is application layer windowing by the way). However, if path has a low BDP and you are CPU bound then either the NONE cipher switch or the multi-threading may provide more of a performance increase than the window sizing. Alternatively, in some high BDP pa
  • by AceJohnny (253840) <jlargentaye@gmai ... minus herbivore> on Wednesday February 13, 2008 @07:09AM (#22404502) Journal
    I've been wondering, does there exist hardware accelerators usable by OpenSSL or GnuTLS? I work in embedded systems, and our chip includes a crypto and hash processor. I'm surprised nothing equivalent exists on modern PCs, or have I just not been looking in the right places?
    • Re: (Score:2, Informative)

      by neumayr (819083)
      There were crypto acceleration cards, but I think the market was fairly small. They made sense for sites with lots of https traffic, but nowadays general purpose cpus are blazingly fast compared to back then.
      So I guess they disappeared..
    • by bzzzt (313005)
      Those embedded systems usually have slow CPU's which are outperformed by hardware-accelerated encryption boards. Your average desktop CPU running openssl will be a lot faster than most cheap accelerator cards. Most of the time it's cheaper to invest in a faster generic CPU than a piece of custom hardware.
    • by TeknoHog (164938)

      I've been wondering, does there exist hardware accelerators usable by OpenSSL or GnuTLS? I work in embedded systems, and our chip includes a crypto and hash processor. I'm surprised nothing equivalent exists on modern PCs, or have I just not been looking in the right places?

      The VIA C7 processor has hardware crypto acceleration (AES and some helper functions) that's supported by OpenSSL out of the box. Applications still require some patching, for example OpenSSH. The reason seems to be that the application has to choose the encryption engine used by OpenSSL.

      http://bugs.gentoo.org/show_bug.cgi?id=162967 [gentoo.org]

    • by Bill Wong (583178)
      There is a whole lot of available crypto hardware listed here [openbsd.org].
      I've used a Hifn Crypto Accelerator [hifn.com] a year or three ago. Worked with OpenSSL for the most part.
    • Re: (Score:2, Interesting)

      by htd2 (854946)
      There are a number of vendors who supply PCI/e/x/etc crypto accelerators. However these are mostly targeted at servers where the overhead of serving up numerous encrypted streams of data is much higher than on a client PC.

      The Sun T1 and T2 processors in the T2000 and T5000 also have onchip crypto units 1 on the T2000 and 8 on the T5000 which accelerate OpenSSL traffic by offloading DES, AES, MD5 etc.
    • Typically the overhead of the memory copy across the PCI bus to the crypto card and back into main memory is higher than just letting the host cpu do the crypto work itself (AES was designed to be very efficient even on slow cpus, far better than DES/3DES).

      As others have mentioned, the Via cpus have built-in accelerators which avoid those memory copies
  • by j.a.mcguire (551738) on Wednesday February 13, 2008 @07:34AM (#22404642)
    Crowds were shocked to discover that multi-threaded software runs faster on a multi-core system, it was hard to contain the excitement over the discovery which could revolutionise software development!
  • it's weird that they look for high bandwidth users only. they could also use low-powered systems to test the approach :)
    of course, if they want to test how well it scales, that's a different matter.
  • Comcast (Score:2, Funny)

    by OglinTatas (710589)
    "They are currently looking for potential users with very high bandwidth to test the upper limits of the system."

    You comcast users can forget about it, then.
  • Old news? (Score:2, Interesting)

    Digging around for the best way to apply the patch without screwing up my portage updates, I came across a request for this to be merged into the portage back in 2005, and is apparently usable with the HPN useflag.

    Not that I'm that surprised to see this is old news, since they're apparently on major revision 13...
  • Today you can buy a machine with eight cores, 8gb memory and 1tb harddrive for less than 2000. And most software will only use one core and a maximum of 2gb memory.

    WE NEED MULTITHREADING NOW BIG AND EVERYWHERE.

    Multithreading is maybe the biggest change in software development. In contrast to advanced command sets like MMX, SSE and so on it is not about some peep hole optimization, about replacing a bunch of x86_32 commands with some SSE commands, it is about changing the whole approach, finding new algorith
  • by rapier1 (855437) on Wednesday February 13, 2008 @11:19AM (#22407074)
    First off, thank you for taking the time to read down this far. There have been some very interesting and useful comments so far. Second, I need to point out that both Ben Bennett of PSC and Michael Stevens of CMU were instrumental in getting this patch written. Without them there would be no HPN-SSH patch. I also highly suggest that interested people go to the http://www.psc.edu/networking/projects/hpn-ssh [psc.edu] and read about what we've done. There is a lot of good material in the papers and presentations section as well as the FAQ.

    A couple notes about the multi-threading: The main goal was to allow SSH to make use of multiple processing cores. The stock OpenSSH is, by design, limited to using one core. As such a user can encounter situations where they have more network capacity and more compute capacity but will be unable to exploit them. The goal of this patch was to allow users to make full use of the resources available too them. The upshot of this is that its best suited for high performance network and compute environments (The HPN in HPN-SSH stands for High Performance Networking). This doesn't mean it won't be useful to home users - only that they might not see the dramatic performance gains someone in a higher capacity environment might see. Its really going to depend on the specifics of their environment.
    Based on our research we decided the most effective way to do this would be to make the AES-CTR mode cipher multi-threaded. The CTR mode is well suited to threading because there is no inter block dependency and, even better, the resulting cypher stream is indistinguishable from a single threaded CTR mode cypher stream. As a result, we retain full compatibility with other implementations of SSH - you don't need to have HPN-SSH on both sides of the connection. Of course, you won't see the same improvements unless you do.
    We still see this as somewhat experimental because we've not yet implemented a way to allow users to choose between a single threaded AES-CTR and multi-threaded AES-CTR mode. As such users on single core machines - if using AES-CTR may see a decrease in performance. We suggest those users just make use of the AES-CBC mode instead (which is the default anyway). Also, you need to be able to support posix threads.
    Future work will involve pipelining the MAC routine and that should provide us with another 30% or so improvement in throughput.

    Also, its important to keep in mind that these improvements are *not* just for SCP but for SSH as a whole. People using HPN-SSH as a transport mechanism for rsync, tunnels, pipes, and so forth may also see considerable performance improvements. Additionally, the windowing patches don't necessarily require HPN-SSH to be installed on both ends of the connection. As long as the patch is installed on the receiving side (the data sink) you may (assuming you were previously window limited) see a performance gain.

    We welcome any comments, suggests, ideas, or problem reports you might have regarding the HPN-SSH patch. Go the website mentioned above and use the email address there to get in touch with us. This is a work in progress and we are doing what we can to enable line rate easy to use fully encrypted communications. We've a lot more to do but I hope what we've done so far is of use and value to the community.

Bus error -- please leave by the rear door.

Working...