Multi-Threaded SSH/SCP 228
neo writes "Chris Rapier has presented a paper describing how to dramatically increase the speed of SCP networks. It appears that because SCP relies on a single thread in SSH, the crypto can sometimes be the bottleneck instead of the wire speed. Their new implementation (HPN-SSH) takes advantage of multi-threaded capable systems dramatically increasing the speed of securely copying files. They are currently looking for potential users with very high bandwidth to test the upper limits of the system."
Re:Alternative solution for a trusted LAN (Score:5, Informative)
-c blowfish|3des|des
Selects the cipher to use for encrypting the session. 3des is
used by default. It is believed to be secure. 3des (triple-des)
is an encrypt-decrypt-encrypt triple with three different keys.
blowfish is a fast block cipher, it appears very secure and is
much faster than 3des. des is only supported in the ssh client
for interoperability with legacy protocol 1 implementations that
do not support the 3des cipher. Its use is strongly discouraged
due to cryptographic weaknesses.
Re:To *have* such problems... (Score:5, Informative)
Re:Alternative solution for a trusted LAN (Score:1, Informative)
I surely missed having that option when copying files between hosts on my LAN. I don't need to hide data from myself. If someone else connects and encrypting data is a concern, I'll simply not use the 'none' "cipher".
Pretty much totaly incorrect summary (Score:5, Informative)
By the way, does anybody else think "the ability to switch to a NONE cipher post authentication" is pretty dodgy?
Re:Alternative solution for a trusted LAN (Score:5, Informative)
Copying 100MB of data over 100mbit ethernet to a P2 350Mhz box (the slowest I got) gives:
* 3des 1.9MB/s
* AES 4.8MB/s
* blowfish 4.4MB/s
Re:Must be why rsync over ssh is much faster (Score:4, Informative)
Re:Alternative solution for a trusted LAN (Score:3, Informative)
This is the setup using nc:
and this is the setup that socketpipe arranges:
Re:Hardware acceleration (Score:2, Informative)
So I guess they disappeared..
Re:To *have* such problems... (Score:5, Informative)
Re:Pretty much totaly incorrect summary (Score:3, Informative)
Not really, for some of the stuff I do via SSH: eg logging into my webhost to untar a patch and apply it the only part of the transaction I want to be secure is my initial password/key-exchange post authentication I really don't give a stuff who sees me type or any of the other commands I type in. However it should be down to the admin of the system in the first place to decided whether to allow NONE down-grade (Either on system wide or per user/session basis) and then down to me as a user to decide whether to take advantage.
Re:Sweet! (Score:4, Informative)
Re:Alternative solution for a trusted LAN (Score:3, Informative)
-c cipher_spec
Selects the cipher specification for encrypting the session.
Protocol version 1 allows specification of a single cipher. The
supported values are "3des", "blowfish", and
"des". 3des (triple-des) is an encrypt-decrypt-encrypt
triple with three different keys. It is believed to be secure.
blowfish is a fast block cipher; it appears very secure and is
much faster than 3des. des is only supported in the ssh client
for interoperability with legacy protocol 1 implementations that
do not support the 3des cipher. Its use is strongly
discouraged due to cryptographic weaknesses. The default
is "3des".
For protocol version 2, cipher_spec is a comma-separated list of
ciphers listed in order of preference. The supported ciphers are:
3des-cbc, aes128-cbc, aes192-cbc, aes256-cbc, aes128-ctr,
aes192-ctr, aes256-ctr, arc
four128, arcfour256, arcfour, blowfish-cbc, and cast128-cbc. The default is:
aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,
arcfour256,arcfour,aes192-cbc,aes256-cbc,aes128-ctr,
aes192-ctr,aes256-ctr
Re:Must be why rsync over ssh is much faster (Score:3, Informative)
Re:Why not loopback? (Score:3, Informative)
BDP is the bandwidth-delay product. BDP is one of the main things these patches address. Loopback has very, very little delay. You could, I suppose, add artificial delay over loopback, but now you're diverging further from the actual deployment scenario.
The other thing is that when sender and receiver are the same host, you don't engage the full network stack (no ethernet queuing, for example, no dropped packets, etc. etc.), so you don't find out all the curve balls that TCP/IP will throw you.
And yet another thing is that sender and receiver will compete for the same CPUs, and so whatever upper CPU bound you have with separate sender and receiver, you'll be at roughly half that (assuming send and receive are balanced) when both are on the same machine.
--JoeRe:Must be why rsync over ssh is much faster (Score:2, Informative)
Re:To *have* such problems... (Score:2, Informative)
Re:Pretty much totaly incorrect summary (Score:2, Informative)
As for the dodgy aspect of the NONE cipher switching. I'll be the first to admit that its not a perfect solution. The authentication remains fully encrypted and you can't use the NONE switch in an interactive session which obviates some of the problems. However, it still, in some ways, is counter to the idea of SSH which is why we came up with the threaded cipher. If you are willing to accept the NONE cipher then you can use that but if you want full encryption then you can use the threaded AES-CTR mode cipher.
Re:Pretty much totaly incorrect summary (Score:4, Informative)
Re:Must be why rsync over ssh is much faster (Score:3, Informative)
tar cfpz - . | ssh user@host '( cd /destination ; tar xfpvz - )'
I'd use a "." instead of *, it avoids shell line length problems, and will also copy hidden files... as someone who as learned this the hard way. Also in my experience, on anything faster then 10MB, don't bother with compression (it's really a CPU to network speed ratio, on transfers I did regularly that was the rule of thumb with P4 2.2Ghz Xeons). Also, I removed the "v" from the source tar, as it duplicates every file name twice and can be hard to read. I can't remember if ssh or tar had better compression, I know I tested both. It really just changed the tipping point of the CPU speed. I also used to use blowfish for the cipher as it was easier on the CPU if you were running out of CPU instead of network. On a Gigabit network, I always ran out of CPU first.
I normally use -C instead of a subshell, but that's merely a matter of taste. I also use the technique in reverse quite often so I can untar on the destination machine as root.
Kirby
Some comments from one of the authors (Score:5, Informative)
A couple notes about the multi-threading: The main goal was to allow SSH to make use of multiple processing cores. The stock OpenSSH is, by design, limited to using one core. As such a user can encounter situations where they have more network capacity and more compute capacity but will be unable to exploit them. The goal of this patch was to allow users to make full use of the resources available too them. The upshot of this is that its best suited for high performance network and compute environments (The HPN in HPN-SSH stands for High Performance Networking). This doesn't mean it won't be useful to home users - only that they might not see the dramatic performance gains someone in a higher capacity environment might see. Its really going to depend on the specifics of their environment.
Based on our research we decided the most effective way to do this would be to make the AES-CTR mode cipher multi-threaded. The CTR mode is well suited to threading because there is no inter block dependency and, even better, the resulting cypher stream is indistinguishable from a single threaded CTR mode cypher stream. As a result, we retain full compatibility with other implementations of SSH - you don't need to have HPN-SSH on both sides of the connection. Of course, you won't see the same improvements unless you do.
We still see this as somewhat experimental because we've not yet implemented a way to allow users to choose between a single threaded AES-CTR and multi-threaded AES-CTR mode. As such users on single core machines - if using AES-CTR may see a decrease in performance. We suggest those users just make use of the AES-CBC mode instead (which is the default anyway). Also, you need to be able to support posix threads.
Future work will involve pipelining the MAC routine and that should provide us with another 30% or so improvement in throughput.
Also, its important to keep in mind that these improvements are *not* just for SCP but for SSH as a whole. People using HPN-SSH as a transport mechanism for rsync, tunnels, pipes, and so forth may also see considerable performance improvements. Additionally, the windowing patches don't necessarily require HPN-SSH to be installed on both ends of the connection. As long as the patch is installed on the receiving side (the data sink) you may (assuming you were previously window limited) see a performance gain.
We welcome any comments, suggests, ideas, or problem reports you might have regarding the HPN-SSH patch. Go the website mentioned above and use the email address there to get in touch with us. This is a work in progress and we are doing what we can to enable line rate easy to use fully encrypted communications. We've a lot more to do but I hope what we've done so far is of use and value to the community.
Re:A likely story (Score:3, Informative)
Re:FUNNY?! That's not funny, try for TRUE (Score:3, Informative)
Re:Must be why rsync over ssh is much faster (Score:3, Informative)
ssh user@host.com tar -C /remote/path -cpzf - remotefile1 remotefile2 | tar -C /local/path -xvzp -