Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SFTP Incremental Reads + Writes #725

Open
emeryalden opened this issue Dec 6, 2024 · 17 comments
Open

SFTP Incremental Reads + Writes #725

emeryalden opened this issue Dec 6, 2024 · 17 comments

Comments

@emeryalden
Copy link

emeryalden commented Dec 6, 2024

Trying to solve an issue where running .get() on a large file (4GB+) will work at first but appears to eventually exhaust the system's RAM and results in the download freezing.

When using .get(), does asyncssh write data to the destination file as it arrives or does everything need to be collected into memory before it can actually be written to a file?

If it's the latter, is there a way to incrementally download a file from an SFTP and write chunks to the destination file as it goes, allowing the system to then clear some memory space? Or is there another way to recommended approach for instances where file sizes are larger than memory?

@ronf
Copy link
Owner

ronf commented Dec 6, 2024

All of the get/put/copy and wildcard mget/mput/mcopy functions should write data to the destination as soon as it is read in. The default block size used to be fixed at 16 KB and the max parallel requests at 128, so the most you should need is about 2 MB per file being transferred. If you transfer multiple files at once, this would be the amount needed per file.

With AsyncSSH 2.18.0, the default buffer size can be higher if the server supports the "limit" extension and advertises a larger max read/write size, so on very large files you may see an increase in memory needed (up to something like 512 MB). If this is too much, though, you can either set a smaller block size or reduce the max number of parallel requests, though you may lose some speed depending on how small you make these.

The arguments to control this are block_size and max_requests.

@emeryalden
Copy link
Author

Hey @ronf,

Really appreciate your input here. I played around with the block size and max_requests settings, but am still running into difficulties. Upon further investigation, the issue appears not to be related to exhausting RAM but rather on the system failing to rekey with the SFTP server. While the key exchange at the start of the transfer is fine, the one occurring ~1GB into the transfer simply hangs until the connection drops. No exception is raised. I've included the debug logs for the events leading up to this below

DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052262400 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052278784 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052295168 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh:[conn=20, chan=0] Received 32720 data bytes
DEBUG:asyncssh:[conn=20, chan=0] Received 32720 data bytes
DEBUG:asyncssh:[conn=20, chan=0] Received 32720 data bytes
DEBUG:asyncssh:[conn=20, chan=0] Received 20921 data bytes
DEBUG:asyncssh:[conn=20] Requesting key exchange
DEBUG:asyncssh:[conn=20]   Key exchange algs: curve25519-sha256,[email protected],curve448-sha512,ecdh-sha2-nistp521,ecdh-sha2-nistp384,ecdh-sha2-nistp256,ecdh-sha2-1.3.132.0.10,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group15-sha512,diffie-hellman-group16-sha512,diffie-hellman-group17-sha512,diffie-hellman-group18-sha512,[email protected],diffie-hellman-group14-sha1,rsa2048-sha256,ext-info-c,[email protected]
DEBUG:asyncssh:[conn=20]   Host key algs: ssh-ed25519
DEBUG:asyncssh:[conn=20]   Encryption algs: [email protected],[email protected],[email protected],aes256-ctr,aes192-ctr,aes128-ctr
DEBUG:asyncssh:[conn=20]   MAC algs: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],hmac-sha2-256,hmac-sha2-512,hmac-sha1,[email protected],[email protected],[email protected],[email protected],[email protected]
DEBUG:asyncssh:[conn=20]   Compression algs: none,[email protected]
DEBUG:asyncssh:[conn=20] Received key exchange request
DEBUG:asyncssh:[conn=20]   Key exchange algs: curve25519-sha256,[email protected],diffie-hellman-group18-sha512,diffie-hellman-group16-sha512
DEBUG:asyncssh:[conn=20]   Host key algs: ssh-ed25519,rsa-sha2-256,rsa-sha2-512
DEBUG:asyncssh:[conn=20]   Client to server:
DEBUG:asyncssh:[conn=20]     Encryption algs: [email protected],[email protected],aes256-ctr,aes192-ctr,aes128-ctr
DEBUG:asyncssh:[conn=20]     MAC algs: [email protected],[email protected],hmac-sha2-512,hmac-sha2-256
DEBUG:asyncssh:[conn=20]     Compression algs: none,zlib
DEBUG:asyncssh:[conn=20]   Server to client:
DEBUG:asyncssh:[conn=20]     Encryption algs: [email protected],[email protected],aes256-ctr,aes192-ctr,aes128-ctr
DEBUG:asyncssh:[conn=20]     MAC algs: [email protected],[email protected],hmac-sha2-512,hmac-sha2-256
DEBUG:asyncssh:[conn=20]     Compression algs: none,zlib
DEBUG:asyncssh:[conn=20] Beginning key exchange
DEBUG:asyncssh:[conn=20]   Key exchange alg: curve25519-sha256
DEBUG:asyncssh.sftp:[conn=20, chan=0] Received 16384 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Received 16384 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Received 16384 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Received 16384 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Received 16384 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Received 16384 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Received 16384 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Received 16384 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052311552 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052327936 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052344320 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052360704 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052377088 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052393472 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052409856 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending read for 16384 bytes at offset 1052426240 in handle 313033
DEBUG:asyncssh:[conn=20, chan=0] Sending 28 data bytes
DEBUG:asyncssh:[conn=20] Sending keepalive request
DEBUG:asyncssh:[conn=20] Sending keepalive request
DEBUG:asyncssh:[conn=20] Sending keepalive request
INFO:asyncssh:[conn=20] Server not responding to keepalive
INFO:asyncssh:[conn=20, chan=0] Closing channel due to connection close
INFO:asyncssh:[conn=20, chan=0] Channel closed: Server not responding to keepalive
INFO:asyncssh.sftp:[conn=20, chan=0] SFTP client exited: Server not responding to keepalive
INFO:asyncssh:[conn=20, chan=0] Closing channel
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending close for handle 313033
INFO:asyncssh.sftp:[conn=20, chan=0] Starting SFTP get of XXXXXXXXXXXXXXXX to XXXXXXXXXXXXXXXX
DEBUG:asyncssh.sftp:[conn=20, chan=0] Sending lstat for XXXXXXXXXXXXXXXX
INFO:asyncssh:[conn=20] Closing connection
INFO:asyncssh:[conn=20] Sending disconnect: Disconnected by application (11)

So far I've tried modifying both kex_algs and rekey_bytes to try to resolve this, but the issue persists. In fact, rekey_bytes does not seem to have any impact as even lowing the value does not prompt the system to attempt to rekey sooner than the ~1GB mark.

Any advice on where to go from here?

@ronf
Copy link
Owner

ronf commented Dec 7, 2024

So far, I'm unable to reproduce this here, using AsyncSSH as both client and server as well as an AsyncSSH client talking to an OpenSSH server. What client and server (and versions) are you using when you see the problem?

Changing rekey_bytes on the client won't do much here, as almost all of the data is being sent by the server and rekey_bytes kicks in based on the number of bytes sent. So, if your server has this set to 1 GB, that'll be when you see the rekey happening.

The fact that keepalive begins to fail at some point suggests that either something is hanging on the server side, or maybe that there's a problem with the network between the client and server after sending a large amount of data.

You might try running AsyncSSH as both client and server and see if you still see the problem. You could also setting a lower value for rekey_bytes on the server side in this case.

If you control the machine the SFTP server is running on, you could also change its config file to rekey sooner using OpenSSH as the server and see if the changes when the problem occurs.

@emeryalden
Copy link
Author

Thank you again for your input on this. It's perhaps worth mentioning that this connection just uses username + password for authentication. Could that be tripping up the rekey process?

@ronf
Copy link
Owner

ronf commented Dec 7, 2024

No - authentication isn't performed again during rekeying. Only the key exchange which happens prior to auth is repeated. Once complete, both sides switch over to new encryption keys, but the authentication information remains the same.

@emeryalden
Copy link
Author

Hey, so wanted to share an update here. The SFTP is hosted via the Move It Cloud service. We've been working with their support team to assess what's going on here. They just responded with the following note in case you are interested. Thanks again for your help!
image

@ronf
Copy link
Owner

ronf commented Jan 17, 2025

Thanks for the additional info. That is consistent with what I'm seeing in the trace. AsyncSSH is initiating a key exchange here by sending a new KEXINIT message, and it is getting back a KEXINIT in response. It then proceeds to start an ECDH key exchange (using curve25519-sha256), but that doesn't seem to complete. As a result, neither side seems to get to the point where NEWKEYS is sent and the new keys become active. It may not even be getting far enough to be able to generate those keys.

Unfortunately, there's not much logging in the AsyncSSH key exchange code right now. Could you try setting the AsyncSSH debug level to 3 and post what you see after "Request key exchange"? That'll let us see what key exchange messages (if any) are sent/received.

The expected result is for AsyncSSH to send a KEX_ECDH_INIT and then get back a KEX_ECDH_REPLY, after which both sides should be sending a NEWKEYS message.

@emeryalden
Copy link
Author

Sure thing @ronf! Here's what I'm seeing. Nothing is shown after the text below and the process simply hangs.

DEBUG:asyncssh:[conn=4, pktid=398] Received MSG_KEXINIT (20), 553 bytes
  00000000: 14 e6 36 78 0c 94 ed f7 8a 4f 99 1b 96 49 22 4e  ..6x.....O...I"N
  00000010: 4d 00 00 00 6a 63 75 72 76 65 32 35 35 31 39 2d  M...jcurve25519-
  00000020: 73 68 61 32 35 36 2c 63 75 72 76 65 32 35 35 31  sha256,curve2551
  00000030: 39 2d 73 68 61 32 35 36 40 6c 69 62 73 73 68 2e  9-sha256@libssh.
  00000040: 6f 72 67 2c 64 69 66 66 69 65 2d 68 65 6c 6c 6d  org,diffie-hellm
  00000050: 61 6e 2d 67 72 6f 75 70 31 38 2d 73 68 61 35 31  an-group18-sha51
  00000060: 32 2c 64 69 66 66 69 65 2d 68 65 6c 6c 6d 61 6e  2,diffie-hellman
  00000070: 2d 67 72 6f 75 70 31 36 2d 73 68 61 35 31 32 00  -group16-sha512.
  00000080: 00 00 25 73 73 68 2d 65 64 32 35 35 31 39 2c 72  ..%ssh-ed25519,r
  00000090: 73 61 2d 73 68 61 32 2d 32 35 36 2c 72 73 61 2d  sa-sha2-256,rsa-
  000000a0: 73 68 61 32 2d 35 31 32 00 00 00 4e 61 65 73 32  sha2-512...Naes2
  000000b0: 35 36 2d 67 63 6d 40 6f 70 65 6e 73 73 68 2e 63  [email protected]
  000000c0: 6f 6d 2c 61 65 73 31 32 38 2d 67 63 6d 40 6f 70  om,aes128-gcm@op
  000000d0: 65 6e 73 73 68 2e 63 6f 6d 2c 61 65 73 32 35 36  enssh.com,aes256
  000000e0: 2d 63 74 72 2c 61 65 73 31 39 32 2d 63 74 72 2c  -ctr,aes192-ctr,
  000000f0: 61 65 73 31 32 38 2d 63 74 72 00 00 00 4e 61 65  aes128-ctr...Nae
  00000100: 73 32 35 36 2d 67 63 6d 40 6f 70 65 6e 73 73 68  s256-gcm@openssh
  00000110: 2e 63 6f 6d 2c 61 65 73 31 32 38 2d 67 63 6d 40  .com,aes128-gcm@
  00000120: 6f 70 65 6e 73 73 68 2e 63 6f 6d 2c 61 65 73 32  openssh.com,aes2
  00000130: 35 36 2d 63 74 72 2c 61 65 73 31 39 32 2d 63 74  56-ctr,aes192-ct
  00000140: 72 2c 61 65 73 31 32 38 2d 63 74 72 00 00 00 57  r,aes128-ctr...W
  00000150: 68 6d 61 63 2d 73 68 61 32 2d 35 31 32 2d 65 74  hmac-sha2-512-et
  00000160: 6d 40 6f 70 65 6e 73 73 68 2e 63 6f 6d 2c 68 6d  [email protected],hm
  00000170: 61 63 2d 73 68 61 32 2d 32 35 36 2d 65 74 6d 40  ac-sha2-256-etm@
  00000180: 6f 70 65 6e 73 73 68 2e 63 6f 6d 2c 68 6d 61 63  openssh.com,hmac
  00000190: 2d 73 68 61 32 2d 35 31 32 2c 68 6d 61 63 2d 73  -sha2-512,hmac-s
  000001a0: 68 61 32 2d 32 35 36 00 00 00 57 68 6d 61 63 2d  ha2-256...Whmac-
  000001b0: 73 68 61 32 2d 35 31 32 2d 65 74 6d 40 6f 70 65  sha2-512-etm@ope
  000001c0: 6e 73 73 68 2e 63 6f 6d 2c 68 6d 61 63 2d 73 68  nssh.com,hmac-sh
  000001d0: 61 32 2d 32 35 36 2d 65 74 6d 40 6f 70 65 6e 73  a2-256-etm@opens
  000001e0: 73 68 2e 63 6f 6d 2c 68 6d 61 63 2d 73 68 61 32  sh.com,hmac-sha2
  000001f0: 2d 35 31 32 2c 68 6d 61 63 2d 73 68 61 32 2d 32  -512,hmac-sha2-2
  00000200: 35 36 00 00 00 09 6e 6f 6e 65 2c 7a 6c 69 62 00  56....none,zlib.
  00000210: 00 00 09 6e 6f 6e 65 2c 7a 6c 69 62 00 00 00 00  ...none,zlib....
  00000220: 00 00 00 00 00 00 00 00 00                       .........
DEBUG:asyncssh:[conn=4] Received key exchange request
DEBUG:asyncssh:[conn=4]   Key exchange algs: curve25519-sha256,[email protected],diffie-hellman-group18-sha512,diffie-hellman-group16-sha512
DEBUG:asyncssh:[conn=4]   Host key algs: ssh-ed25519,rsa-sha2-256,rsa-sha2-512
DEBUG:asyncssh:[conn=4]   Client to server:
DEBUG:asyncssh:[conn=4]     Encryption algs: [email protected],[email protected],aes256-ctr,aes192-ctr,aes128-ctr
DEBUG:asyncssh:[conn=4]     MAC algs: [email protected],[email protected],hmac-sha2-512,hmac-sha2-256
DEBUG:asyncssh:[conn=4]     Compression algs: none,zlib
DEBUG:asyncssh:[conn=4]   Server to client:
DEBUG:asyncssh:[conn=4]     Encryption algs: [email protected],[email protected],aes256-ctr,aes192-ctr,aes128-ctr
DEBUG:asyncssh:[conn=4]     MAC algs: [email protected],[email protected],hmac-sha2-512,hmac-sha2-256
DEBUG:asyncssh:[conn=4]     Compression algs: none,zlib
DEBUG:asyncssh:[conn=4] Beginning key exchange
DEBUG:asyncssh:[conn=4]   Key exchange alg: curve25519-sha256
DEBUG:asyncssh:[conn=4, pktid=802] Sent MSG_IGNORE (2), 5 bytes
  00000000: 02 00 00 00 00                                   .....
DEBUG:asyncssh:[conn=4, pktid=803] Sent MSG_KEX_ECDH_INIT (30), 37 bytes
  00000000: 1e 00 00 00 20 28 a8 7d df e3 6b 77 2b e4 c8 8d  .... (.}..kw+...
  00000010: cc 52 9d 0b 30 59 e2 46 65 c2 90 ce 6a 3f 2e 15  .R..0Y.Fe...j?..
  00000020: 49 27 8f 06 6f                                   I'..o

@ronf
Copy link
Owner

ronf commented Jan 18, 2025

Thanks!

So, it looks like AsyncSSH is sending a proper MSG_KEX_ECDH_INIT after the KEXINIT messages, but the SSH server isn't sending the expected MSG_KEX_ECDH_REPLY in response, and so that's why AsyncSSH is "hanging". A proper exchange would look something like:

DEBUG:asyncssh:[conn=0, pktid=1] Sent MSG_KEX_ECDH_INIT (30), 37 bytes
  00000000: 1e 00 00 00 20 9a a6 4f e0 00 c5 f8 6f e0 32 62  .... ..O....o.2b
  00000010: dd 9b e3 75 6c 6f 4f 4f 94 e5 fc 47 b5 7b fe 35  ...uloOO...G.{.5
  00000020: a2 bc 08 80 1e                                   .....
DEBUG:asyncssh:[conn=0, pktid=1] Received MSG_KEX_ECDH_REPLY (31), 600 bytes
  00000000: 1f 00 00 01 17 00 00 00 07 73 73 68 2d 72 73 61  .........ssh-rsa
  00000010: 00 00 00 03 01 00 01 00 00 01 01 00 b9 3f e5 a1  .............?..
  00000020: 81 8f d5 f6 ee ba 5a 7c 48 b4 b5 91 5e 6a b8 49  ......Z|H...^j.I
  00000030: 6f fa 38 13 5b ff 24 9a e6 eb 39 5b 2e c0 84 f5  o.8.[.$...9[....
  00000040: c6 44 c9 28 7d 97 13 d5 c5 06 b0 d3 3f 3e 19 d9  .D.(}.......?>..
  00000050: 2e a3 2b 13 2e cb 55 ad 4b 8d 14 cb fa 4b 01 61  ..+...U.K....K.a
  00000060: b0 dc 18 a7 44 b8 f0 04 48 be 16 0b 4b 19 2d 46  ....D...H...K.-F
  00000070: 65 26 07 99 a4 0b d9 88 ad 73 cb 5d 93 e1 5d 57  e&.......s.]..]W
  00000080: 39 6e e9 2c 70 19 ba 4a 5b 68 03 e4 c8 43 80 74  9n.,p..J[h...C.t
  00000090: 65 13 95 da 9a 62 98 a7 c3 26 4f f6 8e 4d b9 de  e....b...&O..M..
  000000a0: fc fe 75 32 c9 ee 4f bf ba 1f d7 b4 05 fe 82 87  ..u2..O.........
  000000b0: 08 95 cf 52 0c 1b d0 5c f3 55 27 12 2b 33 9e 1f  ...R...\.U'.+3..
  000000c0: bc cd 10 03 de ea 4d 2b d8 03 8c 52 48 4e a0 0f  ......M+...RHN..
  000000d0: 8c 81 b9 8f 4d 77 62 1c 87 18 43 0c 02 b3 04 65  ....Mwb...C....e
  000000e0: c2 d0 91 63 41 12 5e 32 6c 78 3a ad b8 4e 71 67  ...cA.^2lx:..Nqg
  000000f0: 1b 71 ee f5 6b ae 35 79 b2 d0 c9 4f d6 98 22 79  .q..k.5y...O.."y
  00000100: 2e 58 69 93 02 1e 72 b0 e4 fe f8 b3 be ed 65 98  .Xi...r.......e.
  00000110: 33 51 95 f6 0c da db b5 32 f1 99 71 00 00 00 20  3Q......2..q...
  00000120: 28 a7 a2 aa d7 35 7e dd 39 0b 08 d8 1c 71 44 55  (....5~.9....qDU
  00000130: cb c3 69 8c 43 ab 32 44 e8 8c bd 12 80 d0 f4 25  ..i.C.2D.......%
  00000140: 00 00 01 14 00 00 00 0c 72 73 61 2d 73 68 61 32  ........rsa-sha2
  00000150: 2d 32 35 36 00 00 01 00 13 27 25 69 e3 4d 8e 88  -256.....'%i.M..
  00000160: 62 d1 5b 66 04 05 ad 43 0f c4 c7 78 53 66 a1 21  b.[f...C...xSf.!
  00000170: 50 dc 19 3f 0d 5a 5d 81 c7 58 26 11 7b 02 f9 24  P..?.Z]..X&.{..$
  00000180: d0 94 65 1c 3f b9 ae da f4 87 5f ad 71 7a f9 b7  ..e.?....._.qz..
  00000190: 83 43 80 d5 24 aa 2f cf 27 13 73 f5 88 df 72 c2  .C..$./.'.s...r.
  000001a0: 67 fa 5d d0 76 80 ac 85 63 a0 ec 9c ea ba c7 c9  g.].v...c.......
  000001b0: 35 8d 3f bf b5 d3 bb 80 a3 d6 2f 22 ee 09 02 23  5.?......./"...#
  000001c0: a0 a9 ec 3a 96 f3 cf 41 b5 09 55 37 13 f6 e4 02  ...:...A..U7....
  000001d0: d4 28 3e ec 32 30 a9 73 a4 64 a9 90 7f 6e c5 3b  .(>.20.s.d..�n.;
  000001e0: 54 9c a9 05 40 2a 8e 7e af cd 7d 85 72 2b e2 be  T...@*.~..}.r+..
  000001f0: 10 4c 98 f1 09 d9 23 20 0f e3 86 bc cf 98 fe 3e  .L....# .......>
  00000200: 7e d3 26 2c 6b 9d 85 2b dd 4a 38 a0 25 30 49 62  ~.&,k..+.J8.%0Ib
  00000210: e2 99 a2 cd ba 32 c9 cc 6e aa f1 e2 0f b8 67 e3  .....2..n.....g.
  00000220: 7c 72 5c 4e c3 78 be 1e 0e 72 3e 33 d4 44 08 58  |r\N.x...r>3.D.X
  00000230: 52 ea f3 9b 97 fe cd 28 e9 f6 b8 09 ca 98 4b bb  R......(......K.
  00000240: 3f 44 66 8f 42 ac 8b 5b f0 6a 25 f8 be 3b 15 b4  ?Df.B..[.j%..;..
  00000250: 73 cd 05 47 7c 62 40 f5                          s..G|b@.

The content of the KEX_ECDH_INIT is just the client's ephemeral public key bytes and that looks well-formed in your trace, so I don't know why the SSH server isn't replying to that. I wonder if maybe there's an issue with a fix for the Terrapin attack, where it thinks it is in the wrong state to do a key exchange.

@emeryalden
Copy link
Author

Thanks again! I sent your response on to the company and will let you know how they respond.

@emeryalden
Copy link
Author

Hey @ronf,

Just heard back from the vendor on the latest. See below for the response and let me know if you have any additional questions. Thanks again!!

Hello,

The dev team did get a response from the SSH library vendor Nsoftware and this is a summary of their response and you are correct it is related to the Terrapin fix:

First, we confirmed that this is an issue with the SSH client. At the end of 2023, we made some changes to protect against the Terrapin attack by adding support for the "strict kex" extension.

We confirmed that the SSH client is also using the "strict kex" extension when connecting to the SFTP server. However, during re-negotiation, the extension requires that any key exchange stops if any packet is received that is not expected.

The AsyncSSH SSH client however seems to be sending SSH_MSG_IGNORE during renegotiation.

Since the ignore packet is being sent during key exchange, the SSH server stop responding to the AsyncSSH client. However, we did not consider that this would cause clients to hang.

While it is not recommended, if compatibility is needed, you can disable "strict kex" and the server will allow for the SSH_MSG_IGNORE packet to be sent but the server would then be vulnerable to the attack when the conditions of the attack are met.

Turning off strict kex in the client would theoretically allow you to get past this, though it wouldn't be recommended for security reasons as it could make your connection susceptible to the Terrapin attack.

Best option would probably be to either look for another client library or open a bug with asyncssh to see if they can fix the issue.

I hope this helps.

@ronf
Copy link
Owner

ronf commented Jan 22, 2025

It shouldn't be too hard to make AsyncSSH not send MSG_IGNORE during later key exchanges. However, if you look at the description of strict-kex in the OpenSSH docs, it says:

a) During initial KEX, terminate the connection if any unexpected or
out-of-sequence packet is received. This includes terminating the
connection if the first packet received is not SSH2_MSG_KEXINIT.
Unexpected packets for the purpose of strict KEX include messages
that are otherwise valid at any time during the connection such as
SSH2_MSG_DEBUG and SSH2_MSG_IGNORE.

Note the first sentence, where it says "During initial KEX" (emphasis mine). I don't think this requirement is meant to apply to rekeying, and AsyncSSH has no problem interoperating with OpenSSH even with it sending the MSG_IGNORE, suggesting that OpenSSH is not rejecting later key exchanges which have additional messages present.

I was initially worried that there could be data in flight during rekeying that also might cause a problem. However, looking more closely at the original SSH RFCs, I see the following in RFC 4253 section 7.1:

Once a party has sent a SSH_MSG_KEXINIT message for key exchange or
re-exchange, until it has sent a SSH_MSG_NEWKEYS message (Section
7.3), it MUST NOT send any messages other than:

o Transport layer generic messages (1 to 19) (but
SSH_MSG_SERVICE_REQUEST and SSH_MSG_SERVICE_ACCEPT MUST NOT be
sent);

o Algorithm negotiation messages (20 to 29) (but further
SSH_MSG_KEXINIT messages MUST NOT be sent);

o Specific key exchange method messages (30 to 49).

The provisions of Section 11 apply to unrecognized messages.

Note, however, that during a key re-exchange, after sending a
SSH_MSG_KEXINIT message, each party MUST be prepared to process an
arbitrary number of messages that may be in-flight before receiving a
SSH_MSG_KEXINIT message from the other party.

So, in the original design, it looks like there's no concern about data in flight except for packets before the peer sends a KEXINIT. This means that the primary change with strict-kex was not allowing any messages of type 1-19 once the key exchange has started, which really comes down primarily to MSG_IGNORE and MSG_DEBUG. I still think it's wrong for the server here to enforce this on re-keying, but it gives me a bit more confidence that blocking these additional message types at both initial kex and rekeying might allow AsyncSSH to work with this server.

@ronf
Copy link
Owner

ronf commented Jan 22, 2025

Here's a potential fix you can test:

diff --git a/asyncssh/connection.py b/asyncssh/connection.py
index 2db356b..3ea2af4 100644
--- a/asyncssh/connection.py
+++ b/asyncssh/connection.py
@@ -1742,7 +1742,7 @@ class SSHConnection(SSHPacketHandler, asyncio.Protocol):
             self._kexinit_sent = True

         if (((pkttype in {MSG_SERVICE_REQUEST, MSG_SERVICE_ACCEPT} or
-              pkttype > MSG_KEX_LAST) and not self._kex_complete) or
+              pkttype > MSG_KEX_LAST) and self._kexinit_sent) or
                 (pkttype == MSG_USERAUTH_BANNER and
                  not (self._auth_in_progress or self._auth_complete)) or
                 (pkttype > MSG_USERAUTH_LAST and not self._auth_complete)):
@@ -1751,7 +1751,7 @@ class SSHConnection(SSHPacketHandler, asyncio.Protocol):

         # If we're encrypting and we have no data outstanding, insert an
         # ignore packet into the stream
-        if self._send_encryption and pkttype not in (MSG_IGNORE, MSG_EXT_INFO):
+        if self._send_encryption and pkttype > MSG_KEX_LAST:
             self.send_packet(MSG_IGNORE, String(b''))

         orig_payload = Byte(pkttype) + b''.join(args)
@@ -1934,6 +1934,7 @@ class SSHConnection(SSHPacketHandler, asyncio.Protocol):
                                      self._mac_alg_sc, mac_key_sc, etm_sc)

         self.send_packet(MSG_NEWKEYS)
+        self._kexinit_sent = False

         self._extensions_to_send[b'global-requests-ok'] = b''

@@ -2368,10 +2369,9 @@ class SSHConnection(SSHPacketHandler, asyncio.Protocol):
                                 'KEXINIT was not the first packet')


-        if self._kexinit_sent:
-            self._kexinit_sent = False
-        else:
+        if not self._kexinit_sent:
             self._send_kexinit()
+            self._kexinit_sent = True

         if self._gss:
             self._gss.reset()

It will also require some minor tweaks in the unit testing, but you shouldn't need those to give this a try.

This change avoids sending MSG_IGNORE for any of the messages below KEX_LAST rather than only singling out specific messages where sending it has been a problem in the past. In addition, it makes sure to defer sending any additional data messages during the key exchange, whether it is the initial kex or a rekeying.

Technically, there is also an issue here if send_debug() is called on a connection during a key exchange, so I may need to also handle that case, but first I want to see if the above fixes the main issue.

@emeryalden
Copy link
Author

Appears that did the trick! Transfers completed and rekey did not trip up the process as it had before. Even when I use the "rekey_seconds" parameter to prompt an early rekey, the process continues just fine. Let me know if there's anything else you'd like me to check on my end. Thanks again!

@ronf
Copy link
Owner

ronf commented Jan 23, 2025

Could you try one more version of this? After looking at it a second time, I came up with a simpler patch:

diff --git a/asyncssh/connection.py b/asyncssh/connection.py
index 2db356b..6656255 100644
--- a/asyncssh/connection.py
+++ b/asyncssh/connection.py
@@ -1741,7 +1741,7 @@ class SSHConnection(SSHPacketHandler, asyncio.Protocol):
             self._send_kexinit()
             self._kexinit_sent = True

-        if (((pkttype in {MSG_SERVICE_REQUEST, MSG_SERVICE_ACCEPT} or
+        if (((pkttype in {MSG_DEBUG, MSG_SERVICE_REQUEST, MSG_SERVICE_ACCEPT} or
               pkttype > MSG_KEX_LAST) and not self._kex_complete) or
                 (pkttype == MSG_USERAUTH_BANNER and
                  not (self._auth_in_progress or self._auth_complete)) or
@@ -1751,7 +1751,7 @@ class SSHConnection(SSHPacketHandler, asyncio.Protocol):

         # If we're encrypting and we have no data outstanding, insert an
         # ignore packet into the stream
-        if self._send_encryption and pkttype not in (MSG_IGNORE, MSG_EXT_INFO):
+        if self._send_encryption and pkttype > MSG_KEX_LAST:
             self.send_packet(MSG_IGNORE, String(b''))

         orig_payload = Byte(pkttype) + b''.join(args)

Also, would you mind passing on my previous comments about initial kex vs. rekeying above to the SFTP library vendor? I'm happy to leave my change in place, but I do consider it a workaround for a problem in the library given the wording in the OpenSSH strict-kex specification.

@emeryalden
Copy link
Author

Hey @ronf,

That works too!! I also passed along your feedback to our vendor and will let you know how they respond. Thanks again!

@ronf
Copy link
Owner

ronf commented Jan 26, 2025

Sounds good - thanks very much!

This change is now available as commit b88eed6 in the "develop" branch and will be included in the next release.

Thanks for your help tracking this down, and for testing the potential fixes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants