Приложение H: PuTTY authentication plugin protocol
- H.1 Requirements
- H.2 Transport and configuration
- H.3 Data formats and marshalling
- H.4 Protocol versioning
- H.5 Overview and sequence of events
- H.6 Message formats
- H.6.1
PLUGIN_INIT
- H.6.2
PLUGIN_INIT_RESPONSE
- H.6.3
PLUGIN_INIT_FAILURE
- H.6.4
PLUGIN_PROTOCOL
- H.6.5
PLUGIN_PROTOCOL_REJECT
- H.6.6
PLUGIN_PROTOCOL_ACCEPT
- H.6.7
PLUGIN_KI_SERVER_REQUEST
- H.6.8
PLUGIN_KI_SERVER_RESPONSE
- H.6.9
PLUGIN_KI_USER_REQUEST
- H.6.10
PLUGIN_KI_USER_RESPONSE
- H.6.11
PLUGIN_AUTH_SUCCESS
- H.6.12
PLUGIN_AUTH_FAILURE
- H.6.1
- H.7 References
This appendix contains the specification for the protocol spoken over local IPC between PuTTY and an authentication helper plugin.
If you already have an authentication plugin and want to configure PuTTY to use it, see section 4.22.3 for how to do that. This appendix is for people writing new authentication plugins.
H.1 Requirements
The following requirements informed the specification of this protocol.
Automate keyboard-interactive authentication. We're motivated in the first place by the observation that the general SSH userauth method «keyboard-interactive
» (defined in RFC4256) can be used for many kinds of challenge/response or one-time-password styles of authentication, and in more than one of those, the necessary responses might be obtained from an auxiliary network connection, such as an HTTPS transaction. So it's useful if a user doesn't have to manually copy-type or copy-paste from their web browser into their SSH client, but instead, the process can be automated.
Be able to pass prompts on to the user. On the other hand, some userauth methods can be only partially automated; some of the server's prompts might still require human input. Also, the plugin automating the authentication might need to ask its own questions that are not provided by the SSH server. (For example, «please enter the master key that the real response will be generated by hashing».) So after the plugin intercepts the server's questions, it needs to be able to ask its own questions of the user, which may or may not be the same questions sent by the server.
Allow automatic generation of the username. Sometimes, the authentication method comes with a mechanism for discovering the username to be used in the SSH login. So the plugin has to start up early enough that the client hasn't committed to a username yet.
Future expansion route to other SSH userauth flavours. The initial motivation for this protocol is specific to keyboard-interactive. But other SSH authentication methods exist, and they may also benefit from automation in future. We're making no attempt here to predict what those methods might be or how they might be automated, but we do need to leave a space where they can be slotted in later if necessary.
Minimal information loss. Keyboard-interactive prompts and replies should be passed to and from the plugin in a form as close as possible to the way they look on the wire in SSH itself. Therefore, the protocol resembles SSH in its data formats and marshalling (instead of, for example, translating from SSH binary packet style to another well-known format such as JSON, which would introduce edge cases in character encoding).
Half-duplex. Simultaneously trying to read one I/O stream and write another adds a lot of complexity to software. It becomes necessary to have an organised event loop containing select
or WaitForMultipleObjects
or similar, which can invoke the handler for whichever event happens soonest. There's no need to add that complexity in an application like this, which isn't transferring large amounts of bulk data or multiplexing unrelated activities. So, to keep life simple for plugin authors, we set the ground rule that it must always be 100% clear which side is supposed to be sending a message next. That way, the plugin can be written as sequential code progressing through the protocol, making simple read and write calls to receive or send each message.
Communicate success/failure, to facilitate caching in the plugin. A plugin might want to cache recently used data for next time, but only in the case where authentication using that data was actually successful. So the client has to tell the plugin what the outcome was, if it's known. (But this is best-effort only. Obviously the plugin cannot depend on hearing the answer, because any IPC protocol at all carries the risk that the other end might crash or be killed by things outside its control.)
H.2 Transport and configuration
Plugins are executable programs on the client platform.
The SSH client must be manually configured to use a plugin for a particular connection. The configuration takes the form of a command line, including the location of the plugin executable, and optionally command-line arguments that are meaningful to the particular plugin.
The client invokes the plugin as a subprocess, passing it a pair of 8-bit-clean pipes as its standard input and output. On those pipes, the client and plugin will communicate via the protocol specified below.
H.3 Data formats and marshalling
This protocol borrows the low-level data formatting from SSH itself, in particular the following wire encodings from RFC4251 section 5:
- byte
- An integer between 0 and 0xFF inclusive, transmitted as a single byte of binary data.
- boolean
- The values «true» or «false», transmitted as the bytes 1 and 0 respectively.
- uint32
- An integer between 0 and 0xFFFFFFFF inclusive, transmitted as 4 bytes of binary data, in big-endian («network») byte order.
- string
- A sequence of bytes, preceded by a uint32 giving the number of bytes in the sequence. The length field does not include itself. For example, the empty string is represented by four zero bytes (the uint32 encoding of 0); the string "AB" is represented by the six bytes 0,0,0,2,'A','B'.
Unlike SSH itself, the protocol spoken between the client and the plugin is unencrypted, because local inter-process pipes are assumed to be secured by the OS kernel. So the binary packet protocol is much simpler than SSH proper, and is similar to SFTP and the OpenSSH agent protocol.
The data sent in each direction of the conversation consists of a sequence of messages exchanged between the SSH client and the plugin. Each message is encoded as a string. The contents of the string begin with a byte giving the message type, which determines the format of the rest of the message.
H.4 Protocol versioning
This protocol itself is versioned. At connection setup, the client states the highest version number it knows how to speak, and then the plugin responds by choosing the version number that will actually be spoken (which may not be higher than the client's value).
Including a version number makes it possible to make breaking changes to the protocol later.
Even version numbers represent released versions of this spec. Odd numbers represent drafts or development versions in between releases. A client and plugin negotiating an odd version number are not guaranteed to interoperate; the developer testing the combination is responsible for ensuring the two are compatible.
This document describes version 2 of the protocol, the first released version. (The initial drafts had version 1.)
H.5 Overview and sequence of events
At the very beginning of the user authentication phase of SSH, the client launches the plugin subprocess, if one is configured. It immediately sends the PLUGIN_INIT
message, telling the plugin some initial information about where the SSH connection is to.
The plugin responds with PLUGIN_INIT_RESPONSE
, which may optionally tell the SSH client what username to use.
The client begins trying to authenticate with the SSH server in the usual way, using the username provided by the plugin (if any) or alternatively one obtained via its normal (non-plugin) policy.
The client follows its normal policy for selecting authentication methods to attempt. If it chooses a method that this protocol does not cover, then the client will perform that method in its own way without consulting the plugin.
However, if the client and server decide to attempt a method that this protocol does cover, then the client sends PLUGIN_PROTOCOL
specifying the SSH protocol id for the authentication method being used. The plugin responds with PLUGIN_PROTOCOL_ACCEPT
if it's willing to assist with this auth method, or PLUGIN_PROTOCOL_REJECT
if it isn't.
If the plugin sends PLUGIN_PROTOCOL_REJECT
, then the client will proceed as if the plugin were not present. Later, if another auth method is negotiated (either because this one failed, or because it succeeded but the server wants multiple auth methods), the client may send a further PLUGIN_PROTOCOL
and try again.
If the plugin sends PLUGIN_PROTOCOL_ACCEPT
, then a protocol segment begins that is specific to that auth method, terminating in either PLUGIN_AUTH_SUCCESS
or PLUGIN_AUTH_FAILURE
. After that, again, the client may send a further PLUGIN_PROTOCOL
.
Currently the only supported method is «keyboard-interactive
», defined in RFC4256. Once the client has announced this to the server, the followup protocol is as follows:
Each time the server sends an SSH_MSG_USERAUTH_INFO_REQUEST
message requesting authentication responses from the user, the SSH client translates the message into PLUGIN_KI_SERVER_REQUEST
and passes it on to the plugin.
At this point, the plugin may optionally send back PLUGIN_KI_USER_REQUEST
containing prompts to be presented to the actual user. The client will reply with a matching PLUGIN_KI_USER_RESPONSE
after asking the user to reply to the question(s) in the request message. The plugin can repeat this cycle multiple times.
Once the plugin has all the information it needs to respond to the server's authentication prompts, it sends PLUGIN_KI_SERVER_RESPONSE
back to the client, which translates it into SSH_MSG_USERAUTH_INFO_RESPONSE
to send on to the server.
After that, as described in RFC4256, the server is free to accept authentication, reject it, or send another SSH_MSG_USERAUTH_INFO_REQUEST
. Each SSH_MSG_USERAUTH_INFO_REQUEST
is dealt with in the same way as above.
If the server terminates keyboard-interactive authentication with SSH_MSG_USERAUTH_SUCCESS
or SSH_MSG_USERAUTH_FAILURE
, the client informs the plugin by sending either PLUGIN_AUTH_SUCCESS
or PLUGIN_AUTH_FAILURE
. PLUGIN_AUTH_SUCCESS
is sent when that particular authentication method was successful, regardless of whether the SSH server chooses to request further authentication afterwards: in particular, SSH_MSG_USERAUTH_FAILURE
with the «partial success» flag (see RFC4252 section 5.1) translates into PLUGIN_AUTH_SUCCESS
.
The plugin's standard input will close when the client no longer requires the plugin's services, for any reason. This could be because authentication is complete (with overall success or overall failure), or because the user has manually aborted the session in mid-authentication, or because the client crashed.
H.6 Message formats
This section describes the format of every message in the protocol.
As described in section H.3, every message starts with the same two fields:
- uint32: overall length of the message
- byte: message type.
The length field does not include itself, but does include the type code.
The following subsections each give the format of the remainder of the message, after the type code.
The type codes themselves are defined here:
#define PLUGIN_INIT 1 #define PLUGIN_INIT_RESPONSE 2 #define PLUGIN_PROTOCOL 3 #define PLUGIN_PROTOCOL_ACCEPT 4 #define PLUGIN_PROTOCOL_REJECT 5 #define PLUGIN_AUTH_SUCCESS 6 #define PLUGIN_AUTH_FAILURE 7 #define PLUGIN_INIT_FAILURE 8 #define PLUGIN_KI_SERVER_REQUEST 20 #define PLUGIN_KI_SERVER_RESPONSE 21 #define PLUGIN_KI_USER_REQUEST 22 #define PLUGIN_KI_USER_RESPONSE 23
If this protocol is extended to be able to assist with further auth methods, their message type codes will also begin from 20, overlapping the codes for keyboard-interactive.
H.6.1 PLUGIN_INIT
Direction: client to plugin
When: the first message sent at connection startup
What happens next: the plugin will send PLUGIN_INIT_RESPONSE
or PLUGIN_INIT_FAILURE
Message contents after the type code:
- uint32: the highest version number of this protocol that the client knows how to speak.
- string: the hostname of the server. This will be the logical hostname, in cases where it differs from the physical destination of the network connection. Whatever name would be used by the SSH client to cache the server's host key, that's the same name passed in this message.
- uint32: the port number on the server. (Together with the host name, this forms a primary key identifying a particular server. Port numbers may be vital because a single host can run two unrelated SSH servers with completely different authentication requirements, e.g. system sshd on port 22 and Gerrit on port 29418.)
- string: the username that the client will use to log in, if the plugin chooses not to override it. An empty string means that the client has no opinion about this (and might, for example, prompt the user).
H.6.2 PLUGIN_INIT_RESPONSE
Direction: plugin to client
When: response to PLUGIN_INIT
What happens next: the client will send PLUGIN_PROTOCOL
, or perhaps terminate the session (if no auth method is ever negotiated that the plugin can help with)
Message contents after the type code:
- uint32: the version number of this protocol that the connection will use. Must be no greater than the max version number sent by the client in
PLUGIN_INIT
. - string: the username that the plugin suggests the client use. An empty string means that the plugin has no opinion and the client should stick with the username it already had (or prompt the user, if it had none).
H.6.3 PLUGIN_INIT_FAILURE
Direction: plugin to client
When: response to PLUGIN_INIT
What happens next: the session is over
Message contents after the type code:
- string: an error message to present to the user indicating why the plugin was unable to start up.
H.6.4 PLUGIN_PROTOCOL
Direction: client to plugin
When: sent after PLUGIN_INIT_RESPONSE
, or after a previous auth phase terminates with PLUGIN_AUTH_SUCCESS
or PLUGIN_AUTH_FAILURE
What happens next: the plugin will send PLUGIN_PROTOCOL_ACCEPT
or PLUGIN_PROTOCOL_REJECT
Message contents after the type code:
- string: the SSH protocol id of the auth method the client intends to attempt. Currently the only method specified for use in this protocol is «
keyboard-interactive
».
H.6.5 PLUGIN_PROTOCOL_REJECT
Direction: plugin to client
When: sent after PLUGIN_PROTOCOL
What happens next: the client will either send another PLUGIN_PROTOCOL
or terminate the session
Message contents after the type code:
- string: an error message to present to the user, explaining why the plugin cannot help with this authentication protocol.
An example might be «unable to open <config file>: <OS error message>», if the plugin depends on some configuration that the user has not set up.
If the plugin does not support this this particular authentication protocol at all, this string should be left blank, so that no message will be presented to the user at all.
H.6.6 PLUGIN_PROTOCOL_ACCEPT
Direction: plugin to client
When: sent after PLUGIN_PROTOCOL
What happens next: depends on the auth protocol agreed on. For keyboard-interactive, the client will send PLUGIN_KI_SERVER_REQUEST
or PLUGIN_AUTH_SUCCESS
or PLUGIN_AUTH_FAILURE
. No other method is specified.
Message contents after the type code: none.
H.6.7 PLUGIN_KI_SERVER_REQUEST
Direction: client to plugin
When: sent after PLUGIN_PROTOCOL
, or after a previous PLUGIN_KI_SERVER_RESPONSE
, when the SSH server has sent SSH_MSG_USERAUTH_INFO_REQUEST
What happens next: the plugin will send either PLUGIN_KI_USER_REQUEST
or PLUGIN_KI_SERVER_RESPONSE
Message contents after the type code: the exact contents of the SSH_MSG_USERAUTH_INFO_REQUEST
just sent by the server. See RFC4256 section 3.2 for details. The summary:
- string: name of this prompt collection (e.g. to use as a dialog-box title)
- string: instructions to be displayed before this prompt collection
- string: language tag (deprecated)
- uint32: number of prompts in this collection
- That many copies of:
- string: prompt (in UTF-8)
- boolean: whether the response to this prompt is safe to echo to the screen
H.6.8 PLUGIN_KI_SERVER_RESPONSE
Direction: plugin to client
When: response to PLUGIN_KI_SERVER_REQUEST
, perhaps after one or more intervening pairs of PLUGIN_KI_USER_REQUEST
and PLUGIN_KI_USER_RESPONSE
What happens next: the client will send a further PLUGIN_KI_SERVER_REQUEST
, or PLUGIN_AUTH_SUCCESS
or PLUGIN_AUTH_FAILURE
Message contents after the type code: the exact contents of the SSH_MSG_USERAUTH_INFO_RESPONSE
that the client should send back to the server. See RFC4256 section 3.4 for details. The summary:
- uint32: number of responses (must match the «number of prompts» field from the corresponding server request)
- That many copies of:
- string: response to the nth prompt (in UTF-8)
H.6.9 PLUGIN_KI_USER_REQUEST
Direction: plugin to client
When: response to PLUGIN_KI_SERVER_REQUEST
, if the plugin cannot answer the server's auth prompts without presenting prompts of its own to the user
What happens next: the client will send PLUGIN_KI_USER_RESPONSE
Message contents after the type code: exactly the same as in PLUGIN_KI_SERVER_REQUEST
(see section H.6.7).
H.6.10 PLUGIN_KI_USER_RESPONSE
Direction: client to plugin
When: response to PLUGIN_KI_USER_REQUEST
What happens next: the plugin will send PLUGIN_KI_SERVER_RESPONSE
, or another PLUGIN_KI_USER_REQUEST
Message contents after the type code: exactly the same as in PLUGIN_KI_SERVER_RESPONSE
(see section H.6.8).
H.6.11 PLUGIN_AUTH_SUCCESS
Direction: client to plugin
When: sent after PLUGIN_KI_SERVER_RESPONSE
, or (in unusual cases) after PLUGIN_PROTOCOL_ACCEPT
What happens next: the client will either send another PLUGIN_PROTOCOL
or terminate the session
Message contents after the type code: none
H.6.12 PLUGIN_AUTH_FAILURE
Direction: client to plugin
When: sent after PLUGIN_KI_SERVER_RESPONSE
, or (in unusual cases) after PLUGIN_PROTOCOL_ACCEPT
What happens next: the client will either send another PLUGIN_PROTOCOL
or terminate the session
Message contents after the type code: none
H.7 References
[RFC4251] RFC 4251, «The Secure Shell (SSH) Protocol Architecture».
[RFC4252] RFC 4252, «The Secure Shell (SSH) Authentication Protocol».
[RFC4256] RFC 4256, «Generic Message Exchange Authentication for the Secure Shell Protocol (SSH)» (better known by its wire id «keyboard-interactive»).