AS2 file transfer protocol defined by RFC 4130 is a secure file exchange protocol, widely used for business-to business data exchange. Its key features such as end-to-end encryption, digital signatures and non-repudiation has made this protocol popular within many industries including retail (Walmart, Amazon, Target etc.), health (FDA, EMA) and logistics. You can read this article for a detailed explanation on how AS2 protocol operates to provide these features.
Since AS2 transmission operates on top of HTTP protocol, it is also vulnerable to the usual HTTP related transmission failures such as connection timeouts, socket timeouts, etc. But unlike plain HTTP, AS2 protocol’s features make it possible to clearly identify almost all transmission failure scenarios, so the transmission can be retried.
In most cases where AS2 is in use, the sizes of the files being exchanged are fairly small, usually in kilobyte (kB) range or a few Megabytes (MBs) at most. In such scenarios, it is possible to re-transmit the whole file again when a transmission failure was identified. But in some cases, specially in the health domain, AS2 protocol is used to exchange very large files that can be few GigaBytes (GBs) in size. Due to the size of the file and the longer time it takes to transmit such files over HTTP, the possibility of transmission failures increases. In such a scenario, re-transmitting the whole file again is not favorable, as it can result in the same failure again and also the network congestion and costs would be significant due to these re-transmissions.
As a solution to the above issue, a new Internet Draft Document named as AS2 Restart for Very Large Messages was submitted. This introduced a mechanism to restart any failed transmissions from the point of failure instead of re-transmitting the whole file. This mechanism utilizes the already existing HTTP headers for this purpose and, because of that any AS2 software which implements this will be backwards compatible with any other AS2 software that does not support AS2 restart.
From this point onwards, the terms AS2 client or client refers to the AS2 software that is sending the file, and the terms AS2 server or server refers to the AS2 software that is receiving the file. Also the term file refers to the content transferred over HTTP, after encrypting and/or signing the original file based on trading partner’s AS2 configuration.
In addition to the usual AS2 related headers, this mechanism utilizes the following two additional HTTP headers for its purpose.
Etag
- The AS2 client will use this header to indicate the Transfer ID associated with this transmission. The
client must guarantee that the value of this transfer ID adhere to the following conditions, and any re-transmission
attempts of the same file should contain the same Transfer ID.
In HTTP, typically the hash of the file is used as the
ETag
. But since calculating the hash of large files is costly, the client has the freedom to use any other mechanism at its disposal (such as a database ID, timestamp) to generate a valid transfer ID as long as it conforms to the conditions mentioned above.
Content-Range
- The AS2 client will use this header to indicate which part of the file (the byte range) is being
transferred in this transmission attempt.The AS2 server should utilize the values of ETag
and Content-Range
to temporarily cache the received file until the
full file is transmitted, so in case of a failure, a restart of the transmission from the point of failure can be performed.
Usually in AS2 transmissions, a file is transferred via a POST
request from the AS2 client to the AS2 server. But in
AS2 restart mechanism, an additional HEAD
request is used by the client to query the server on how much content of the
file is already received by the server from previous partial transmissions. Let’s see how this is being used during the
initial transmission and during subsequent transmission restart attempts.
HEAD
request to the AS2 server with the above transfer ID as the value of ETag
header. That
request is querying the server whether the server has already received (at least partially) a file with this
transfer ID before, and if yes, how much content has already received.200
status and a Content-Length
header with
value 0
. The meaning of this response is that the server doesn’t have any content of a file with this transfer ID.POST
request. In addition to the usual AS2 related headers, this request must contain the same ETag
header sent on the previous HEAD
request. Since this POST
request is transferring a new file (or overwriting an
existing file), it is not required to send the Content-Range
header.Although the RFC states that the
HEAD
query should be used even with the initial transmission of a file, some AS2 software do not use it for initial transmissions. Most probably this is to prevent the unnecessary delay and network overhead, as the client is aware that this is a fresh transmission, and there is no information to get from the server via theHEAD
query.
HEAD
request to the AS2 server with the above transfer ID as the value of ETag
header.200
status and a Content-Length
header
indicating how many bytes of this file has already received by the server (let’s indicate this value by n
).n+1
)
through a POST
request. In addition to the usual AS2 related headers, this request must contain the same ETag
header sent on the previous HEAD
request. And most importantly it should also contain the Content-Range
header
indicating the range of bytes that the client expects to transfer through this request.Content-Range
header.If the returned
Content-Length
value from theHEAD
query equals the total file size, the client should send at least one byte of data in the nextPOST
request.
In summary, when a very large file is sent from an AS2 client to a server (with both supporting AS2 Restart feature), the transmission will be started with an ‘initial request’ as mentioned above, and continued to be re-attempted with one or more ‘restart requests’ until the file is fully received by the server.
Udith is the Chief Technology Officer at Aayu Technologies. With over 9 years of experience in the enterprise software industry, he has been instrumental in architecting, developing, and maintaining a range of enterprise software solutions, particularly B2B communication software, with a significant focus on cloud technologies.