Modern tools make it quite easy to get your EDI file transferred to your partner over secure protocols like AS2, but sometimes the same cannot be said about troubleshooting the flow when your AS2 EDI transfer has failed. But if you have a basic idea of how an EDI transaction works over AS2, pinpointing and fixing/remedying the failure is not as hard as it seems.
Correctly identifying the location and nature of the failure, is one key factor in troubleshooting a failed AS2 EDI transfer.
AS2 EDI transfers are transactional, meaning that the sender has to be fully aware that the recipient was able to interpret or understand the received EDI content. This is more advanced than normal AS2 transactionality, where it is enough for the receiver to have received the original content/file.
In the X12 flavor of EDI, a transaction/transfer is completed by a functional acknowledgement (code 997) EDI sent back by the receiver. This file contains the acceptance/rejection status of each EDI-level transaction set (ST), and an overall acceptance/rejection status for the complete document.
(This holds true for EDIs exchanged over any medium, e.g. VAN or SFTP - not just AS2.)
Accordingly, to declare a successful AS2 EDI transfer:
Note that each EDI file is a separate AS2 message transmission, so a successful EDI transaction would contain at least 2 AS2 transactions (one for the original file, and another for the EDI ACK).
We will be discussing EDI transaction-level failures separately, in a future article.
These are experienced by the recipient, generally when they are unable to decode or verify the data received from the sender.
If the receiving system cannot determine which of their partners is sending the message (AS2-From
)
or does not have an identity matching the intended recipient (AS2-To
), it would reject the message.
Symptom (error):
The exact mechanism of rejection may vary across AS2 systems, but common behaviors are:
unknown AS2 association
or cannot find partner/identity with identifier
Solution:
Compare the AS2 identifiers on both ends, and fix any mismatches, register missing identifiers, etc.
During sending, the message is encrypted using the recipient’s public key/certificate as available to the sender (so that only the recipient can decode/decrypt the message, using their corresponding private key). Some common scenarios can render this encrypted message unreadable even for the recipient:
Symptom:
Recipient’s AS2 system may produce a negative MDN with an error similar to:
Expected ... recipient with certificate serial number NNN issued by CN=ReceiverCN,...
However, the received message was intended for Serial number XYZ issued by CN=SomeOtherCN,...
Solution:
Compare the encryption certificate on sender, with the identity/decryption certificate of the receiver. The combination ({Issuer’s distinguished name (DN)}, {certificate’s serial number}) must match on both ends, but usually checking the serial number is sufficient.
This is technically the same as the earlier issue. It usually happens when the certificate is renewed after/close to its expiry, so the sender may also become aware of it (by noticing the past/approaching expiration date).
In the longer term, it may be beneficial to have certificate expiration alerts set up, for early detection and avoidance of such issues.
Generally if the AS2 payload (HTTP request body) is damaged or truncated during transmission, decryption will fail at the recipient. (Even in the unlikely, rare chance of decryption being successful, signature verification is guaranteed to fail later in the process.)
Symptom:
Negative MDN, with an error similar to premature end of file
or invalid content
Solution:
. If the transmission had a Content-Length
header, compare it with the length/size of the actual payload recorded by the receiver.
. Retry the transmission, possibly using AS2 restart if the payload is large (several MB/GB range).
During sending, sender signs the message using their own private key (so that the recipient can verify the integrity of the transferred content/file, using sender’s public key/certificate, and authenticate that it actually came from the claimed sender). This process can fail due to several reasons:
Usually, if the issuer/trust chain of sender’s certificate is not available in the recipient’s trust store, recipient’s AS2 system would refuse to trust the signature generated by sender - causing the verification to fail or be skipped.
(In case the sender’s certificate is self-signed, whereby issuer is same as the subject (sender), a copy of that certificate itself needs to be present in the recipient’s trust store.)
Symptom:
Negative MDN, with an error similar to trust anchor for certification path not found
Solution:
sender -> issuerA -> issuerB -> issuerC
and receiver already trusts issuerB
,
adding just issuerA
would complete the chain.In this case, even though the certificate itself is trusted, authentication will fail because the recipient is expecting the sender to use a different certificate in its signatures.
Symptom:
Negative MDN, with an error similar to:
signature certificate <sender's actual certificate> does not match verification certificate <certificate expected by recipient>
Solution:
Confirm with the sender and assign their correct certificate as the verification certificate on the recipient’s end.
Sender’s signature contains a digest/hash of the original content/file. Receiver calculates the digest over the same content. If these two digests do not match, receiver cannot trust the integrity of the content that they got after decoding.
Symptom:
Negative MDN, with an error similar to:
message-digest does not match calculated value
Solution:
Content-Transfer-Encoding
) in their AS2/MIME systems.This is somewhat uncommon, but may come up if the payload transfer is corrupted/interrupted.
Symptom/Solution:
Same as “Transferred payload is corrupted” scenario;
error message may differ, such as decompression failed
These are faced by the sender, when their message either fails to transmit to the receiving system, or fail processing at the recipient’s end.
AS2 applications may usually isolate such failed messages for easy identification, automatically retry the transmissions where applicable, and automatically alert the system administrators in case of permanent failures.
AS2 operates over the unreliable HTTP protocol, so any error faced by the underlying network could affect the transmission of the message (request) and MDN (response; or another request, in case of asynchronous processing). (Even the most stable networks may encounter such issues randomly.)
Symptom:
Message reports a network-related error (see below).
Solution:
Based on the error category, check the network health and retry the transmission.
Some applications may also provide independent connectivity test options for quick troubleshooting - instead of having to submit a test message to re-check connectivity every time.
Connection was interrupted while data was being sent; usually there is no guarantee whether the remote system received the message completely, so it is necessary to confirm with the receiver whether the message was processed successfully on their end - before resending it.
Sender could not connect to the receiver’s remote endpoint/URL at all. Note that this is different from the connection time-out explained above. A connect-timeout is most commonly due to:
Since it is guaranteed that the recipient did not receive any data, it is safe to retry/resend the message - once the connectivity issue is resolved.
Sender was unable to resolve the internet (IP) address of the receiver URL. Most common cause is an invalid hostname in the receiver URL; double-check the URL.
Sender could resolve the hostname in receiver’s URL, but could not find a network route/path to reach it. Commonly happens when either the sender or the receiver is in a private network with no internet access/routes. Revise network routing to ensure that the network where sender application is hosted, is able to reach the receiver application’s network.
There are some IP ranges/classes explicitly used for private networks;
if the receiver URL points to one of these, it is guaranteed to be the problem.
If the URL contains a hostname, the corresponding IP address can be found by running an IP resolution program within the sender’s network,
e.g. nslookup {receiver.hostname}
When the remote URL is a secure/HTTPS one (with https://
protocol prefix),
the transmission can start only if the sender can create a secure connection to the receiver endpoint by means of a TLS handshake.
This handshake can fail due to various reasons:
Symptom:
Certificate or hostname-verification issues may be directly visible in the connection failure error description, but in most cases sender system may only issue a generic “handshake failure” or “failed to establish secure connection” message.
To identify the underlying cause, it may be necessary to run third-party tools like openssl s_client
,
to attempt a handshake independently, against the receiver endpoint.
Solution:
Note that in most cases, legacy server-side configurations would not be allowed to change (to avoid breaking of working connections from other legacy clients); so, although it is non-ideal from a security perspective, it may be necessary for the client to relax its TLS configurations (such as minimum allowed TLS version).
Sometimes the network transfer is successful, but the remote system returns an error response - indicating that the HTTP-level (application layer) transaction failed. These usually follow the standard HTTP error response code semantics; but some AS2 systems may just return a generic “server error” (500) code, in which cases it may be necessary to manually check with the partner to determine the actual problem.
Here, one would need to check the response code (and possibly the reason phrase), headers, and the response body if available (whose format varies widely across AS2 implementations), to get an idea of the failure cause. A typical HTTP response contains these pieces arranged as follows:
HTTP/{protocol version} {response code} {reason phrase}
{response header 1}
{response header 2}
...
{response body}
Note that any of these error codes might also be an indication that the configured AS2 URL is incorrect. For example if, instead of an AS2 acceptor endpoint, the URL is pointing to a standard web server (e.g. the recipient’s company website) (where POST requests - as in AS2 - are disallowed), sender would receive HTTP 403 (Forbidden) responses.
The remote URL/system is asking you to send the message to a different location; the currently configured partner URL is somehow incorrect. Common causes include:
http://
prefix) when remote system expects HTTPSwww.as2.acme.com
when just as2.acme.com
is expected), orThe expected URL is usually sent back in a Location
response header.
Remote AS2 system needs some form of authentication in order to accept/allow the request.
Different systems expect auth in different forms, with the most common being an Authorization
HTTP request header containing credentials -
such as basic auth (encoded username/password) or a third-party issued token.
Another possible cause (esp. for 403) could be an application-level firewall blocking incoming requests; it is necessary to confirm with the partner that the receiver’s system is allowed access under its current configuration (source IP address, user agent, etc.).
Sender’s message exceeds the maximum size that is acceptable by the receiver’s system. Some receivers may have different configurations (e.g. dedicated large-payload endpoints) to accept “large” messages. It is recommended to:
These are generic errors indicating that the receiver cannot accept/process the received message, because it is somehow malformed by the sender.
AS2-From
header missing, or indicating an unknown partner).Unless the reason is included in the error response body, it may be necessary to check the actual network/HTTP request received at the receiver, esp. in case of 400; to determine the cause.
AS2 service on remote system is currently not functioning, possibly due to a temporary downtime or maintenance. It may be advisable to retry the transmission later.
An intermediate routing/proxy server is failing to forward the sender’s request to the receiver, or return the receiver’s response back to the sender, within a reasonable time window/wait. Similar to above, this could indicate slowness or a downtime of the underlying AS2 service; retrying later may be a possible solution.
Similar to the generic client (sender) error 400, this is a generic error indicating that the server (receiver) could not process the received request/message for some reason - within the transmission window. It is not known whether the receiver managed to store the message and process it later on; so retrying is generally unsafe, and it may be necessary to confirm the final status of the message, out-of-band, before taking further steps.
Note that, esp. in case of transmissions involving async MDNs/processing, the partner may later manage to process such a message and return a successful MDN. While this is in violation of the AS2 specification, there are widely-adopted AS2 implementations that still behave in this manner.
If sender requested a synchronous MDN, receiver must send back the MDN in the same response. In this case, if the MDN is not available after the AS2 transfer completes, the transaction is effectively a failure.
Since the MDN was supposed to be sent back on the already established connection, a missing MDN could most commonly be due to:
Disposition-Notification-Options
header,
or is sending a Receipt-Delivery-Option
async-MDN header while actually expecting a sync MDNNote that if the receiver was not actually able to return the MDN to the sender due to some form of interruption, it would be marked as a message transmission failure - because the MDN is actually the response of the original AS2 transmission.
If sender requested an asynchronous MDN (by including a Receipt-Delivery-Option
header in the AS2 transmission),
and receiver indicated successful acceptance of the message (by returning a HTTP 200-range response),
the transaction should be held as pending - until receiver sends back the MDN indicating final message processing status.
However if the MDN does not reach the sender within an acceptable time frame (usually a few minutes, maybe an hour at most),
it is possible that the async MDN delivery has somehow failed.
Symptoms and solutions are the same as those explained under connectivity issues and HTTP error responses, now being encountered by the receiver (“MDN sender”) against the MDN receiving endpoint of the sender (“MDN receiver”).
Even if the network transfer and HTTP response indicate success, the actual processing status of the AS2 message can only be determined by the MDN receipt that the receiver sends back to the sender.
Errors returned in the MDN, fall into the same categories described under incoming message failures.
As a verification step, receiver calculates a message integrity check code (MIC) over the received content and embeds it in the MDN that is sent back to the sender. If this value does not match the MIC calculated by the sender (against the originally sent content), it indicates that the transferred content was somehow modified/corrupted.
Symptom:
Sender’s AS2 system reports an “unmatched MIC” error upon receipt of the MDN. Note that in this case, the MDN itself could be marked as successful; it is the sender that does the final MIC comparison and finds the mismatch.
Solution:
If this happens during initial connectivity testing, it could indicate a bad configuration; common scenarios include:
sha1
and md5
as MIC algorithms,
but in most cases AS2 vendors prefer stronger algorithms like SHA-256 or SHA-512.Disposition-Notification-Options
header in the AS2 transmission;
however the recipient may have ignored it and used a different/fixed algorithm to compute the returned MIC.The MIC string is usually {MIC value}, {MIC algorithm name}
;
the variants in white-space usage and the algorithm name format/case could cause issues across implementations.
For example, if sender expects sha-256
as the algorithm name (e.g. MIC foobar, sha-256
) but receiver uses sha256
name (foobar, sha256
),
MIC may be marked as unmatched on sender’s side.
If there have been successful (MIC-matched) exchanges previously, an isolated occurrence (or series of occurrences) could actually indicate a corruption in the payload, or a man-in-the-middle (MITM) exploitation. Review all security parameters of the AS2 connection; esp. the algorithms and certificates used for encryption and signature. If there is a suspicion of the security keys being compromised, they would need to be rotated and re-configured on the remote end.
Receiver can optionally sign the returned MDN using their private key (for the sender to verify its authenticity, upon receipt - similar to how the receiver verifies the received AS2 message signature). If this verification fails, sender has no guarantee that the MDN is authentic (that it originated from sender) or intact (that it was not modified during transfer).
Symptom:
Sender’s AS2 system reports a signature verification failure for the received MDN. As in previous cases, the MDN itself could be marked as successful.
Solution:
Ensure that the receiver’s signature/verification certificate is configured correctly on sender’s side.
Janaka is a Software Architect at Aayu Technologies. He is experienced in diverse areas including enterprise integration, B2B communication, and cloud and serverless technologies; and has been involved in the design and implementation of almost every Aayu product. Any interesting bug will keep him up overnight, as will tea, movies, and music.