Troubleshooting AS2 EDI transfer issues | Aayu Technologies
Home Blog Troubleshooting AS2 EDI transfer issues

Troubleshooting AS2 EDI transfer issues

Learn how to correctly identify and troubleshoot AS2 transfer issues, with symptoms like transmission errors, negative MDN receipts, etc.

01 Aug 2022

Modern tools make it quite easy to get your EDI file transferred to your partner over secure protocols like AS2, but sometimes the same cannot be said about troubleshooting the flow when your AS2 EDI transfer has failed. But if you have a basic idea of how an EDI transaction works over AS2, pinpointing and fixing/remedying the failure is not as hard as it seems.

Correctly identifying the location and nature of the failure, is one key factor in troubleshooting a failed AS2 EDI transfer.

Anatomy of an EDI transaction over AS2

AS2 EDI transfers are transactional, meaning that the sender has to be fully aware that the recipient was able to interpret or understand the received EDI content. This is more advanced than normal AS2 transactionality, where it is enough for the receiver to have received the original content/file.

In the X12 flavor of EDI, a transaction/transfer is completed by a functional acknowledgement (code 997) EDI sent back by the receiver. This file contains the acceptance/rejection status of each EDI-level transaction set (ST), and an overall acceptance/rejection status for the complete document.

(This holds true for EDIs exchanged over any medium, e.g. VAN or SFTP - not just AS2.)

Accordingly, to declare a successful AS2 EDI transfer:

  • the AS2 transmissions (protocol level) should be successful, usually with positive MDN receipts.
  • the EDI transaction should be acknowledged with a success ACK (e.g. 997 in case of X12).

Note that each EDI file is a separate AS2 message transmission, so a successful EDI transaction would contain at least 2 AS2 transactions (one for the original file, and another for the EDI ACK).

We will be discussing EDI transaction-level failures separately, in a future article.

Simplified as web based email for AS2 file transfer

Types of AS2 failures

Inbound failures

These are experienced by the recipient, generally when they are unable to decode or verify the data received from the sender.

Identification/correlation issues

If the receiving system cannot determine which of their partners is sending the message (AS2-From) or does not have an identity matching the intended recipient (AS2-To), it would reject the message.

Symptom (error):

The exact mechanism of rejection may vary across AS2 systems, but common behaviors are:

  • receiver returns a HTTP error code (e.g. 422, “Unprocessable Entry”) to sender, with a body detailing the error (and may not save the message on their end)
  • receiver sends back a negative receipt (error MDN) to the sender, with a message such as unknown AS2 association or cannot find partner/identity with identifier

Solution:

Compare the AS2 identifiers on both ends, and fix any mismatches, register missing identifiers, etc.

Decryption failed

During sending, the message is encrypted using the recipient’s public key/certificate as available to the sender (so that only the recipient can decode/decrypt the message, using their corresponding private key). Some common scenarios can render this encrypted message unreadable even for the recipient:

Sender uses an incorrect certificate to encrypt the message

Symptom:

Recipient’s AS2 system may produce a negative MDN with an error similar to:

Expected ... recipient with certificate serial number NNN issued by CN=ReceiverCN,...
However, the received message was intended for Serial number XYZ issued by CN=SomeOtherCN,...

Solution:

Compare the encryption certificate on sender, with the identity/decryption certificate of the receiver. The combination ({Issuer’s distinguished name (DN)}, {certificate’s serial number}) must match on both ends, but usually checking the serial number is sufficient.

Recipient’s certificate has been renewed/rotated

This is technically the same as the earlier issue. It usually happens when the certificate is renewed after/close to its expiry, so the sender may also become aware of it (by noticing the past/approaching expiration date).

In the longer term, it may be beneficial to have certificate expiration alerts set up, for early detection and avoidance of such issues.

Transferred payload is corrupted

Generally if the AS2 payload (HTTP request body) is damaged or truncated during transmission, decryption will fail at the recipient. (Even in the unlikely, rare chance of decryption being successful, signature verification is guaranteed to fail later in the process.)

Symptom:

Negative MDN, with an error similar to premature end of file or invalid content

Solution:

. If the transmission had a Content-Length header, compare it with the length/size of the actual payload recorded by the receiver. . Retry the transmission, possibly using AS2 restart if the payload is large (several MB/GB range).

(Signature) Verification failed

During sending, sender signs the message using their own private key (so that the recipient can verify the integrity of the transferred content/file, using sender’s public key/certificate, and authenticate that it actually came from the claimed sender). This process can fail due to several reasons:

Recipient does not trust sender’s certificate

Usually, if the issuer/trust chain of sender’s certificate is not available in the recipient’s trust store, recipient’s AS2 system would refuse to trust the signature generated by sender - causing the verification to fail or be skipped.

(In case the sender’s certificate is self-signed, whereby issuer is same as the subject (sender), a copy of that certificate itself needs to be present in the recipient’s trust store.)

Symptom:

Negative MDN, with an error similar to trust anchor for certification path not found

Solution:

  1. Import sender’s certificate (if self-signed), or a sufficient portion of its issuer chain, to the recipient’s trust store - in order to complete the trust chain. For example, if sender’s certificate chain is sender -> issuerA -> issuerB -> issuerC and receiver already trusts issuerB, adding just issuerA would complete the chain.
  2. If available, run a trust anchor validation on the receiver’s side upon the imported certificate, to ensure that it is trusted by the system.
Recipient has assigned a different certificate as sender’s signature/verification certificate

In this case, even though the certificate itself is trusted, authentication will fail because the recipient is expecting the sender to use a different certificate in its signatures.

Symptom:

Negative MDN, with an error similar to: signature certificate <sender's actual certificate> does not match verification certificate <certificate expected by recipient>

Solution:

Confirm with the sender and assign their correct certificate as the verification certificate on the recipient’s end.

Message Digest mismatch

Sender’s signature contains a digest/hash of the original content/file. Receiver calculates the digest over the same content. If these two digests do not match, receiver cannot trust the integrity of the content that they got after decoding.

Symptom:

Negative MDN, with an error similar to: message-digest does not match calculated value

Solution:

  • This could indicate that the content was actually tampered/corrupted by a third party; in such cases, sender may need to refresh their certificate/key material (and propagate the public portions again to their recipients), because the current ones may most probably have been compromised.
  • In some cases, encoding issues (e.g. type of line endings in original vs. decoded content) can lead to digest mismatches, even for untampered content. If so, both sender and receiver need to recheck and compare the MIME content encoding parameters/mechanisms (e.g. Content-Transfer-Encoding) in their AS2/MIME systems.

Decompression failed

This is somewhat uncommon, but may come up if the payload transfer is corrupted/interrupted.

Symptom/Solution:

Same as “Transferred payload is corrupted” scenario; error message may differ, such as decompression failed

Outbound failures

These are faced by the sender, when their message either fails to transmit to the receiving system, or fail processing at the recipient’s end.

AS2 applications may usually isolate such failed messages for easy identification, automatically retry the transmissions where applicable, and automatically alert the system administrators in case of permanent failures.

Transmission/network errors

AS2 operates over the unreliable HTTP protocol, so any error faced by the underlying network could affect the transmission of the message (request) and MDN (response; or another request, in case of asynchronous processing). (Even the most stable networks may encounter such issues randomly.)

Symptom:

Message reports a network-related error (see below).

Solution:

Based on the error category, check the network health and retry the transmission.

Some applications may also provide independent connectivity test options for quick troubleshooting - instead of having to submit a test message to re-check connectivity every time.

Connection timed out / closed unexpectedly / reset / broken pipe etc.

Connection was interrupted while data was being sent; usually there is no guarantee whether the remote system received the message completely, so it is necessary to confirm with the receiver whether the message was processed successfully on their end - before resending it.

Connect timed out / connection refused

Sender could not connect to the receiver’s remote endpoint/URL at all. Note that this is different from the connection time-out explained above. A connect-timeout is most commonly due to:

  • an invalid hostname or port number on the receiver URL,
  • a firewall blocking connections between the sender and receiver, or
  • a temporary downtime/maintenance of the receiver endpoint

Since it is guaranteed that the recipient did not receive any data, it is safe to retry/resend the message - once the connectivity issue is resolved.

Unknown host / resolution failed

Sender was unable to resolve the internet (IP) address of the receiver URL. Most common cause is an invalid hostname in the receiver URL; double-check the URL.

No route to host

Sender could resolve the hostname in receiver’s URL, but could not find a network route/path to reach it. Commonly happens when either the sender or the receiver is in a private network with no internet access/routes. Revise network routing to ensure that the network where sender application is hosted, is able to reach the receiver application’s network.

There are some IP ranges/classes explicitly used for private networks; if the receiver URL points to one of these, it is guaranteed to be the problem. If the URL contains a hostname, the corresponding IP address can be found by running an IP resolution program within the sender’s network, e.g. nslookup {receiver.hostname}

TLS/SSL handshake error

When the remote URL is a secure/HTTPS one (with https:// protocol prefix), the transmission can start only if the sender can create a secure connection to the receiver endpoint by means of a TLS handshake. This handshake can fail due to various reasons:

  • Two parties are unable to agree upon a common TLS version or set of cipher suites/algorithms for the secure connection.
  • TLS/HTTPS certificate of the receiver (server) is not trusted by the sender (client) (“trust anchor not found” or “untrusted certificate”)
  • Hostname (or alternative domain name) in server’s TLS certificate does not match the actual domain name in the client side URL (“hostname verification failure”)
  • Server expects TLS client authentication (two-way handshake) but client does not support two-way SSL / client’s certificate cannot be trusted by the server
  • In rare cases, intermediate proxies/firewalls may drop certain packets related to secure/TLS communication (theoretically this can happen on any connection, but is more commonly visible in TLS)

Symptom:

Certificate or hostname-verification issues may be directly visible in the connection failure error description, but in most cases sender system may only issue a generic “handshake failure” or “failed to establish secure connection” message.

To identify the underlying cause, it may be necessary to run third-party tools like openssl s_client, to attempt a handshake independently, against the receiver endpoint.

Solution:

  • In case of a certificate/trust issue, the issuer (or a suitable parent from the trust chain) of the server certificate should be imported to the trust store of the client (or vice versa, in case of a client-auth issue).
  • A certificate hostname mismatch is a server-side misconfiguration; if server cannot be rectified, usually the client can explicitly disable hostname verification on its connections made to this URL (although it is highly discouraged).
  • Other TLS parameter mismatches, such as non-interception of cipher suites, can usually be rectified by adjusting the client (or server) TLS configurations.

Note that in most cases, legacy server-side configurations would not be allowed to change (to avoid breaking of working connections from other legacy clients); so, although it is non-ideal from a security perspective, it may be necessary for the client to relax its TLS configurations (such as minimum allowed TLS version).

HTTP error responses

Sometimes the network transfer is successful, but the remote system returns an error response - indicating that the HTTP-level (application layer) transaction failed. These usually follow the standard HTTP error response code semantics; but some AS2 systems may just return a generic “server error” (500) code, in which cases it may be necessary to manually check with the partner to determine the actual problem.

Here, one would need to check the response code (and possibly the reason phrase), headers, and the response body if available (whose format varies widely across AS2 implementations), to get an idea of the failure cause. A typical HTTP response contains these pieces arranged as follows:

HTTP/{protocol version} {response code} {reason phrase}

{response header 1}
{response header 2}
...

{response body}

Note that any of these error codes might also be an indication that the configured AS2 URL is incorrect. For example if, instead of an AS2 acceptor endpoint, the URL is pointing to a standard web server (e.g. the recipient’s company website) (where POST requests - as in AS2 - are disallowed), sender would receive HTTP 403 (Forbidden) responses.

301/307 (Permanent Redirect) / 302 (Temporary Redirect) / 303 (See Other)

The remote URL/system is asking you to send the message to a different location; the currently configured partner URL is somehow incorrect. Common causes include:

  • using HTTP protocol (http:// prefix) when remote system expects HTTPS
  • an invalid hostname (e.g. using www.as2.acme.com when just as2.acme.com is expected), or
  • missing a trailing slash on the URL.

The expected URL is usually sent back in a Location response header.

401 (Unauthorized) / 403 (Access Denied)

Remote AS2 system needs some form of authentication in order to accept/allow the request. Different systems expect auth in different forms, with the most common being an Authorization HTTP request header containing credentials - such as basic auth (encoded username/password) or a third-party issued token.

Another possible cause (esp. for 403) could be an application-level firewall blocking incoming requests; it is necessary to confirm with the partner that the receiver’s system is allowed access under its current configuration (source IP address, user agent, etc.).

413 (Payload Too Large)

Sender’s message exceeds the maximum size that is acceptable by the receiver’s system. Some receivers may have different configurations (e.g. dedicated large-payload endpoints) to accept “large” messages. It is recommended to:

  • confirm the file-sizing configuration with the receiver, out-of-band; or,
  • if/when possible, split the message into multiple smaller messages for sequential sending - assuming that the partner is capable of combining them as one unit during processing.
400 (Bad Request) / 422 (Unprocessable Entry)

These are generic errors indicating that the receiver cannot accept/process the received message, because it is somehow malformed by the sender.

  • 422 may indicate missing or invalid identifier associations (e.g. AS2-From header missing, or indicating an unknown partner).
  • 400, in addition to such AS2-specific issues, may also be due to general HTTP protocol-level issues; such as malformed request headers, or problems with the request payload length, encoding, etc.

Unless the reason is included in the error response body, it may be necessary to check the actual network/HTTP request received at the receiver, esp. in case of 400; to determine the cause.

503 (Service Unavailable)

AS2 service on remote system is currently not functioning, possibly due to a temporary downtime or maintenance. It may be advisable to retry the transmission later.

502 (Bad Gateway) / 504 (Gateway Timeout)

An intermediate routing/proxy server is failing to forward the sender’s request to the receiver, or return the receiver’s response back to the sender, within a reasonable time window/wait. Similar to above, this could indicate slowness or a downtime of the underlying AS2 service; retrying later may be a possible solution.

500 (Internal Server Error)

Similar to the generic client (sender) error 400, this is a generic error indicating that the server (receiver) could not process the received request/message for some reason - within the transmission window. It is not known whether the receiver managed to store the message and process it later on; so retrying is generally unsafe, and it may be necessary to confirm the final status of the message, out-of-band, before taking further steps.

Note that, esp. in case of transmissions involving async MDNs/processing, the partner may later manage to process such a message and return a successful MDN. While this is in violation of the AS2 specification, there are widely-adopted AS2 implementations that still behave in this manner.

MDN pending/not received

In synchronous MDN mode

If sender requested a synchronous MDN, receiver must send back the MDN in the same response. In this case, if the MDN is not available after the AS2 transfer completes, the transaction is effectively a failure.

Since the MDN was supposed to be sent back on the already established connection, a missing MDN could most commonly be due to:

  • a misconfiguration of the MDN requesting mode at the receiver; e.g. the sender is not explicitly requesting (or marking as optional) the MDN, by means of a Disposition-Notification-Options header, or is sending a Receipt-Delivery-Option async-MDN header while actually expecting a sync MDN
  • a misconfiguration of the MDN issuing mode at the receiver; e.g. receiver decides not to send back an async MDN, or send back an async MDN instead of a sync one, disregarding the header-driven MDN configuration requested by the sender

Note that if the receiver was not actually able to return the MDN to the sender due to some form of interruption, it would be marked as a message transmission failure - because the MDN is actually the response of the original AS2 transmission.

Asynchronous MDN mode

If sender requested an asynchronous MDN (by including a Receipt-Delivery-Option header in the AS2 transmission), and receiver indicated successful acceptance of the message (by returning a HTTP 200-range response), the transaction should be held as pending - until receiver sends back the MDN indicating final message processing status. However if the MDN does not reach the sender within an acceptable time frame (usually a few minutes, maybe an hour at most), it is possible that the async MDN delivery has somehow failed.

Symptoms and solutions are the same as those explained under connectivity issues and HTTP error responses, now being encountered by the receiver (“MDN sender”) against the MDN receiving endpoint of the sender (“MDN receiver”).

Negative/error MDN receipts

Even if the network transfer and HTTP response indicate success, the actual processing status of the AS2 message can only be determined by the MDN receipt that the receiver sends back to the sender.

Errors returned in the MDN, fall into the same categories described under incoming message failures.

Unmatched MIC

As a verification step, receiver calculates a message integrity check code (MIC) over the received content and embeds it in the MDN that is sent back to the sender. If this value does not match the MIC calculated by the sender (against the originally sent content), it indicates that the transferred content was somehow modified/corrupted.

Symptom:

Sender’s AS2 system reports an “unmatched MIC” error upon receipt of the MDN. Note that in this case, the MDN itself could be marked as successful; it is the sender that does the final MIC comparison and finds the mismatch.

Solution:

If this happens during initial connectivity testing, it could indicate a bad configuration; common scenarios include:

Algorithm used to compute the MIC on the two ends, is different.
  • AS2 specification only provides sha1 and md5 as MIC algorithms, but in most cases AS2 vendors prefer stronger algorithms like SHA-256 or SHA-512.
  • Usually the sender requests a specific MIC algorithm by including a Disposition-Notification-Options header in the AS2 transmission; however the recipient may have ignored it and used a different/fixed algorithm to compute the returned MIC.
  • It is a common practice to use the message signature algorithm itself to compute the MIC as well.
Format of the MIC string is different.

The MIC string is usually {MIC value}, {MIC algorithm name}; the variants in white-space usage and the algorithm name format/case could cause issues across implementations. For example, if sender expects sha-256 as the algorithm name (e.g. MIC foobar, sha-256) but receiver uses sha256 name (foobar, sha256), MIC may be marked as unmatched on sender’s side.

Payload is altered/corrupted.

If there have been successful (MIC-matched) exchanges previously, an isolated occurrence (or series of occurrences) could actually indicate a corruption in the payload, or a man-in-the-middle (MITM) exploitation. Review all security parameters of the AS2 connection; esp. the algorithms and certificates used for encryption and signature. If there is a suspicion of the security keys being compromised, they would need to be rotated and re-configured on the remote end.

MDN signature verification failure

Receiver can optionally sign the returned MDN using their private key (for the sender to verify its authenticity, upon receipt - similar to how the receiver verifies the received AS2 message signature). If this verification fails, sender has no guarantee that the MDN is authentic (that it originated from sender) or intact (that it was not modified during transfer).

Symptom:

Sender’s AS2 system reports a signature verification failure for the received MDN. As in previous cases, the MDN itself could be marked as successful.

Solution:

Ensure that the receiver’s signature/verification certificate is configured correctly on sender’s side.