Go to the first, previous, next, last section, table of contents.

48. SMTP processing

Exim supports SMTP over TCP/IP, and also so-called `batched SMTP'. The latter is the name for a process in which batches of messages are stored in or read from files, in a format in which SMTP commands are used to contain the envelope information. Such batches are delivered to or received from other systems using some transport mechanism other than Exim. For each of these kinds of SMTP processing there are two aspects: outgoing and incoming. There is also support for a third kind of SMTP when a message is passed from a local process to Exim by running the SMTP protocol over the standard input and output. This is called `local SMTP', and is an input process only.

48.1 Outgoing SMTP over TCP/IP

Outgoing SMTP over TCP/IP is implemented by the smtp transport. If, in response to its EHLO command, it is told that the SIZE parameter is supported, it adds SIZE=<n> to each subsequent MAIL command. The value of <n> is the message size plus the value of the size_addition option (default 1024) to allow for additions to the message such as per-transport header lines, or changes made in a transport filter. If size_addition is set negative, the use of SIZE is suppressed.

If the remote server advertises support for PIPELINING, Exim uses the pipelining extension to SMTP (RFC 2197) to reduce the number of TCP/IP packets required for the transaction.

If the remote server advertises support for the STARTTLS command, and Exim was built to support TLS encryption, it tries to start a TLS session unless the server matches hosts_avoid_tls. See chapter 38 for more details.

If the remote server advertises support for the AUTH command, and Exim was built to support SMTP authentication, it scans the authenticators configuration for any suitable client settings, as described in chapter 35.

Responses from the remote host are supposed to be terminated by CR followed by LF. However, there are known to be hosts that do not send CR characters, so in order to be able to interwork with such hosts, Exim treats LF on its own as a line terminator.

If a message contains a number of different addresses, all those with the same characteristics (for example, the same envelope sender) that resolve to the same set of hosts, in the same order, are sent in a single SMTP transaction, even if they are for different domains, unless there are more than the setting of the max_rcpts option in the smtp transport allows, in which case they are split into groups containing no more than max_rcpts addresses each. If remote_max_parallel is greater than one, such groups may be sent in parallel sessions. The order of hosts with identical MX values is not significant when checking whether addresses can be batched in this way.

When the smtp transport suffers a temporary failure that is not message-related, Exim updates its transport-specific database, which contains records indexed by host name that remember which messages are waiting for each particular host. It also updates the retry database with new retry times. Exim's retry hints are based on host name plus IP address, so if one address of a multi-homed host is broken, it will soon be skipped most of the time. See the next section for more detail about error handling.

When a message is successfully delivered over a TCP/IP SMTP connection, Exim looks in the hints database for the transport to see if there are any queued messages waiting for the host to which it is connected. If it finds one, it creates a new Exim process using the -MC option (which can only be used by a process running as root or the Exim user) and passes the TCP/IP socket to it. The new process does only those deliveries that are routed to the connected host, and may in turn pass the socket on to a third process, and so on.

If this is happening in a queue run, the queue-runner process must not proceed to the next message in the queue until the whole sequence of deliveries is complete. However, making each process wait for its successor is not a good idea, as there may be many of them. To avoid having to do this, a queue-runner process creates a pipe which is passed to all the created processes, none of which actually write to it. The queue-runner tries to read from the pipe. This causes it to block until all the created processes have finished.

The batch_max option of the smtp transport can be used to limit the number of messages sent down a single TCP/IP connection. The second and subsequent messages delivered down an existing connection are identified in the main log by the addition of an asterisk after the closing square bracket of the IP address.

48.2 Errors in outgoing SMTP

Three different kinds of error are recognized for outgoing SMTP: host errors, message errors, and recipient errors.

A host error is not associated with a particular message or with a particular recipient of a message. The host errors are:
- Connection refused or timed out,
- Any error response code on connection,
- Any error response code to EHLO or HELO,
- Loss of connection at any time, except after `.',
- I/O errors at any time,
- Timeouts during the session, other than in response to MAIL, RCPT or the `.' at the end of the data.
A permanent error response on connection, or in response to EHLO, causes all addresses routed to the host to be failed. Any other host error causes all addresses to be deferred, and retry data to be created for the host. It is not tried again, for any message, until its retry time arrives. If the current set of addresses are not all delivered in this run (to some alternative host), the message is added to the list of those waiting for this host, so if it is still undelivered when a subsequent successful delivery is made to the host, it will be sent down the same SMTP connection.
A message error is associated with a particular message when sent to a particular host, but not with a particular recipient of the message. The message errors are:
- Any error response code to MAIL, DATA, or the `.' that terminates the data,
- Timeout after MAIL,
- Timeout or loss of connection after the `.' that terminates the data. A timeout after the DATA command itself is treated as a host error, as is loss of connection at any other time.
A permanent error response (5xx) causes all addresses to be failed, and a delivery error report to be returned to the sender. A temporary error response (4xx) or one of the timeouts causes all addresses to be deferred. Retry data is not created for the host, but instead, a retry record for the combination of host plus message id is created. The message is not added to the list of those waiting for this host. This ensures that the failing message will not be sent to this host again until the retry time arrives. However, other messages that are routed to the host are not affected, so if it is some property of the message that is causing the error, it will not stop the delivery of other mail. If the remote host specified support for the SIZE parameter in its response to EHLO, Exim adds SIZE=nnn to the MAIL command, so an over-large message will cause a message error because it will arrive as a response to MAIL.
A recipient error is associated with a particular recipient of a message. The recipient errors are:
- Any error response to RCPT,
- Timeout after RCPT.
A permanent error response (5xx) causes the recipient address to be failed, and a delivery error report to be returned to the sender. A temporary error response (4xx) or a timeout causes the failing address to be deferred, and routing retry data to be created for it. This is used to delay processing of the address in subsequent queue runs, until its routing retry time arrives. This applies to all messages, but because it operates only in queue runs, one attempt will be made to deliver a new message to the failing address before the delay starts to operate. This ensures that, if the failure is really related to the message rather than the recipient (`message too big for this recipient' is a possible example), other messages have a chance of getting delivered. However, if a delivery to the address does succeed, the retry information gets cleared, so all stuck messages get tried again, and the retry clock is reset. The message is not added to the list of those waiting for this host. Use of the host for other messages is unaffected, and except in the case of a timeout, other recipients are processed independently, and may be successfully delivered in the current SMTP session. After a timeout it is of course impossible to proceed with the session, so all addresses get deferred. However, those other than the one that failed do not suffer any subsequent retry delays. Therefore, if one recipient is causing trouble, the others have a chance of getting through when a subsequent delivery attempt occurs before the failing recipient's retry time.

In all cases, if there are other hosts (or IP addresses) available for the current set of addresses (for example, from multiple MX records), they are tried in this run for any undelivered addresses, subject of course to their own retry data. In other words, recipient error retry data does not take effect until the next delivery attempt.

Some hosts have been observed to give temporary error responses to every MAIL command at certain times (`insufficient space' has been seen). It would be nice if such circumstances could be recognized, and defer data for the host itself created, but this is not possible within the current Exim design. What actually happens is that retry data for every (host, message) combination is created.

The reason that timeouts after MAIL and RCPT are treated specially is that these can sometimes arise as a result of the remote host's verification procedures. Exim makes this assumption, and treats them as if a temporary error response had been received. A timeout after `.' is treated specially because it is known that some broken implementations fail to recognize the end of the message if the last character of the last line is a binary zero. Thus is it helpful to treat this case as a message error.

Timeouts at other times are treated as host errors, assuming a problem with the host, or the connection to it. If a timeout after MAIL, RCPT, or `.' is really a connection problem, the assumption is that at the next try the timeout is likely to occur at some other point in the dialogue, causing it then to be treated as a host error.

There is experimental evidence that some MTAs drop the connection after the terminating `.' if they don't like the contents of the message for some reason, in contravention of the RFC, which indicates that a 5xx response should be given. That is why Exim treats this case as a message rather than a host error, in order not to delay other messages to the same host.

48.3 Variable Envelope Return Paths (VERP)

Variable Envelope Return Paths -- see ftp://koobera.math.uic.edu/www/proto/verp.txt -- can be supported in Exim by using the return_path generic transport option to rewrite the return path at transport time. For example, the following could be used on an smtp transport:

return_path = \
  ${if match {$return_path}{^(.+?)-request@your.domain\$}\
  {$1-request=$local_part%$domain@your.domain}fail}

This has the effect of rewriting the return path (envelope sender) on all outgoing SMTP messages, if the local part of the original return path ends in `-request', and the domain is your.domain. The rewriting inserts the local part and domain of the recipient into the return path. If, for example, a message with return path somelist-request@your.domain is sent to subscriber@other.domain, the return path is rewritten as

somelist-request=subscriber%other.domain@your.domain

For this to work, you must arrange for outgoing messages that have `-request' in their return paths to have just a single recipient. This can be done by setting

max_rcpt = 1

in the smtp transport. Otherwise a single copy of a message might be addressed to several different recipients in the same domain, in which case $local_part is not available (because it is not unique). Of course, if you do start sending out messages with this kind of return path, you must also configure Exim to accept the bounce messages that come back to those paths. Typically this would be done by setting a suffix option in a suitable director.

The overhead incurred in using VERP depends very much on the size of the message, the number of recipient addresses that resolve to the same remote host, and the speed of the connection over which the message is being sent. If a lot of addresses resolve to the same host and the connection is slow, sending a separate copy of the message for each address may take substantially longer than sending a single copy with many recipients (for which VERP cannot be used).

48.4 Incoming SMTP messages over TCP/IP

Incoming SMTP messages can be accepted in one of two ways: by running a listening daemon, or by using inetd. In the latter case, the entry in /etc/inetd.conf should be like this:

smtp  stream  tcp  nowait  exim  /opt/exim/bin/exim  in.exim  -bs

Exim distinguishes between this case and the case of a user agent using the -bs option by checking whether the standard input is a socket using the getpeername() function.

By default, Exim does not make a log entry when a remote hosts connects or disconnects (either via the daemon or inetd), unless the disconnection is unexpected. It can be made to write such log entries by setting the log_smtp_connections option.

Commands from the remote host are supposed to be terminated by CR followed by LF. However, there are known to be hosts that do not send CR characters, so in order to be able to interwork with such hosts, Exim treats LF on its own as a line terminator.

One area that sometimes gives rise to problems concerns the EHLO or HELO commands. Some clients sent syntactically invalid versions of these commands, which Exim rejects by default. (This is nothing to do with verifying the data that is sent, so helo_verify is not relevant.) You can tell Exim not to apply a syntax check by setting helo_accept_junk_hosts to match the broken hosts that send invalid commands.

The amount of disc space available is checked whenever SIZE is received on a MAIL command, independently of whether message_size_limit or check_spool_space is configured, unless smtp_check_spool_space is set false. A temporary error is given if there isn't enough. If check_spool_space is set, the check is for that amount of space plus the value given with SIZE, that is, it checks that the addition of the incoming message will not reduce the space below the threshold.

When a message is successfully received, Exim includes the local message id in its response to the final `.' that terminates the data. If the remote host logs this text it can help with tracing what has happened to a message.

The Exim daemon can limit the number of simultaneous incoming connections it is prepared to handle (see the smtp_accept_max option). It can also limit the number of simultaneous incoming connections from a single remote host (see the smtp_accept_max_per_host option). Additional connection attempts are rejected using the SMTP temporary error code 421.

On some operating systems the SIGCHLD signal that is used to detect when a subprocess has finished can get lost at busy times. However, the daemon looks for completed subprocesses every time it wakes up, so provided there are other things happening (new incoming calls, starts of queue runs), the completion of processes created to handle incoming calls should get noticed eventually. If, however, Exim appears not to be accepting as many incoming connections as expected, sending the daemon a SIGCHLD signal will wake it up and cause it to check for any completed subprocesses.

When running as a daemon, Exim can reserve some SMTP slots for specific hosts, and can also be set up to reject SMTP calls from non-reserved hosts at times of high system load -- for details see the smtp_accept_reserve, smtp_load_reserve, and smtp_reserve_hosts options. The load check applies in both the daemon and inetd cases.

Exim normally starts a delivery process for each message received, though this can be varied by means of the -odq command line option and the queue_only, queue_only_file, and queue_only_load options. The number of simultaneously running delivery processes started in this way from SMTP input can be limited by the smtp_accept_queue and smtp_accept_queue_per_connection options. When either limit is reached, subsequently received messages are just put on the input queue.

The controls that involve counts of incoming SMTP calls (smtp_accept_max smtp_accept_queue, smtp_accept_reserve) are not available when Exim is started up from the inetd daemon, since each connection is handled by an entirely independent Exim process. Control by load average is, however, available with inetd.

Exim can be configured to verify addresses in incoming SMTP commands as they are received. See chapter 45 for details. It can also be configured to rewrite addresses at this time -- before any syntax checking is done. See section 34.7.

48.5 The VRFY, EXPN, and DEBUG commands

The SMTP command VRFY is accepted only when the configuration option smtp_verify is set, and if so, it runs exactly the same code as when Exim is called with the -bv option. The SMTP command EXPN is is permitted only if the calling host matches smtp_expn_hosts (add `localhost' if you want calls to 127.0.0.1 to be able to use it). A single-level expansion of the address is done. EXPN is treated as an `address test' (similar to the -bt option) rather than a verification (the -bv option). If an unqualified local part is given as the argument to EXPN, it is qualified with qualify_domain. Rejections of VRFY and EXPN commands are logged on the main and reject logs, and VRFY verification failures are logged on the main log for consistency with RCPT failures.

The SMTP command DEBUG is not supported at all. Occurrences of this command are rejected, and the incident is logged.

48.6 The ETRN command

RFC 1985 describes an SMTP command called ETRN that is designed to overcome the security problems of the TURN command (which has fallen into disuse). Exim recognizes ETRN if the calling host matches smtp_etrn_hosts. Attempts to use ETRN from other hosts are logged on the main and reject logs; when ETRN is accepted, it is logged on the main log.

The ETRN command is concerned with `releasing' messages that are awaiting delivery to certain hosts. As Exim does not organize its message queue by host, the only form of ETRN that is supported by default is the one where the text starts with the `#' prefix, in which case the remainder of the text is specific to the SMTP server. A valid ETRN command causes a run of Exim with the -R option to happen, with the remainder of the ETRN text as its argument. For example,

ETRN #brigadoon

runs the command

exim -R brigadoon

which causes a delivery attempt on all messages with undelivered addresses containing the text `brigadoon'. Because a separate delivery process is run to do the delivery, there is no security risk with ETRN.

When smtp_etrn_serialize is set (the default), it prevents the simultaneous execution of more than one queue run for the same argument string as a result of an ETRN command. This prevents a mis-behaving client from starting more than one queue-runner at once. Exim implements the serialization by means of a hints database in which a record is written whenever a process is started by ETRN, and deleted when a -R queue run completes.

Obviously there is scope for hints records to get left lying around if there is a system or program crash. To guard against this, Exim ignores any records that are more than six hours old, but you should normally arrange to delete any files in the spool/db directory whose names begin with `serialize-' after a reboot.

For more control over what ETRN does, the smtp_etrn_command option can used. This specifies a command that is run whenever ETRN is received, whatever the form of its argument. For example:

smtp_etrn_command = /etc/etrn_command $domain $sender_host_address

The string is split up into arguments which are independently expanded. The expansion variable $domain is set to the argument of the ETRN command, and no syntax checking is done on the contents of this argument. A new freestanding process is created to run the command. Exim does not wait for it to complete, so its status code is not checked. As Exim is normally running under its own uid and gid when receiving incoming SMTP, it is not possible for it to change them before running the command.

If you use smtp_etrn_command to do something other than run Exim with the -R option, you must disable smtp_etrn_serialize, because otherwise hints never get deleted, and further ETRN commands are ignored until the hints time out.

48.7 Incoming local SMTP

Some user agents use SMTP to pass messages to their local MTA using the standard input and output, as opposed to passing the envelope on the command line and writing the message to the standard input. This is supported by the -bs option. This form of SMTP is handled in the same way as incoming messages over TCP/IP, except that all host-specific processing is bypassed, and any envelope sender given in a MAIL command is ignored unless the caller is trusted.

48.8 Outgoing batched SMTP

Both the appendfile and pipe transports can be used for handling batched SMTP. Each has an option called bsmtp which, if set to anything other than `none' causes the message to be output in SMTP format. The message is written to the file or pipe preceded by the SMTP commands MAIL and RCPT, and followed by a line containing a single dot. The SMTP command HELO is not normally used, but if the transport's bsmtp_helo option is set true, a HELO command line precedes each message. No SMTP responses are possible for this form of delivery. All it is doing is using SMTP commands as a way of transmitting the envelope along with the message.

Lines in the message that start with a dot have an extra dot added. If the prefix option is set, its contents are included after the SMTP commands, and the contents of suffix appear at the end of the message, before the terminating dot; normally these options are specified as empty, to override the defaults.

The value of the bsmtp option determines how multiple addresses in a single message may be batched, if other conditions permit. If the value of bsmtp is `one', there is no batching, and a copy of the message is output for each address. If the value is `domain' then a single copy (with multiple RCPT commands) is output for all addresses that have the same domain. If the value is `all' then only a single copy of the message is written. The batching is further constrained by other parameters:

If any of the transport's expandable strings contain a reference to $local_part, no batching takes place.
If any of the transport's expandable strings contains a reference to $domain, only domain batching is done.
Addresses are not batched if they have different error addresses, associated hosts, header additions or removals and so on.
The uid and gid for delivery must be explicitly set. This is normally done in the transport, but if they are specified by a router or director, batching occurs only for addresses that have the same uid/gid set up.

When one or more messages are routed to a BSMTP transport by a router that sets up a host list, the name of the first host on the list is available to the transport in the variable $host. Here is an example of such a transport and router for batched SMTP:

# transport
smtp_appendfile:
  driver = appendfile
  directory = /var/bsmtp/$host
  bsmtp = all
  prefix =
  suffix =
  user = exim

# router
route_append:
  driver = domainlist
  transport = smtp_appendfile
  route_list = some.domain  batch.host

This causes messages addressed to some.domain to be written in batched SMTP format to /var/bsmtp/batch.host, with only a single copy of each message. Note that prefix and suffix must be explicitly changed from their defaults.

48.9 Incoming batched SMTP

The -bS command line option causes Exim to accept one or more messages by reading SMTP on the standard input, but to generate no responses. If the caller is trusted, the senders in the MAIL commands are believed; otherwise the sender is always the caller of Exim. Unqualified senders and receivers are not rejected (there seems little point) but instead just get qualified. If sender_verify is set, sender verification takes place only if sender_verify_batch is set (it defaults unset). Receiver verification and administrative rejection is not done, even if configured. HELO and EHLO act as RSET; VRFY, EXPN, ETRN, HELP, and DEBUG act as NOOP; QUIT quits.

If any error is detected while reading a message, including a missing `.' at the end, Exim gives up immediately. It writes details of the error to the standard output in a stylized way that the calling program should be able to make some use of automatically, for example:

554 Unexpected end of file
Transaction started in line 10
Error detected in line 14

It writes a more verbose version, for human consumption, to the standard error file, for example:

An error was detected while processing a file of BSMTP input.
The error message was:

  501 '>' missing at end of address

The SMTP transaction started in line 10.
The error was detected in line 12.
The SMTP command at fault was:

   rcpt to:<malformed@in.com.plete

1 previous message was successfully processed.
The rest of the batch was abandoned.

The return code from Exim is zero only if there were no errors. It is 1 if some messages were accepted before an error was detected, and 2 if no messages were accepted.

Go to the first, previous, next, last section, table of contents.