The Exim Mail Transfer Agent


Table of Contents


Exim Overview

Date: 9 November 2000

Exim is a mail transfer agent (MTA) developed at the University of Cambridge for use on Unix systems connected to the Internet. It is freely available under the terms of the GNU General Public Licence. In overall style it is similar to Smail 3, but its facilities are more extensive. It contains facilities for verifying incoming sender and recipient addresses, for refusing mail from specified hosts, networks, or senders, and for controlling mail relaying.

Exim is in production use at quite a few sites, some of which move hundreds of thousands of messages per day. This document contains an overview description of the way Exim works, with a certain amount of omission and simplification to keep it fairly short. Please address any enquiries about Exim to Philip Hazel:

Email:  <ph10@cus.cam.ac.uk>
Phone:  +44 1223 334714
Fax:    +44 1223 334679

University of Cambridge Computing Service New Museums Site Pembroke Street Cambridge CB2 3QG United Kingdom

This document is copyright (c) University of Cambridge 2000, but copying permission is granted to all. -------------------------------------------------------------------------

"If I have seen further it is by standing on the shoulders of giants."
(Isaac Newton)

Background

Exim owes a great deal to Smail 3 and its author, Ron Karr. Without the experience of running and working on the Smail 3 code, I could never have contemplated starting to write a new mailer. The general style of operation and configuration are taken from Smail 3, though the actual code of Exim is entirely new.

My intention was to write a mailer that had more functionality than Smail 3, but which retained the simple lightweight approach, as this seemed to me to be all that was needed for systems directly connected to the Internet, where most messages are delivered almost immediately. On the central mail machines at Cambridge University, `most' means around 98%.

Since the first versions of Exim went into service at the end of 1995, it has continued to develop, and now has far more facilities than my original conception.

Availability

The current distribution of Exim is available via the Exim page at http://www.exim.org. The distribution contains an ASCII copy of the documentation; other formats (HTML, PDF, PostScript, Texinfo) can be downloaded from the web site. The HTML version is also available online.

The following operating systems are currently supported: AIX, BSDI, DGUX, FreeBSD, GNU/Hurd, GNU/Linux, HI-OSF (Hitachi), HP-UX, IRIX, MIPS RISCOS, NetBSD, OpenBSD, QNX, SCO, SCO SVR4.2 (aka UNIX-SV), Solaris (aka SunOS5), SunOS4, Tru64-Unix (aka DEC OSF1, aka Digital UNIX) Ultrix, and Unixware.

Limitations

For the benefit of those reading this overview to see whether Exim is of interest to them, its limitations are listed first.

Main features

Exim follows the same general approach of decentralized control that Smail 3 does. There is no central process doing overall management of mail delivery. However, the independent delivery processes share data in the form of `hints', which makes delivery more efficient in some cases. The hints are kept in a number of DBM files. If any of these files are lost, the only effect is to change the pattern of delivery attempts and retries.

Here is a summary of Exim's main features. More details are given in the sections which follow.

Performance

Although I did not specifically set out to write a high-performance MTA, Exim does seem to be fairly efficient. One site I heard of was a mailing list exploder that sometimes handles over 100,000 deliveries a day on a big Linux box, the record being 177,000 deliveries (791MB in total). Up to 13,000 deliveries in an hour have been reported. On larger systems, up to 800,000 messages a day have been reported.

Interface

Like many MTAs, Exim has adopted the Sendmail interface so that it can be a straight replacement for `usr/sbin/sendmail' or `/usr/lib/sendmail'. All the relevant Sendmail options are implemented. There are also some additional options that are compatible with Smail 3, and some further options that are new to Exim.

The runtime configuration interface is a single file which is divided into a number of sections. The entries in this file consist of keywords and values, in the style of Smail 3 configuration files.

Control of messages on the queue can be done via certain privileged command line options. There is also an optional monitor program called eximon, which displays current information in an X window and contains interfaces to the command line options.

Method of operation

When Exim receives a message, it writes two files in its spool directory. The first contains the envelope information, the current status of the message, and the headers, while the second contains the body of the message. The status of the message includes a complete list of recipients and a list of those that have already received the message. The header file gets updated during the course of delivery if necessary.

A message remains in the spool directory until it is completely delivered to its recipients or to an error address, or until it is deleted by an administrator or by the user who originally created it. In cases when delivery cannot proceed -- for example, when a message can neither be delivered to its recipients nor returned to its sender, the message is marked `frozen' on the spool, and no more deliveries are attempted. The administrator can thaw such messages when the problem has been corrected, and can also freeze individual messages by hand if necessary.

As delivery proceeds, Exim writes timestamped information about each address to a per-message log file; this includes any delivery error messages. This log is solely for the benefit of the administrator. All the information Exim itself needs for delivery is kept in the header spool file. The message log file is deleted with the spool files. If a message is delayed for more than a configured time, a warning message is sent to the sender.

The main delivery processing elements of Exim are called directors, routers, and transports. Code for a number of these is provided, and compile-time options specify which ones are actually included in the binary. Directors handle addresses that include one of the local domains, routers handle remote addresses, and transports do actual deliveries.

When a message is to be delivered, the sequence of events is roughly as follows:

Mail filtering

Exim can be configured to allow users to set up filter files as an alternative to the traditional `.forward' files. A filter file can test various characteristics of a message, including the contents of the headers and the start of the body, and direct delivery to specified addresses, files, or pipes according to what it finds. The system-wide filter file uses the same control syntax.

Directors

The existing directors are listed below. I use the RFC 822 term local part to mean that portion of an address that comes before the @ character.

The configuration file determines which directors are actually used, and in which order. It is possible to use the same director more than once, with different options.

The addresses a director handles can be constrained in the following ways:

In addition, certain files can be required to exist or not exist for a given director to be run.

Routers

The existing routers are:

The configuration file determines which routers are actually used, and in which order. It is possible to use the same router more than once, with different options.

Like directors, routers can be constrained to handle only certain domains or certain local parts (though I haven't seen a good use for that yet), or messages from certain senders. If a router times out, either the delivery can be deferred, or the address can be passed on to the next router.

A flag controls whether a router is called when an address is being verified on an incoming message, as opposed to being routed for delivery.

Transports

Local and remote transports are handled differently. A local transport is always run in a separate process with an appropriate real uid and gid. Their values can be specified in the transport's configuration, or passed over from the director that handled the address. Remote transports run under Exim's own uid. The existing transports are:

Exim logs

Exim writes to three different log files:

A utility script for renaming and compressing the main and reject logs each night is provided. There are also scripts for extracting statistics from log files and for searching log files for message entries that match a given pattern. For example, one can pull out all entries relating to messages for a given local part.

It is possible to configure Exim to write its logging data to syslog instead of, or as well as, to local files.

Exim databases

Exim maintains a number of databases in DBM files to help it perform efficient mail delivery. In effect, the files contain hints, and if they are lost it is not a disaster -- Exim's performance just suffers a bit. The databases are:

There is a utility program that lists the contents of one of these databases, and another that allows manual modifications to be applied in some cases. Database records are timestamped, and there is a utility that removes records that are older than a given period, and also cleans up wait-smtp records containing references to messages that no longer exist. Running this daily or weekly should be sufficient to keep the files reasonably tidy.

Exim can use any one of the following DBM libraries: ndbm, gdbm, DB 1.85, DB 2.x, DB 3.x, or tdb.

SMTP batching

When an SMTP delivery attempt fails, causing the message to be deferred till later, Exim updates a DBM database that contains records keyed by host name plus IP address. Each record holds a list of messages that are waiting for that host and address.

When an SMTP delivery succeeds, Exim consults the database to see if there are any other messages waiting for the same host and address. If it finds any, it creates a new Exim process and passes it the open SMTP channel and a message identification. The new process then delivers the waiting message down the existing channel and may in turn cause the creation of yet another process. Any other waiting addresses in the message are skipped. The maximum number of messages sent down one connection is configurable.

This scheme achieves some SMTP efficiency when a number of messages have been queued up for a given host, without the overhead of a heavyweight queueing apparatus.

Retries

When a message cannot immediately be directed, routed, or delivered, it remains on the queue and another delivery attempt occurs at a later time. While failures to deliver to remote hosts are the most common cause of this, it is also possible for a message to be deferred as a result of temporary local delivery failure, or following directing or routing. A local delivery can fail if the user is over quota, while directing can be delayed if a user's home directory is not available (for example, a missing NFS mount), and therefore the existence of a `.forward' file cannot be tested. Routing can be delayed by DNS timeouts.

Exim can be given a set of rules which specify how often to retry deferred addresses, and when to give up. These rules apply to directing and routing as well as to transporting, and are keyed by (wildcarded) domain name or, for local users, by local part and domain name, either of which can be wildcarded.

Each rule is actually a sequential list of subrules, which are applied successively as time passes. At present there are two kinds of subrule: fixed interval, and geometrically increasing interval. For example, it is possible to specify a rule such as `retry every 15 minutes for 2 hours; then increase the interval between retries by a factor of 1.5 each time until 8 hours have passed; then retry every 8 hours until 4 days have passed; then give up'. The times are measured from when the address first failed, so, for example, if a host has been down for two days, new messages will immediately go on to the 8-hour retry schedule.

Exim does not have an elaborate series of alarm clocks to cause retries to happen exactly on schedule. A queue-runner process is started periodically, to attempt delivery, one by one, of messages containing addresses that have passed their next retry time. If such an address fails again, a new retry time is computed, and so subsequent messages queued for the same address get skipped. The queue is not processed sequentially, but in a `random' order, to prevent one rogue message that causes a problem blocking other messages to the same destination for ever.

When the maximum time for retrying has passed, pending addresses are failed. However, a next try time is still computed from the final subrule. Until that time is reached, any new messages for the address are immediately failed. When the next try time is passed, one further delivery attempt is made; if this fails, a new next try time is computed, and so on.

The increasing number of small computers on the Internet has caused there to be a lot of messages addressed to hosts that are never going to listen. The retry logic described above should reduce the amount of wasted time spent on trying to deliver such messages. However, some administrators are unhappy about this rather draconian approach, which can cause an address to be failed without any deliveries being attempted. Exim can alternatively be configured always to try at least once those hosts whose last failure was before the arrival of the message. This option increases the number of attempts to deliver to dead hosts.

Retry rules can be predicated on particular errors as well as on domain names, and for domains that are looked up in the DNS, further discrimination on whether MX records were used or not is also possible. Thus it is possible to treat `connection refused' and `connection timed out' differently, or to distinguish between `connection refused and there was only an A record' and `connection refused from a host pointed to by an MX record'.

When a local delivery fails because a user is over quota, the retry rule can be predicated on the length of time since the mailbox was last read. For example, if the mailbox has been recently read, the delivery can be retried for a while; otherwise it can be failed quickly.

Header rewriting

There are those who argue that header rewriting is a totally Bad Thing; there are others who swear they cannot live without it. Exim provides the facility -- you do not have to use it!

Exim can be configured to rewrite the address portions of headers when a message is received. From release 3.20, Exim can also be configured to rewrite addresses in header lines at transport time. Rewriting rules can be targeted at individual headers and the envelope fields; it is possible, for example, just to rewrite the `From' header and no others.

Rewriting rules are keyed by local part and domain, either of which can be wildcarded, and the replacement text is a general expansion string which can contain file lookups. This makes it possible to replace login names by `friendly' names in outgoing addresses via a DBM lookup, for example. The other most common rewriting requirement of replacing `*.foo.bar' with `foo.bar' is also easily handled.

Headers are also automatically rewritten by Exim in two cases:

In addition to the rewriting rules, Exim can be configured to add or delete specific headers at transport time. The configuration can either be on the transport, in which case it applies to all copies of the message sent by that transport, or on a director or router, in which case it applies only to copies of the message that are associated with the addresses that the router or director handled.

Host verification

Exim can be configured to accept incoming SMTP calls from certain hosts only, or it can be configured to reject calls from certain hosts. In both cases, the test may include an RFC 1413 identification check. A system that gets all its mail via a central hub might want to lock out the rest of the world, while a number of systems under one management might want to exchange mail only via the standard mailer, and hence reject mail from all but certain specified ids within the group.

Hosts for rejection can be identified explicitly, or via the RBL DNS databases. Alternatively, the SMTP AUTH extension mechanism may be used for authenticating hosts.

When a host fails an acceptance test, Exim can either give an error code immediately on connection, or allow the connection to proceed and then give error codes to all the message's recipients. The latter approach is useful when using the mechanism to reject unsolicited junk mail and mail bombs, because it normally prevents the sender from trying again with the same message.

SMTP port reservation

The maximum number of simultaneous incoming SMTP calls can be set, and in addition, a number of them can be reserved for particular hosts or particular IP networks. It is also possible to specify a system load value above which only calls from the reserved hosts are accepted.

Control of relaying

A host is said to act as a relay if it accepts an incoming message from an external host and delivers it to an external host. Unscrupulous persons have been known to use unsuspecting hosts as relays in an attempt to disguise the origin of messages. An Exim host can be configured to accept mail from any host for onward transmission to a limited set of domains only, and to accept mail only from a specified list of hosts or networks for onward transmission to any domain. Such hosts may also be identified by means of the SMTP AUTH mechanism. It is also possible to enable relaying to any domain whose MX records point to the local host, without having to list the domains explicitly.

Sender verification

The return path of a message (also known as the `envelope sender') is used when Exim has to return an error message. If this is a bad address, the error message cannot be delivered, and the postmaster has to sort things out.

Sender verification (a configurable option that applies to SMTP input) is intended to pass this work to a foreign postmaster, by refusing to accept the message in the first place. There is an exception list which can specify certain hosts (with optional RFC 1413 identifications) that are allowed to bypass the check.

A certain amount of SPAM mail contains invalid return paths. Apart from this, there are two main causes: misconfigured mailers (gateways in particular), and users fooling around with mail. Sender verification catches both of them. It operates by passing the sender address through the directors and routers in verification mode; if this fails, the message is not accepted.

The first thing foreign postmasters ask when they learn about an apparently legitimate rejected message is `What were the headers?'. For this reason, and also to collect evidence in cases of mail forgery, Exim does not initially reject a message after the MAIL FROM command in the SMTP session. It reads the message, so as to be able to write the headers to the rejection log, and then gives a hard error response to the sending host.

Unfortunately, several mailers believe that any error response after the data for a message has been sent indicates a temporary error. Consequently, such mailers will continue to try to send a message that has been rejected as described above. To prevent this, whenever a message is rejected, Exim records the time, bad address, and host in a DBM database. If the same host sends the same bad address within 24 hours, it is rejected immediately at the MAIL FROM command.

Sadly, even this doesn't stop some mailers from repeatedly trying to send the message. As a last resort, if the same host sends the same bad address for a third time in 24 hours, the MAIL FROM command is accepted, but all subsequent RCPT TO commands are rejected. If this does not stop a remote mailer then it is badly broken.

If the attempt to verify the sender address cannot be completed (typically because of a DNS timeout) Exim returns a temporary error code after the MAIL FROM command, which should cause the remote mailer to try again later. However, it is possible to configure Exim to accept the message in these circumstances.

Many messages with bad return paths in fact contain perfectly valid `From' or `Reply-to' headers. For administrators that want a quieter life, there is a configuration option which causes Exim to check these headers if the return path is bad, and if a good address is found, to use it to replace the return path. The old value is retained in an header whose name begins with `X-'.

Sender lock out

More and more unsolicited junk mail is being seen on the Internet. It is sometimes useful to be able to reject messages (from any host) with particular sender addresses in the envelope. Exim can be configured to reject messages whose sender addresses match certain patterns, either by failing the MAIL FROM command, or (because some mailers take no notice of that) by failing all RCPT TO commands.

Receiver verification

Exim can be configured so that it checks the addresses given in incoming SMTP RCPT TO commands as they are received. A failing address can be immediately rejected, or it can be logged and accepted. If verification cannot be completed (typically because of a DNS timeout) either a temporary error code can be given, or the address can be logged and accepted.

The `percent hack'

The so-called `percent hack' is the feature of mailers whereby a local part containing a percent sign gets interpreted as an entire new address, with the percent replaced by @. This is used for explicit mail routing and sometimes for testing. In Exim, it is possible to configure which local domains, if any, allow the `percent hack', though this is not recommended. Such usage, if configured, is, however, subject to the relay controls.

Security

Exim is written as a single binary that has to run setuid to root. I did start off trying to write it as a number of different modules, but soon came to the conclusion that, for this design of mailer, it was not worth it, because the functions don't decompose cleanly. For example, if you want to verify addresses while receiving mail you need all the directing and routing apparatus to be available.

Exim runs each local delivery in a separate process which is setuid to the relevant local user. In addition, it can be configured to run under a given non-root uid (and gid) for much of the rest of the time, and this is the recommended practice. In particular, it need not be root while sending or receiving SMTP mail.

Exim checks the permissions and owners of files to which messages are to be appended, and refuses to proceed with the delivery if things are not right.

Delivery of messages to pipes or files is supported only as a result of expanding an address via an alias or a forward file, provided this is permitted by the configuration. Externally generated local addresses cannot specify files or pipes -- no special action is taken for addresses starting with the file or pipe characters, so they will usually fail.

Use of the VRFY, EXPN, and ETRN functions in SMTP connections is controlled by configuration options. The DEBUG function is not supported at all.

The Exim Monitor

A program for monitoring Exim and displaying information in an X window is provided. This can be configured to show stripcharts of incoming and outgoing mail in various categories. It also shows a `tail' of the main log file, and information about messages on the queue.

There is a menu of operations that can be performed by suitably privileged users. Messages can be frozen, thawed, deleted, caused to be delivered, modified, or returned to their senders from this interface. However, all these actions can also be performed from the command line interface as well.

* * *


This document was generated on 11 May 2001 using the texi2html translator version 1.52.