Go to the first, previous, next, last section, table of contents.

6. File and database lookups

Exim can be configured to look up data in files or databases in a number of different circumstances (see 6.4 below). Two different styles of data lookup are implemented:

The single-key style requires the specification of a file in which to look, and a single key to search for. The lookup type determines how the file is searched.
The query style accepts a generalized database query, which may contain one or more keys.

The code for each lookup type is in a separate source file which is compiled and included in the binary of Exim only if the corresponding compile-time option is set. The default settings in `src/EDITME' are:

LOOKUP_DBM=yes
LOOKUP_LSEARCH=yes

which means that only linear searching and DBM lookups are included by default.

6.1 Single-key lookup types

The following single-key lookup types are implemented:

lsearch: The given file is a text file which is searched linearly for a line beginning with the single key, terminated by a colon or white space or the end of the line. White space between the key and the colon is permitted. The remainder of the line, with leading and trailing white space removed, is the data. This can be continued onto subsequent lines by starting them with any amount of white space, but only a single space character is included in the data at such a junction. If the data begins with a colon, the key must be terminated by a colon, for example:
```
baduser:  :fail:
```
Empty lines and lines beginning with # are ignored, even if they occur in the middle of an item. This is the traditional textual format of alias files.
dbm: Calls to DBM library functions are used to extract data from the given DBM file by looking up the record with the given key. The terminating binary zero is included in the key that is passed to the DBM library. There is a variant called dbmnz which does not include the terminating binary zero in the key.
nis: The given file is the name of a NIS map, and a NIS lookup is done with the given key, excluding the terminating binary zero. There is a variant called nis0 which does include the terminating binary zero in the key. This is reportedly needed for Sun-style alias files. Exim does not recognize NIS aliases; the full map names must be used.
cdb: The given file is searched as a Constant DataBase file, using the key string without the terminating binary zero. The cdb format is designed for indexed files that are read frequently and never updated, except by total re-creation. As such, it is particulary suitable for large files containing aliases or other indexed data referenced by an MTA. Information about cdb can be found at
```
http://www.pobox.com/~djb/cdb.html
```
The cdb distribution is not needed in order to build Exim with cdb support, as the code for reading cdb files is included directly in Exim itself. However, no means of building or testing cdb files is provided with Exim because these are available within the cdb distribution.

6.2 An lsearch file is not an item list

There has been some confusion about the way lsearch lookups work, in particular in domain and host lists. An item in one of these lists may be a plain file name, or a file name preceded by a search type, and these behave differently. For a plain file name, for example

local_domains = /etc/local-mail-domains

each line of the file is treated as if it appeared as an item in the list, and negated items, wild cards, and regular expressions may be present. However, if an item is specified as an lsearch lookup, for example

local_domains = lsearch;/etc/local-mail-domains

then negated items, wild cards, and regular expressions may not be used, because lsearch is an indexed lookup method which, when given a key (the domain in the above example), yields a data value that corresponds to that key. The fact that the file is searched linearly does not make this kind of search any different from the other single-key lookup types, and an lsearch file can always be directly converted into one of the other types without change of function. Thus, the keys in lsearched files are literal strings and are not interpreted in any way.

6.3 Query-style lookup types

The following query-style lookup types are implemented:

nisplus: This does a NIS+ lookup using a query that may contain any number of keys, and which can specify the name of the field to be returned. See section 6.10 below.
ldap: This does an LDAP lookup using a query in the form of a URL, and returns attributes from a single entry. There is a variant called ldapm which permits values from multiple entries to be returned. A third variant called ldapdn returns the Distinguished Name of a single entry instead of any attribute values. See section 6.11 below.
mysql: The format of the query is an SQL statement that is passed to a MySQL database. See section 6.12 below.
pgsql: The format of the query is an SQL statement that is passed to a PostgreSQL database. See section 6.12 below.
dnsdb: This does a DNS search for a record whose domain name is the supplied query. The resulting data is the contents of the record. See section 6.13 below.
testdb: This is a lookup type which is for use in debugging Exim. It is not likely to be useful in normal operation.

6.4 Use of data lookups

There are three different types of configuration item in which data lookups can be specified:

Any string that is to be expanded may contain explicit lookup requests. String expansions are described in chapter 9.
Some drivers can be configured directly to look up data in files.
Lists of domains and other items can contain lookup requests as a way of avoiding excessively long linear lists. In this case, any data that is returned by the lookup is normally discarded; whether the lookup succeeds or fails is all that counts. However, in the case of the domains and local_parts options for directors and routers, the data is preserved in variables for later use. See sections 7.12, 7.13, and 7.16 for descriptions of the different list types.

In a string expansion, all the parameters of the lookup are specified explicitly, while for the other types there is always one implicit key involved. For example, the local_domains option contains a list of local domains; when it is being searched there is some domain name that is an implicit key.

This is not a problem for single-key lookups; the relevant file name is specified, and the key is implicit. For example, the list of local domains could be given as

local_domains = dbm;/local/domain/list

However, for query-style lookups the entire query has to be specified, and to do this, some means of including the implicit key is required. The special expansion variable $key is provided for this purpose. NIS+ could be used to look up local domains by a setting such as

local_domains = nisplus;[domain=$key],domains.org_dir

In cases where drivers can be configured to do lookups, there are always three alternative configuration options: file is used for single-key lookups, using an implicit key, and query or queries is specified for query-style lookups. In these cases the query is an expanded string, and the implicit key that would be used for file is always available as one of the normal expansion variables. The difference between query and queries is that in the latter case the string is treated as a colon-separated list of queries that are tried in order until one succeeds.

6.5 Temporary errors in lookups

Lookup functions can return temporary error codes if the lookup cannot be completed. (For example, a NIS or LDAP database might be unavailable.) For this reason, it is not advisable to use a lookup that might do this for critical options such as (to give an extreme example) local_domains.

When a lookup cannot be completed in a transport, director, or router, delivery of the message is deferred, as for any other temporary error. In other circumstances Exim may assume the lookup has failed, or may give up altogether. These are some specific cases:

local_domains, hold_domains, or queue_remote_domains during delivery: the address it is checking is deferred; other addresses may succeed if they match something earlier in the list.
domains, local_parts, senders, or condition on a router or director: delivery is deferred.
local_domains, percent_hack_domains, or relay_domains while receiving SMTP: a 451 temporary error is given to the RCPT command.
local_domains during verification: a temporary error given.
mx_domains during lookuphost: delivery is deferred.
mx_domains in the smtp transport (for hosts specified on the transport): treat as not matching.
queue_smtp_domains in the smtp transport: treat as not matching -- otherwise all SMTP deliveries would be held up.

6.6 Default values in single-key lookups

In this context, a `default value' is a value specified by the administrator that is to be used if a lookup fails.

If `*' is added to a single-key lookup type (for example, lsearch*) and the initial lookup fails, the key `*' is looked up in the file to provide a default value. See also the section on partial matching below.

Alternatively, if `*@' is added to a single-key lookup type (for example dbm*@) then, if the initial lookup fails and the key contains an @ character, a second lookup is done with everything before the last @ replaced by *. This makes it possible to provide per-domain defaults in alias files that include the domains in the keys. If the second lookup fails (or doesn't take place because there is no @ in the key), `*' is looked up.

6.7 Partial matching in single-key lookups

The normal operation of a single-key lookup is to search the file for an exact match with the given key. However, in a number of situations where domains are being looked up, it is useful to be able to do partial matching. In this case, information in the file that has a key starting with `*.' is matched by any domain that ends with the components that follow the full stop. For example, if a key in a DBM file is

*.dates.fict.book

then when partial matching is enabled this is matched by (amongst others) 2001.dates.fict.book and 1984.dates.fict.book. It is also matched by dates.fict.book, if that does not appear as a separate key in the file.

Partial matching is implemented by doing a series of separate lookups using keys constructed by modifying the original subject key. This means that it can be used with any of the single-key lookup types, provided that the special partial-matching keys beginning with `*.' are included in the data file. Keys in the file that do not begin with `*.' are matched only by unmodified subject keys when partial matching is in use.

Partial matching is requested by adding the string `partial-' to the front of the name of a single-key lookup type, for example, partial-dbm. When this is done, the subject key is first looked up unmodified; if that fails, `*.' is added at the start of the subject key, and it is looked up again. If that fails, further lookups are tried with dot-separated components removed from the start of the subject key, one-by-one, and `*.' added on the front of what remains.

A minimum number of two non-* components are required. This can be adjusted by including a number before the hyphen in the search type. For example, partial3-lsearch specifies a minimum of three non-* components in the modified keys. Omitting the number is equivalent to `partial2-'. If the subject key is 2250.dates.fict.book then the following keys are looked up when the minimum number of non-* components is two:

2250.dates.fict.book
*.2250.dates.fict.book
*.dates.fict.book
*.fict.book

As soon as one key in the sequence is successfully looked up, the lookup finishes. If `partial0-' is used, the original key gets shortened right down to the null string, and the final lookup is for `*' on its own.

If the search type ends in `*' or `*@' (see section 6.6 above), the search for an ultimate default that this implies happens after all partial lookups have failed. If `partial0-' is specified, adding `*' to the search type has no effect, because the `*' key is already included in the sequence of partial lookups.

The use of `*' in lookup partial matching differs from its use as a wildcard in domain lists and the like. Partial matching works only in terms of dot-separated components; a key such as `*fict.book' in a database file is useless, because the asterisk in a partial matching subject key is always followed by a dot.

6.8 Lookup caching

Exim caches the most recent lookup result on a per-file basis for single-key lookup types, and keeps the relevant files open. In some types of configuration this can lead to many files being kept open for messages with many recipients. To avoid hitting the operating system limit on the number of simultaneously open files, Exim closes the least recently used file when it needs to open more files than its own internal limit, which can be changed via the lookup_open_max option. For query-style lookups, a single data cache per lookup type is kept. The files are closed and the caches flushed at strategic points during delivery -- for example, after all directing and routing is complete.

6.9 Quoting lookup data

When data from an incoming message is included in a query-style lookup, there is the possibility of special characters in the data messing up the syntax of the query. For example, a NIS+ query that contains

[name=$local_part]

will be broken if the local part happens to contain a closing square bracket. For NIS+, data can be enclosed in double quotes like this:

[name="$local_part"]

but this still leaves the problem of a double quote in the data. The rule for NIS+ is that double quotes must be doubled. Other lookup types have different rules, and to cope with the differing requirements, an expansion operator of the following form is provided:

${quote_<lookup-type>:<string>}

For example, the safest way to write the NIS+ query is

[name="${quote_nisplus:$local_part}"]

See chapter 9 for full coverage of string expansions. The quote operator can be used for all lookup types, but has no effect for single-key lookups, since no quoting is ever needed in their key strings.

6.10 More about NIS+

NIS+ queries consist of a NIS+ indexed name followed by an optional colon and field name. If this is given, the result of a successful query is the contents of the named field; otherwise the result consists of a concatenation of field-name=field-value pairs, separated by spaces. Empty values and values containing spaces are quoted. For example, the query

[name=mg1456],passwd.org_dir

might return the string

name=mg1456 passwd="" uid=999 gid=999 gcos="Martin Guerre"
home=/home/mg1456 shell=/bin/bash shadow=""

(split over two lines here to fit on the page), whereas

[name=mg1456],passwd.org_dir:gcos

would just return

Martin Guerre

with no quotes. A NIS+ lookup fails if NIS+ returns more than one table entry for the given indexed key. The effect of the quote_nisplus expansion operator is to double any quote characters within the text.

6.11 More about LDAP

The original LDAP implementation came from the University of Michigan; this has become `Open LDAP', and there are now two different releases. Another implementation comes from Netscape, and Solaris 7 and subsequent releases contain inbuilt LDAP support. Unfortunately, though these are all compatible at the lookup function level, their error handling is different. For this reason it is necessary to set a compile-time variable when building Exim with LDAP, to indicate which LDAP library is in use. One of the following should appear in your `Local/Makefile':

LDAP_LIB_TYPE=UMICHIGAN
LDAP_LIB_TYPE=OPENLDAP1
LDAP_LIB_TYPE=OPENLDAP2
LDAP_LIB_TYPE=NETSCAPE
LDAP_LIB_TYPE=SOLARIS

If LDAP_LIB_TYPE is not set, Exim assumes OpenLDAP 1, which has the same interface as the University of Michigan version.

There are three LDAP lookup types, which behave slightly differently in the way they handle the results of a query.

ldap requires the result to contain just one entry; if there are more, it gives an error.
ldapdn also requires the result to contain just one entry, but it is the Distinguished Name that is returned rather than any attribute values.
ldapm permits the result to contain more than one entry; the attributes from all of them are returned.

An LDAP query takes the form of a URL as defined in RFC 2255. For example, in the configuration of an aliasfile director one might have these settings:

search_type = ldap
query = ldap:///cn=$local_part,o=University%20of%20Cambridge,\
        c=UK?mailbox?base?

Two levels of quoting are required in LDAP queries, the first for LDAP and the second because the LDAP query is represented as a URL. The quote_ldap expansion operator implements the following rules:

For LDAP quoting, the characters #,+"\<>;*() have to be preceded by a backslash. (In fact, only some of these need to be quoted in Distinguished Names, and others in LDAP filters, but it does no harm to have a single quoting rule for all of them.)
For URL quoting, all characters except alphanumerics and !$'()*+-._ are replaced by %xx where xx is the hexadecimal character code. Note that backslash has to be quoted in a URL, so characters that are escaped for LDAP end up preceded by %5C in the final encoding.

The example above does not specify an LDAP server. A server can be specified in a query by starting it with

ldap://<hostname>:<port>/...

If the port (and preceding colon) are omitted, the standard LDAP port (389) is used. When, however, no server is specified in a query, a list of default servers is taken from the ldap_default_servers configuration option. This supplies a colon-separated list of servers which are tried in turn until one successfully handles a query, or there is a serious error. Successful handling either returns the requested data, or indicates that it does not exist. Serious errors are syntactical, or multiple values when only a single value is expected. Errors which cause the next server to be tried are connection failures, bind failures, and timeouts.

For each server name in the list, a port number can be given. The standard way of specifing a host and port is to use a colon separator (RFC 1738). Because ldap_default_servers is a colon-separated list, such colons have to be doubled. For example

ldap_default_servers = ldap1.example.com::145:ldap2.example.com

If ldap_default_servers is unset, a URL with no server name is passed to the LDAP library with no server name, and the library's default (normally the local host) is used.

The LDP URL syntax provides no way of passing authentication and other control information to the server. To make this possible, the URL in an LDAP query may be preceded by any number of `<name>=<value>' settings, separated by spaces. If a value contains spaces it must be enclosed in double quotes, and when double quotes are used, backslash is interpreted in the usual way inside them. The following names are recognized:

USER     set the DN, for authenticating the LDAP bind
PASS     set the password, likewise
SIZE     set the limit for the number of entries returned
TIME     set the maximum waiting time for a query

The values may be given in any order. The default is no time limit, and no limit on the number of entries returned. Here is an example of an LDAP query in an Exim lookup which uses some of these values. This is a single line, folded for ease of reading:

${lookup ldap
  {user="cn=manager,o=University of Cambridge,c=UK" pass=secret
  ldap:///o=University%20of%20Cambridge,c=UK?sn?sub?(cn=foo)}
  {$value}fail}

The encoding of spaces as %20 is a URL thing which should not be done for any of the auxiliary data. Exim configuration settings that include lookups which contain password information should be preceded by `hide' to prevent non-admin users from using the -bP option to see their values.

The ldapdn lookup type returns the Distinguished Name from a single entry as a sequence of values, for example

cn=manager, o=University of Cambridge, c=UK

For ldap and ldapm, if a query finds only entries with no attributes, Exim behaves as if the entry did not exist, and the lookup fails.

The ldap lookup type generates an error if more than one entry matches the search filter, whereas ldapm permits this case, and inserts a newline in the result between the data from different entries. It is possible for multiple values to be returned for both ldap and ldapm, but in the former case you know that whatever values are returned all came from a single entry in the directory.

In the common case where you specify a single attribute in your LDAP query, the result is not quoted, and if there are multiple values, they are separated by commas. If you specify multiple attributes, they are returned as space-separated strings, quoted if necessary, preceded by the attribute name. For example,

ldap:///o=base?attr1,attr2?sub?(uid=fred)

might yield

attr1="value one" attr2=value2

If you do not specify any attributes in the search, the same format is used for all attributes in the entry. For example,

ldap:///o=base??sub?(uid=fred)

might yield

objectClass=top attr1="value one" attr2=value2

The extract operator in string expansions can be used to pick out individual fields from such data.

6.12 More about MySQL and PostgreSQL

If any MySQL or PostgreSQL lookups are used, the mysql_servers or pgsql_servers option (as appropriate) must be set to a colon-separated list of slash-separated host, database, user, password, tuples. Because password data is sensitive, you should precede the setting with `hide', to prevent non-admin users from obtaining the setting via the -bP option. For example:

hide mysql_servers = localhost/users/root/secret:\
                     otherhost/users/root/othersecret

For each query, these parameter groups are tried in order until a connection and a query succeeds. For MySQL, no database need be supplied -- if it is absent, it must be given in the queries. A host may be specified as <name>:<port> but because this is a colon-separated list, the colon has to be doubled. Queries are SQL statements, so an example might be

${lookup mysql{select mailbox from users where id='ph10'}{$value}fail}

If the result of the query contains more than one field, the data for each field in the row is returned, preceded by its name, so the result of

${lookup pgsql{select home,name from users where id='ph10'}{$value}}

might be

home=/home/ph10 name="Philip Hazel"

Values containing spaces and empty values are double quoted, with embedded quotes escaped by backslash.

If the result of the query contains just one field, the value is passed back verbatim, without a field name, for example:

Philip Hazel

If the result of the query yields more than one row, it is all concatenated, with a newline between the data for each row.

The quote_mysql and quote_pgsql expansion operators convert newline, tab, carriage return, and backspace to \n, \t, \r, and \b respectively, and the characters single-quote, double-quote, and backslash are escaped with backslashes. The quote_pgsql expansion operator, in addition, escapes the percent and underscore characters. This cannot be done for MySQL because these escapes are not recognized in contexts where these characters are not special.

6.13 More about dnsdb

The dnsdb lookup type uses the DNS as its database. A query consists of a record type and a domain name, separated by an equals sign. For example, an expansion string could contain:

${lookup dnsdb{mx=a.b.example}{$value}fail}

The supported record types are A, CNAME, MX, NS, PTR, and TXT, and, when Exim is compiled with IPv6 support, AAAA and A6. If no type is given, TXT is assumed. When the type is PTR, the address should be given as normal; it gets converted to the necessary magic internally. For example:

${lookup dnsdb{ptr=192.168.4.5}{$value}fail}

For MX records, both the preference value and the host name are returned, separated by a space. If multiple records are found (or, for A6 lookups, if a single record leads to multiple addresses), the data is returned as a concatenation, separated by newlines. The order, of course, depends on the DNS resolver.

Go to the first, previous, next, last section, table of contents.