Previous   Next   Contents       (Exim 4.20 Specification)

9. File and database lookups

Exim can be configured to look up data in files or databases as it processes messages. Two different kinds of syntax are used:

  1. A string that is to be expanded may contain explicit lookup requests. These cause parts of the string to be replaced by data that is obtained from the lookup.

  2. Lists of domains, hosts, and email addresses can contain lookup requests as a way of avoiding excessively long linear lists. In this case, the data that is returned by the lookup is often (but not always) discarded; whether the lookup succeeds or fails is what really counts. These kinds of list are described in chapter 10.

It is easy to confuse the two different kinds of lookup, especially as the lists that may contain the second kind are always expanded before being processed as lists. Therefore, they may also contain lookups of the first kind. Be careful to distinguish between the following two examples:

  domains = ${lookup{$sender_host_address}lsearch{/some/file}}
  domains = lsearch;/some/file

The first uses a string expansion, the result of which must be a domain list. String expansions are described in detail in chapter 11. The expansion takes place first, and the file that is searched could contain lines like this:

  192.168.3.4: domain1 : domain2 : ...
  192.168.1.9: domain3 : domain4 : ...

Thus, the result of the expansion is a list of domains (and possibly other types of item that are allowed in domain lists).

In the second case, the lookup is a single item in a domain list. It causes Exim to use a lookup to see if the domain that is being processed can be found in the file. The file could contains lines like this:

  domain1: 
  domain2:

Any data that follows the keys is not relevant when checking that the domain matches the list item.

It is possible to use both kinds of lookup at once. Consider a file containing lines like this:

  192.168.5.6: lsearch;/another/file

If the value of $sender_host_address is 192.168.5.6, expansion of the first domains setting above generates the second setting, which therefore causes a second lookup to occur.

The rest of this chapter describes the different lookup types that are available. Any of them can be used in either of the circumstances described above. The syntax requirements for the two cases are described in chapters 11 and 10, respectively.

9.1. Lookup types

Two different styles of data lookup are implemented:

The code for each lookup type is in a separate source file that is included in the binary of Exim only if the corresponding compile-time option is set. The default settings in src/EDITME are:

  LOOKUP_DBM=yes
  LOOKUP_LSEARCH=yes

which means that only linear searching and DBM lookups are included by default. For some types of lookup (e.g. SQL databases), you need to install appropriate libraries and header files before building Exim.

9.2. Single-key lookup types

The following single-key lookup types are implemented:

9.3. Query-style lookup types

The supported query-style lookup types are listed below. Further details about many of them are given in later sections.

9.4. Temporary errors in lookups

Lookup functions can return temporary error codes if the lookup cannot be completed. For example, a NIS or LDAP database might be unavailable. For this reason, it is not advisable to use a lookup that might do this for critical options such as a list of local domains.

When a lookup cannot be completed in a router or transport, delivery of the message (to the relevant address) is deferred, as for any other temporary error. In other circumstances Exim may assume the lookup has failed, or may give up altogether.

9.5. Default values in single-key lookups

In this context, a ``default value'' is a value specified by the administrator that is to be used if a lookup fails.

If ``*'' is added to a single-key lookup type (for example, lsearch*) and the initial lookup fails, the key ``*'' is looked up in the file to provide a default value. See also the section on partial matching below.

Alternatively, if ``*@'' is added to a single-key lookup type (for example dbm*@) then, if the initial lookup fails and the key contains an @ character, a second lookup is done with everything before the last @ replaced by *. This makes it possible to provide per-domain defaults in alias files that include the domains in the keys. If the second lookup fails (or doesn't take place because there is no @ in the key), ``*'' is looked up. For example, a redirect router might contain:

  data = ${lsearch*@{$local_part@$domain}{/etc/mixed-aliases}}

Suppose the address that is being processed is jane@eyre.example. Exim looks up these keys, in this order:

  jane@eyre.example
  *@eyre.example
  *

The data is taken from whichever key it finds first. Note: in an lsearch file, this does not mean the first of these keys in the file. A complete scan is done for each key, and only if it is not found at all does Exim move on to try the next key.

9.6. Partial matching in single-key lookups

The normal operation of a single-key lookup is to search the file for an exact match with the given key. However, in a number of situations where domains are being looked up, it is useful to be able to do partial matching. In this case, information in the file that has a key starting with ``*.'' is matched by any domain that ends with the components that follow the full stop. For example, if a key in a DBM file is

  *.dates.fict.example

then when partial matching is enabled this is matched by (amongst others) 2001.dates.fict.example and 1984.dates.fict.example. It is also matched by dates.fict.example, if that does not appear as a separate key in the file.

Note: Partial matching is not available for query-style lookups. It is also not available for any lookup items in address lists (see section 10.13).

Partial matching is implemented by doing a series of separate lookups using keys constructed by modifying the original subject key. This means that it can be used with any of the single-key lookup types, provided that partial matching keys beginning with a special prefix (default ``*.'') are included in the data file. Keys in the file that do not begin with the prefix are matched only by unmodified subject keys when partial matching is in use.

Partial matching is requested by adding the string ``partial-'' to the front of the name of a single-key lookup type, for example, partial-dbm. When this is done, the subject key is first looked up unmodified; if that fails, ``*.'' is added at the start of the subject key, and it is looked up again. If that fails, further lookups are tried with dot-separated components removed from the start of the subject key, one-by-one, and ``*.'' added on the front of what remains.

A minimum number of two non-* components are required. This can be adjusted by including a number before the hyphen in the search type. For example, partial3-lsearch specifies a minimum of three non-* components in the modified keys. Omitting the number is equivalent to ``partial2-''. If the subject key is 2250.dates.fict.example then the following keys are looked up when the minimum number of non-* components is two:

  2250.dates.fict.example
  *.2250.dates.fict.example
  *.dates.fict.example
  *.fict.example

As soon as one key in the sequence is successfully looked up, the lookup finishes.

The use of ``*.'' as the partial matching prefix is a default that can be changed. The motivation for this feature is to allow Exim to operate with file formats that are used by other MTAs. A different prefix can be supplied in parentheses instead of the hyphen after ``partial''. For example:

  domains = partial(.)lsearch;/some/file

In this example, if the domain is a.b.c, the sequence of lookups is a.b.c, .a.b.c, and .b.c (the default minimum of 2 non-wild components is unchanged). The prefix may consist of any punctuation characters other than a closing parenthesis. It may be empty, for example:

  domains = partial1()cdb;/some/file

For this example, if the domain is a.b.c, the sequence of lookups is a.b.c, b.c, and c.

If ``partial0'' is specified, what happens at the end (when the lookup with just one non-wild component has failed, and the original key is shortened right down to the null string) depends on the prefix:

If the search type ends in ``*'' or ``*@'' (see section 9.5 above), the search for an ultimate default that this implies happens after all partial lookups have failed. If ``partial0'' is specified, adding ``*'' to the search type has no effect with the default prefix, because the ``*'' key is already included in the sequence of partial lookups. However, there might be a use for lookup types such as ``partial0(.)lsearch*''.

The use of ``*'' in lookup partial matching differs from its use as a wildcard in domain lists and the like. Partial matching works only in terms of dot-separated components; a key such as *fict.example in a database file is useless, because the asterisk in a partial matching subject key is always followed by a dot.

9.7. Lookup caching

An Exim process caches the most recent lookup result on a per-file basis for single-key lookup types, and keeps the relevant files open. In some types of configuration this can lead to many files being kept open for messages with many recipients. To avoid hitting the operating system limit on the number of simultaneously open files, Exim closes the least recently used file when it needs to open more files than its own internal limit, which can be changed via the lookup_open_max option.

For query-style lookups, a single data cache per lookup type is kept. The files are closed and the caches flushed at strategic points during delivery - for example, after all routing is complete.

9.8. Quoting lookup data

When data from an incoming message is included in a query-style lookup, there is the possibility of special characters in the data messing up the syntax of the query. For example, a NIS+ query that contains

  [name=$local_part]

will be broken if the local part happens to contain a closing square bracket. For NIS+, data can be enclosed in double quotes like this:

  [name="$local_part"]

but this still leaves the problem of a double quote in the data. The rule for NIS+ is that double quotes must be doubled. Other lookup types have different rules, and to cope with the differing requirements, an expansion operator of the following form is provided:

  ${quote_<lookup-type>:<string>}

For example, the safest way to write the NIS+ query is

  [name="${quote_nisplus:$local_part}"]

See chapter 11 for full coverage of string expansions. The quote operator can be used for all lookup types, but has no effect for single-key lookups, since no quoting is ever needed in their key strings.

9.9. More about dnsdb

The dnsdb lookup type uses the DNS as its database. A query consists of a record type and a domain name, separated by an equals sign. For example, an expansion string could contain:

  ${lookup dnsdb{mx=a.b.example}{$value}fail}

The supported record types are A, CNAME, MX, NS, PTR, and TXT, and, when Exim is compiled with IPv6 support, AAAA (and A6 if that is also configured). If no type is given, TXT is assumed. When the type is PTR, the address should be given as normal; it is converted to the necessary inverted format internally. For example:

  ${lookup dnsdb{ptr=192.168.4.5}{$value}fail}

For MX records, both the preference value and the host name are returned, separated by a space. If multiple records are found (or, for A6 lookups, if a single record leads to multiple addresses), the data is returned as a concatenation, separated by newlines. The order, of course, depends on the DNS resolver.

9.10. More about LDAP

The original LDAP implementation came from the University of Michigan; this has become ``Open LDAP'', and there are now two different releases. Another implementation comes from Netscape, and Solaris 7 and subsequent releases contain inbuilt LDAP support. Unfortunately, though these are all compatible at the lookup function level, their error handling is different. For this reason it is necessary to set a compile-time variable when building Exim with LDAP, to indicate which LDAP library is in use. One of the following should appear in your Local/Makefile:

  LDAP_LIB_TYPE=UMICHIGAN
  LDAP_LIB_TYPE=OPENLDAP1
  LDAP_LIB_TYPE=OPENLDAP2
  LDAP_LIB_TYPE=NETSCAPE
  LDAP_LIB_TYPE=SOLARIS

If LDAP_LIB_TYPE is not set, Exim assumes OPENLDAP1, which has the same interface as the University of Michigan version.

There are three LDAP lookup types in Exim. These behave slightly differently in the way they handle the results of a query:

For ldap and ldapm, if a query finds only entries with no attributes, Exim behaves as if the entry did not exist, and the lookup fails. The format of the data returned by a successful lookup is described in the next section. First we explain how LDAP queries are coded.

9.11. Format of LDAP queries

An LDAP query takes the form of a URL as defined in RFC 2255. For example, in the configuration of a redirect router one might have this setting:

  data = ${lookup ldap \
    {ldap:///cn=$local_part,o=University%20of%20Cambridge,\
    c=UK?mailbox?base?}}

The URL may begin with ldap or ldaps if your LDAP library supports secure (encrypted) LDAP connections. The second of these ensures that an encrypted TLS connection is used.

9.12. LDAP quoting

Two levels of quoting are required in LDAP queries, the first for LDAP itself and the second because the LDAP query is represented as a URL. Furthermore, within an LDAP query, two different kinds of quoting are required. For this reason, there are two different LDAP-specific quoting operators.

The quote_ldap operator is designed for use on strings that are part of filter specifications. Conceptually, it first does the following conversions on the string:

  *   =>   \2A
  (   =>   \28
  )   =>   \29
  \   =>   \5C

in accordance with RFC 2254. The resulting string is then quoted according to the rules for URLs, that is, all characters except

  ! $ ' - . _ ( ) * +

are converted to their hex values, preceded by a percent sign. For example:

  ${quote_ldap: a(bc)*, a<yz>; }

yields

  %20a%5C28bc%5C29%5C2A%2C%20a%3Cyz%3E%3B%20

Removing the URL quoting, this is (with a leading and a trailing space):

  a\28bc\29\2A, a<yz>;

The quote_ldap_dn operator is designed for use on strings that are part of base DN specifications in queries. Conceptually, it first converts the string by inserting a backslash in front of any of the following characters:

  , + " \ < > ;

It also inserts a backslash before any leading spaces or # characters, and before any trailing spaces. (These rules are in RFC 2253.) The resulting string is then quoted according to the rules for URLs. For example:

  ${quote_ldap_dn: a(bc)*, a<yz>; } 

yields

  %5C%20a(bc)*%5C%2C%20a%5C%3Cyz%5C%3E%5C%3B%5C%20

Removing the URL quoting, this is (with a trailing space):

  \ a(bc)*\, a\<yz\>\;\

There are some further comments about quoting in the section on LDAP authentication below.

9.13. LDAP connections

The connection to an LDAP server may either be over TCP/IP, or, when OpenLDAP is in use, via a Unix domain socket. The example given above does not specify an LDAP server. A server that is reached by TCP/IP can be specified in a query by starting it with

  ldap://<hostname>:<port>/...

If the port (and preceding colon) are omitted, the standard LDAP port (389) is used. When no server is specified in a query, a list of default servers is taken from the ldap_default_servers configuration option. This supplies a colon-separated list of servers which are tried in turn until one successfully handles a query, or there is a serious error. Successful handling either returns the requested data, or indicates that it does not exist. Serious errors are syntactical, or multiple values when only a single value is expected. Errors which cause the next server to be tried are connection failures, bind failures, and timeouts.

For each server name in the list, a port number can be given. The standard way of specifing a host and port is to use a colon separator (RFC 1738). Because ldap_default_servers is a colon-separated list, such colons have to be doubled. For example

  ldap_default_servers = ldap1.example.com::145:ldap2.example.com

If ldap_default_servers is unset, a URL with no server name is passed to the LDAP library with no server name, and the library's default (normally the local host) is used.

If you are using the OpenLDAP library, you can connect to an LDAP server using a Unix domain socket instead of a TCP/IP connection. This is specified by using ldapi instead of ldap in LDAP queries. What follows here applies only to OpenLDAP. If Exim is compiled with a different LDAP library, this feature is not available.

For this type of connection, instead of a host name for the server, a pathname for the socket is required, and the port number is not relevant. The pathname can be specified either as an item in ldap_default_servers, or inline in the query. In the former case, you can have settings such as

  ldap_default_servers = /tmp/ldap.sock : backup.ldap.your.domain

When the pathname is given in the query, you have to escape the slashes as %2F to fit in with the LDAP URL syntax. For example:

  ${lookup ldap {ldapi://%2Ftmp%2Fldap.sock/o=...

When Exim processes an LDAP lookup and finds that the ``hostname'' is really a pathname, it uses the Unix domain socket code, even if the query actually specifies ldap or ldaps. In particular, no encryption is used for a socket connection. This behaviour means that you can use a setting of ldap_default_servers such as in the example above with traditional ldap or ldaps queries, and it will work. First, Exim tries a connection via the Unix domain socket; if that fails, it tries a TCP/IP connection to the backup host.

If an explicit ldapi type is given in a query when a host name is specified, an error is diagnosed. However, if there are more items in ldap_default_servers, they are tried. In other words:

Using ldapi with no host or path in the query, and no setting of ldap_default_servers, does whatever the library does by default.

9.14. LDAP authentication and control information

The LDAP URL syntax provides no way of passing authentication and other control information to the server. To make this possible, the URL in an LDAP query may be preceded by any number of ``<name>=<value>'' settings, separated by spaces. If a value contains spaces it must be enclosed in double quotes, and when double quotes are used, backslash is interpreted in the usual way inside them.

The following names are recognized:

  CONNECT set a connection timeout
  USER set the DN, for authenticating the LDAP bind
  PASS set the password, likewise
  SIZE set the limit for the number of entries returned
  TIME set the maximum waiting time for a query

Here is an example of an LDAP query in an Exim lookup that uses some of these values. This is a single line, folded for ease of reading:

  ${lookup ldap
    {user="cn=manager,o=University of Cambridge,c=UK" pass=secret
    ldap:///o=University%20of%20Cambridge,c=UK?sn?sub?(cn=foo)}
    {$value}fail}

The encoding of spaces as %20 is a URL thing which should not be done for any of the auxiliary data. Exim configuration settings that include lookups which contain password information should be preceded by ``hide'' to prevent non-admin users from using the -bP option to see their values.

The auxiliary data items may be given in any order. The default is no connection timeout (the system timeout is used), no user or password, no limit on the number of entries returned, and no time limit on queries.

The time limit for connection is given in seconds; zero means use the default. This facility is available in Netscape SDK 4.1; it may not be available in other LDAP implementations. Exim uses the given value if LDAP_X_OPT_CONNECT_TIMEOUT is defined in the LDAP headers.

When a DN is quoted in the USER= setting for LDAP authentication, Exim removes any URL quoting that it may contain before passing it LDAP. Apparently some libraries do this for themselves, but some do not. Removing the URL quoting has two advantages:

For example, a setting such as

  USER=cn=${quote_ldap_dn:$1}

should work even if $1 contains spaces.

Expanded data for the PASS= value should be quoted using the quote expansion operator, rather than the LDAP quote operators. The only reason this field needs quoting is to ensure that it conforms to the Exim syntax, which does not allow unquoted spaces. For example:

  PASS=${quote:$3}

The LDAP authentication mechanism can be used to check passwords as part of SMTP authentication. See the ldapauth expansion string condition in chapter 11.

9.15. Format of data returned by LDAP

The ldapdn lookup type returns the Distinguished Name from a single entry as a sequence of values, for example

  cn=manager, o=University of Cambridge, c=UK

The ldap lookup type generates an error if more than one entry matches the search filter, whereas ldapm permits this case, and inserts a newline in the result between the data from different entries. It is possible for multiple values to be returned for both ldap and ldapm, but in the former case you know that whatever values are returned all came from a single entry in the directory.

In the common case where you specify a single attribute in your LDAP query, the result is not quoted, and does not contain the attribute name. If the attribute has multiple values, they are separated by commas.

If you specify multiple attributes, the result contains space-separated, quoted strings, each preceded by the attribute name and an equals sign. Within the quotes, the quote character, backslash, and newline are escaped with backslashes, and commas are used to separate multiple values for the attribute. Apart from the escaping, the string within quotes takes the same form as the output when a single attribute is requested. Specifying no attributes is the same as specifying all of an entry's attributes.

Here are some examples of the output format. The first line of each pair is an LDAP query, and the second is the data that is returned. The attribute called attr1 has two values, whereas attr2 has only one value:

  ldap:///o=base?attr1?sub?(uid=fred)
  value1.1, value1.2
  
  ldap:///o=base?attr2?sub?(uid=fred)
  value two
  
  ldap:///o=base?attr1,attr2?sub?(uid=fred)
  attr1="value1.1, value1.2" attr2="value two"
  
  ldap:///o=base??sub?(uid=fred)
  objectClass="top" attr1="value1.1, value1.2" attr2="value two"

The extract operator in string expansions can be used to pick out individual fields from data that consists of key=value pairs. You can make use of Exim's -be option to run expansion tests and thereby check the results of LDAP lookups.

9.16. More about NIS+

NIS+ queries consist of a NIS+ indexed name followed by an optional colon and field name. If this is given, the result of a successful query is the contents of the named field; otherwise the result consists of a concatenation of field-name=field-value pairs, separated by spaces. Empty values and values containing spaces are quoted. For example, the query

  [name=mg1456],passwd.org_dir

might return the string

  name=mg1456 passwd="" uid=999 gid=999 gcos="Martin Guerre"
  home=/home/mg1456 shell=/bin/bash shadow=""

(split over two lines here to fit on the page), whereas

  [name=mg1456],passwd.org_dir:gcos

would just return

  Martin Guerre

with no quotes. A NIS+ lookup fails if NIS+ returns more than one table entry for the given indexed key. The effect of the quote_nisplus expansion operator is to double any quote characters within the text.

9.17. More about MySQL, PostgreSQL, and Oracle

If any MySQL, PostgreSQL, or Oracle lookups are used, the mysql_servers, pgsql_servers, or oracle_servers option (as appropriate) must be set to a colon-separated list of server information. Each item in the list is a slash-separated list of four items: host name, database name, user name, and password. In the case of Oracle, the host name field is used for the ``service name'', and the database name field is not used and should be empty. For example:

  hide oracle_servers = oracle.plc.example//ph10/abcdwxyz

Because password data is sensitive, you should always precede the setting with ``hide'', to prevent non-admin users from obtaining the setting via the -bP option. Here is an example where two MySQL servers are listed:

  hide mysql_servers = localhost/users/root/secret:\
                       otherhost/users/root/othersecret

For MySQL and PostgreSQL, a host may be specified as <name>:<port> but because this is a colon-separated list, the colon has to be doubled.

For each query, these parameter groups are tried in order until a connection and a query succeeds. Queries for these databases are SQL statements, so an example might be

  ${lookup mysql{select mailbox from users where id='ph10'}{$value}fail}

If the result of the query contains more than one field, the data for each field in the row is returned, preceded by its name, so the result of

  ${lookup pgsql{select home,name from users where id='ph10'}{$value}}

might be

  home=/home/ph10 name="Philip Hazel"

Values containing spaces and empty values are double quoted, with embedded quotes escaped by a backslash.

If the result of the query contains just one field, the value is passed back verbatim, without a field name, for example:

  Philip Hazel

If the result of the query yields more than one row, it is all concatenated, with a newline between the data for each row.

The quote_mysql, quote_pgsql, and quote_oracle expansion operators convert newline, tab, carriage return, and backspace to \n, \t, \r, and \b respectively, and the characters single-quote, double-quote, and backslash itself are escaped with backslashes. The quote_pgsql expansion operator, in addition, escapes the percent and underscore characters. This cannot be done for MySQL because these escapes are not recognized in contexts where these characters are not special.

9.18. Special MySQL features

For MySQL, an empty host name or the use of ``localhost'' in mysql_servers causes a connection to the server on the local host by means of a Unix domain socket. An alternate socket can be specified in parentheses. The full syntax of each item in mysql_servers is:

  <hostname>::<port>(<socket name>)/<database>/<user>/<password>

Any of the three sub-parts of the first field can be omitted. For normal use on the local host it can be left blank or set to just ``localhost''.

No database need be supplied - but if it is absent here, it must be given in the queries.

If a MySQL query is issued that does not request any data (an insert, update, or delete command), the result of the lookup is the number of rows affected.

9.19. Special PostgreSQL features

PostgreSQL lookups can also use Unix domain socket connections to the database. This is usually faster and costs less CPU time than a TCP/IP connection. However it can be used only if the mail server runs on the same machine as the database server. A configuration line for PostgreSQL via Unix domain sockets looks like this:

  hide pgsql_servers = (/tmp/.s.PGSQL.5432)/db/user/password : ...

In other words, instead of supplying a host name, a path to the socket is given. The path name is enclosed in parentheses so that its slashes aren't visually confused with the delimiters for the other server parameters.

If a PostgreSQL query is issued that does not request any data (an insert, update, or delete command), the result of the lookup is the number of rows affected.


Previous  Next  Contents       (Exim 4.20 Specification)