/[pcre]/code/trunk/doc/pcrestack.3
ViewVC logotype

Diff of /code/trunk/doc/pcrestack.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 91 by nigel, Sat Feb 24 21:41:34 2007 UTC revision 487 by ph10, Wed Jan 6 10:26:55 2010 UTC
# Line 17  the recursive call would immediately be Line 17  the recursive call would immediately be
17  current call (a "tail recursion"), the function is just restarted instead.  current call (a "tail recursion"), the function is just restarted instead.
18  .P  .P
19  The \fBpcre_dfa_exec()\fP function operates in an entirely different way, and  The \fBpcre_dfa_exec()\fP function operates in an entirely different way, and
20  hardly uses recursion at all. The limit on its complexity is the amount of  uses recursion only when there is a regular expression recursion or subroutine
21  workspace it is given. The comments that follow do NOT apply to  call in the pattern. This includes the processing of assertion and "once-only"
22  \fBpcre_dfa_exec()\fP; they are relevant only for \fBpcre_exec()\fP.  subpatterns, which are handled like subroutine calls. Normally, these are never
23  .P  very deep, and the limit on the complexity of \fBpcre_dfa_exec()\fP is
24  You can set limits on the number of times that \fBmatch()\fP is called, both in  controlled by the amount of workspace it is given. However, it is possible to
25  total and recursively. If the limit is exceeded, an error occurs. For details,  write patterns with runaway infinite recursions; such patterns will cause
26  see the  \fBpcre_dfa_exec()\fP to run out of stack. At present, there is no protection
27  .\" HTML <a href="pcreapi.html#extradata">  against this.
28  .\" </a>  .P
29  section on extra data for \fBpcre_exec()\fP  The comments that follow do NOT apply to \fBpcre_dfa_exec()\fP; they are
30  .\"  relevant only for \fBpcre_exec()\fP.
31  in the  .
32  .\" HREF  .
33  \fBpcreapi\fP  .SS "Reducing \fBpcre_exec()\fP's stack usage"
34  .\"  .rs
35  documentation.  .sp
 .P  
36  Each time that \fBmatch()\fP is actually called recursively, it uses memory  Each time that \fBmatch()\fP is actually called recursively, it uses memory
37  from the process stack. For certain kinds of pattern and data, very large  from the process stack. For certain kinds of pattern and data, very large
38  amounts of stack may be needed, despite the recognition of "tail recursion".  amounts of stack may be needed, despite the recognition of "tail recursion".
# Line 52  frame for each matched character. For a Line 51  frame for each matched character. For a
51  required. Consider now this rewritten pattern, which matches exactly the same  required. Consider now this rewritten pattern, which matches exactly the same
52  strings:  strings:
53  .sp  .sp
54    ([^<]++|<(?!inet))    ([^<]++|<(?!inet))+
55  .sp  .sp
56  This uses very much less stack, because runs of characters that do not contain  This uses very much less stack, because runs of characters that do not contain
57  "<" are "swallowed" in one item inside the parentheses. Recursion happens only  "<" are "swallowed" in one item inside the parentheses. Recursion happens only
# Line 61  assume this is relatively rare). A posse Line 60  assume this is relatively rare). A posse
60  backtracking into the runs of non-"<" characters, but that is not related to  backtracking into the runs of non-"<" characters, but that is not related to
61  stack usage.  stack usage.
62  .P  .P
63    This example shows that one way of avoiding stack problems when matching long
64    subject strings is to write repeated parenthesized subpatterns to match more
65    than one character whenever possible.
66    .
67    .
68    .SS "Compiling PCRE to use heap instead of stack for \fBpcre_exec()\fP"
69    .rs
70    .sp
71  In environments where stack memory is constrained, you might want to compile  In environments where stack memory is constrained, you might want to compile
72  PCRE to use heap memory instead of stack for remembering back-up points. This  PCRE to use heap memory instead of stack for remembering back-up points when
73  makes it run a lot more slowly, however. Details of how to do this are given in  \fBpcre_exec()\fP is running. This makes it run a lot more slowly, however.
74  the  Details of how to do this are given in the
75  .\" HREF  .\" HREF
76  \fBpcrebuild\fP  \fBpcrebuild\fP
77  .\"  .\"
78    documentation. When built in this way, instead of using the stack, PCRE obtains
79    and frees memory by calling the functions that are pointed to by the
80    \fBpcre_stack_malloc\fP and \fBpcre_stack_free\fP variables. By default, these
81    point to \fBmalloc()\fP and \fBfree()\fP, but you can replace the pointers to
82    cause PCRE to use your own functions. Since the block sizes are always the
83    same, and are always freed in reverse order, it may be possible to implement
84    customized memory handlers that are more efficient than the standard functions.
85    .
86    .
87    .SS "Limiting \fBpcre_exec()\fP's stack usage"
88    .rs
89    .sp
90    You can set limits on the number of times that \fBmatch()\fP is called, both in
91    total and recursively. If a limit is exceeded, \fBpcre_exec()\fP returns an
92    error code. Setting suitable limits should prevent it from running out of
93    stack. The default values of the limits are very large, and unlikely ever to
94    operate. They can be changed when PCRE is built, and they can also be set when
95    \fBpcre_exec()\fP is called. For details of these interfaces, see the
96    .\" HREF
97    \fBpcrebuild\fP
98    .\"
99    documentation and the
100    .\" HTML <a href="pcreapi.html#extradata">
101    .\" </a>
102    section on extra data for \fBpcre_exec()\fP
103    .\"
104    in the
105    .\" HREF
106    \fBpcreapi\fP
107    .\"
108  documentation.  documentation.
109  .P  .P
110  In Unix-like environments, there is not often a problem with the stack, though  As a very rough rule of thumb, you should reckon on about 500 bytes per
111  the default limit on stack size varies from system to system. Values from 8Mb  recursion. Thus, if you want to limit your stack usage to 8Mb, you
112  to 64Mb are common. You can find your default limit by running the command:  should set the limit at 16000 recursions. A 64Mb stack, on the other hand, can
113    support around 128000 recursions.
114    .P
115    In Unix-like environments, the \fBpcretest\fP test program has a command line
116    option (\fB-S\fP) that can be used to increase the size of its stack. As long
117    as the stack is large enough, another option (\fB-M\fP) can be used to find the
118    smallest limits that allow a particular pattern to match a given subject
119    string. This is done by calling \fBpcre_exec()\fP repeatedly with different
120    limits.
121    .
122    .
123    .SS "Changing stack size in Unix-like systems"
124    .rs
125    .sp
126    In Unix-like environments, there is not often a problem with the stack unless
127    very long strings are involved, though the default limit on stack size varies
128    from system to system. Values from 8Mb to 64Mb are common. You can find your
129    default limit by running the command:
130  .sp  .sp
131    ulimit -s    ulimit -s
132  .sp  .sp
133  The effect of running out of stack is often SIGSEGV, though sometimes an error  Unfortunately, the effect of running out of stack is often SIGSEGV, though
134  message is given. You can normally increase the limit on stack size by code  sometimes a more explicit error message is given. You can normally increase the
135  such as this:  limit on stack size by code such as this:
136  .sp  .sp
137    struct rlimit rlim;    struct rlimit rlim;
138    getrlimit(RLIMIT_STACK, &rlim);    getrlimit(RLIMIT_STACK, &rlim);
# Line 88  such as this: Line 142  such as this:
142  This reads the current limits (soft and hard) using \fBgetrlimit()\fP, then  This reads the current limits (soft and hard) using \fBgetrlimit()\fP, then
143  attempts to increase the soft limit to 100Mb using \fBsetrlimit()\fP. You must  attempts to increase the soft limit to 100Mb using \fBsetrlimit()\fP. You must
144  do this before calling \fBpcre_exec()\fP.  do this before calling \fBpcre_exec()\fP.
145  .P  .
146  PCRE has an internal counter that can be used to limit the depth of recursion,  .
147  and thus cause \fBpcre_exec()\fP to give an error code before it runs out of  .SS "Changing stack size in Mac OS X"
148  stack. By default, the limit is very large, and unlikely ever to operate. It  .rs
149  can be changed when PCRE is built, and it can also be set when  .sp
150  \fBpcre_exec()\fP is called. For details of these interfaces, see the  Using \fBsetrlimit()\fP, as described above, should also work on Mac OS X. It
151  .\" HREF  is also possible to set a stack size when linking a program. There is a
152  \fBpcrebuild\fP  discussion about stack sizes in Mac OS X at this web site:
153  .\"  .\" HTML <a href="http://developer.apple.com/qa/qa2005/qa1419.html">
154  and  .\" </a>
155  .\" HREF  http://developer.apple.com/qa/qa2005/qa1419.html.
 \fBpcreapi\fP  
156  .\"  .\"
157  documentation.  .
158  .P  .
159  As a very rough rule of thumb, you should reckon on about 500 bytes per  .SH AUTHOR
160  recursion. Thus, if you want to limit your stack usage to 8Mb, you  .rs
161  should set the limit at 16000 recursions. A 64Mb stack, on the other hand, can  .sp
162  support around 128000 recursions. The \fBpcretest\fP test program has a command  .nf
163  line option (\fB-S\fP) that can be used to increase its stack.  Philip Hazel
164  .P  University Computing Service
165  .in 0  Cambridge CB2 3QH, England.
166  Last updated: 29 June 2006  .fi
167  .br  .
168  Copyright (c) 1997-2006 University of Cambridge.  .
169    .SH REVISION
170    .rs
171    .sp
172    .nf
173    Last updated: 03 January 2010
174    Copyright (c) 1997-2010 University of Cambridge.
175    .fi

Legend:
Removed from v.91  
changed lines
  Added in v.487

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12