mirror of
https://github.com/tildeclub/ex-vi.git
synced 2026-01-24 02:10:19 +00:00
files from last release. ex-050325
This commit is contained in:
504
libuxre/COPYING.LGPL
Normal file
504
libuxre/COPYING.LGPL
Normal file
@@ -0,0 +1,504 @@
|
||||
GNU LESSER GENERAL PUBLIC LICENSE
|
||||
Version 2.1, February 1999
|
||||
|
||||
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
|
||||
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
[This is the first released version of the Lesser GPL. It also counts
|
||||
as the successor of the GNU Library Public License, version 2, hence
|
||||
the version number 2.1.]
|
||||
|
||||
Preamble
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
Licenses are intended to guarantee your freedom to share and change
|
||||
free software--to make sure the software is free for all its users.
|
||||
|
||||
This license, the Lesser General Public License, applies to some
|
||||
specially designated software packages--typically libraries--of the
|
||||
Free Software Foundation and other authors who decide to use it. You
|
||||
can use it too, but we suggest you first think carefully about whether
|
||||
this license or the ordinary General Public License is the better
|
||||
strategy to use in any particular case, based on the explanations below.
|
||||
|
||||
When we speak of free software, we are referring to freedom of use,
|
||||
not price. Our General Public Licenses are designed to make sure that
|
||||
you have the freedom to distribute copies of free software (and charge
|
||||
for this service if you wish); that you receive source code or can get
|
||||
it if you want it; that you can change the software and use pieces of
|
||||
it in new free programs; and that you are informed that you can do
|
||||
these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
distributors to deny you these rights or to ask you to surrender these
|
||||
rights. These restrictions translate to certain responsibilities for
|
||||
you if you distribute copies of the library or if you modify it.
|
||||
|
||||
For example, if you distribute copies of the library, whether gratis
|
||||
or for a fee, you must give the recipients all the rights that we gave
|
||||
you. You must make sure that they, too, receive or can get the source
|
||||
code. If you link other code with the library, you must provide
|
||||
complete object files to the recipients, so that they can relink them
|
||||
with the library after making changes to the library and recompiling
|
||||
it. And you must show them these terms so they know their rights.
|
||||
|
||||
We protect your rights with a two-step method: (1) we copyright the
|
||||
library, and (2) we offer you this license, which gives you legal
|
||||
permission to copy, distribute and/or modify the library.
|
||||
|
||||
To protect each distributor, we want to make it very clear that
|
||||
there is no warranty for the free library. Also, if the library is
|
||||
modified by someone else and passed on, the recipients should know
|
||||
that what they have is not the original version, so that the original
|
||||
author's reputation will not be affected by problems that might be
|
||||
introduced by others.
|
||||
|
||||
Finally, software patents pose a constant threat to the existence of
|
||||
any free program. We wish to make sure that a company cannot
|
||||
effectively restrict the users of a free program by obtaining a
|
||||
restrictive license from a patent holder. Therefore, we insist that
|
||||
any patent license obtained for a version of the library must be
|
||||
consistent with the full freedom of use specified in this license.
|
||||
|
||||
Most GNU software, including some libraries, is covered by the
|
||||
ordinary GNU General Public License. This license, the GNU Lesser
|
||||
General Public License, applies to certain designated libraries, and
|
||||
is quite different from the ordinary General Public License. We use
|
||||
this license for certain libraries in order to permit linking those
|
||||
libraries into non-free programs.
|
||||
|
||||
When a program is linked with a library, whether statically or using
|
||||
a shared library, the combination of the two is legally speaking a
|
||||
combined work, a derivative of the original library. The ordinary
|
||||
General Public License therefore permits such linking only if the
|
||||
entire combination fits its criteria of freedom. The Lesser General
|
||||
Public License permits more lax criteria for linking other code with
|
||||
the library.
|
||||
|
||||
We call this license the "Lesser" General Public License because it
|
||||
does Less to protect the user's freedom than the ordinary General
|
||||
Public License. It also provides other free software developers Less
|
||||
of an advantage over competing non-free programs. These disadvantages
|
||||
are the reason we use the ordinary General Public License for many
|
||||
libraries. However, the Lesser license provides advantages in certain
|
||||
special circumstances.
|
||||
|
||||
For example, on rare occasions, there may be a special need to
|
||||
encourage the widest possible use of a certain library, so that it becomes
|
||||
a de-facto standard. To achieve this, non-free programs must be
|
||||
allowed to use the library. A more frequent case is that a free
|
||||
library does the same job as widely used non-free libraries. In this
|
||||
case, there is little to gain by limiting the free library to free
|
||||
software only, so we use the Lesser General Public License.
|
||||
|
||||
In other cases, permission to use a particular library in non-free
|
||||
programs enables a greater number of people to use a large body of
|
||||
free software. For example, permission to use the GNU C Library in
|
||||
non-free programs enables many more people to use the whole GNU
|
||||
operating system, as well as its variant, the GNU/Linux operating
|
||||
system.
|
||||
|
||||
Although the Lesser General Public License is Less protective of the
|
||||
users' freedom, it does ensure that the user of a program that is
|
||||
linked with the Library has the freedom and the wherewithal to run
|
||||
that program using a modified version of the Library.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow. Pay close attention to the difference between a
|
||||
"work based on the library" and a "work that uses the library". The
|
||||
former contains code derived from the library, whereas the latter must
|
||||
be combined with the library in order to run.
|
||||
|
||||
GNU LESSER GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License Agreement applies to any software library or other
|
||||
program which contains a notice placed by the copyright holder or
|
||||
other authorized party saying it may be distributed under the terms of
|
||||
this Lesser General Public License (also called "this License").
|
||||
Each licensee is addressed as "you".
|
||||
|
||||
A "library" means a collection of software functions and/or data
|
||||
prepared so as to be conveniently linked with application programs
|
||||
(which use some of those functions and data) to form executables.
|
||||
|
||||
The "Library", below, refers to any such software library or work
|
||||
which has been distributed under these terms. A "work based on the
|
||||
Library" means either the Library or any derivative work under
|
||||
copyright law: that is to say, a work containing the Library or a
|
||||
portion of it, either verbatim or with modifications and/or translated
|
||||
straightforwardly into another language. (Hereinafter, translation is
|
||||
included without limitation in the term "modification".)
|
||||
|
||||
"Source code" for a work means the preferred form of the work for
|
||||
making modifications to it. For a library, complete source code means
|
||||
all the source code for all modules it contains, plus any associated
|
||||
interface definition files, plus the scripts used to control compilation
|
||||
and installation of the library.
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running a program using the Library is not restricted, and output from
|
||||
such a program is covered only if its contents constitute a work based
|
||||
on the Library (independent of the use of the Library in a tool for
|
||||
writing it). Whether that is true depends on what the Library does
|
||||
and what the program that uses the Library does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Library's
|
||||
complete source code as you receive it, in any medium, provided that
|
||||
you conspicuously and appropriately publish on each copy an
|
||||
appropriate copyright notice and disclaimer of warranty; keep intact
|
||||
all the notices that refer to this License and to the absence of any
|
||||
warranty; and distribute a copy of this License along with the
|
||||
Library.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy,
|
||||
and you may at your option offer warranty protection in exchange for a
|
||||
fee.
|
||||
|
||||
2. You may modify your copy or copies of the Library or any portion
|
||||
of it, thus forming a work based on the Library, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) The modified work must itself be a software library.
|
||||
|
||||
b) You must cause the files modified to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
c) You must cause the whole of the work to be licensed at no
|
||||
charge to all third parties under the terms of this License.
|
||||
|
||||
d) If a facility in the modified Library refers to a function or a
|
||||
table of data to be supplied by an application program that uses
|
||||
the facility, other than as an argument passed when the facility
|
||||
is invoked, then you must make a good faith effort to ensure that,
|
||||
in the event an application does not supply such function or
|
||||
table, the facility still operates, and performs whatever part of
|
||||
its purpose remains meaningful.
|
||||
|
||||
(For example, a function in a library to compute square roots has
|
||||
a purpose that is entirely well-defined independent of the
|
||||
application. Therefore, Subsection 2d requires that any
|
||||
application-supplied function or table used by this function must
|
||||
be optional: if the application does not supply it, the square
|
||||
root function must still compute square roots.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Library,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Library, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote
|
||||
it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Library.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Library
|
||||
with the Library (or with a work based on the Library) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may opt to apply the terms of the ordinary GNU General Public
|
||||
License instead of this License to a given copy of the Library. To do
|
||||
this, you must alter all the notices that refer to this License, so
|
||||
that they refer to the ordinary GNU General Public License, version 2,
|
||||
instead of to this License. (If a newer version than version 2 of the
|
||||
ordinary GNU General Public License has appeared, then you can specify
|
||||
that version instead if you wish.) Do not make any other change in
|
||||
these notices.
|
||||
|
||||
Once this change is made in a given copy, it is irreversible for
|
||||
that copy, so the ordinary GNU General Public License applies to all
|
||||
subsequent copies and derivative works made from that copy.
|
||||
|
||||
This option is useful when you wish to copy part of the code of
|
||||
the Library into a program that is not a library.
|
||||
|
||||
4. You may copy and distribute the Library (or a portion or
|
||||
derivative of it, under Section 2) in object code or executable form
|
||||
under the terms of Sections 1 and 2 above provided that you accompany
|
||||
it with the complete corresponding machine-readable source code, which
|
||||
must be distributed under the terms of Sections 1 and 2 above on a
|
||||
medium customarily used for software interchange.
|
||||
|
||||
If distribution of object code is made by offering access to copy
|
||||
from a designated place, then offering equivalent access to copy the
|
||||
source code from the same place satisfies the requirement to
|
||||
distribute the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
5. A program that contains no derivative of any portion of the
|
||||
Library, but is designed to work with the Library by being compiled or
|
||||
linked with it, is called a "work that uses the Library". Such a
|
||||
work, in isolation, is not a derivative work of the Library, and
|
||||
therefore falls outside the scope of this License.
|
||||
|
||||
However, linking a "work that uses the Library" with the Library
|
||||
creates an executable that is a derivative of the Library (because it
|
||||
contains portions of the Library), rather than a "work that uses the
|
||||
library". The executable is therefore covered by this License.
|
||||
Section 6 states terms for distribution of such executables.
|
||||
|
||||
When a "work that uses the Library" uses material from a header file
|
||||
that is part of the Library, the object code for the work may be a
|
||||
derivative work of the Library even though the source code is not.
|
||||
Whether this is true is especially significant if the work can be
|
||||
linked without the Library, or if the work is itself a library. The
|
||||
threshold for this to be true is not precisely defined by law.
|
||||
|
||||
If such an object file uses only numerical parameters, data
|
||||
structure layouts and accessors, and small macros and small inline
|
||||
functions (ten lines or less in length), then the use of the object
|
||||
file is unrestricted, regardless of whether it is legally a derivative
|
||||
work. (Executables containing this object code plus portions of the
|
||||
Library will still fall under Section 6.)
|
||||
|
||||
Otherwise, if the work is a derivative of the Library, you may
|
||||
distribute the object code for the work under the terms of Section 6.
|
||||
Any executables containing that work also fall under Section 6,
|
||||
whether or not they are linked directly with the Library itself.
|
||||
|
||||
6. As an exception to the Sections above, you may also combine or
|
||||
link a "work that uses the Library" with the Library to produce a
|
||||
work containing portions of the Library, and distribute that work
|
||||
under terms of your choice, provided that the terms permit
|
||||
modification of the work for the customer's own use and reverse
|
||||
engineering for debugging such modifications.
|
||||
|
||||
You must give prominent notice with each copy of the work that the
|
||||
Library is used in it and that the Library and its use are covered by
|
||||
this License. You must supply a copy of this License. If the work
|
||||
during execution displays copyright notices, you must include the
|
||||
copyright notice for the Library among them, as well as a reference
|
||||
directing the user to the copy of this License. Also, you must do one
|
||||
of these things:
|
||||
|
||||
a) Accompany the work with the complete corresponding
|
||||
machine-readable source code for the Library including whatever
|
||||
changes were used in the work (which must be distributed under
|
||||
Sections 1 and 2 above); and, if the work is an executable linked
|
||||
with the Library, with the complete machine-readable "work that
|
||||
uses the Library", as object code and/or source code, so that the
|
||||
user can modify the Library and then relink to produce a modified
|
||||
executable containing the modified Library. (It is understood
|
||||
that the user who changes the contents of definitions files in the
|
||||
Library will not necessarily be able to recompile the application
|
||||
to use the modified definitions.)
|
||||
|
||||
b) Use a suitable shared library mechanism for linking with the
|
||||
Library. A suitable mechanism is one that (1) uses at run time a
|
||||
copy of the library already present on the user's computer system,
|
||||
rather than copying library functions into the executable, and (2)
|
||||
will operate properly with a modified version of the library, if
|
||||
the user installs one, as long as the modified version is
|
||||
interface-compatible with the version that the work was made with.
|
||||
|
||||
c) Accompany the work with a written offer, valid for at
|
||||
least three years, to give the same user the materials
|
||||
specified in Subsection 6a, above, for a charge no more
|
||||
than the cost of performing this distribution.
|
||||
|
||||
d) If distribution of the work is made by offering access to copy
|
||||
from a designated place, offer equivalent access to copy the above
|
||||
specified materials from the same place.
|
||||
|
||||
e) Verify that the user has already received a copy of these
|
||||
materials or that you have already sent this user a copy.
|
||||
|
||||
For an executable, the required form of the "work that uses the
|
||||
Library" must include any data and utility programs needed for
|
||||
reproducing the executable from it. However, as a special exception,
|
||||
the materials to be distributed need not include anything that is
|
||||
normally distributed (in either source or binary form) with the major
|
||||
components (compiler, kernel, and so on) of the operating system on
|
||||
which the executable runs, unless that component itself accompanies
|
||||
the executable.
|
||||
|
||||
It may happen that this requirement contradicts the license
|
||||
restrictions of other proprietary libraries that do not normally
|
||||
accompany the operating system. Such a contradiction means you cannot
|
||||
use both them and the Library together in an executable that you
|
||||
distribute.
|
||||
|
||||
7. You may place library facilities that are a work based on the
|
||||
Library side-by-side in a single library together with other library
|
||||
facilities not covered by this License, and distribute such a combined
|
||||
library, provided that the separate distribution of the work based on
|
||||
the Library and of the other library facilities is otherwise
|
||||
permitted, and provided that you do these two things:
|
||||
|
||||
a) Accompany the combined library with a copy of the same work
|
||||
based on the Library, uncombined with any other library
|
||||
facilities. This must be distributed under the terms of the
|
||||
Sections above.
|
||||
|
||||
b) Give prominent notice with the combined library of the fact
|
||||
that part of it is a work based on the Library, and explaining
|
||||
where to find the accompanying uncombined form of the same work.
|
||||
|
||||
8. You may not copy, modify, sublicense, link with, or distribute
|
||||
the Library except as expressly provided under this License. Any
|
||||
attempt otherwise to copy, modify, sublicense, link with, or
|
||||
distribute the Library is void, and will automatically terminate your
|
||||
rights under this License. However, parties who have received copies,
|
||||
or rights, from you under this License will not have their licenses
|
||||
terminated so long as such parties remain in full compliance.
|
||||
|
||||
9. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Library or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Library (or any work based on the
|
||||
Library), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Library or works based on it.
|
||||
|
||||
10. Each time you redistribute the Library (or any work based on the
|
||||
Library), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute, link with or modify the Library
|
||||
subject to these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties with
|
||||
this License.
|
||||
|
||||
11. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Library at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Library by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Library.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under any
|
||||
particular circumstance, the balance of the section is intended to apply,
|
||||
and the section as a whole is intended to apply in other circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
12. If the distribution and/or use of the Library is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Library under this License may add
|
||||
an explicit geographical distribution limitation excluding those countries,
|
||||
so that distribution is permitted only in or among countries not thus
|
||||
excluded. In such case, this License incorporates the limitation as if
|
||||
written in the body of this License.
|
||||
|
||||
13. The Free Software Foundation may publish revised and/or new
|
||||
versions of the Lesser General Public License from time to time.
|
||||
Such new versions will be similar in spirit to the present version,
|
||||
but may differ in detail to address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Library
|
||||
specifies a version number of this License which applies to it and
|
||||
"any later version", you have the option of following the terms and
|
||||
conditions either of that version or of any later version published by
|
||||
the Free Software Foundation. If the Library does not specify a
|
||||
license version number, you may choose any version ever published by
|
||||
the Free Software Foundation.
|
||||
|
||||
14. If you wish to incorporate parts of the Library into other free
|
||||
programs whose distribution conditions are incompatible with these,
|
||||
write to the author to ask for permission. For software which is
|
||||
copyrighted by the Free Software Foundation, write to the Free
|
||||
Software Foundation; we sometimes make exceptions for this. Our
|
||||
decision will be guided by the two goals of preserving the free status
|
||||
of all derivatives of our free software and of promoting the sharing
|
||||
and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
|
||||
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
|
||||
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
|
||||
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
|
||||
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
|
||||
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
|
||||
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
|
||||
|
||||
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
|
||||
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
|
||||
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
|
||||
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
|
||||
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
|
||||
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
|
||||
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
|
||||
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
|
||||
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
|
||||
DAMAGES.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
How to Apply These Terms to Your New Libraries
|
||||
|
||||
If you develop a new library, and you want it to be of the greatest
|
||||
possible use to the public, we recommend making it free software that
|
||||
everyone can redistribute and change. You can do so by permitting
|
||||
redistribution under these terms (or, alternatively, under the terms of the
|
||||
ordinary General Public License).
|
||||
|
||||
To apply these terms, attach the following notices to the library. It is
|
||||
safest to attach them to the start of each source file to most effectively
|
||||
convey the exclusion of warranty; and each file should have at least the
|
||||
"copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the library's name and a brief idea of what it does.>
|
||||
Copyright (C) <year> <name of author>
|
||||
|
||||
This library is free software; you can redistribute it and/or
|
||||
modify it under the terms of the GNU Lesser General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2.1 of the License, or (at your option) any later version.
|
||||
|
||||
This library is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
Lesser General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU Lesser General Public
|
||||
License along with this library; if not, write to the Free Software
|
||||
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
You should also get your employer (if you work as a programmer) or your
|
||||
school, if any, to sign a "copyright disclaimer" for the library, if
|
||||
necessary. Here is a sample; alter the names:
|
||||
|
||||
Yoyodyne, Inc., hereby disclaims all copyright interest in the
|
||||
library `Frob' (a library for tweaking knobs) written by James Random Hacker.
|
||||
|
||||
<signature of Ty Coon>, 1 April 1990
|
||||
Ty Coon, President of Vice
|
||||
|
||||
That's all there is to it!
|
||||
|
||||
|
||||
12
libuxre/Makefile
Normal file
12
libuxre/Makefile
Normal file
@@ -0,0 +1,12 @@
|
||||
CFLAGS = $(COPT) $(RPMCFLAGS) -I.
|
||||
OBJS = bracket.o _collelem.o _collmult.o regcomp.o regdfa.o regerror.o regexec.o regfree.o regnfa.o regparse.o stubs.o
|
||||
|
||||
.c.o: ; $(CC) $(CFLAGS) -c $<
|
||||
|
||||
all: libuxre.a
|
||||
|
||||
libuxre.a: $(OBJS)
|
||||
ar cr libuxre.a $(OBJS)
|
||||
|
||||
clean:
|
||||
rm -f libuxre.a $(OBJS) core
|
||||
14
libuxre/NOTES
Normal file
14
libuxre/NOTES
Normal file
@@ -0,0 +1,14 @@
|
||||
Notes for the modified 'UNIX(R) Regular Expression Library'
|
||||
============================================================
|
||||
|
||||
The code this is based on was released by Caldera as 'osutils-0.1a'
|
||||
and is available at <http://unixtools.sourceforge.net/>. Notable
|
||||
changes include:
|
||||
|
||||
- Support for multibyte characters was enabled again.
|
||||
- Support for traditional extended regular expression syntax was added.
|
||||
- Fix: With REG_ICASE, [B-z] matches 'A', 'a', and '[' according to
|
||||
POSIX.2.
|
||||
- Some speed improvements.
|
||||
|
||||
Gunnar Ritter 9/22/03
|
||||
119
libuxre/_collelem.c
Normal file
119
libuxre/_collelem.c
Normal file
@@ -0,0 +1,119 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)_collelem.c 1.4 (gritter) 10/18/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
#include "colldata.h"
|
||||
#include <stddef.h>
|
||||
|
||||
#define CCE(p) ((const CollElem *)(p))
|
||||
#define CCM(p) ((const CollMult *)(p))
|
||||
|
||||
LIBUXRE_STATIC const CollElem *
|
||||
libuxre_collelem(struct lc_collate *col, CollElem *spare, wchar_t wc)
|
||||
{
|
||||
const char *tbl;
|
||||
size_t hi, lo, cur;
|
||||
const CollMult *cmp;
|
||||
const CollElem *cep;
|
||||
long diff;
|
||||
int sz;
|
||||
|
||||
/*
|
||||
* ELEM_ENCODED is returned when the collation is entirely
|
||||
* based on the encoded value of the character.
|
||||
*/
|
||||
if (col == 0 || col->flags & CHF_ENCODED
|
||||
|| (tbl = (const char *)col->maintbl) == 0)
|
||||
{
|
||||
return ELEM_ENCODED;
|
||||
}
|
||||
if ((wuchar_type)wc <= UCHAR_MAX)
|
||||
{
|
||||
indexed:;
|
||||
cep = CCE(&tbl[(wuchar_type)wc * col->elemsize]);
|
||||
if (cep->weight[0] == WGHT_SPECIAL)
|
||||
return ELEM_BADCHAR;
|
||||
return cep;
|
||||
}
|
||||
if (col->flags & CHF_INDEXED)
|
||||
{
|
||||
if ((wuchar_type)wc >= col->nmain)
|
||||
return ELEM_BADCHAR;
|
||||
goto indexed;
|
||||
}
|
||||
/*
|
||||
* Binary search for a match. Could speed up the search if
|
||||
* some interpolation was used, but keep it simple for now.
|
||||
* Note that this is actually a table of CollMult's.
|
||||
*
|
||||
* To save space in the file, sequences of similar elements
|
||||
* are sometimes compressed into a single CollMult that
|
||||
* describes many entries. This is denoted by a subnbeg
|
||||
* with the SUBN_SPECIAL bit set. The rest of the bits give
|
||||
* the range covered by this entry.
|
||||
*/
|
||||
sz = col->elemsize + (sizeof(CollMult) - sizeof(CollElem));
|
||||
tbl += (1 + UCHAR_MAX) * col->elemsize;
|
||||
lo = 0;
|
||||
hi = col->nmain - UCHAR_MAX;
|
||||
while (lo < hi)
|
||||
{
|
||||
if ((cur = (hi + lo) >> 1) < lo) /* hi+lo overflowed */
|
||||
cur |= ~(~(size_t)0 >> 1); /* lost high order bit */
|
||||
cmp = CCM(&tbl[cur * sz]);
|
||||
if ((diff = wc - cmp->ch) < 0)
|
||||
hi = cur;
|
||||
else if (cmp->elem.subnbeg & SUBN_SPECIAL)
|
||||
{
|
||||
if (diff > (long)(cmp->elem.subnbeg & ~SUBN_SPECIAL))
|
||||
lo = cur + 1;
|
||||
else /* create an entry from the sequence in spare */
|
||||
{
|
||||
spare->multbeg = cmp->elem.multbeg;
|
||||
spare->subnbeg = 0;
|
||||
spare->weight[0] = cmp->elem.weight[0] + diff;
|
||||
for (lo = 1; lo < col->nweight; lo++)
|
||||
{
|
||||
wuchar_type w;
|
||||
|
||||
if ((w = cmp->elem.weight[lo])
|
||||
== WGHT_SPECIAL)
|
||||
{
|
||||
w = spare->weight[0];
|
||||
}
|
||||
spare->weight[lo] = w;
|
||||
}
|
||||
return spare;
|
||||
}
|
||||
}
|
||||
else if (diff == 0)
|
||||
return &cmp->elem;
|
||||
else
|
||||
lo = cur + 1;
|
||||
}
|
||||
return ELEM_BADCHAR;
|
||||
}
|
||||
55
libuxre/_collmult.c
Normal file
55
libuxre/_collmult.c
Normal file
@@ -0,0 +1,55 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)_collmult.c 1.4 (gritter) 9/22/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
#include "colldata.h"
|
||||
#include <stddef.h>
|
||||
|
||||
#define CCM(p) ((const CollMult *)(p))
|
||||
|
||||
LIBUXRE_STATIC const CollElem *
|
||||
libuxre_collmult(struct lc_collate *col, const CollElem *cep, wchar_t wc)
|
||||
{
|
||||
const char *tbl;
|
||||
size_t sz;
|
||||
w_type ch;
|
||||
|
||||
if (col == 0 || cep->multbeg == 0
|
||||
|| (tbl = (const char *)col->multtbl) == 0)
|
||||
{
|
||||
return ELEM_BADCHAR;
|
||||
}
|
||||
sz = col->elemsize + (sizeof(CollMult) - sizeof(CollElem));
|
||||
tbl += sz * cep->multbeg;
|
||||
while ((ch = CCM(tbl)->ch) != wc)
|
||||
{
|
||||
if (ch == 0)
|
||||
return ELEM_BADCHAR; /* end of list */
|
||||
tbl += sz;
|
||||
}
|
||||
return &CCM(tbl)->elem;
|
||||
}
|
||||
829
libuxre/bracket.c
Normal file
829
libuxre/bracket.c
Normal file
@@ -0,0 +1,829 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)bracket.c 1.14 (gritter) 10/18/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
#include <ctype.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include "re.h"
|
||||
|
||||
/*
|
||||
* Build and match the [...] part of REs.
|
||||
*
|
||||
* In general, each compiled bracket construct holds a set of mapped
|
||||
* wide character values and a set of character classifications.
|
||||
* The mapping applied (when the current LC_COLLATE is not CHF_ENCODED)
|
||||
* is the "basic" weight (cep->weight[0]); otherwise the actual wide
|
||||
* character is used.
|
||||
*
|
||||
* To support simplified range handling, this code assumes that a w_type,
|
||||
* a signed integer type, can hold all valid basic weight values (as well
|
||||
* as all wide character values for CHF_ENCODED locales) and that these
|
||||
* are all positive. Negative values indicate error conditions (BKT_*);
|
||||
* zero (which must be the same as WGHT_IGNORE) indicates success, but
|
||||
* that the item installed is not a range endpoint.
|
||||
*/
|
||||
|
||||
static int
|
||||
addwide(Bracket *bp, wchar_t ord)
|
||||
{
|
||||
unsigned int nw;
|
||||
|
||||
if ((nw = bp->nwide) < NWIDE)
|
||||
bp->wide[nw] = ord;
|
||||
else
|
||||
{
|
||||
if (nw % NWIDE == 0 && (bp->exwide =
|
||||
realloc(bp->exwide, nw * sizeof(wchar_t))) == 0)
|
||||
{
|
||||
return BKT_ESPACE;
|
||||
}
|
||||
nw -= NWIDE;
|
||||
bp->exwide[nw] = ord;
|
||||
}
|
||||
bp->nwide++;
|
||||
return 0;
|
||||
}
|
||||
|
||||
#if USHRT_MAX == 65535 /* have 16 bits */
|
||||
#define PLIND(n) ((n) >> 4)
|
||||
#define PLBIT(n) (1 << ((n) & 0xf))
|
||||
#else
|
||||
#define PLIND(n) ((n) / CHAR_BIT)
|
||||
#define PLBIT(n) (1 << ((n) % CHAR_BIT))
|
||||
#endif
|
||||
|
||||
#define RANGE ((wchar_t)'-') /* separates wide chars in ranges */
|
||||
|
||||
static int
|
||||
addrange(Bracket *bp, wchar_t ord, w_type prev)
|
||||
{
|
||||
int ret;
|
||||
|
||||
if (prev > 0 && prev != ord) /* try for range */
|
||||
{
|
||||
if (prev > ord)
|
||||
{
|
||||
if (bp->flags & BKT_ODDRANGE) /* prev only - done */
|
||||
return 0;
|
||||
else if ((bp->flags & BKT_BADRANGE) == 0)
|
||||
return BKT_ERANGE;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (++prev <= UCHAR_MAX) /* "prev" already there */
|
||||
{
|
||||
do
|
||||
{
|
||||
bp->byte[PLIND(prev)] |= PLBIT(prev);
|
||||
if (prev == ord)
|
||||
return 0;
|
||||
} while (++prev <= UCHAR_MAX);
|
||||
}
|
||||
if ((ret = addwide(bp, prev)) != 0)
|
||||
return ret;
|
||||
if (++prev > ord)
|
||||
return 0;
|
||||
if (prev < ord && (ret = addwide(bp, RANGE)) != 0)
|
||||
return ret;
|
||||
return addwide(bp, ord);
|
||||
}
|
||||
}
|
||||
if (ord <= UCHAR_MAX)
|
||||
{
|
||||
bp->byte[PLIND(ord)] |= PLBIT(ord);
|
||||
return 0;
|
||||
}
|
||||
if (prev == ord) /* don't bother */
|
||||
return 0;
|
||||
return addwide(bp, ord);
|
||||
}
|
||||
|
||||
static w_type
|
||||
place(Bracket *bp, wchar_t wc, w_type prev, int mb_cur_max)
|
||||
{
|
||||
const CollElem *cep;
|
||||
CollElem spare;
|
||||
int ret;
|
||||
|
||||
if ((cep = libuxre_collelem(bp->col, &spare, wc)) != ELEM_ENCODED)
|
||||
{
|
||||
if (cep == ELEM_BADCHAR)
|
||||
return BKT_BADCHAR;
|
||||
wc = cep->weight[0];
|
||||
}
|
||||
if ((ret = addrange(bp, wc, prev)) != 0)
|
||||
return ret;
|
||||
return wc;
|
||||
}
|
||||
|
||||
#ifndef CHARCLASS_NAME_MAX
|
||||
# define CHARCLASS_NAME_MAX 127
|
||||
#endif
|
||||
|
||||
static w_type
|
||||
chcls(Bracket *bp, const unsigned char *s, int n)
|
||||
{
|
||||
char clsstr[CHARCLASS_NAME_MAX + 1];
|
||||
unsigned int nt;
|
||||
wctype_t wct;
|
||||
|
||||
if (n > CHARCLASS_NAME_MAX)
|
||||
return BKT_ECTYPE;
|
||||
(void)memcpy(clsstr, s, n);
|
||||
clsstr[n] = '\0';
|
||||
if ((wct = wctype(clsstr)) == 0)
|
||||
return BKT_ECTYPE;
|
||||
if ((nt = bp->ntype) < NTYPE)
|
||||
bp->type[nt] = wct;
|
||||
else
|
||||
{
|
||||
if (nt % NTYPE == 0 && (bp->extype =
|
||||
realloc(bp->extype, nt * sizeof(wctype_t))) == 0)
|
||||
{
|
||||
return BKT_ESPACE;
|
||||
}
|
||||
nt -= NTYPE;
|
||||
bp->extype[nt] = wct;
|
||||
}
|
||||
bp->ntype++;
|
||||
return 0; /* cannot be end point of a range */
|
||||
}
|
||||
|
||||
/*
|
||||
* The purpose of mcce() and its Mcce structure is to locate
|
||||
* the next full collation element from "wc" and "s". It is
|
||||
* called both at compile and execute time. These two differ
|
||||
* primarily in that at compile time there is an exact number
|
||||
* of bytes to be consumed, while at execute time the longest
|
||||
* valid collation element is to be found.
|
||||
*
|
||||
* When BKT_ONECASE is set, MCCEs become particularly messy.
|
||||
* There is no guarantee that all possible combinations of
|
||||
* upper/lower case are defined as MCCEs. Thus, this code
|
||||
* tries both lower- and uppercase (in that order) for each
|
||||
* character than might be part of an MCCE.
|
||||
*/
|
||||
|
||||
typedef struct
|
||||
{
|
||||
const unsigned char *max; /* restriction by caller */
|
||||
const unsigned char *aft; /* longest successful */
|
||||
Bracket *bp; /* readonly */
|
||||
struct lc_collate *col; /* readonly */
|
||||
const CollElem *cep; /* entry matching longest */
|
||||
wchar_t ch; /* initial character (if any) */
|
||||
w_type wc; /* character matching "aft" */
|
||||
} Mcce;
|
||||
|
||||
static int
|
||||
mcce(Mcce *mcp, const CollElem *cep, const unsigned char *s, int mb_cur_max,
|
||||
int compile_time)
|
||||
{
|
||||
const CollElem *nxt;
|
||||
CollElem spare;
|
||||
w_type ch, wc;
|
||||
int i;
|
||||
|
||||
/*
|
||||
* Get next character.
|
||||
*/
|
||||
if ((wc = mcp->ch) != '\0')
|
||||
{
|
||||
mcp->ch = '\0';
|
||||
}
|
||||
else if (ISONEBYTE(wc = *s++))
|
||||
{
|
||||
if (wc == '\0')
|
||||
return 0;
|
||||
}
|
||||
else if ((i = libuxre_mb2wc(&wc, s)) > 0)
|
||||
{
|
||||
s += i;
|
||||
if (mcp->max != 0 && s > mcp->max)
|
||||
return 0;
|
||||
}
|
||||
else if (i < 0)
|
||||
return BKT_ILLSEQ;
|
||||
/*
|
||||
* Try out the this character as part of an MCCE.
|
||||
* If BKT_ONECASE is set, this code tries both the lower- and
|
||||
* uppercase version, continuing if it matches so far.
|
||||
*/
|
||||
ch = wc;
|
||||
if (mcp->bp->flags & BKT_ONECASE)
|
||||
{
|
||||
if ((wc = to_lower(wc)) == ch)
|
||||
ch = to_upper(wc);
|
||||
}
|
||||
for (;;) /* at most twice */
|
||||
{
|
||||
if (cep == ELEM_BADCHAR) /* first character */
|
||||
{
|
||||
if ((nxt = libuxre_collelem(mcp->col, &spare, wc))
|
||||
== ELEM_ENCODED
|
||||
|| (mcp->col->flags & CHF_MULTICH) == 0
|
||||
|| s == mcp->max)
|
||||
{
|
||||
mcp->aft = s;
|
||||
mcp->cep = nxt;
|
||||
mcp->wc = wc;
|
||||
break;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
nxt = libuxre_collmult(mcp->col, cep, wc);
|
||||
}
|
||||
if (nxt != ELEM_BADCHAR)
|
||||
{
|
||||
/*
|
||||
* Okay so far. Record this collating element
|
||||
* if it's really one (not WGHT_IGNORE) and
|
||||
* we've reached a new high point or it's the
|
||||
* first match.
|
||||
*
|
||||
* If there's a possibility for more, call mcce()
|
||||
* recursively for the subsequent characters.
|
||||
*/
|
||||
if (nxt->weight[0] != WGHT_IGNORE
|
||||
&& (mcp->aft < s || mcp->cep == ELEM_BADCHAR))
|
||||
{
|
||||
mcp->aft = s;
|
||||
mcp->cep = nxt;
|
||||
mcp->wc = wc;
|
||||
}
|
||||
if (nxt->multbeg != 0
|
||||
&& (mcp->max == 0 || s < mcp->max))
|
||||
{
|
||||
if ((i = mcce(mcp, nxt, s, mb_cur_max,
|
||||
compile_time)) != 0)
|
||||
return i;
|
||||
}
|
||||
}
|
||||
if (wc == ch)
|
||||
break;
|
||||
wc = ch;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static w_type
|
||||
eqcls(Bracket *bp, const unsigned char *s, int n, w_type prev, int mb_cur_max)
|
||||
{
|
||||
w_type last;
|
||||
Mcce mcbuf;
|
||||
int err;
|
||||
|
||||
mcbuf.max = &s[n];
|
||||
mcbuf.aft = &s[0];
|
||||
mcbuf.bp = bp;
|
||||
mcbuf.col = bp->col;
|
||||
mcbuf.cep = ELEM_BADCHAR;
|
||||
mcbuf.ch = '\0';
|
||||
if ((err = mcce(&mcbuf, ELEM_BADCHAR, s, mb_cur_max, 1)) != 0)
|
||||
return err;
|
||||
if (mcbuf.cep == ELEM_BADCHAR || mcbuf.aft != mcbuf.max)
|
||||
return BKT_EEQUIV;
|
||||
last = mcbuf.wc;
|
||||
if (mcbuf.cep != ELEM_ENCODED && mcbuf.col->nweight > 1)
|
||||
{
|
||||
const CollElem *cep;
|
||||
|
||||
/*
|
||||
* The first and last weight[0] values for equivalence
|
||||
* classes are stuffed into the terminator for the
|
||||
* multiple character lists. If these values are
|
||||
* scattered (elements that are not part of this
|
||||
* equivalence class have weight[0] values between the
|
||||
* two end points), then SUBN_SPECIAL is placed in
|
||||
* this terminator. Note that weight[1] of the
|
||||
* terminator must be other than WGHT_IGNORE, too.
|
||||
*/
|
||||
last = mcbuf.cep->weight[0];
|
||||
if ((cep = libuxre_collmult(bp->col, mcbuf.cep, 0))
|
||||
!= ELEM_BADCHAR
|
||||
&& cep->weight[1] != WGHT_IGNORE)
|
||||
{
|
||||
last = cep->weight[1];
|
||||
if (cep->subnbeg == SUBN_SPECIAL)
|
||||
{
|
||||
unsigned int nq;
|
||||
|
||||
/*
|
||||
* Permit ranges up to the first and
|
||||
* after the last.
|
||||
*/
|
||||
if (prev > 0 && prev != cep->weight[0]
|
||||
&& (prev = addrange(bp,
|
||||
cep->weight[0], prev)) != 0)
|
||||
{
|
||||
return prev;
|
||||
}
|
||||
/*
|
||||
* Record the equivalence class by storing
|
||||
* the primary weight.
|
||||
*/
|
||||
if ((nq = bp->nquiv) < NQUIV)
|
||||
bp->quiv[nq] = mcbuf.cep->weight[1];
|
||||
else
|
||||
{
|
||||
if (nq % NQUIV == 0 && (bp->exquiv =
|
||||
realloc(bp->exquiv,
|
||||
nq * sizeof(wuchar_type)))
|
||||
== 0)
|
||||
{
|
||||
return REG_ESPACE;
|
||||
}
|
||||
nq -= NQUIV;
|
||||
bp->exquiv[nq] = mcbuf.cep->weight[1];
|
||||
}
|
||||
bp->nquiv++;
|
||||
return last;
|
||||
}
|
||||
mcbuf.cep = cep;
|
||||
}
|
||||
mcbuf.wc = mcbuf.cep->weight[0];
|
||||
}
|
||||
/*
|
||||
* Determine range, if any, to install.
|
||||
*
|
||||
* If there's a pending low (prev > 0), then try to use it.
|
||||
*
|
||||
* Otherwise, try to use mcbuf.wc as the low end of the range.
|
||||
* Since addrange() assumes that the low point has already been
|
||||
* placed, we try to fool it by using a prev of one less than
|
||||
* mcbuf.wc. But, if that value would not look like a valid
|
||||
* low point of a range, we have to explicitly place mcbuf.wc.
|
||||
*/
|
||||
if (prev <= 0 && (prev = mcbuf.wc - 1) <= 0)
|
||||
{
|
||||
if ((prev = addrange(bp, mcbuf.wc, 0)) != 0)
|
||||
return prev;
|
||||
}
|
||||
if ((mcbuf.wc = addrange(bp, last, prev)) != 0)
|
||||
return mcbuf.wc;
|
||||
return last;
|
||||
}
|
||||
|
||||
static w_type
|
||||
clsym(Bracket *bp, const unsigned char *s, int n, w_type prev, int mb_cur_max)
|
||||
{
|
||||
Mcce mcbuf;
|
||||
int err;
|
||||
|
||||
mcbuf.max = &s[n];
|
||||
mcbuf.aft = &s[0];
|
||||
mcbuf.bp = bp;
|
||||
mcbuf.col = bp->col;
|
||||
mcbuf.cep = ELEM_BADCHAR;
|
||||
mcbuf.ch = '\0';
|
||||
if ((err = mcce(&mcbuf, ELEM_BADCHAR, s, mb_cur_max, 1)) != 0)
|
||||
return err;
|
||||
if (mcbuf.cep == ELEM_BADCHAR || mcbuf.aft != mcbuf.max)
|
||||
return BKT_ECOLLATE;
|
||||
if (mcbuf.cep != ELEM_ENCODED)
|
||||
mcbuf.wc = mcbuf.cep->weight[0];
|
||||
if ((err = addrange(bp, mcbuf.wc, prev)) != 0)
|
||||
return err;
|
||||
return mcbuf.wc;
|
||||
}
|
||||
|
||||
/*
|
||||
* Scans the rest of a bracket construction within a regular
|
||||
* expression and fills in a description for it.
|
||||
* The leading [ and the optional set complement indicator
|
||||
* were handled already by the caller.
|
||||
* Returns:
|
||||
* <0 error (a BKT_* value)
|
||||
* >0 success; equals how many bytes were scanned.
|
||||
*/
|
||||
LIBUXRE_STATIC int
|
||||
libuxre_bktmbcomp(Bracket *bp, const unsigned char *pat0,
|
||||
int flags, int mb_cur_max)
|
||||
{
|
||||
static const Bracket zero = {0};
|
||||
const unsigned char *pat = pat0;
|
||||
struct lc_collate *savecol;
|
||||
w_type n, wc, prev = 0;
|
||||
|
||||
/*
|
||||
* Set represented set to empty. Easiest to copy an empty
|
||||
* version over the caller's, (re)setting col and flags.
|
||||
*/
|
||||
savecol = bp->col;
|
||||
*bp = zero;
|
||||
bp->col = savecol;
|
||||
bp->flags = flags
|
||||
& (BKT_NEGATED | BKT_ONECASE | BKT_NOTNL | BKT_BADRANGE |
|
||||
BKT_ODDRANGE);
|
||||
/*
|
||||
* Handle optional "empty" brackets; typically only used
|
||||
* in combination with BKT_QUOTE or BKT_ESCAPE.
|
||||
*/
|
||||
if ((wc = *pat) == ']' && (flags & BKT_EMPTY) != 0)
|
||||
return 1;
|
||||
/*
|
||||
* Populate *bp.
|
||||
*/
|
||||
for (;; prev = n)
|
||||
{
|
||||
switch (wc)
|
||||
{
|
||||
case '\0':
|
||||
ebrack:;
|
||||
n = BKT_EBRACK;
|
||||
goto err;
|
||||
case '\n':
|
||||
if (flags & BKT_NLBAD)
|
||||
goto ebrack;
|
||||
goto regular;
|
||||
case '/':
|
||||
if (flags & BKT_SLASHBAD)
|
||||
goto ebrack;
|
||||
goto regular;
|
||||
case '\\':
|
||||
if ((flags & (BKT_ESCAPE | BKT_QUOTE
|
||||
| BKT_ESCNL | BKT_ESCSEQ)) == 0)
|
||||
{
|
||||
goto regular;
|
||||
}
|
||||
switch (wc = *++pat)
|
||||
{
|
||||
default:
|
||||
noesc:;
|
||||
if ((flags & BKT_ESCAPE) == 0)
|
||||
{
|
||||
wc = '\\';
|
||||
pat--;
|
||||
}
|
||||
break;
|
||||
case '\\':
|
||||
case ']':
|
||||
case '-':
|
||||
case '^':
|
||||
if ((flags & BKT_QUOTE) == 0)
|
||||
goto noesc;
|
||||
break;
|
||||
case 'a':
|
||||
if ((flags & BKT_ESCSEQ) == 0 ||
|
||||
(flags & BKT_OLDESC))
|
||||
goto noesc;
|
||||
wc = '\a';
|
||||
break;
|
||||
case 'b':
|
||||
if ((flags & BKT_ESCSEQ) == 0)
|
||||
goto noesc;
|
||||
wc = '\b';
|
||||
break;
|
||||
case 'f':
|
||||
if ((flags & BKT_ESCSEQ) == 0)
|
||||
goto noesc;
|
||||
wc = '\f';
|
||||
break;
|
||||
case 'n':
|
||||
if ((flags & (BKT_ESCSEQ | BKT_ESCNL)) == 0)
|
||||
goto noesc;
|
||||
wc = '\n';
|
||||
break;
|
||||
case 'r':
|
||||
if ((flags & BKT_ESCSEQ) == 0)
|
||||
goto noesc;
|
||||
wc = '\r';
|
||||
break;
|
||||
case 't':
|
||||
if ((flags & BKT_ESCSEQ) == 0)
|
||||
goto noesc;
|
||||
wc = '\t';
|
||||
break;
|
||||
case 'v':
|
||||
if ((flags & BKT_ESCSEQ) == 0 ||
|
||||
(flags & BKT_OLDESC))
|
||||
goto noesc;
|
||||
wc = '\v';
|
||||
break;
|
||||
case 'x':
|
||||
if ((flags & BKT_ESCSEQ) == 0 ||
|
||||
(flags & BKT_OLDESC))
|
||||
goto noesc;
|
||||
if (!isxdigit(wc = *++pat))
|
||||
{
|
||||
pat--;
|
||||
goto noesc;
|
||||
}
|
||||
/*
|
||||
* Take as many hex digits as possible,
|
||||
* ignoring overflows.
|
||||
* Any positive result is okay.
|
||||
*/
|
||||
n = 0;
|
||||
do
|
||||
{
|
||||
if (isdigit(wc))
|
||||
wc -= '0';
|
||||
else if (isupper(wc))
|
||||
wc -= 'A' + 10;
|
||||
else
|
||||
wc -= 'a' + 10;
|
||||
n <<= 4;
|
||||
n |= wc;
|
||||
} while (isxdigit(wc = *++pat));
|
||||
pat--;
|
||||
if ((wc = n) <= 0)
|
||||
{
|
||||
n = BKT_BADESC;
|
||||
goto err;
|
||||
}
|
||||
break;
|
||||
case '0':
|
||||
case '1':
|
||||
case '2':
|
||||
case '3':
|
||||
case '4':
|
||||
case '5':
|
||||
case '6':
|
||||
case '7':
|
||||
case '8':
|
||||
case '9':
|
||||
if ((flags & BKT_ESCSEQ) == 0 ||
|
||||
(flags & BKT_OLDESC))
|
||||
goto noesc;
|
||||
/*
|
||||
* For compatibility (w/awk),
|
||||
* permit "octal" 8 and 9.
|
||||
*/
|
||||
n = wc - '0';
|
||||
if ((wc = *++pat) >= '0' && wc <= '9')
|
||||
{
|
||||
n <<= 3;
|
||||
n += wc - '0';
|
||||
if ((wc = *++pat) >= '0' && wc <= '9')
|
||||
{
|
||||
n <<= 3;
|
||||
n += wc - '0';
|
||||
}
|
||||
}
|
||||
pat--;
|
||||
if ((wc = n) <= 0)
|
||||
{
|
||||
n = BKT_BADESC;
|
||||
goto err;
|
||||
}
|
||||
break;
|
||||
}
|
||||
goto regular;
|
||||
case '[':
|
||||
if (((wc = *++pat) == ':' || wc == '=' || wc == '.') &&
|
||||
(flags & BKT_NOI18N) == 0)
|
||||
{
|
||||
n = 0;
|
||||
while (*++pat != wc || pat[1] != ']')
|
||||
{
|
||||
if (*pat == '\0')
|
||||
{
|
||||
badpat:;
|
||||
n = BKT_BADPAT;
|
||||
goto err;
|
||||
}
|
||||
else if (*pat == '/')
|
||||
{
|
||||
if (flags & BKT_SLASHBAD)
|
||||
goto badpat;
|
||||
}
|
||||
else if (*pat == '\n')
|
||||
{
|
||||
if (flags & BKT_NLBAD)
|
||||
goto badpat;
|
||||
}
|
||||
n++;
|
||||
}
|
||||
if (n == 0)
|
||||
{
|
||||
n = BKT_EMPTYSUBBKT;
|
||||
goto err;
|
||||
}
|
||||
if (wc == ':')
|
||||
n = chcls(bp, &pat[-n], n);
|
||||
else if (wc == '=')
|
||||
n = eqcls(bp, &pat[-n], n, prev,
|
||||
mb_cur_max);
|
||||
else /* wc == '.' */
|
||||
n = clsym(bp, &pat[-n], n, prev,
|
||||
mb_cur_max);
|
||||
pat++;
|
||||
break;
|
||||
}
|
||||
wc = '[';
|
||||
pat--;
|
||||
goto regular;
|
||||
default:
|
||||
if (!ISONEBYTE(wc) &&
|
||||
(n = libuxre_mb2wc(&wc, pat + 1)) > 0)
|
||||
pat += n;
|
||||
regular:;
|
||||
n = place(bp, wc, prev, mb_cur_max);
|
||||
break;
|
||||
}
|
||||
if (n < 0) {
|
||||
n = BKT_ILLSEQ;
|
||||
goto err;
|
||||
}
|
||||
if ((wc = *++pat) == ']')
|
||||
break;
|
||||
if (wc == '-' && n != 0)
|
||||
{
|
||||
if (prev == 0 || (flags & BKT_SEPRANGE) == 0)
|
||||
{
|
||||
if ((wc = *++pat) != ']')
|
||||
continue; /* valid range */
|
||||
wc = '-';
|
||||
pat--;
|
||||
}
|
||||
}
|
||||
n = 0; /* no range this time */
|
||||
}
|
||||
return pat - pat0 + 1;
|
||||
err:;
|
||||
libuxre_bktfree(bp);
|
||||
return n;
|
||||
}
|
||||
|
||||
LIBUXRE_STATIC void
|
||||
libuxre_bktfree(Bracket *bp)
|
||||
{
|
||||
if (bp->extype != 0)
|
||||
free(bp->extype);
|
||||
if (bp->exquiv != 0)
|
||||
free(bp->exquiv);
|
||||
if (bp->exwide != 0)
|
||||
free(bp->exwide);
|
||||
}
|
||||
|
||||
LIBUXRE_STATIC int
|
||||
libuxre_bktmbexec(Bracket *bp, wchar_t wc,
|
||||
const unsigned char *str, int mb_cur_max)
|
||||
{
|
||||
unsigned int i;
|
||||
wchar_t lc, uc;
|
||||
Mcce mcbuf;
|
||||
|
||||
mcbuf.aft = str; /* in case of match in character classes */
|
||||
mcbuf.ch = wc;
|
||||
/*
|
||||
* First: check the single wc against any character classes.
|
||||
* Since multiple character collating elements are not part
|
||||
* of this world, they don't apply here.
|
||||
*/
|
||||
if ((i = bp->ntype) != 0)
|
||||
{
|
||||
wctype_t *wctp = &bp->type[0];
|
||||
|
||||
if (bp->flags & BKT_ONECASE)
|
||||
{
|
||||
if ((wc = to_lower(wc)) == mcbuf.ch)
|
||||
mcbuf.ch = to_upper(wc);
|
||||
}
|
||||
for (;;)
|
||||
{
|
||||
if (iswctype(mb_cur_max==1?btowc(wc):wc, *wctp))
|
||||
goto match;
|
||||
if (wc != mcbuf.ch &&
|
||||
iswctype(mb_cur_max==1?btowc(mcbuf.ch):mcbuf.ch,
|
||||
*wctp))
|
||||
goto match;
|
||||
if (--i == 0)
|
||||
break;
|
||||
if (++wctp == &bp->type[NTYPE])
|
||||
wctp = &bp->extype[0];
|
||||
}
|
||||
}
|
||||
/*
|
||||
* The main match is determined by the weight[0] value
|
||||
* of the character (or characters, if the input can be
|
||||
* taken as a multiple character collating element).
|
||||
*/
|
||||
mcbuf.max = 0;
|
||||
mcbuf.bp = bp;
|
||||
mcbuf.col = bp->col;
|
||||
mcbuf.cep = ELEM_BADCHAR;
|
||||
mcce(&mcbuf, ELEM_BADCHAR, str, mb_cur_max, 0);
|
||||
if (mcbuf.cep == ELEM_BADCHAR)
|
||||
return -1; /* never matches */
|
||||
if (mcbuf.cep != ELEM_ENCODED)
|
||||
mcbuf.wc = mcbuf.cep->weight[0];
|
||||
/*
|
||||
* POSIX.2 demands that both a character and its case counterpart
|
||||
* can match if REG_ICASE is set. This means that [B-z] matches
|
||||
* 'A', 'a', and '['.
|
||||
*/
|
||||
if (bp->flags & BKT_ONECASE)
|
||||
{
|
||||
lc = to_lower(mcbuf.wc);
|
||||
uc = to_upper(mcbuf.wc);
|
||||
}
|
||||
else
|
||||
lc = uc = mcbuf.wc;
|
||||
/*
|
||||
* See if it's in the set. Note that the list of true wide
|
||||
* character values has explicit ranges.
|
||||
*/
|
||||
if (mcbuf.wc <= UCHAR_MAX)
|
||||
{
|
||||
if (bp->byte[PLIND(lc)] & PLBIT(lc))
|
||||
goto match;
|
||||
if (lc != uc && (bp->byte[PLIND(uc)] & PLBIT(uc)))
|
||||
goto match;
|
||||
}
|
||||
else if ((i = bp->nwide) != 0)
|
||||
{
|
||||
wchar_t *wcp = &bp->wide[0];
|
||||
long lcmp, ucmp;
|
||||
|
||||
for (;;)
|
||||
{
|
||||
if ((lcmp = lc - *wcp) == 0)
|
||||
goto match;
|
||||
ucmp = uc - *wcp;
|
||||
if (lc != uc && ucmp == 0)
|
||||
goto match;
|
||||
if (--i == 0)
|
||||
break;
|
||||
if (++wcp == &bp->wide[NWIDE])
|
||||
wcp = &bp->exwide[0];
|
||||
if (*wcp == RANGE)
|
||||
{
|
||||
if (++wcp == &bp->wide[NWIDE])
|
||||
wcp = &bp->exwide[0];
|
||||
if (lcmp > 0 && lc <= *wcp)
|
||||
goto match;
|
||||
if (lc != uc && ucmp > 0 && uc < *wcp)
|
||||
goto match;
|
||||
if ((i -= 2) == 0)
|
||||
break;
|
||||
if (++wcp == &bp->wide[NWIDE])
|
||||
wcp = &bp->exwide[0];
|
||||
}
|
||||
}
|
||||
}
|
||||
/*
|
||||
* The last chance for a match is if an equivalence class
|
||||
* was specified for which the primary weights are scattered
|
||||
* through the weight[0]s.
|
||||
*/
|
||||
if ((i = bp->nquiv) != 0 && mcbuf.cep != ELEM_ENCODED)
|
||||
{
|
||||
wuchar_type *wucp = &bp->quiv[0];
|
||||
|
||||
mcbuf.wc = mcbuf.cep->weight[1];
|
||||
for (;;)
|
||||
{
|
||||
if (mcbuf.wc == *wucp)
|
||||
goto match;
|
||||
if (--i == 0)
|
||||
break;
|
||||
if (++wucp == &bp->quiv[NQUIV])
|
||||
wucp = &bp->exquiv[0];
|
||||
}
|
||||
}
|
||||
/*
|
||||
* Only here when no match against the set was found.
|
||||
* One final special case w/r/t newline.
|
||||
*/
|
||||
if (bp->flags & BKT_NEGATED)
|
||||
{
|
||||
if (wc != '\n' || (bp->flags & BKT_NOTNL) == 0)
|
||||
return mcbuf.aft - str;
|
||||
}
|
||||
return -1;
|
||||
match:;
|
||||
/*
|
||||
* Only here when a match against the described set is found.
|
||||
*/
|
||||
if (bp->flags & BKT_NEGATED)
|
||||
return -1;
|
||||
return mcbuf.aft - str;
|
||||
}
|
||||
226
libuxre/colldata.h
Normal file
226
libuxre/colldata.h
Normal file
@@ -0,0 +1,226 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)colldata.h 1.5 (gritter) 5/1/04
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
#ifndef LIBUXRE_COLLDATA_H
|
||||
#define LIBUXRE_COLLDATA_H
|
||||
|
||||
typedef struct
|
||||
{
|
||||
long coll_offst; /* offset to xnd table */
|
||||
long sub_cnt; /* length of subnd table */
|
||||
long sub_offst; /* offset to subnd table */
|
||||
long str_offst; /* offset to strings for subnd table */
|
||||
long flags; /* nonzero if reg.exp. used */
|
||||
} hd;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
unsigned char ch; /* character or number of followers */
|
||||
unsigned char pwt; /* primary weight */
|
||||
unsigned char swt; /* secondary weight */
|
||||
unsigned char ns; /* index of follower state list */
|
||||
} xnd;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
char *exp; /* expression to be replaced */
|
||||
long explen; /* length of expression */
|
||||
char *repl; /* replacement string */
|
||||
} subnd;
|
||||
|
||||
/*----------------------------------*/
|
||||
|
||||
#include <wcharm.h>
|
||||
#include <limits.h>
|
||||
/* #include <stdlock.h> */
|
||||
|
||||
/*
|
||||
* Structure of a collation file:
|
||||
* 1. CollHead (maintbl is 0 if CHF_ENCODED)
|
||||
* if !CHF_ENCODED then
|
||||
* 2. CollElem[bytes] (256 for 8 bit bytes)
|
||||
* 3. if CHF_INDEXED then
|
||||
* CollElem[wides] (nmain-256 for 8 bit bytes)
|
||||
* else
|
||||
* CollMult[wides]
|
||||
* 4. CollMult[*] (none if multtbl is 0)
|
||||
* 5. wuchar_type[*] (none if repltbl is 0)
|
||||
* 6. CollSubn[*] (none if subntbl is 0)
|
||||
* 7. strings (first is pathname for .so if CHF_DYNAMIC)
|
||||
*
|
||||
* The actual location of parts 2 through 7 is not important.
|
||||
*
|
||||
* The main table is in encoded value order.
|
||||
*
|
||||
* All indeces/offsets must be nonzero to be effective; zero is reserved
|
||||
* to indicate no-such-entry. This implies either that an unused initial
|
||||
* entry is placed in each of (4) through (7), or that the "start offset"
|
||||
* given by the header is artificially pushed back by an entry size.
|
||||
*
|
||||
* Note that if CHF_ENCODED is not set, then nweight must be positive.
|
||||
*
|
||||
* If an element can begin a multiple character element, it contains a
|
||||
* nonzero multbeg which is the initial index into (4) for its list;
|
||||
* the list is terminated by a CollMult with a ch of zero.
|
||||
*
|
||||
* If there are elements with the same primary weight (weight[1]), then
|
||||
* for each such element, it must have a CollMult list. The CollMult
|
||||
* that terminates the list (ch==0) notes the lowest and highest basic
|
||||
* weights for those elements with that same primary weight value
|
||||
* respectively in weight[0] and weight[1]. If there are some basic
|
||||
* weights between these values that do not have the same primary
|
||||
* weight--are not in the equivalence class--then the terminator also
|
||||
* has a SUBN_SPECIAL mark. Note that this list terminator should be
|
||||
* shared when the elements are not multiple character collating
|
||||
* elements because they wouldn't otherwise have a CollMult list.
|
||||
*
|
||||
* WGHT_IGNORE is used to denote ignored collating elements for a
|
||||
* particular collation ordering pass. All main table entries other
|
||||
* than for '\0' will have a non-WGHT_IGNORE weight[0]. However, it is
|
||||
* possible for a CollMult entries from (4) to have a WGHT_IGNORE
|
||||
* weight[0]: If, for example, "xyz" is a multiple character collating
|
||||
* element, but "xy" is not, then the CollMult for "y" will have a
|
||||
* WGHT_IGNORE weight[0]. Also, WGHT_IGNORE is used to terminate each
|
||||
* list of replacement weights.
|
||||
*
|
||||
* Within (3), it is possible to describe a sequence of unremarkable
|
||||
* collating elements with a single CollMult entry. If the SUBN_SPECIAL
|
||||
* bit is set, the rest of subnbeg represents the number of collating
|
||||
* elements covered by this entry. The weight[0] values are determined
|
||||
* by adding the difference between the encoded value and the entry's ch
|
||||
* value to the entry's weight[0]. This value is then substituted for
|
||||
* any weight[n], n>0 that has only the WGHT_SPECIAL bit set. libuxre_collelem()
|
||||
* hides any match to such an entry by filling in a "spare" CollElem.
|
||||
*
|
||||
* If there are substitution strings, then for each character that begins
|
||||
* a string, it has a nonzero subnbeg which is similarly the initial
|
||||
* index into (6). The indeces in (6) refer to offsets within (7).
|
||||
*/
|
||||
|
||||
#define TOPBIT(t) (((t)1) << (sizeof(t) * CHAR_BIT - 1))
|
||||
|
||||
#define CHF_ENCODED 0x1 /* collation by encoded values only */
|
||||
#define CHF_INDEXED 0x2 /* main table indexed by encoded values */
|
||||
#define CHF_MULTICH 0x4 /* a multiple char. coll. elem. exists */
|
||||
#define CHF_DYNAMIC 0x8 /* shared object has collation functions */
|
||||
|
||||
#define CWF_BACKWARD 0x1 /* reversed ordering for this weight */
|
||||
#define CWF_POSITION 0x2 /* weight takes position into account */
|
||||
|
||||
#define CLVERS 1 /* most recent version */
|
||||
|
||||
#define WGHT_IGNORE 0 /* ignore this collating element */
|
||||
#define WGHT_SPECIAL TOPBIT(wuchar_type)
|
||||
#define SUBN_SPECIAL TOPBIT(unsigned short)
|
||||
|
||||
#ifndef COLL_WEIGHTS_MAX
|
||||
#define COLL_WEIGHTS_MAX 1
|
||||
#endif
|
||||
|
||||
typedef struct
|
||||
{
|
||||
unsigned long maintbl; /* start of main table */
|
||||
unsigned long multtbl; /* start of multi-char table */
|
||||
unsigned long repltbl; /* start of replacement weights */
|
||||
unsigned long subntbl; /* start of substitutions */
|
||||
unsigned long strstbl; /* start of sub. strings */
|
||||
unsigned long nmain; /* # entries in main table */
|
||||
unsigned short flags; /* CHF_* bits */
|
||||
unsigned short version; /* handle future changes */
|
||||
unsigned char elemsize; /* # bytes/element (w/padding) */
|
||||
unsigned char nweight; /* # weights/element */
|
||||
unsigned char order[COLL_WEIGHTS_MAX]; /* CWF_* bits/weight */
|
||||
} CollHead;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
unsigned short multbeg; /* start of multi-chars */
|
||||
unsigned short subnbeg; /* start of substitutions */
|
||||
wuchar_type weight[COLL_WEIGHTS_MAX];
|
||||
} CollElem;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
wchar_t ch; /* "this" character (of sequence) */
|
||||
CollElem elem; /* its full information */
|
||||
} CollMult;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
unsigned short strbeg; /* start of match string */
|
||||
unsigned short length; /* length of match string */
|
||||
unsigned short repbeg; /* start of replacement */
|
||||
} CollSubn;
|
||||
|
||||
struct lc_collate
|
||||
{
|
||||
const unsigned char *strstbl;
|
||||
const wuchar_type *repltbl;
|
||||
const CollElem *maintbl;
|
||||
const CollMult *multtbl;
|
||||
const CollSubn *subntbl;
|
||||
#ifdef DSHLIB
|
||||
void *handle;
|
||||
void (*done)(struct lc_collate *);
|
||||
int (*strc)(struct lc_collate *, const char *, const char *);
|
||||
int (*wcsc)(struct lc_collate *, const wchar_t *, const wchar_t *);
|
||||
size_t (*strx)(struct lc_collate *, char *, const char *, size_t);
|
||||
size_t (*wcsx)(struct lc_collate *, wchar_t *, const wchar_t *, size_t);
|
||||
#endif
|
||||
const char *mapobj;
|
||||
size_t mapsize;
|
||||
unsigned long nmain;
|
||||
short nuse;
|
||||
unsigned short flags;
|
||||
unsigned char elemsize;
|
||||
unsigned char nweight;
|
||||
unsigned char order[COLL_WEIGHTS_MAX];
|
||||
};
|
||||
|
||||
#define ELEM_BADCHAR ((CollElem *)0)
|
||||
#define ELEM_ENCODED ((CollElem *)-1)
|
||||
|
||||
/*
|
||||
LIBUXRE_STATIC int libuxre_old_collate(struct lc_collate *);
|
||||
LIBUXRE_STATIC int libuxre_strqcoll(struct lc_collate *, const char *,
|
||||
const char *);
|
||||
LIBUXRE_STATIC int libuxre_wcsqcoll(struct lc_collate *, const wchar_t *,
|
||||
const wchar_t *);
|
||||
*/
|
||||
extern struct lc_collate *libuxre_lc_collate(struct lc_collate *);
|
||||
LIBUXRE_STATIC const CollElem *libuxre_collelem(struct lc_collate *,
|
||||
CollElem *, wchar_t);
|
||||
LIBUXRE_STATIC const CollElem *libuxre_collmult(struct lc_collate *,
|
||||
const CollElem *, wchar_t);
|
||||
/*
|
||||
LIBUXRE_STATIC const CollElem *libuxre_collmbs(struct lc_collate *,
|
||||
CollElem *, const unsigned char **);
|
||||
LIBUXRE_STATIC const CollElem *libuxre_collwcs(struct lc_collate *,
|
||||
CollElem *, const wchar_t **);
|
||||
*/
|
||||
|
||||
#endif /* !LIBUXRE_COLLDATA_H */
|
||||
38
libuxre/onefile.c
Normal file
38
libuxre/onefile.c
Normal file
@@ -0,0 +1,38 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)onefile.c 1.1 (gritter) 9/22/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
#define LIBUXRE_STATIC static
|
||||
|
||||
#include "_collelem.c"
|
||||
#include "_collmult.c"
|
||||
#include "stubs.c"
|
||||
#include "bracket.c"
|
||||
#include "regdfa.c"
|
||||
#include "regnfa.c"
|
||||
#include "regparse.c"
|
||||
#include "regcomp.c"
|
||||
#include "regexec.c"
|
||||
228
libuxre/re.h
Normal file
228
libuxre/re.h
Normal file
@@ -0,0 +1,228 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)re.h 1.15 (gritter) 2/6/05
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
#ifndef LIBUXRE_RE_H
|
||||
#define LIBUXRE_RE_H
|
||||
|
||||
/*
|
||||
* Maps safe external tag to internal one
|
||||
*/
|
||||
#define re_coll_ lc_collate /* <regex.h> */
|
||||
/* #define __fnm_collate lc_collate */ /* <fnmatch.h> */
|
||||
|
||||
#include <limits.h>
|
||||
#include <regex.h>
|
||||
/* #include <fnmatch.h> */
|
||||
#include <colldata.h>
|
||||
|
||||
#define NBSHT (sizeof(unsigned short) * CHAR_BIT)
|
||||
#define NBYTE (((1 << CHAR_BIT) + NBSHT - 1) / NBSHT)
|
||||
#define NTYPE 4
|
||||
#define NWIDE 32
|
||||
#define NQUIV 4
|
||||
|
||||
typedef struct
|
||||
{
|
||||
struct lc_collate *col; /* only member set by caller */
|
||||
wctype_t *extype;
|
||||
wuchar_type *exquiv;
|
||||
wchar_t *exwide;
|
||||
wctype_t type[NTYPE];
|
||||
wuchar_type quiv[NQUIV];
|
||||
wchar_t wide[NWIDE];
|
||||
unsigned short byte[NBYTE];
|
||||
unsigned short ntype;
|
||||
unsigned short nquiv;
|
||||
unsigned short nwide;
|
||||
unsigned int flags;
|
||||
} Bracket;
|
||||
|
||||
#define BKT_NEGATED 0x001 /* complemented set */
|
||||
#define BKT_ONECASE 0x002 /* uppercase same as lowercase */
|
||||
#define BKT_NOTNL 0x004 /* do not match newline when BKT_NEGATED */
|
||||
#define BKT_BADRANGE 0x008 /* accept [m-a] ranges as [ma] */
|
||||
#define BKT_SEPRANGE 0x010 /* disallow [a-m-z] style ranges */
|
||||
#define BKT_NLBAD 0x020 /* newline disallowed */
|
||||
#define BKT_SLASHBAD 0x040 /* slash disallowed (for pathnames) */
|
||||
#define BKT_EMPTY 0x080 /* take leading ] is end (empty set) */
|
||||
#define BKT_ESCAPE 0x100 /* allow \ as quote for next anything */
|
||||
#define BKT_QUOTE 0x200 /* allow \ as quote for \\, \^, \- or \] */
|
||||
#define BKT_ESCNL 0x400 /* take \n as the newline character */
|
||||
#define BKT_ESCSEQ 0x800 /* otherwise, take \ as in C escapes */
|
||||
#define BKT_ODDRANGE 0x1000 /* oawk oddity: [m-a] means [m] */
|
||||
#define BKT_NOI18N 0x2000 /* disable [::] [==] [..] */
|
||||
#define BKT_OLDESC 0x4000 /* enable \b \f \n \r \t only */
|
||||
|
||||
/*
|
||||
* These error returns for libuxre_bktmbcomp() are directly tied to
|
||||
* the error returns for regcomp() for convenience.
|
||||
*/
|
||||
#define BKT_BADPAT (-REG_BADPAT)
|
||||
#define BKT_ECOLLATE (-REG_ECOLLATE)
|
||||
#define BKT_ECTYPE (-REG_ECTYPE)
|
||||
#define BKT_EEQUIV (-REG_EEQUIV)
|
||||
#define BKT_BADCHAR (-REG_EBKTCHAR)
|
||||
#define BKT_EBRACK (-REG_EBRACK)
|
||||
#define BKT_EMPTYSUBBKT (-REG_EMPTYSUBBKT)
|
||||
#define BKT_ERANGE (-REG_ERANGE)
|
||||
#define BKT_ESPACE (-REG_ESPACE)
|
||||
#define BKT_BADESC (-REG_BADESC)
|
||||
#define BKT_ILLSEQ (-REG_ILLSEQ)
|
||||
|
||||
/*
|
||||
* These must be distinct from the flags in <fnmatch.h>.
|
||||
*/
|
||||
#define FNM_COLLATE 0x2000 /* have collation information */
|
||||
#define FNM_CURRENT 0x4000 /* have full-sized fnm_t structure */
|
||||
|
||||
/*
|
||||
* These must be distinct from the flags in <regex.h>.
|
||||
*/
|
||||
#define REG_NFA 0x20000000
|
||||
#define REG_DFA 0x40000000
|
||||
#define REG_GOTBKT 0x80000000
|
||||
|
||||
#define BRACE_INF USHRT_MAX
|
||||
#define BRACE_MAX 5100 /* arbitrary number < SHRT_MAX */
|
||||
#define BRACE_DFAMAX 255 /* max amount for r.e. duplication */
|
||||
|
||||
typedef union /* extra info always kept for some tokens/nodes */
|
||||
{
|
||||
Bracket *bkt; /* ROP_BKT */
|
||||
size_t sub; /* ROP_LP (ROP_RP), ROP_REF */
|
||||
unsigned short num[2]; /* ROP_BRACE: num[0]=low, num[1]=high */
|
||||
} Info;
|
||||
|
||||
typedef struct /* lexical context while parsing */
|
||||
{
|
||||
Info info;
|
||||
const unsigned char *pat;
|
||||
unsigned char *clist;
|
||||
struct lc_collate *col;
|
||||
unsigned long flags;
|
||||
w_type tok;
|
||||
size_t maxref;
|
||||
size_t nleft;
|
||||
size_t nright;
|
||||
size_t nclist;
|
||||
int bktflags;
|
||||
int err;
|
||||
int mb_cur_max;
|
||||
} Lex;
|
||||
|
||||
typedef struct t_tree Tree; /* RE parse tree node */
|
||||
struct t_tree
|
||||
{
|
||||
union
|
||||
{
|
||||
Tree *ptr; /* unary & binary nodes */
|
||||
size_t pos; /* position for DFA leaves */
|
||||
} left;
|
||||
union
|
||||
{
|
||||
Tree *ptr; /* binary nodes */
|
||||
Info info;
|
||||
} right;
|
||||
Tree *parent;
|
||||
w_type op; /* positive => char. to match */
|
||||
};
|
||||
|
||||
typedef struct re_dfa_ Dfa; /* DFA engine description */
|
||||
typedef struct re_nfa_ Nfa; /* NFA engine description */
|
||||
|
||||
typedef struct
|
||||
{
|
||||
const unsigned char *str;
|
||||
regmatch_t *match;
|
||||
size_t nmatch;
|
||||
unsigned long flags;
|
||||
int mb_cur_max;
|
||||
} Exec;
|
||||
|
||||
/*
|
||||
* Regular expression operators. Some only used internally.
|
||||
* All are negative, to distinguish them from the regular
|
||||
* "match this particular wide character" operation.
|
||||
*/
|
||||
#define BINARY_ROP 0x02
|
||||
#define UNARY_ROP 0x01
|
||||
#define LEAF_ROP 0x00
|
||||
|
||||
#define MAKE_ROP(k, v) (-((v) | ((k) << 4)))
|
||||
#define KIND_ROP(v) ((-(v)) >> 4)
|
||||
|
||||
#define ROP_OR MAKE_ROP(BINARY_ROP, 1)
|
||||
#define ROP_CAT MAKE_ROP(BINARY_ROP, 2)
|
||||
|
||||
#define ROP_STAR MAKE_ROP(UNARY_ROP, 1)
|
||||
#define ROP_PLUS MAKE_ROP(UNARY_ROP, 2)
|
||||
#define ROP_QUEST MAKE_ROP(UNARY_ROP, 3)
|
||||
#define ROP_BRACE MAKE_ROP(UNARY_ROP, 4)
|
||||
#define ROP_LP MAKE_ROP(UNARY_ROP, 5)
|
||||
#define ROP_RP MAKE_ROP(UNARY_ROP, 6)
|
||||
|
||||
#define ROP_NOP MAKE_ROP(LEAF_ROP, 1) /* temporary */
|
||||
#define ROP_BOL MAKE_ROP(LEAF_ROP, 2) /* ^ anchor */
|
||||
#define ROP_EOL MAKE_ROP(LEAF_ROP, 3) /* $ anchor */
|
||||
#define ROP_ALL MAKE_ROP(LEAF_ROP, 4) /* anything (added) */
|
||||
#define ROP_ANYCH MAKE_ROP(LEAF_ROP, 5) /* . w/\n */
|
||||
#define ROP_NOTNL MAKE_ROP(LEAF_ROP, 6) /* . w/out \n */
|
||||
#define ROP_EMPTY MAKE_ROP(LEAF_ROP, 7) /* empty string */
|
||||
#define ROP_NONE MAKE_ROP(LEAF_ROP, 8) /* match failure */
|
||||
#define ROP_BKT MAKE_ROP(LEAF_ROP, 9) /* [...] */
|
||||
#define ROP_BKTCOPY MAKE_ROP(LEAF_ROP, 10) /* [...] (duplicated) */
|
||||
#define ROP_LT MAKE_ROP(LEAF_ROP, 11) /* \< word begin */
|
||||
#define ROP_GT MAKE_ROP(LEAF_ROP, 12) /* \> word end */
|
||||
#define ROP_REF MAKE_ROP(LEAF_ROP, 13) /* \digit */
|
||||
#define ROP_END MAKE_ROP(LEAF_ROP, 14) /* final (added) */
|
||||
|
||||
/*
|
||||
* Return values:
|
||||
* libuxre_bktmbcomp()
|
||||
* <0 error (see BKT_* above); >0 #bytes scanned
|
||||
* libuxre_bktmbexec()
|
||||
* <0 doesn't match; >=0 matches, #extra bytes scanned
|
||||
*/
|
||||
LIBUXRE_STATIC void libuxre_bktfree(Bracket *);
|
||||
LIBUXRE_STATIC int libuxre_bktmbcomp(Bracket *, const unsigned char *,
|
||||
int, int);
|
||||
LIBUXRE_STATIC int libuxre_bktmbexec(Bracket *, wchar_t,
|
||||
const unsigned char *, int);
|
||||
|
||||
LIBUXRE_STATIC void libuxre_regdeltree(Tree *, int);
|
||||
LIBUXRE_STATIC Tree *libuxre_reg1tree(w_type, Tree *);
|
||||
LIBUXRE_STATIC Tree *libuxre_reg2tree(w_type, Tree *, Tree *);
|
||||
LIBUXRE_STATIC Tree *libuxre_regparse(Lex *, const unsigned char *, int);
|
||||
|
||||
extern void libuxre_regdeldfa(Dfa *);
|
||||
LIBUXRE_STATIC int libuxre_regdfacomp(regex_t *, Tree *, Lex *);
|
||||
LIBUXRE_STATIC int libuxre_regdfaexec(Dfa *, Exec *);
|
||||
|
||||
extern void libuxre_regdelnfa(Nfa *);
|
||||
LIBUXRE_STATIC int libuxre_regnfacomp(regex_t *, Tree *, Lex *);
|
||||
LIBUXRE_STATIC int libuxre_regnfaexec(Nfa *, Exec *);
|
||||
#endif /* !LIBUXRE_RE_H */
|
||||
77
libuxre/regcomp.c
Normal file
77
libuxre/regcomp.c
Normal file
@@ -0,0 +1,77 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)regcomp.c 1.6 (gritter) 9/22/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
#include "re.h"
|
||||
|
||||
/* #pragma weak regcomp = _regcomp */
|
||||
|
||||
int
|
||||
regcomp(regex_t *ep, const char *pat, int flags)
|
||||
{
|
||||
Tree *tp;
|
||||
Lex lex;
|
||||
|
||||
if ((tp=libuxre_regparse(&lex, (const unsigned char *)pat, flags)) == 0)
|
||||
goto out;
|
||||
ep->re_nsub = lex.nleft;
|
||||
ep->re_flags = lex.flags & ~(REG_NOTBOL | REG_NOTEOL | REG_NONEMPTY);
|
||||
ep->re_col = lex.col;
|
||||
ep->re_mb_cur_max = lex.mb_cur_max;
|
||||
/*
|
||||
* Build the engine(s). The factors determining which are built:
|
||||
* 1. If the pattern built insists on an NFA, then only build NFA.
|
||||
* 2. If flags include REG_NOSUB or REG_ONESUB and not (1),
|
||||
* then only build DFA.
|
||||
* 3. Otherwise, build both.
|
||||
* Since libuxre_regdfacomp() modifies the tree and libuxre_regnfacomp()
|
||||
* doesn't, libuxre_regnfacomp() must be called first, if both are to
|
||||
* be called.
|
||||
*/
|
||||
if (ep->re_nsub != 0 && (flags & (REG_NOSUB | REG_ONESUB)) == 0
|
||||
|| lex.flags & REG_NFA)
|
||||
{
|
||||
ep->re_flags |= REG_NFA;
|
||||
if ((lex.err = libuxre_regnfacomp(ep, tp, &lex)) != 0)
|
||||
goto out;
|
||||
}
|
||||
if ((lex.flags & REG_NFA) == 0)
|
||||
{
|
||||
ep->re_flags |= REG_DFA;
|
||||
if ((lex.err = libuxre_regdfacomp(ep, tp, &lex)) != 0)
|
||||
{
|
||||
if (ep->re_flags & REG_NFA)
|
||||
libuxre_regdelnfa(ep->re_nfa);
|
||||
}
|
||||
}
|
||||
out:;
|
||||
if (lex.err != 0 && lex.col != 0)
|
||||
(void)libuxre_lc_collate(lex.col);
|
||||
if (tp != 0)
|
||||
libuxre_regdeltree(tp, lex.err);
|
||||
return lex.err;
|
||||
}
|
||||
877
libuxre/regdfa.c
Normal file
877
libuxre/regdfa.c
Normal file
@@ -0,0 +1,877 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)regdfa.c 1.9 (gritter) 9/22/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <ctype.h>
|
||||
#include "regdfa.h"
|
||||
|
||||
/*
|
||||
* Deterministic Finite Automata.
|
||||
*/
|
||||
|
||||
/*
|
||||
* Postorder traversal that returns a copy of the subtree,
|
||||
* except that ROP_BKT becomes ROP_BKTCOPY (since they
|
||||
* share the same pointed to Bracket object).
|
||||
*/
|
||||
static Tree *
|
||||
copy(regex_t *ep, Tree *tp)
|
||||
{
|
||||
Tree *np;
|
||||
|
||||
if ((np = malloc(sizeof(Tree))) == 0)
|
||||
return 0;
|
||||
switch (np->op = tp->op) /* almost always correct */
|
||||
{
|
||||
case ROP_BKT:
|
||||
np->op = ROP_BKTCOPY;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_BKTCOPY:
|
||||
np->right.info.bkt = tp->right.info.bkt;
|
||||
/*FALLTHROUGH*/
|
||||
default:
|
||||
np->left.pos = ep->re_dfa->nposn++;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_EMPTY:
|
||||
return np;
|
||||
case ROP_CAT:
|
||||
case ROP_OR:
|
||||
if ((np->right.ptr = copy(ep, tp->right.ptr)) == 0)
|
||||
{
|
||||
free(np);
|
||||
return 0;
|
||||
}
|
||||
np->right.ptr->parent = np;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_STAR:
|
||||
case ROP_PLUS:
|
||||
case ROP_QUEST:
|
||||
case ROP_LP:
|
||||
if ((np->left.ptr = copy(ep, tp->left.ptr)) == 0)
|
||||
break;
|
||||
np->left.ptr->parent = np;
|
||||
return np;
|
||||
}
|
||||
libuxre_regdeltree(np, 1);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Postorder traversal.
|
||||
* Assign unique ascending integer values to the leaves.
|
||||
* Since the right child is traversed before the left,
|
||||
* the position for ROP_END is guaranteed to be zero.
|
||||
* The parse tree is rewritten in two cases:
|
||||
* - Each ROP_BRACE is replaced by an equivalent--sometimes
|
||||
* large--subtree using only ROP_CAT, ROP_QUEST, and
|
||||
* ROP_PLUS.
|
||||
* - If REG_ICASE, replace each simple character that has
|
||||
* an uppercase equivalent with a ROP_OR subtree over the
|
||||
* two versions.
|
||||
* Since these rewrites occur bottom up, they have already
|
||||
* been applied before any subtrees passed to copy().
|
||||
*/
|
||||
static Tree *
|
||||
findposn(regex_t *ep, Tree *tp, int mb_cur_max)
|
||||
{
|
||||
unsigned int lo, hi;
|
||||
Tree *ptr, *par;
|
||||
w_type wc;
|
||||
|
||||
switch (tp->op)
|
||||
{
|
||||
default:
|
||||
if (ep->re_flags & REG_ICASE
|
||||
&& (wc = to_upper(tp->op)) != tp->op)
|
||||
{
|
||||
if ((ptr = libuxre_reg1tree(tp->op, 0)) == 0)
|
||||
return 0;
|
||||
ptr->parent = tp;
|
||||
ptr->left.pos = ep->re_dfa->nposn++;
|
||||
tp->op = ROP_OR;
|
||||
tp->left.ptr = ptr;
|
||||
ptr = libuxre_reg1tree(wc, 0);
|
||||
if ((tp->right.ptr = ptr) == 0)
|
||||
return 0;
|
||||
ptr->parent = tp;
|
||||
ptr->left.pos = ep->re_dfa->nposn++;
|
||||
return tp;
|
||||
}
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_BOL:
|
||||
case ROP_EOL:
|
||||
case ROP_ALL:
|
||||
case ROP_ANYCH:
|
||||
case ROP_NOTNL:
|
||||
case ROP_NONE:
|
||||
case ROP_BKT:
|
||||
case ROP_BKTCOPY:
|
||||
case ROP_END:
|
||||
tp->left.pos = ep->re_dfa->nposn++;
|
||||
return tp;
|
||||
case ROP_EMPTY:
|
||||
return tp;
|
||||
case ROP_OR:
|
||||
case ROP_CAT:
|
||||
if ((tp->right.ptr = findposn(ep, tp->right.ptr,
|
||||
mb_cur_max)) == 0)
|
||||
return 0;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_STAR:
|
||||
case ROP_PLUS:
|
||||
case ROP_QUEST:
|
||||
case ROP_LP:
|
||||
if ((tp->left.ptr = findposn(ep, tp->left.ptr,
|
||||
mb_cur_max)) == 0)
|
||||
return 0;
|
||||
return tp;
|
||||
case ROP_BRACE:
|
||||
if ((tp->left.ptr = findposn(ep, tp->left.ptr,
|
||||
mb_cur_max)) == 0)
|
||||
return 0;
|
||||
break;
|
||||
}
|
||||
/*
|
||||
* ROP_BRACE as is cannot be handled in a DFA. This code
|
||||
* duplicates the ROP_BRACE subtree as a left-towering
|
||||
* series of ROP_CAT nodes, the first "lo" of which are
|
||||
* direct copies of the original subtree. The tail of
|
||||
* the series are either some number of ROP_QUESTs over
|
||||
* copies of the original subtree, or a single ROP_PLUS
|
||||
* over a copy (when "hi" is infinity).
|
||||
*
|
||||
* All interesting cases {lo,hi}:
|
||||
* {0,0} -> ROP_EMPTY, parsing, temporary
|
||||
* {0,1} -> ROP_QUEST, parsing
|
||||
* {0,2} -> CAT(QUEST(left), QUEST(copy))
|
||||
* {0,n} -> CAT({0,n-1}, QUEST(copy))
|
||||
* {0,} -> ROP_STAR, parsing
|
||||
*
|
||||
* {1,1} -> ROP_NOP, parsing, temporary
|
||||
* {1,2} -> CAT(left, QUEST(copy))
|
||||
* {1,n} -> CAT({1,n-1}, QUEST(copy))
|
||||
* {1,} -> ROP_PLUS, parsing
|
||||
*
|
||||
* {2,2} -> CAT(left, copy)
|
||||
* {2,n} -> CAT({2,n-1}, QUEST(copy))
|
||||
* {2,} -> CAT(left, PLUS(copy))
|
||||
*
|
||||
* {3,3} -> CAT({2,2}, copy)
|
||||
* {3,n} -> CAT({3,n-1}, QUEST(copy))
|
||||
* {3,} -> CAT({2,2}, PLUS(copy))
|
||||
*
|
||||
* {n,} -> CAT({n-1,n-1}, PLUS(copy))
|
||||
*
|
||||
* In all cases, the ROP_BRACE node is turned into the
|
||||
* left-most ROP_CAT, and a copy of its original subtree
|
||||
* is connected as the right child. Note that the bottom-
|
||||
* up nature of this duplication guarantees that copy()
|
||||
* never sees a ROP_BRACE node.
|
||||
*/
|
||||
par = tp->parent;
|
||||
lo = tp->right.info.num[0];
|
||||
hi = tp->right.info.num[1];
|
||||
if ((ptr = copy(ep, tp->left.ptr)) == 0)
|
||||
return 0;
|
||||
ptr->parent = tp;
|
||||
tp->op = ROP_CAT;
|
||||
tp->right.ptr = ptr;
|
||||
if (lo == 0)
|
||||
{
|
||||
if ((tp->left.ptr = libuxre_reg1tree(ROP_QUEST, tp->left.ptr))
|
||||
== 0)
|
||||
return 0;
|
||||
tp->left.ptr->parent = tp;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (hi == BRACE_INF || (hi -= lo) == 0)
|
||||
lo--; /* lo > 1; no extra needed */
|
||||
while (--lo != 0)
|
||||
{
|
||||
if ((tp = libuxre_reg2tree(ROP_CAT, tp, copy(ep, ptr)))
|
||||
== 0)
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
if (hi == BRACE_INF)
|
||||
{
|
||||
if ((tp->right.ptr = libuxre_reg1tree(ROP_PLUS, tp->right.ptr))
|
||||
== 0)
|
||||
return 0;
|
||||
tp->right.ptr->parent = tp;
|
||||
}
|
||||
else if (hi != 0)
|
||||
{
|
||||
if ((tp->right.ptr = libuxre_reg1tree(ROP_QUEST, tp->right.ptr))
|
||||
== 0)
|
||||
return 0;
|
||||
ptr = tp->right.ptr;
|
||||
ptr->parent = tp;
|
||||
while (--hi != 0)
|
||||
{
|
||||
if ((tp = libuxre_reg2tree(ROP_CAT, tp, copy(ep, ptr)))
|
||||
== 0)
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
tp->parent = par;
|
||||
return tp;
|
||||
}
|
||||
|
||||
/*
|
||||
* Postorder traversal, but not always entire subtree.
|
||||
* For each leaf reachable by the empty string, add it
|
||||
* to the set. Return 0 if the subtree can match empty.
|
||||
*/
|
||||
static int
|
||||
first(Dfa *dp, Tree *tp)
|
||||
{
|
||||
switch (tp->op)
|
||||
{
|
||||
case ROP_BOL:
|
||||
if (dp->flags & REG_NOTBOL)
|
||||
return 0;
|
||||
break;
|
||||
case ROP_EOL:
|
||||
if (dp->flags & REG_NOTEOL)
|
||||
return 0;
|
||||
break;
|
||||
case ROP_EMPTY:
|
||||
return 0;
|
||||
case ROP_OR:
|
||||
return first(dp, tp->left.ptr) & first(dp, tp->right.ptr);
|
||||
case ROP_CAT:
|
||||
if (first(dp, tp->left.ptr) != 0)
|
||||
return 1;
|
||||
return first(dp, tp->right.ptr);
|
||||
case ROP_BRACE:
|
||||
if (tp->right.info.num[0] != 0 && first(dp, tp->left.ptr) != 0)
|
||||
return 1;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_STAR:
|
||||
case ROP_QUEST:
|
||||
first(dp, tp->left.ptr);
|
||||
return 0;
|
||||
case ROP_LP:
|
||||
case ROP_PLUS:
|
||||
return first(dp, tp->left.ptr);
|
||||
}
|
||||
if (dp->posset[tp->left.pos] == 0)
|
||||
{
|
||||
dp->posset[tp->left.pos] = 1;
|
||||
dp->nset++;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
/*
|
||||
* Walk from leaf up (most likely not to root).
|
||||
* Determine follow set for the leaf by filling
|
||||
* set[] with the positions reachable.
|
||||
*/
|
||||
static void
|
||||
follow(Dfa *dp, Tree *tp)
|
||||
{
|
||||
Tree *pp;
|
||||
|
||||
switch ((pp = tp->parent)->op)
|
||||
{
|
||||
case ROP_CAT:
|
||||
if (pp->left.ptr == tp && first(dp, pp->right.ptr) != 0)
|
||||
break;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_OR:
|
||||
case ROP_QUEST:
|
||||
case ROP_LP:
|
||||
follow(dp, pp);
|
||||
break;
|
||||
case ROP_STAR:
|
||||
case ROP_PLUS:
|
||||
case ROP_BRACE:
|
||||
first(dp, tp);
|
||||
follow(dp, pp);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Postorder traversal.
|
||||
* At each leaf, copy it into posn[] and assign its follow set.
|
||||
* Because the left-most subtree is ROP_ALL under ROP_STAR, the
|
||||
* follow set for its leaf (position dp->nposn-1) is the same
|
||||
* as the initial state's signature (prior to any ROP_BOL).
|
||||
*/
|
||||
static int
|
||||
posnfoll(Dfa *dp, Tree *tp)
|
||||
{
|
||||
unsigned char *s;
|
||||
size_t i, n;
|
||||
size_t *fp;
|
||||
Posn *p;
|
||||
int ret;
|
||||
|
||||
switch (tp->op)
|
||||
{
|
||||
case ROP_OR:
|
||||
case ROP_CAT:
|
||||
if ((ret = posnfoll(dp, tp->right.ptr)) != 0)
|
||||
return ret;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_STAR:
|
||||
case ROP_PLUS:
|
||||
case ROP_QUEST:
|
||||
case ROP_LP:
|
||||
if ((ret = posnfoll(dp, tp->left.ptr)) != 0)
|
||||
return ret;
|
||||
return 0;
|
||||
case ROP_END: /* keeps follow() from walking above the root */
|
||||
p = &dp->posn[tp->left.pos];
|
||||
p->op = tp->op;
|
||||
p->seti = 0;
|
||||
p->nset = 0;
|
||||
return 0;
|
||||
case ROP_BKT:
|
||||
case ROP_BKTCOPY:
|
||||
p = &dp->posn[tp->left.pos];
|
||||
p->bkt = tp->right.info.bkt;
|
||||
goto skip;
|
||||
case ROP_BOL:
|
||||
dp->flags |= REG_NOTBOL; /* adjacent ROP_BOLs match empty */
|
||||
break;
|
||||
case ROP_EOL:
|
||||
dp->flags |= REG_NOTEOL; /* adjacent ROP_EOLs match empty */
|
||||
break;
|
||||
}
|
||||
p = &dp->posn[tp->left.pos];
|
||||
skip:;
|
||||
p->op = tp->op;
|
||||
memset(dp->posset, 0, dp->nposn);
|
||||
dp->nset = 0;
|
||||
follow(dp, tp);
|
||||
dp->flags &= ~(REG_NOTBOL | REG_NOTEOL);
|
||||
fp = dp->posfoll;
|
||||
if ((p->nset = dp->nset) > dp->avail) /* need more */
|
||||
{
|
||||
if ((n = p->nset << 1) < dp->nposn)
|
||||
n = dp->nposn;
|
||||
dp->avail += n;
|
||||
if ((fp = realloc(dp->posfoll,
|
||||
sizeof(size_t) * (dp->avail + dp->used))) == 0)
|
||||
{
|
||||
return REG_ESPACE;
|
||||
}
|
||||
dp->posfoll = fp;
|
||||
}
|
||||
p->seti = dp->used;
|
||||
if ((i = dp->nset) != 0)
|
||||
{
|
||||
dp->used += i;
|
||||
dp->avail -= i;
|
||||
fp += p->seti;
|
||||
s = dp->posset;
|
||||
n = 0;
|
||||
do
|
||||
{
|
||||
if (*s++ != 0)
|
||||
{
|
||||
*fp++ = n;
|
||||
if (--i == 0)
|
||||
break;
|
||||
}
|
||||
} while (++n != dp->nposn);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int
|
||||
addstate(Dfa *dp) /* install state if unique; return its index */
|
||||
{
|
||||
size_t *sp, *fp;
|
||||
size_t t, n, i;
|
||||
int flushed;
|
||||
|
||||
/*
|
||||
* Compare dp->nset/dp->cursig[] against remembered states.
|
||||
*/
|
||||
t = dp->top;
|
||||
do
|
||||
{
|
||||
if (dp->nsig[--t] != dp->nset)
|
||||
continue;
|
||||
if ((n = dp->nset) != 0)
|
||||
{
|
||||
fp = &dp->sigfoll[dp->sigi[t]];
|
||||
sp = &dp->cursig[0];
|
||||
loop:;
|
||||
if (*fp++ != *sp++)
|
||||
continue; /* to the do-while */
|
||||
if (--n != 0)
|
||||
goto loop;
|
||||
}
|
||||
return t + 1;
|
||||
} while (t != 0);
|
||||
/*
|
||||
* Not in currently cached states; add it.
|
||||
*/
|
||||
flushed = 0;
|
||||
if ((t = dp->top) >= CACHESZ) /* need to flush the cache */
|
||||
{
|
||||
flushed = 1;
|
||||
n = dp->anybol;
|
||||
n = dp->sigi[n] + dp->nsig[n]; /* past invariant states */
|
||||
dp->avail += dp->used - n;
|
||||
dp->used = n;
|
||||
dp->top = n = dp->nfix;
|
||||
memset((void *)&dp->trans, 0, sizeof(dp->trans));
|
||||
memset((void *)&dp->acc[n], 0, CACHESZ - n);
|
||||
t = n;
|
||||
}
|
||||
dp->top++;
|
||||
fp = dp->sigfoll;
|
||||
if ((n = dp->nset) > dp->avail) /* grow strip */
|
||||
{
|
||||
i = dp->avail + n << 1;
|
||||
if ((fp = realloc(fp, sizeof(size_t) * (i + dp->used))) == 0)
|
||||
return 0;
|
||||
dp->avail = i;
|
||||
dp->sigfoll = fp;
|
||||
}
|
||||
dp->acc[t] = 0;
|
||||
if ((dp->nsig[t] = n) != 0)
|
||||
{
|
||||
sp = dp->cursig;
|
||||
if (sp[0] == 0)
|
||||
dp->acc[t] = 1;
|
||||
dp->sigi[t] = i = dp->used;
|
||||
dp->used += n;
|
||||
dp->avail -= n;
|
||||
fp += i;
|
||||
do
|
||||
*fp++ = *sp++;
|
||||
while (--n != 0);
|
||||
}
|
||||
t++;
|
||||
if (flushed)
|
||||
return -t;
|
||||
return t;
|
||||
}
|
||||
|
||||
void
|
||||
libuxre_regdeldfa(Dfa *dp)
|
||||
{
|
||||
Posn *pp;
|
||||
size_t np;
|
||||
|
||||
if (dp->posfoll != 0)
|
||||
free(dp->posfoll);
|
||||
if (dp->sigfoll != 0)
|
||||
free(dp->sigfoll);
|
||||
if (dp->cursig != 0)
|
||||
free(dp->cursig);
|
||||
if ((pp = dp->posn) != 0)
|
||||
{
|
||||
/*
|
||||
* Need to walk the positions list to free any
|
||||
* space used for ROP_BKTs.
|
||||
*/
|
||||
np = dp->nposn;
|
||||
do
|
||||
{
|
||||
if (pp->op == ROP_BKT)
|
||||
{
|
||||
libuxre_bktfree(pp->bkt);
|
||||
free(pp->bkt);
|
||||
}
|
||||
} while (++pp, --np != 0);
|
||||
free(dp->posn);
|
||||
}
|
||||
free(dp);
|
||||
}
|
||||
|
||||
int
|
||||
regtrans(Dfa *dp, int st, w_type wc, int mb_cur_max)
|
||||
{
|
||||
const unsigned char *s;
|
||||
size_t *fp, *sp;
|
||||
size_t i, n;
|
||||
Posn *pp;
|
||||
int nst;
|
||||
|
||||
if ((n = dp->nsig[st]) == 0) /* dead state */
|
||||
return st + 1; /* stay here */
|
||||
memset(dp->posset, 0, dp->nposn);
|
||||
dp->nset = 0;
|
||||
fp = &dp->sigfoll[dp->sigi[st]];
|
||||
do
|
||||
{
|
||||
pp = &dp->posn[*fp];
|
||||
switch (pp->op)
|
||||
{
|
||||
case ROP_EOL:
|
||||
if (wc == '\0' && (dp->flags & REG_NOTEOL) == 0)
|
||||
break;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_BOL:
|
||||
default:
|
||||
if (pp->op == wc)
|
||||
break;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_END:
|
||||
case ROP_NONE:
|
||||
continue;
|
||||
case ROP_NOTNL:
|
||||
if (wc == '\n')
|
||||
continue;
|
||||
/*FALLTHROUGH*/
|
||||
case ROP_ANYCH:
|
||||
if (wc <= '\0')
|
||||
continue;
|
||||
break;
|
||||
case ROP_ALL:
|
||||
if (wc == '\0')
|
||||
continue;
|
||||
break;
|
||||
case ROP_BKT:
|
||||
case ROP_BKTCOPY:
|
||||
/*
|
||||
* Note that multiple character bracket matches
|
||||
* are precluded from DFAs. (See regparse.c and
|
||||
* regcomp.c.) Thus, the continuation string
|
||||
* argument is not used in libuxre_bktmbexec().
|
||||
*/
|
||||
if (wc > '\0' &&
|
||||
libuxre_bktmbexec(pp->bkt, wc, 0, mb_cur_max) == 0)
|
||||
break;
|
||||
continue;
|
||||
}
|
||||
/*
|
||||
* Current character matches this position.
|
||||
* For each position in its follow list,
|
||||
* add that position to the new state's signature.
|
||||
*/
|
||||
i = pp->nset;
|
||||
sp = &dp->posfoll[pp->seti];
|
||||
do
|
||||
{
|
||||
if (dp->posset[*sp] == 0)
|
||||
{
|
||||
dp->posset[*sp] = 1;
|
||||
dp->nset++;
|
||||
}
|
||||
} while (++sp, --i != 0);
|
||||
} while (++fp, --n != 0);
|
||||
/*
|
||||
* Move the signature (if any) into cursig[] and install it.
|
||||
*/
|
||||
if ((i = dp->nset) != 0)
|
||||
{
|
||||
fp = dp->cursig;
|
||||
s = dp->posset;
|
||||
for (n = 0;; n++)
|
||||
{
|
||||
if (*s++ != 0)
|
||||
{
|
||||
*fp++ = n;
|
||||
if (--i == 0)
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
if ((nst = addstate(dp)) < 0) /* flushed cache */
|
||||
nst = -nst;
|
||||
else if (nst > 0 && (wc & ~(long)(NCHAR - 1)) == 0)
|
||||
dp->trans[st][wc] = nst;
|
||||
return nst;
|
||||
}
|
||||
|
||||
LIBUXRE_STATIC int
|
||||
libuxre_regdfacomp(regex_t *ep, Tree *tp, Lex *lxp)
|
||||
{
|
||||
Tree *lp;
|
||||
Dfa *dp;
|
||||
Posn *p;
|
||||
int st;
|
||||
|
||||
/*
|
||||
* It's convenient to insert an STAR(ALL) subtree to the
|
||||
* immediate left of the current tree. This makes the
|
||||
* "any match" libuxre_regdfaexec() not a special case,
|
||||
* and the initial state signature will fall out when
|
||||
* building the follow sets for all the leaves.
|
||||
*/
|
||||
if ((lp = libuxre_reg1tree(ROP_ALL, 0)) == 0
|
||||
|| (lp = libuxre_reg1tree(ROP_STAR, lp)) == 0
|
||||
|| (tp->left.ptr = lp
|
||||
= libuxre_reg2tree(ROP_CAT, lp, tp->left.ptr)) == 0)
|
||||
{
|
||||
return REG_ESPACE;
|
||||
}
|
||||
lp->parent = tp;
|
||||
if ((dp = calloc(1, sizeof(Dfa))) == 0)
|
||||
return REG_ESPACE;
|
||||
ep->re_dfa = dp;
|
||||
/*
|
||||
* Just in case null pointers aren't just all bits zero...
|
||||
*/
|
||||
dp->posfoll = 0;
|
||||
dp->sigfoll = 0;
|
||||
dp->cursig = 0;
|
||||
dp->posn = 0;
|
||||
/*
|
||||
* Assign position values to each of the tree's leaves
|
||||
* (the important parts), meanwhile potentially rewriting
|
||||
* the parse tree so that it fits within the restrictions
|
||||
* of our DFA.
|
||||
*/
|
||||
if ((tp = findposn(ep, tp, lxp->mb_cur_max)) == 0)
|
||||
goto err;
|
||||
/*
|
||||
* Get space for the array of positions and current set,
|
||||
* now that the number of positions is known.
|
||||
*/
|
||||
if ((dp->posn = malloc(sizeof(Posn) * dp->nposn + dp->nposn)) == 0)
|
||||
goto err;
|
||||
dp->posset = (unsigned char *)&dp->posn[dp->nposn];
|
||||
/*
|
||||
* Get follow sets for each position.
|
||||
*/
|
||||
if (posnfoll(dp, tp) != 0)
|
||||
goto err;
|
||||
/*
|
||||
* Set up the special invariant states:
|
||||
* - dead state (no valid transitions); index 0.
|
||||
* - initial state for any match [STAR(ALL) follow set]; index 1.
|
||||
* - initial state for any match after ROP_BOL.
|
||||
* - initial state for left-most longest if REG_NOTBOL.
|
||||
* - initial state for left-most longest after ROP_BOL.
|
||||
* The final two are not allocated if leftmost() cannot be called.
|
||||
* The pairs of initial states are the same if there is no
|
||||
* explicit ROP_BOL transition.
|
||||
*/
|
||||
dp->avail += dp->used;
|
||||
dp->used = 0;
|
||||
if ((dp->sigfoll = malloc(sizeof(size_t) * dp->avail)) == 0)
|
||||
goto err;
|
||||
p = &dp->posn[dp->nposn - 1]; /* same as first(root) */
|
||||
dp->cursig = &dp->posfoll[p->seti];
|
||||
dp->nset = p->nset;
|
||||
dp->top = 1; /* index 0 is dead state */
|
||||
addstate(dp); /* must be state index 1 (returns 2) */
|
||||
if ((dp->cursig = malloc(sizeof(size_t) * dp->nposn)) == 0)
|
||||
goto err;
|
||||
dp->nfix = 2;
|
||||
if ((st = regtrans(dp, 1, ROP_BOL, lxp->mb_cur_max)) == 0)
|
||||
goto err;
|
||||
if ((dp->anybol = st - 1) == 2) /* new state */
|
||||
dp->nfix = 3;
|
||||
if ((ep->re_flags & REG_NOSUB) == 0) /* leftmost() might be called */
|
||||
{
|
||||
/*
|
||||
* leftmost() initial states are the same as the
|
||||
* "any match" ones without the STAR(ALL) position.
|
||||
*/
|
||||
dp->sigi[dp->nfix] = 0;
|
||||
dp->nsig[dp->nfix] = dp->nsig[1] - 1;
|
||||
dp->acc[dp->nfix] = dp->acc[1];
|
||||
dp->leftbol = dp->leftmost = dp->nfix;
|
||||
dp->nfix++;
|
||||
if (dp->anybol != 1) /* distinct state w/BOL */
|
||||
{
|
||||
dp->sigi[dp->nfix] = dp->sigi[2];
|
||||
dp->nsig[dp->nfix] = dp->nsig[2] - 1;
|
||||
dp->acc[dp->nfix] = dp->acc[2];
|
||||
dp->leftbol = dp->nfix;
|
||||
dp->nfix++;
|
||||
}
|
||||
dp->top = dp->nfix;
|
||||
}
|
||||
return 0;
|
||||
err:;
|
||||
libuxre_regdeldfa(dp);
|
||||
return REG_ESPACE;
|
||||
}
|
||||
|
||||
static int
|
||||
leftmost(Dfa *dp, Exec *xp)
|
||||
{
|
||||
const unsigned char *s, *beg, *end;
|
||||
int i, nst, st, mb_cur_max;
|
||||
w_type wc;
|
||||
|
||||
mb_cur_max = xp->mb_cur_max;
|
||||
beg = s = xp->str;
|
||||
end = 0;
|
||||
st = dp->leftbol;
|
||||
if (xp->flags & REG_NOTBOL)
|
||||
st = dp->leftmost;
|
||||
if (dp->acc[st] && (xp->flags & REG_NONEMPTY) == 0)
|
||||
end = s; /* initial empty match allowed */
|
||||
for (;;)
|
||||
{
|
||||
if ((wc = *s++) == '\n')
|
||||
{
|
||||
if (xp->flags & REG_NEWLINE)
|
||||
wc = ROP_EOL;
|
||||
}
|
||||
else if (!ISONEBYTE(wc) && (i = libuxre_mb2wc(&wc, s)) > 0)
|
||||
s += i;
|
||||
if ((wc & ~(long)(NCHAR - 1)) != 0
|
||||
|| (nst = dp->trans[st][wc]) == 0)
|
||||
{
|
||||
if ((nst=regtrans(dp, st, wc, mb_cur_max)) == 0)
|
||||
return REG_ESPACE;
|
||||
if (wc == ROP_EOL) /* REG_NEWLINE only */
|
||||
{
|
||||
if (dp->acc[nst - 1])
|
||||
{
|
||||
if (end == 0 || end < s)
|
||||
end = s;
|
||||
break;
|
||||
}
|
||||
beg = s;
|
||||
st = dp->leftbol;
|
||||
goto newst;
|
||||
}
|
||||
}
|
||||
if ((st = nst - 1) == 0) /* dead state */
|
||||
{
|
||||
if (end != 0)
|
||||
break;
|
||||
if ((wc = *beg++) == '\0')
|
||||
return REG_NOMATCH;
|
||||
else if (!ISONEBYTE(wc) &&
|
||||
(i = libuxre_mb2wc(&wc, beg)) > 0)
|
||||
beg += i;
|
||||
s = beg;
|
||||
st = dp->leftmost;
|
||||
goto newst;
|
||||
}
|
||||
if (wc == '\0')
|
||||
{
|
||||
if (dp->acc[st])
|
||||
{
|
||||
s--; /* don't include \0 */
|
||||
if (end == 0 || end < s)
|
||||
end = s;
|
||||
break;
|
||||
}
|
||||
if (end != 0)
|
||||
break;
|
||||
return REG_NOMATCH;
|
||||
}
|
||||
newst:;
|
||||
if (dp->acc[st])
|
||||
{
|
||||
if (end == 0 || end < s)
|
||||
end = s;
|
||||
}
|
||||
}
|
||||
xp->match[0].rm_so = beg - xp->str;
|
||||
xp->match[0].rm_eo = end - xp->str;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Optimization by simplification: singlebyte locale and REG_NEWLINE not set.
|
||||
* Performance gain for grep is 25% so it's worth the hack.
|
||||
*/
|
||||
static int
|
||||
regdfaexec_opt(Dfa *dp, Exec *xp)
|
||||
{
|
||||
const unsigned char *s;
|
||||
int nst, st;
|
||||
|
||||
s = xp->str;
|
||||
st = dp->anybol;
|
||||
if (xp->flags & REG_NOTBOL)
|
||||
st = 1;
|
||||
if (dp->acc[st] && (xp->flags & REG_NONEMPTY) == 0)
|
||||
return 0; /* initial empty match allowed */
|
||||
do
|
||||
{
|
||||
if ((nst = dp->trans[st][*s]) == 0)
|
||||
{
|
||||
if ((nst = regtrans(dp, st, *s, 1)) == 0)
|
||||
return REG_ESPACE;
|
||||
}
|
||||
if (dp->acc[st = nst - 1])
|
||||
return 0;
|
||||
} while (*s++ != '\0'); /* st != 0 */
|
||||
return REG_NOMATCH;
|
||||
}
|
||||
|
||||
LIBUXRE_STATIC int
|
||||
libuxre_regdfaexec(Dfa *dp, Exec *xp)
|
||||
{
|
||||
const unsigned char *s;
|
||||
int i, nst, st, mb_cur_max;
|
||||
w_type wc;
|
||||
|
||||
dp->flags = xp->flags & REG_NOTEOL; /* for regtrans() */
|
||||
mb_cur_max = xp->mb_cur_max;
|
||||
if (xp->nmatch != 0)
|
||||
return leftmost(dp, xp);
|
||||
if (mb_cur_max == 1 && (xp->flags & REG_NEWLINE) == 0)
|
||||
return regdfaexec_opt(dp, xp);
|
||||
s = xp->str;
|
||||
st = dp->anybol;
|
||||
if (xp->flags & REG_NOTBOL)
|
||||
st = 1;
|
||||
if (dp->acc[st] && (xp->flags & REG_NONEMPTY) == 0)
|
||||
return 0; /* initial empty match allowed */
|
||||
for (;;)
|
||||
{
|
||||
if ((wc = *s++) == '\n')
|
||||
{
|
||||
if (xp->flags & REG_NEWLINE)
|
||||
wc = ROP_EOL;
|
||||
}
|
||||
else if (!ISONEBYTE(wc) && (i = libuxre_mb2wc(&wc, s)) > 0)
|
||||
s += i;
|
||||
if ((wc & ~(long)(NCHAR - 1)) != 0
|
||||
|| (nst = dp->trans[st][wc]) == 0)
|
||||
{
|
||||
if ((nst=regtrans(dp, st, wc, mb_cur_max)) == 0)
|
||||
return REG_ESPACE;
|
||||
if (wc == ROP_EOL) /* REG_NEWLINE only */
|
||||
{
|
||||
if (dp->acc[nst - 1])
|
||||
return 0;
|
||||
if (dp->acc[st = dp->anybol])
|
||||
return 0;
|
||||
continue;
|
||||
}
|
||||
}
|
||||
if (dp->acc[st = nst - 1])
|
||||
return 0;
|
||||
if (wc == '\0') /* st == 0 */
|
||||
return REG_NOMATCH;
|
||||
}
|
||||
}
|
||||
75
libuxre/regdfa.h
Normal file
75
libuxre/regdfa.h
Normal file
@@ -0,0 +1,75 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)regdfa.h 1.3 (gritter) 9/22/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
|
||||
/*
|
||||
* Deterministic Finite Automata.
|
||||
*/
|
||||
|
||||
#ifndef LIBUXRE_REGDFA_H
|
||||
#define LIBUXRE_REGDFA_H
|
||||
|
||||
#include <re.h>
|
||||
|
||||
typedef struct
|
||||
{
|
||||
Bracket *bkt; /* extra info for ROP_BKT */
|
||||
size_t nset; /* number of items in the follow set */
|
||||
size_t seti; /* index into the follow set strip */
|
||||
w_type op; /* the leaf match operation */
|
||||
} Posn;
|
||||
|
||||
#define CACHESZ 32 /* max. states to remember (must fit in uchar) */
|
||||
#define NCHAR (1 << CHAR_BIT)
|
||||
|
||||
struct re_dfa_ /*Dfa*/
|
||||
{
|
||||
unsigned char *posset; /* signatures built here */
|
||||
size_t *posfoll; /* follow strip for posn[] */
|
||||
size_t *sigfoll; /* follow strip for sigi[] */
|
||||
size_t *cursig; /* current state's signature */
|
||||
Posn *posn; /* important positions */
|
||||
size_t nposn; /* length of posn,cursig,posset */
|
||||
size_t used; /* used portion of follow strip */
|
||||
size_t avail; /* unused part of follow strip */
|
||||
size_t nset; /* # items nonzero in posset[] */
|
||||
size_t nsig[CACHESZ]; /* number of items in signature */
|
||||
size_t sigi[CACHESZ]; /* index into sigfoll[] */
|
||||
unsigned char acc[CACHESZ]; /* nonzero for accepting states */
|
||||
unsigned char leftmost; /* leftmost() start, not BOL */
|
||||
unsigned char leftbol; /* leftmost() start, w/BOL */
|
||||
unsigned char anybol; /* any match start, w/BOL */
|
||||
unsigned char nfix; /* number of invariant states */
|
||||
unsigned char top; /* next state index available */
|
||||
unsigned char flags; /* interesting flags */
|
||||
unsigned char trans[CACHESZ][NCHAR]; /* goto table */
|
||||
};
|
||||
|
||||
extern int regtrans(Dfa *, int, w_type, int);
|
||||
|
||||
#endif /* !LIBUXRE_REGDFA_H */
|
||||
95
libuxre/regerror.c
Normal file
95
libuxre/regerror.c
Normal file
@@ -0,0 +1,95 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)regerror.c 1.4 (gritter) 3/29/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
#include <string.h>
|
||||
#include "re.h"
|
||||
/* include "_locale.h" */
|
||||
|
||||
/* #pragma weak regerror = _regerror */
|
||||
|
||||
size_t
|
||||
regerror(int err, const regex_t *ep, char *str, size_t max)
|
||||
{
|
||||
const struct
|
||||
{
|
||||
int index;
|
||||
const char *str;
|
||||
} unk =
|
||||
{
|
||||
88, "unknown regular expression error"
|
||||
}, msgs[] =
|
||||
{
|
||||
/*ENOSYS*/ { 89, "feature not implemented" },
|
||||
/*0*/ { 0, "" },
|
||||
/*NOMATCH*/ { 90, "regular expression failed to match" },
|
||||
/*BADPAT*/ { 91, "invalid regular expression" },
|
||||
/*ECOLLATE*/ { 92, "invalid collating element construct" },
|
||||
/*ECTYPE*/ { 93, "invalid character class construct" },
|
||||
/*EEQUIV*/ { 94, "invalid equivalence class construct" },
|
||||
/*EBKTCHAR*/ { 95, "invalid character in '[ ]' construct" },
|
||||
/*EESCAPE*/ { 96, "trailing \\ in pattern" },
|
||||
/*ESUBREG*/ { 97, "'\\digit' out of range" },
|
||||
/*EBRACK*/ { 98, "'[ ]' imbalance" },
|
||||
/*EMPTYSUBBKT*/ { 99, "empty nested '[ ]' construct" },
|
||||
/*EMPTYPAREN*/ { 100, "empty '\\( \\)' or '( )'" },
|
||||
/*NOPAT*/ { 101, "empty pattern" },
|
||||
/*EPAREN*/ { 102, "'\\( \\)' or '( )' imbalance" },
|
||||
/*EBRACE*/ { 103, "'\\{ \\} or '{ }' imbalance" },
|
||||
/*BADBR*/ { 104, "invalid '\\{ \\}' or '{ }'" },
|
||||
/*ERANGE*/ { 105, "invalid endpoint in range" },
|
||||
/*ESPACE*/ { 106, "out of regular expression memory" },
|
||||
/*BADRPT*/ { 107, "invalid *, +, ?, \\{\\} or {} operator" },
|
||||
/*BADESC*/ { 108, "invalid escape sequence (e.g. \\0)" },
|
||||
/*ILLSEQ*/ { 109, "illegal byte sequence"}
|
||||
};
|
||||
const char *p;
|
||||
size_t len;
|
||||
int i;
|
||||
|
||||
if (err < REG_ENOSYS || REG_ILLSEQ < err)
|
||||
{
|
||||
i = unk.index;
|
||||
p = unk.str;
|
||||
}
|
||||
else
|
||||
{
|
||||
i = msgs[err - REG_ENOSYS].index;
|
||||
p = msgs[err - REG_ENOSYS].str;
|
||||
}
|
||||
/* p = __gtxt(_str_uxlibc, i, p); */
|
||||
len = strlen(p) + 1;
|
||||
if (max != 0)
|
||||
{
|
||||
if (max > len)
|
||||
max = len;
|
||||
else if (max < len)
|
||||
str[--max] = '\0';
|
||||
memcpy(str, p, max);
|
||||
}
|
||||
return len;
|
||||
}
|
||||
153
libuxre/regex.h
Normal file
153
libuxre/regex.h
Normal file
@@ -0,0 +1,153 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)regex.h 1.13 (gritter) 2/6/05
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
#ifndef LIBUXRE_REGEX_H
|
||||
#define LIBUXRE_REGEX_H
|
||||
/* from unixsrc:usr/src/common/head/regex.h /main/uw7_nj/1 */
|
||||
|
||||
#include <sys/types.h> /* really only want [s]size_t */
|
||||
|
||||
/*
|
||||
* Official regexec() flags.
|
||||
*/
|
||||
#define REG_NOTBOL 0x000001 /* start of string does not match ^ */
|
||||
#define REG_NOTEOL 0x000002 /* end of string does not match $ */
|
||||
|
||||
/*
|
||||
* Additional regexec() flags.
|
||||
*/
|
||||
#define REG_NONEMPTY 0x000004 /* do not match empty at start of string */
|
||||
|
||||
/*
|
||||
* Extensions to provide individual control over each
|
||||
* of the differences between basic and extended REs.
|
||||
*/
|
||||
#define REG_OR 0x0000001 /* enable | operator */
|
||||
#define REG_PLUS 0x0000002 /* enable + operator */
|
||||
#define REG_QUEST 0x0000004 /* enable ? operator */
|
||||
#define REG_BRACES 0x0000008 /* use {m,n} (instead of \{m,n\}) */
|
||||
#define REG_PARENS 0x0000010 /* use (...) [instead of \(...\)] */
|
||||
#define REG_ANCHORS 0x0000020 /* ^ and $ are anchors anywhere */
|
||||
#define REG_NOBACKREF 0x0000040 /* disable \digit */
|
||||
#define REG_NOAUTOQUOTE 0x0000080 /* no automatic quoting of REG_BADRPTs */
|
||||
|
||||
/*
|
||||
* Official regcomp() flags.
|
||||
*/
|
||||
#define REG_EXTENDED (REG_OR | REG_PLUS | REG_QUEST | REG_BRACES | \
|
||||
REG_PARENS | REG_ANCHORS | \
|
||||
REG_NOBACKREF | REG_NOAUTOQUOTE)
|
||||
#define REG_ICASE 0x0000100 /* ignore case */
|
||||
#define REG_NOSUB 0x0000200 /* only success/fail for regexec() */
|
||||
#define REG_NEWLINE 0x0000400 /* take \n as line separator for ^ and $ */
|
||||
|
||||
/*
|
||||
* Additional regcomp() flags.
|
||||
* Some of these assume that int is >16 bits!
|
||||
* Beware: 0x20000000 and above are used in re.h.
|
||||
*/
|
||||
#define REG_ONESUB 0x0000800 /* regexec() only needs pmatch[0] */
|
||||
#define REG_MTPARENFAIL 0x0001000 /* take empty \(\) or () as match failure */
|
||||
#define REG_MTPARENBAD 0x0002000 /* disallow empty \(\) or () */
|
||||
#define REG_BADRANGE 0x0004000 /* accept [m-a] ranges as [ma] */
|
||||
#define REG_ODDRANGE 0x0008000 /* oawk oddity: [m-a] means [m] */
|
||||
#define REG_SEPRANGE 0x0010000 /* disallow [a-m-z] style ranges */
|
||||
#define REG_BKTQUOTE 0x0020000 /* allow \ in []s to quote \, -, ^ or ] */
|
||||
#define REG_BKTEMPTY 0x0040000 /* allow empty []s (w/BKTQUOTE, BKTESCAPE) */
|
||||
#define REG_ANGLES 0x0080000 /* enable \<, \> operators */
|
||||
#define REG_ESCNL 0x0100000 /* take \n as newline character */
|
||||
#define REG_NLALT 0x0200000 /* take newline as alternation */
|
||||
#define REG_ESCSEQ 0x0400000 /* otherwise, take \ as start of C escapes */
|
||||
#define REG_BKTESCAPE 0x0800000 /* allow \ in []s to quote next anything */
|
||||
#define REG_NOBRACES 0x1000000 /* disable {n,m} */
|
||||
#define REG_ADDITIVE 0x2000000 /* a+*b means + and * additive, ^+ is valid */
|
||||
#define REG_NOI18N 0x4000000 /* disable I18N features ([::] etc.) */
|
||||
#define REG_OLDESC 0x8000000 /* recognize \b \f \n \r \t \123 only */
|
||||
#define REG_AVOIDNULL 0x10000000/* avoid null subexpression matches */
|
||||
#define REG_OLDBRE (REG_BADRANGE | REG_ANGLES | REG_ESCNL)
|
||||
#define REG_OLDERE (REG_OR | REG_PLUS | REG_QUEST | REG_NOBRACES | \
|
||||
REG_PARENS | REG_ANCHORS | REG_ODDRANGE | \
|
||||
REG_NOBACKREF | REG_ADDITIVE | REG_NOAUTOQUOTE)
|
||||
|
||||
/*
|
||||
* Error return values.
|
||||
*/
|
||||
#define REG_ENOSYS (-1) /* unsupported */
|
||||
#define REG_NOMATCH 1 /* regexec() failed to match */
|
||||
#define REG_BADPAT 2 /* invalid regular expression */
|
||||
#define REG_ECOLLATE 3 /* invalid collating element construct */
|
||||
#define REG_ECTYPE 4 /* invalid character class construct */
|
||||
#define REG_EEQUIV 5 /* invalid equivalence class construct */
|
||||
#define REG_EBKTCHAR 6 /* invalid character in [] construct */
|
||||
#define REG_EESCAPE 7 /* trailing \ in pattern */
|
||||
#define REG_ESUBREG 8 /* number in \digit invalid or in error */
|
||||
#define REG_EBRACK 9 /* [] imbalance */
|
||||
#define REG_EMPTYSUBBKT 10 /* empty sub-bracket construct */
|
||||
#define REG_EMPTYPAREN 11 /* empty \(\) or () [REG_MTPARENBAD] */
|
||||
#define REG_NOPAT 12 /* no (empty) pattern */
|
||||
#define REG_EPAREN 13 /* \(\) or () imbalance */
|
||||
#define REG_EBRACE 14 /* \{\} or {} imbalance */
|
||||
#define REG_BADBR 15 /* contents of \{\} or {} invalid */
|
||||
#define REG_ERANGE 16 /* invalid endpoint in expression */
|
||||
#define REG_ESPACE 17 /* out of memory */
|
||||
#define REG_BADRPT 18 /* *,+,?,\{\} or {} not after r.e. */
|
||||
#define REG_BADESC 19 /* invalid escape sequence (e.g. \0) */
|
||||
#define REG_ILLSEQ 20 /* illegal byte sequence */
|
||||
|
||||
typedef struct
|
||||
{
|
||||
size_t re_nsub; /* only advertised member */
|
||||
unsigned long re_flags; /* augmented regcomp() flags */
|
||||
struct re_dfa_ *re_dfa; /* DFA engine */
|
||||
struct re_nfa_ *re_nfa; /* NFA engine */
|
||||
struct re_coll_ *re_col; /* current collation info */
|
||||
int re_mb_cur_max; /* MB_CUR_MAX acceleration */
|
||||
void *re_more; /* just in case... */
|
||||
} regex_t;
|
||||
|
||||
typedef ssize_t regoff_t;
|
||||
|
||||
typedef struct
|
||||
{
|
||||
regoff_t rm_so;
|
||||
regoff_t rm_eo;
|
||||
} regmatch_t;
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
int regcomp(regex_t *, const char *, int);
|
||||
int regexec(const regex_t *, const char *, size_t, regmatch_t *, int);
|
||||
size_t regerror(int, const regex_t *, char *, size_t);
|
||||
void regfree(regex_t *);
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif /* !LIBUXRE_REGEX_H */
|
||||
68
libuxre/regexec.c
Normal file
68
libuxre/regexec.c
Normal file
@@ -0,0 +1,68 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)regexec.c 1.7 (gritter) 2/6/05
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
#include "re.h"
|
||||
|
||||
/* #pragma weak regexec = _regexec */
|
||||
|
||||
int
|
||||
regexec(const regex_t *ep, const char *s, size_t n, regmatch_t *mp, int flg)
|
||||
{
|
||||
Exec ex;
|
||||
int ret;
|
||||
|
||||
ex.flags = flg | (ep->re_flags & (REG_NEWLINE|REG_ICASE|REG_AVOIDNULL));
|
||||
ex.str = (const unsigned char *)s;
|
||||
ex.match = mp;
|
||||
ex.mb_cur_max = ep->re_mb_cur_max;
|
||||
if ((ex.nmatch = n) != 0) /* impose limits from compile flags */
|
||||
{
|
||||
if (ep->re_flags & REG_NOSUB)
|
||||
n = ex.nmatch = 0;
|
||||
else if (ep->re_flags & REG_ONESUB)
|
||||
ex.nmatch = 1;
|
||||
else if (n > ep->re_nsub + 1)
|
||||
ex.nmatch = ep->re_nsub + 1;
|
||||
}
|
||||
if (ep->re_flags & REG_DFA && ex.nmatch <= 1)
|
||||
ret = libuxre_regdfaexec(ep->re_dfa, &ex);
|
||||
else
|
||||
ret = libuxre_regnfaexec(ep->re_nfa, &ex);
|
||||
/*
|
||||
* Fill unused part of mp[].
|
||||
*/
|
||||
if (ret != 0)
|
||||
ex.nmatch = 0;
|
||||
while (n > ex.nmatch)
|
||||
{
|
||||
n--;
|
||||
mp[n].rm_so = -1;
|
||||
mp[n].rm_eo = -1;
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
42
libuxre/regfree.c
Normal file
42
libuxre/regfree.c
Normal file
@@ -0,0 +1,42 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)regfree.c 1.3 (gritter) 9/22/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
|
||||
/* #include "synonyms.h" */
|
||||
#include "re.h"
|
||||
|
||||
/* #pragma weak regfree = _regfree */
|
||||
|
||||
void
|
||||
regfree(regex_t *ep)
|
||||
{
|
||||
if (ep->re_flags & REG_DFA)
|
||||
libuxre_regdeldfa(ep->re_dfa);
|
||||
if (ep->re_flags & REG_NFA)
|
||||
libuxre_regdelnfa(ep->re_nfa);
|
||||
if (ep->re_col != 0)
|
||||
(void)libuxre_lc_collate(ep->re_col);
|
||||
}
|
||||
1070
libuxre/regnfa.c
Normal file
1070
libuxre/regnfa.c
Normal file
File diff suppressed because it is too large
Load Diff
1091
libuxre/regparse.c
Normal file
1091
libuxre/regparse.c
Normal file
File diff suppressed because it is too large
Load Diff
97
libuxre/stubs.c
Normal file
97
libuxre/stubs.c
Normal file
@@ -0,0 +1,97 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)stubs.c 1.24 (gritter) 10/12/04
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
/* stubbed-out routines needed to complete the RE libc code */
|
||||
|
||||
#include "colldata.h"
|
||||
|
||||
struct lc_collate *
|
||||
libuxre_lc_collate(struct lc_collate *cp)
|
||||
{
|
||||
static struct lc_collate curinfo = {0}; /* means CHF_ENCODED */
|
||||
|
||||
return &curinfo;
|
||||
}
|
||||
|
||||
#include "wcharm.h"
|
||||
|
||||
LIBUXRE_STATIC int
|
||||
libuxre_mb2wc(w_type *wt, const unsigned char *s)
|
||||
{
|
||||
wchar_t wc;
|
||||
int len;
|
||||
|
||||
if ((len = mbtowc(&wc, (const char *)&s[-1], MB_LEN_MAX)) > 0)
|
||||
*wt = wc;
|
||||
else if (len == 0)
|
||||
*wt = '\0';
|
||||
else /*if (len < 0)*/
|
||||
*wt = (w_type)WEOF;
|
||||
return len > 0 ? len - 1 : len;
|
||||
}
|
||||
|
||||
#if __GNUC__ >= 3 && __GNUC_MINOR__ >= 4
|
||||
#define USED __attribute__ ((used))
|
||||
#elif defined __GNUC__
|
||||
#define USED __attribute__ ((unused))
|
||||
#else
|
||||
#define USED
|
||||
#endif
|
||||
static const char sccsid[] USED = "@(#)libuxre.sl 1.24 (gritter) 10/12/04";
|
||||
/*
|
||||
_collelem.c:
|
||||
_collelem.c 1.4 (gritter) 10/18/03
|
||||
_collmult.c:
|
||||
_collmult.c 1.4 (gritter) 9/22/03
|
||||
bracket.c:
|
||||
bracket.c 1.14 (gritter) 10/18/03
|
||||
colldata.h:
|
||||
colldata.h 1.4 (gritter) 10/18/03
|
||||
onefile.c:
|
||||
onefile.c 1.1 (gritter) 9/22/03
|
||||
re.h:
|
||||
re.h 1.14 (gritter) 10/18/03
|
||||
regcomp.c:
|
||||
regcomp.c 1.6 (gritter) 9/22/03
|
||||
regdfa.c:
|
||||
regdfa.c 1.9 (gritter) 9/22/03
|
||||
regdfa.h:
|
||||
regdfa.h 1.3 (gritter) 9/22/03
|
||||
regerror.c:
|
||||
regerror.c 1.4 (gritter) 3/29/03
|
||||
regex.h:
|
||||
regex.h 1.12 (gritter) 9/22/03
|
||||
regexec.c:
|
||||
regexec.c 1.6 (gritter) 9/22/03
|
||||
regfree.c:
|
||||
regfree.c 1.3 (gritter) 9/22/03
|
||||
regnfa.c:
|
||||
regnfa.c 1.7 (gritter) 9/22/03
|
||||
regparse.c:
|
||||
regparse.c 1.12 (gritter) 9/22/03
|
||||
wcharm.h:
|
||||
wcharm.h 1.12 (gritter) 10/18/03
|
||||
*/
|
||||
63
libuxre/wcharm.h
Normal file
63
libuxre/wcharm.h
Normal file
@@ -0,0 +1,63 @@
|
||||
/*
|
||||
* Changes by Gunnar Ritter, Freiburg i. Br., Germany, November 2002.
|
||||
*
|
||||
* Sccsid @(#)wcharm.h 1.12 (gritter) 10/18/03
|
||||
*/
|
||||
/* UNIX(R) Regular Expresssion Library
|
||||
*
|
||||
* Note: Code is released under the GNU LGPL
|
||||
*
|
||||
* Copyright (C) 2001 Caldera International, Inc.
|
||||
*
|
||||
* This library is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU Lesser General Public
|
||||
* License as published by the Free Software Foundation; either
|
||||
* version 2 of the License, or (at your option) any later version.
|
||||
*
|
||||
* This library is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
* Lesser General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU Lesser General Public
|
||||
* License along with this library; if not, write to:
|
||||
* Free Software Foundation, Inc.
|
||||
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
*/
|
||||
/* Stubbed-out wide character locale information */
|
||||
|
||||
#ifndef LIBUXRE_WCHARM_H
|
||||
#define LIBUXRE_WCHARM_H
|
||||
|
||||
#ifndef LIBUXRE_STATIC
|
||||
#define LIBUXRE_STATIC
|
||||
#endif
|
||||
|
||||
#ifndef LIBUXRE_WUCHAR_T
|
||||
#define LIBUXRE_WUCHAR_T
|
||||
typedef unsigned int wuchar_type;
|
||||
#endif
|
||||
|
||||
#ifndef LIBUXRE_W_TYPE
|
||||
#define LIBUXRE_W_TYPE
|
||||
typedef int w_type;
|
||||
#endif
|
||||
|
||||
#include <wchar.h>
|
||||
#include <wctype.h>
|
||||
#include <stdlib.h>
|
||||
|
||||
#ifdef notdef
|
||||
#define ISONEBYTE(ch) ((ch), 1)
|
||||
|
||||
#define libuxre_mb2wc(wp, cp) ((wp), (cp), 0)
|
||||
#endif /* notdef */
|
||||
|
||||
#define ISONEBYTE(ch) (((ch) & 0200) == 0 || mb_cur_max == 1)
|
||||
|
||||
#define to_lower(ch) (mb_cur_max > 1 ? towlower(ch) : tolower(ch))
|
||||
#define to_upper(ch) (mb_cur_max > 1 ? towupper(ch) : toupper(ch))
|
||||
|
||||
LIBUXRE_STATIC int libuxre_mb2wc(w_type *, const unsigned char *);
|
||||
|
||||
#endif /* !LIBUXRE_WCHARM_H */
|
||||
Reference in New Issue
Block a user