Added mathsym+tcc and boost against all advice

This commit is contained in:
maartenkeijzer 2005-10-06 12:13:53 +00:00
commit 90702a435d
136 changed files with 14409 additions and 0 deletions

340
eo/contrib/mathsym/COPYING Normal file
View file

@ -0,0 +1,340 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.

View file

@ -0,0 +1,87 @@
COMPILEFLAGS=-Wno-deprecated -g -Wall -Wshadow #-DINTERVAL_DEBUG
OPTFLAGS= -O2 -DNDEBUG
PROFILE_FLAGS=#-pg
INCLUDES=-I. -Isym -Ifun -Igen -Ieval -Iregression -I../../src -Ieo_interface -I..
CPPFLAGS=$(COMPILEFLAGS) $(OPTFLAGS) $(INCLUDES) $(PROFILE_FLAGS) -D__I386__ -DSIZEOF_UNSIGNED_LONG=4
EXTLIBS=tcc/libtcc.a tcc/libtcc1.a ../../src/libeo.a ../../src/utils/libeoutils.a
LIBS=${EXTLIBS} -ldl
SYMLIB=libsym.a
VPATH=sym fun gen eval regression eo_interface
CXXSOURCES=FunDef.cpp Sym.cpp SymImpl.cpp SymOps.cpp sym_compile.cpp TreeBuilder.cpp LanguageTable.cpp\
Dataset.cpp ErrorMeasure.cpp Scaling.cpp TargetInfo.cpp BoundsCheck.cpp util.cpp NodeSelector.cpp\
eoSymCrossover.cpp
PROGRAMS=test/test test/test_compile test/testeo
OBJS= $(CXXSOURCES:.cpp=.o) c_compile.o
all: tcc/ symreg
include $(CXXSOURCES:.cpp=.d) symreg.d
clean:
rm *.o *.d $(PROGRAMS) $(SYMLIB) symreg || true
distclean: clean
rm -rf tcc
symreg: libsym.a symreg.o $(EXTLIBS)
$(CXX) -o symreg symreg.o libsym.a $(LIBS) $(PROFILE_FLAGS)
libsym.a: $(OBJS)
rm libsym.a; ar cq $(SYMLIB) $(OBJS)
check: $(PROGRAMS)
test/test && test/test_compile && test/testeo && echo "all tests succeeded"
test/test: test/test.o ${SYMLIB}
$(CXX) -o test/test test/test.o ${SYMLIB} ${LIBS}
test/test_compile: test/test_compile.o ${SYMLIB}
$(CXX) -o test/test_compile test/test_compile.o $(SYMLIB) ${LIBS}
test/testeo: test/testeo.o ${SYMLIB}
$(CXX) -o test/testeo test/testeo.o $(SYMLIB) ${LIBS}
# eo
../../src/libeo.a:
make -C ../../src
../../src/utils/libeoutils.a:
make -C ../../src/utils
# tiny cc
tcc/: tcc.tar.gz
tar xvfz tcc.tar.gz && cd tcc && ./configure && make
tcc/Makefile: tcc/
cd tcc && ./configure
tcc/libtcc.a: tcc/Makefile
make -Ctcc
tcc/libtcc1.a: tcc/Makefile
make -Ctcc
#rules
c_compile.o: eval/c_compile.c
gcc -c eval/c_compile.c -I./tcc $(COMPILEFLAGS) $(OPTFLAGS)
%.o:%.cpp
$(CXX) -o $@ -c $< $(CPPFLAGS) $(INCLUDE)
%.d: %.cpp
$(SHELL) -ec '$(CXX) -M $(CPPFLAGS) $< | sed "s/$*.o/& $@/g" > $@ '
%.d: %.c
$(SHELL) -ec '$(CXX) -M $(CPPFLAGS) $< | sed "s/$*.o/& $@/g" > $@ '

172
eo/contrib/mathsym/README Normal file
View file

@ -0,0 +1,172 @@
This is not yet another gp system (nyagp). For one, it is not general.
It does one thing, find mathematical functions, and tries to do that well.
So, if you're trying to steer ants on various New Mexico trails, or build your
own tiny block world, you're in the wrong place. However, if you're interested
in finding mathematical functions either through direct application on data or
running it through a simulator, you might find what you're looking for here.
=== Representation (sym/ + gen/) ========
Mathsym has a few interesting characteristics. First and foremost is the
basic representation. It uses trees, but these trees are stored in a
reference counted hashtable. This means that every subtree that is alive
is stored once and only once. The reference counting mechanism takes care
of memory management.
The idea of using a hashtable (for offline analysis) comes from Walter Tackett, in his
1994 dissertation. The current system is just a real-time implementation of this
idea, adding the reference counting for ease of use.
The hashtable brings overhead. It's still pretty fast, but a string based representation
would run rounds around it. However, by virtue of it storing every subtree only once, it
is fairly tight on memory. This helps tremendously when confronted with growing populations, bloat.
The hashtable implementation can not stop bloat, but does make it more manageable. In a typical
GP run, the number of distinct subtrees is only 10-20% of the total number of subtrees.
Other advantages of the hashtable are in the ability to examine the run more thoroughly. It is easy
to check how many subtrees are present in the system, and for each subtree you can check the reference
count.
The basic tree is called a Sym. A Sym is simply a tree, and has children, accessible through args().
A Sym simply contains an iterator (== decorated pointer) to an entry in the hashtable.
Every time you create a Sym, it is either looked up in the hashtable or added to the hashtable.
A Sym has several members: size, depth, args, etc. One interesting member is the refcount().
This returns the reference count of the Sym in the hashtable, and thus returns the number
of distinct contexts in which the Sym is used.
Another nice thing of these hashtable Syms is that a check for equality reduces to a pointer comparison.
The Sym nodes are identified by a simple token, of type token_t (usually an unsigned int). It
is completely generic and could conceivably be adapted to steer ants. The rest of the library
is however targeted at mathematical functions purely.
sym/Sym.h is the file to look into for the functionality provided by Sym. The sym/ directory
is where the source files are stored that are relevant for the generic Sym functionality. The
'gen/' directory contains some generic functionality to build and traverse trees, independent of
the function and terminal set.
The file sym/README.cpp documents the use of the sym library for general GP use.
=== Function Set (fun/) ===
The standard GP function set of binary functions: addition, multiplication, subtraction and
division is NOT supported.
What is however supported are the functions of:
summation: arbitrary arity, arity zero meaning 0.0. Arity 2 is standard addition
product: arbitrary arity, arity zero meaning 1.0. Arity 2 is standard multiplication
inversion: 1.0 / x. Only arity 1
unary minus: -x. Only arity 1
Plus a whole bunch of other functions (see "fun/FunDef.h")
The reason for this is the observation (actually from a friend of mine, thanks Luuk),
that this set of functions is complete and slightly more orthogonal than a binary set.
The directory 'fun' contains the functionality for the function and terminal set, together
with ERC's etc. fun/FunDef.cpp contains the definition of the functionality. Stuff can be
added here, but best to contact me if you miss particular functions.
=== Evaluation (eval/) ===
The second important thing is evaluation. Although Syms can be evaluated through an interpreter,
this is not the fastest way to go about with it. The standard way of evaluating a Sym is to
first *compile* it to a function, and then run it in your favourite environment. Compilation
is done through the use of the excellent tinycc compiler, which is blazingly fast and produces
pretty good functions.
Compilation comes in several flavours: compile a single function and retrieve a pointer to a function
of signature:
double func(const double* x);
where x is the input array. Another option is to compile a bunch of functions in one go, and retrieve an array
of such function pointers. The Syms are simply printed and compiled. An example:
double func(const double* x) { return x*x + x * 1./x; }
The batch version proceeds significantly more quickly than calling compile every time. The function pointers
can be given to a simulation for extremely quick evaluation.
A third option is to compile a complete population in one go, and return a single pointer of signature
void func(const double* x, double* y);
Where 'y' is the (preallocated) output array. This allows to evaluate a complete population in one function
call, storing the results in 'y'. It uses the hashtable to store every calculation only once. An example
for the two function x*x + x*1./x and x + sin(x*x) is:
void func(const double* x, double* y) {
double a0 = x;
double a1 = a0 * a0;
double a2 = 1.0;
double a3 = a2 / a0;
double a4 = a2 * a3;
y[0] = a4;
double a5 = sin(a1);
double a6 = a0 + a5;
y[1] = a6;
}
This is the fastest way to evaluate even humongous populations quickly. You might be surprised at
the amount of code re-use in a GP population.
The three compilation functions can be found in eval/sym_compile.h
A limiting factor in tinycc is that the struct TCCState that is used to hold the compilation context,
is not really self-contained. This unfortunately means that with every call to 'compile' ALL previous
pointers that have been produced are invalidated. I'm still looking at ways to circumvent this.
To work with mathsym, a few small changes in tccelf.c were necessary, check README.TCC for details.
=== Interval Arithmetic (eval/) ===
GP is pretty good at finding mathematical expressions that are numerically unsound. Take for instance
the function '1 / x'. This is well defined only when x is strictly positive, but will lead to problems
when x equals 0. The standard answer is to define some pseudo-arithmetical function called 'protected
division' that will return some value (usually 1) when a division by zero occurs. This leads to a number
of protected functions (sqrt, log, tan, etc.) which all need to be protected. Interpreting results from
GP using such functions is in general hard.
Interval arithmetic (through another excellent library boost/numeric/interval) is used to calculate
if particular functions can conceivably produce problems. This completely annihilates the use for Koza-style
protected operators and is a more safe and sound method. For interval arithmetic to function, the bounds
on the input variables need to be known. As for every function we can calculate a guarenteed,
though not necessarily tight, output interval given the input intervals, we can check arbitrary functions
for possible problems. If, for example for division, the input interval contains 0, we know that a division
by zero is theoretically possible. It's then best to throw away the entire function.
Interval Arithmetic is accessible through the class IntervalBoundsCheck (eval/BoundsCheck.h)
=== More generic support (gen/) ===
The gen subdirectory contains some general utility classes for defining function sets and for
creating trees. The idea is that these functions are generic and only append on the sym/ part
of the library. Unfortunately, the language table currently needs an ERC function, a default
implementation is hidden inside fun/FunDef.cpp. Will fix at some point.
gen/LanguageTable.cpp -> defines the functions/terminals that can be used
gen/TreeBuilder.cpp -> can create trees based on a LanguageTable
=== Data and Errors (regression/) ===
The above classes are generic and apply for any type of problem where a mathematical function can be
used to steer some process, run a simulation, whatever. First check the intervals, then compile the
Sym(s) to a (set of) function pointer(s), and use the pointers in some way to evaluate for fitness.
One particular type of problem for which support is built in is 'symbolic regression'. This type of
problem involves finding an mathematical input/output relationship based on some data.
To enable this, regression/ introduces the class Dataset to contain the data and ErrorMeasure to calculate
error. Currently supported: mean squared error, mean absolute error and mean squared error scaled (proportional
to correlation squared). They use some helper classes such as Scaling and TargetInfo.
=== EO interface (eo_interface/) ===
Contains the classes to make it all work with EO. Check the root application 'symreg' for ways to use this

View file

@ -0,0 +1,5 @@
To refrain tcc from searching for libtcc1.a in the path, uncomment
out the lines looking for that in 'tccelf.c'. Search for libtcc1 and uncomment
these two lines. All should be well.

View file

@ -0,0 +1,62 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef EOSYM_H_
#define EOSYM_H_
#include <EO.h>
#include <Sym.h>
#include <FunDef.h>
template <class Fitness>
class EoSym : public EO<Fitness>, public Sym {
public:
void set(const Sym& sym) {
invalidate();
static_cast<Sym*>(this)->operator=(sym);
}
virtual void printOn(ostream& os) const;
virtual void readFrom(istream& is);
};
template <class Fitness>
void EoSym<Fitness>::printOn(ostream& os) const {
EO<Fitness>::printOn(os);
os << ' ';
write_raw(os, *this);
}
template <class Fitness>
void EoSym<Fitness>::readFrom(istream& is) {
EO<Fitness>::readFrom(is);
read_raw(is, *this);
}
template <class Fitness>
inline std::ostream& operator<<(std::ostream& os, const EoSym<Fitness>& f) { f.printOn(os); return os; }
template <class Fitness>
inline istream& operator>>(std::istream& is, EoSym<Fitness>& f) { f.readFrom(is); return os; }
#endif

View file

@ -0,0 +1,76 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <Sym.h>
#include <NodeSelector.h>
#include <eoSymCrossover.h>
#include <utils/eoRNG.h>
bool subtree_quad(Sym& a, Sym& b, NodeSelector& select) {
unsigned i = select.select_node(a);
unsigned j = select.select_node(b);
Sym aprime = insert_subtree(a, i, get_subtree(b, j));
Sym bprime = insert_subtree(b, j, get_subtree(a, i));
a = aprime;
b = bprime;
return true;
}
bool subtree_bin(Sym& a, const Sym& b, NodeSelector& select) {
unsigned i = select.select_node(a);
unsigned j = select.select_node(b);
a = insert_subtree(a, i, get_subtree(b,j));
return true;
}
Sym homologous_binimpl(Sym a, Sym b) {
bool use_a = rng.random(2);
token_t head = (use_a? a : b).token();
SymVec args = use_a?a.args() : b.args();
const SymVec& a_args = a.args();
const SymVec& b_args = b.args();
unsigned mn = std::min(a_args.size(), b_args.size());
bool changed = !use_a;
for (unsigned i = 0; i < mn; ++i) {
args[i] = homologous_binimpl(a_args[i], b_args[i]);
if (args[i] != a_args[i]) {
changed = true;
}
}
return changed? Sym(head, args) : a;
}
bool homologous_bin(Sym& a, const Sym& b) {
if (a==b) return false;
Sym org = a;
a = homologous_binimpl(a,b);
return org != a;
}

View file

@ -0,0 +1,64 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef EOSYMCROSSOVER_H
#define EOSYMCROSSOVER_H
class NodeSelector;
class Sym;
#include <eoOp.h>
extern bool subtree_quad(Sym& a, Sym& b, NodeSelector& select);
template <class EoType>
class eoQuadSubtreeCrossover : public eoQuadOp<EoType> {
NodeSelector& node_selector;
public:
eoQuadSubtreeCrossover(NodeSelector& _node_selector) : node_selector(_node_selector) {}
bool operator()(EoType& a, EoType& b) { return subtree_quad(a,b, node_selector); }
};
extern bool subtree_bin(Sym& a, const Sym& b, NodeSelector& select);
template <class EoType>
class eoBinSubtreeCrossover : public eoBinOp<EoType> {
NodeSelector& node_selector;
public :
eoBinSubtreeCrossover(NodeSelector& _node_selector) : node_selector(_node_selector) {}
bool operator()(EoType& a, const EoType& b) { return subtree_bin(a, b, node_selector); }
};
/** Yet another homologous crossover, afaik not particularly
* defined in the literature
*/
extern bool homologous_bin(Sym& a, const Sym& b);
template <class EoType>
class eoBinHomologousCrossover : public eoBinOp<EoType> {
public:
bool operator()(EoType& a, const EoType& b) {
return homologous_bin(a,b);
}
};
#endif

View file

@ -0,0 +1,94 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef SYMEVAL_H
#define SYMEVAL_H
#include <Sym.h>
#include <ErrorMeasure.h>
#include <BoundsCheck.h>
#include <eoPopEvalFunc.h>
template <class EoType>
class eoSymPopEval : public eoPopEvalFunc<EoType> {
BoundsCheck& check;
ErrorMeasure& measure;
public:
eoSymPopEval(BoundsCheck& _check, ErrorMeasure& _measure) :
check(_check), measure(_measure) {}
/** apparently this thing works on two populations, why?
*
* In any case, currently only implemented the population wide
* evaluation version, as that one is much faster. This because the
* compile going on behind the scenes is much faster when done in one
* go (and using subtree similarity) then when done on a case by case
* basis.
*/
void operator()(eoPop<EoType>& p1, eoPop<EoType>& p2) {
std::vector<unsigned> unevaluated;
std::vector<Sym> tmppop;
for (unsigned i = 0; i < p1.size(); ++i) {
if (p1[i].invalid()) {
if (check.in_bounds(p1[i])) {
unevaluated.push_back(i);
tmppop.push_back( static_cast<Sym>(p1[i]) );
} else {
p1[i].fitness( measure.worst_performance() );
}
}
}
for (unsigned i = 0; i < p2.size(); ++i) {
if (p2[i].invalid()) {
if (check.in_bounds(p2[i])) {
unevaluated.push_back(p1.size() + i);
tmppop.push_back( static_cast<Sym>(p2[i]) );
} else {
p2[i].fitness( measure.worst_performance() ); // pretty bad error
}
}
}
std::vector<ErrorMeasure::result> result = measure.calc_error(tmppop);
for (unsigned i = 0; i < result.size(); ++i) {
unsigned idx = unevaluated[i];
if (idx < p1.size()) {
p1[idx].fitness(result[i].error);
} else {
idx -= p1.size();
p2[idx].fitness(result[i].error);
}
}
}
};
#endif

View file

@ -0,0 +1,64 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef EOSYMINIT_H
#define EOSYMINIT_H
#include <eoInit.h>
#include <TreeBuilder.h>
/** Default initializer, Koza style */
template <class EoType>
class eoSymInit : public eoInit<EoType> {
TreeBuilder& builder;
double own_grow_prob;
unsigned own_max_depth;
double& grow_prob;
unsigned& max_depth;
public:
/** By default build ramped half and half with max depth 6 */
eoSymInit(TreeBuilder& _builder)
: builder(_builder),
own_grow_prob(0.5),
own_max_depth(6),
grow_prob(own_grow_prob),
max_depth(own_max_depth)
{}
/** Control the grow_prob and max_depth externally */
eoSymInit(TreeBuilder& _builder, double& _grow_prob, unsigned& _max_depth)
: builder(_builder),
grow_prob(_grow_prob),
max_depth(_max_depth)
{}
/** build the tree */
void operator()(EoType& tree) {
int depth_to_use = rng.random(max_depth-2) + 2; // two levels minimum
builder.build_tree(tree, depth_to_use, rng.flip(grow_prob));
}
};
#endif

View file

@ -0,0 +1,122 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef SYMMUTATE_H
#define SYMMUTATE_H
#include <TreeBuilder.h>
#include <NodeSelector.h>
#include <eoSym.h>
#include <eoOp.h>
template <class EoType>
class eoSymSubtreeMutate : public eoMonOp<EoType> {
TreeBuilder& subtree_builder;
NodeSelector& node_selector;
public :
eoSymSubtreeMutate(TreeBuilder& _subtree_builder, NodeSelector& _node_selector)
: subtree_builder(_subtree_builder), node_selector(_node_selector) {}
bool operator()(EoType& tomutate) {
unsigned xover_point = node_selector.select_node(tomutate);
// create subtree
Sym newtree = subtree_builder.build_tree(6, true); // TODO, parameterize
static_cast<Sym&>(tomutate) = insert_subtree(tomutate, xover_point, newtree);
return true;
}
};
/** Class for doing node mutation
* Two parameters:
*
* mutation_rate (the rate at which to do mutation)
* is_rate_absolute : don't rescale the rate to the size of the tree
*/
template <class EoType>
class eoSymNodeMutate : public eoMonOp<EoType> {
LanguageTable& table;
double own_mutation_rate;
bool own_is_rate_absolute;
// these two can (should?) move to an impl file
bool mutate(Sym& sym, double p) {
std::pair<Sym, bool> r = do_mutate(sym, p);
sym = r.first;
return r.second;
}
std::pair<Sym, bool> do_mutate(Sym sym, double p) {
bool changed = false;
SymVec args = sym.args();
if (rng.flip(p)) {
token_t new_token = table.get_random_function( args.size());
if (new_token != sym.token()) changed = true;
sym = Sym(new_token, args);
}
for (unsigned i = 0; i < args.size(); ++i) {
std::pair<Sym,bool> r = do_mutate(args[i], p);
changed |= r.second;
if (r.second)
args[i] = r.first;
}
if (changed)
return std::make_pair(Sym(sym.token(), args), true);
// else
return std::make_pair(sym, false);
}
public:
double& mutation_rate;
bool& is_rate_absolute;
eoSymNodeMutate(LanguageTable& _table)
: table(_table),
own_mutation_rate(1.0),
own_is_rate_absolute(false), // this means a probability of node mutation of 1/sym.size()
mutation_rate(own_mutation_rate),
is_rate_absolute(own_is_rate_absolute)
{}
eoSymNodeMutate(LanguageTable& _table, double& _mutation_rate, bool& _is_rate_absolute)
: table(_table),
mutation_rate(_mutation_rate),
is_rate_absolute(_is_rate_absolute)
{}
bool operator()(EoType& _eo) {
double p = mutation_rate;
if (!is_rate_absolute) p /= _eo.size();
return mutate(_eo, p);
}
};
#endif

View file

@ -0,0 +1,80 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <vector>
#include "BoundsCheck.h"
#include <Sym.h>
#include <FunDef.h>
#include <sstream>
using namespace std;
class IntervalBoundsCheckImpl {
public :
vector<Interval> bounds;
};
IntervalBoundsCheck::IntervalBoundsCheck(const vector<double>& mins, const vector<double>& maxes) {
pimpl = new IntervalBoundsCheckImpl;
vector<Interval>& b = pimpl->bounds;
b.resize( mins.size());
for (unsigned i = 0; i < b.size(); ++i) {
b[i] = Interval(mins[i], maxes[i]);
}
}
IntervalBoundsCheck::~IntervalBoundsCheck() { delete pimpl; }
IntervalBoundsCheck::IntervalBoundsCheck(const IntervalBoundsCheck& that) { pimpl = new IntervalBoundsCheckImpl(*that.pimpl); }
IntervalBoundsCheck& IntervalBoundsCheck::operator=(const IntervalBoundsCheck& that) { *pimpl = *that.pimpl; return *this; }
bool IntervalBoundsCheck::in_bounds(const Sym& sym) const {
Interval bounds;
try {
bounds = eval(sym, pimpl->bounds);
if (!valid(bounds)) return false;
} catch (interval_error) {
return false;
}
return true;
}
std::string IntervalBoundsCheck::get_bounds(const Sym& sym) const {
try {
Interval bounds = eval(sym, pimpl->bounds);
if (!valid(bounds)) return "err";
ostringstream os;
os << bounds;
return os.str();
} catch (interval_error) {
return "err";
}
}
std::pair<double, double> IntervalBoundsCheck::calc_bounds(const Sym& sym) const {
Interval bounds = eval(sym, pimpl->bounds);
return make_pair(bounds.lower(), bounds.upper());
}

View file

@ -0,0 +1,58 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef BOUNDS_CHECK_H_
#define BOUNDS_CHECK_H_
#include <string>
class IntervalBoundsCheckImpl;
class Sym;
class BoundsCheck {
public :
virtual ~BoundsCheck() {};
virtual bool in_bounds(const Sym&) const = 0;
virtual std::string get_bounds(const Sym&) const = 0;
};
// checks if a formula keeps within bounds using interval arithmetic
class IntervalBoundsCheck : public BoundsCheck {
IntervalBoundsCheckImpl* pimpl;
public:
IntervalBoundsCheck(const std::vector<double>& minima, const std::vector<double>& maxima);
~IntervalBoundsCheck();
IntervalBoundsCheck(const IntervalBoundsCheck&);
IntervalBoundsCheck& operator=(const IntervalBoundsCheck&);
bool in_bounds(const Sym&) const;
std::string get_bounds(const Sym&) const;
std::pair<double, double> calc_bounds(const Sym&) const;
};
class NoBoundsCheck : public BoundsCheck {
bool in_bounds(const Sym&) const { return false; }
std::string get_bounds(const Sym&) const { return ""; }
};
#endif

View file

@ -0,0 +1,115 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef INTERVAL__H__
#define INTERVAL__H__
#include <boost/numeric/interval.hpp>
#include <iostream>
#include <limits>
typedef boost::numeric::interval_lib::rounded_transc_exact<double> RoundingTransc;
typedef boost::numeric::interval_lib::save_state<RoundingTransc> Rounding;
typedef boost::numeric::interval_lib::checking_base<double> Checking;
typedef boost::numeric::interval_lib::policies<Rounding,Checking> Policy;
typedef boost::numeric::interval<double, Policy> Interval;
struct interval_error{};
inline bool valid(const Interval& val) {
if (!finite(val.lower()) || !finite(val.upper())) return false;
return val.lower() > -1e10 && val.upper() < 1e10;
}
inline Interval sqrt(const Interval& val) {
if (val.lower() < 0.0) {
return Interval::whole();
}
return boost::numeric::sqrt(val);
}
inline Interval sqr(const Interval& val) {
return square(val);
}
inline Interval acos(const Interval& val) {
if (val.lower() < 1.0 || val.upper() > 1.0) {
return Interval::whole();
}
return boost::numeric::acos(val);
}
inline Interval asin(const Interval& val) {
if (val.lower() < 1.0 || val.upper() > 1.0) {
return Interval::whole();
}
return boost::numeric::asin(val);
}
inline Interval acosh(const Interval& val) {
if (val.lower() < 1.0) return Interval::whole();
return boost::numeric::acosh(val);
}
inline
std::ostream& operator<<(std::ostream& os, const Interval& val) {
os << '[' << val.lower() << ", " << val.upper() << ']';
return os;
}
#ifdef TEST_INTERVAL
using namespace std;
using namespace boost::numeric;
int main() {
Interval a(0, 10);
Interval b(-1.5, 2);
cout << "a = " << a << endl;
cout << "b = " << b << endl;
cout << "a + b = " << a + b << endl;
cout << "a - b = " << a - b << endl;
cout << "b - a = " << b - a << endl;
cout << "-a = " << -a << endl;
cout << "a * b = " << a * b << endl;
cout << "b/(a+1) = " << b / (a + 1.0) << endl;
cout << "b * a = " << b * a << endl;
cout << "b / a = " << b/a << endl;
cout << "cos(a) = " << cos(a) << endl;
cout << "cos(b) = " << cos(b) << endl;
cout << "log(b) = " << log(b) << endl;
cout << "sqrt(b) = " << sqrt(b) << endl;
cout << "sqrt(a) = " << sqrt(a) << endl;
cout << "sqr(b) = " << sqr(b) << endl;
cout << "exp(b*a)= " << exp(b*a) << endl;
cout << "atan(a) = " << atan(a) << endl;
cout << "cosh(b) = " << cosh(b) << endl;
}
#endif
#endif

View file

@ -0,0 +1,40 @@
#include <stdio.h>
#include <libtcc.h>
#include <math.h>
static TCCState* s = 0;
extern void symc_init() {
if (s != 0) {
tcc_delete(s);
}
s = tcc_new();
if (s == 0) {
fprintf(stderr, "Tiny cc doesn't function properly");
exit(1);
}
}
extern void symc_compile(const char* func_str) {
//printf("Compiling %s\n", func_str);
tcc_compile_string(s, func_str);
}
extern void symc_link() {
tcc_relocate(s);
}
extern void* symc_get_fun(const char* func_name) {
unsigned long val;
tcc_get_symbol(s, &val, func_name);
return (void*) val;
}
extern void* symc_make(const char* func_str, const char* func_name) {
symc_init();
symc_compile(func_str);
symc_link();
return symc_get_fun(func_name);
}

View file

@ -0,0 +1,181 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#if __GNUC__ == 3
#include <backward/hash_map.h>
#else
#include <hash_map.h>
using std::hash_map;
#endif
#include "Sym.h"
#include "FunDef.h"
#include "sym_compile.h"
#include <sstream>
using namespace std;
extern "C" {
void symc_init();
void symc_compile(const char* func_str);
void symc_link();
void* symc_get_fun(const char* func_name);
void* symc_make(const char* func_str, const char* func_name);
}
string make_prototypes() {
string prot = get_prototypes();
prot += "double sqr(double x) { return x*x; }";
return prot;
}
// contains variable names, like 'a0', 'a1', etc.
typedef hash_map<Sym, string, HashSym> HashMap;
HashMap::iterator find_entry(Sym sym, ostream& os, HashMap& map) {
HashMap::iterator result = map.find(sym);
if (result == map.end()) { // new entry
const SymVec& args = sym.args();
vector<string> argstr(args.size());
for (unsigned i = 0; i < args.size(); ++i) {
argstr[i] = find_entry(args[i], os, map)->second;
}
unsigned current_entry = map.size(); // current number of variables defined
// write out the code
const FunDef& fun = get_element(sym.token());
string code = fun.c_print(argstr, vector<string>());
os << "double a" << current_entry << "=" << code << ";\n";
// insert variable ref in map
ostringstream str;
str << 'a' << current_entry;
result = map.insert( make_pair(sym, str.str()) ).first; // only want iterator
}
return result;
}
void write_entry(Sym sym, ostream& os, HashMap& map, unsigned out) {
HashMap::iterator it = find_entry(sym, os, map);
os << "y[" << out << "]=" << it->second << ";\n";
}
multi_function compile(const std::vector<Sym>& syms) {
ostringstream os;
os << make_prototypes();
os << "double func(const double* x, double* y) { \n ";
HashMap map;
for (unsigned i = 0; i < syms.size(); ++i) {
write_entry(syms[i], os, map, i);
}
os << ";}";
string func_str = os.str();
//cout << "compiling " << func_str << endl;
return (multi_function) symc_make(func_str.c_str(), "func");
}
single_function compile(Sym sym) {
ostringstream os;
os << make_prototypes();
os << "double func(const double* x) { return ";
string code = c_print(sym);
os << code;
os << ";}";
string func_str = os.str();
//cout << "compiling " << func_str << endl;
return (single_function) symc_make(func_str.c_str(), "func");
}
/* finds and inserts the full code in a hashmap */
HashMap::iterator find_code(Sym sym, HashMap& map) {
HashMap::iterator result = map.find(sym);
if (result == map.end()) { // new entry
const SymVec& args = sym.args();
vector<string> argstr(args.size());
for (unsigned i = 0; i < args.size(); ++i) {
argstr[i] = find_code(args[i], map)->second;
}
// write out the code
const FunDef& fun = get_element(sym.token());
string code = fun.c_print(argstr, vector<string>());
result = map.insert( make_pair(sym, code) ).first; // only want iterator
}
return result;
}
string print_code(Sym sym, HashMap& map) {
HashMap::iterator it = find_code(sym, map);
return it->second;
}
void compile(const std::vector<Sym>& syms, std::vector<single_function>& functions) {
symc_init();
ostringstream os;
os << make_prototypes();
HashMap map;
for (unsigned i = 0; i < syms.size(); ++i) {
os << "double func" << i << "(const double* x) { return ";
os << print_code(syms[i], map); //c_print(syms[i]);
os << ";}\n";
//symc_compile(os.str().c_str());
//cout << "compiling " << os.str() << endl;
}
#ifdef INTERVAL_DEBUG
//cout << "Compiling " << os.str() << endl;
#endif
symc_compile(os.str().c_str());
symc_link();
functions.resize(syms.size());
for (unsigned i = 0; i < syms.size(); ++i) {
ostringstream os2;
os2 << "func" << i;
functions[i] = (single_function) symc_get_fun(os2.str().c_str());
}
}

View file

@ -0,0 +1,36 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef SYMCOMPILE_H_
#define SYMCOMPILE_H_
#include <vector>
typedef double (*single_function)(const double *);
typedef void (*multi_function)(const double*, double*); // last argument is output
/*
* Important, after every call of the functions below, the function pointers of the previous
* call are invalidated. Sorry, but that's the way the cookie crumbles (in tcc)
* */
single_function compile(Sym sym);
multi_function compile(const std::vector<Sym>& sym);
void compile(const std::vector<Sym>& sym, std::vector<single_function>& functions);
#endif

View file

@ -0,0 +1,569 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <sstream>
#if __GNUC__ == 3
#include <backward/hash_map.h>
#else
#include <hash_map.h>
using std::hash_map;
#endif
#include "Sym.h"
#include "FunDef.h"
#include <LanguageTable.h>
using namespace std;
using namespace boost::numeric;
vector<const FunDef*> language;
token_t add_function(FunDef* function) {
language.push_back(function);
return token_t(language.size()-1);
}
const FunDef& get_element(token_t token) { return *language[token]; }
/* Printing */
string c_print(const Sym& sym) {
return c_print(sym, vector<string>());
}
string c_print(const Sym& sym, const vector<string>& vars) {
const SymVec& args = sym.args();
vector<string> names(args.size());
for (unsigned i = 0; i < args.size(); ++i) {
names[i] = c_print(args[i], vars);
}
return language[sym.token()]->c_print(names, vars);
}
/* Evaluation */
double eval(const Sym& sym, const std::vector<double>& inputs) {
return language[sym.token()]->eval(sym.args(), inputs);
}
/* Interval Logic */
Interval eval(const Sym& sym, const vector<Interval>& inputs) {
const SymVec& args = sym.args();
vector<Interval> interv(args.size());
for (unsigned i = 0; i < args.size(); ++i) {
interv[i] = eval(args[i], inputs);
if (!valid(interv[i])) throw interval_error();
}
return language[sym.token()]->eval(interv, inputs);
}
/* */
void add_function_to_table(LanguageTable& table, token_t token) {
const FunDef& fundef = *language[token];
if (fundef.has_varargs() == false) {
table.add_function(token, fundef.min_arity());
} else { // sum or prod (or min or max)
table.add_function(token, 2);
}
}
// by default it is eager
double FunDef::eval(const SymVec& args, const vector<double>& inputs) const {
vector<double> values(args.size());
for (unsigned i = 0; i < args.size(); ++i) {
values[i] = ::eval(args[i], inputs);
}
return eval(values, inputs);
}
/* Variable Handling */
FunDef* make_var(int idx); // defined in FunDefs.h
static vector<token_t> var_token;
Sym SymVar(unsigned idx) {
if (var_token.size() <= idx) {
// it is new
var_token.resize(idx+1, token_t(-1));
var_token[idx] = add_function( make_var(idx) );
} else if (var_token[idx] == token_t(-1)) {
var_token[idx] = add_function( make_var(idx) );
}
return Sym(var_token[idx]);
}
/* Constant Handling */
struct HashDouble{
size_t operator()(double val) const {
unsigned long h = 0;
char* s = (char*)&val;
for (unsigned i=0 ; i<sizeof(double); ++i)
h = 5*h + s[i];
return size_t(h);
}
};
typedef hash_map<double, token_t, HashDouble> DoubleSet;
static DoubleSet doubleSet; // for quick checking if a constant already exists
static vector<bool> is_constant;
static vector<double> token_value;
static std::vector<token_t> free_list;
void delete_val(token_t token) { // clean up the information about this constant
if (is_constant[token]) {
double value = token_value[token];
delete language[token];
language[token] = 0;
doubleSet.erase(value);
free_list.push_back(token);
}
}
FunDef* make_const(double value);
Sym SymConst(double value) {
Sym::set_extra_dtor(delete_val);
DoubleSet::iterator it = doubleSet.find(value);
if (it != doubleSet.end()) {
return Sym(it->second);
}
if (free_list.empty()) { // make space for tokens;
unsigned sz = language.size();
language.resize(sz + sz+1); // double
is_constant.resize(language.size(), false);
token_value.resize(language.size(), 0.0);
for (unsigned i = sz; i < language.size(); ++i) {
free_list.push_back(i);
}
}
token_t token = free_list.back();
free_list.pop_back();
assert(language[token] == 0);
language[token] = make_const(value);
doubleSet[value] = token;
is_constant[token] = true;
token_value[token] = value;
return Sym(token);
}
/* LanguageTable depends on this one, XXX move somewhere safe.*/
#include <utils/eoRNG.h>
extern Sym default_const() { return SymConst(rng.normal()); }
/* The functions */
class Var : public FunDef {
public :
int idx;
string default_str;
Var(int _idx) : idx(_idx) {
ostringstream os;
os << "x[" << idx << ']'; // CompiledCode expects this form
default_str = os.str();
}
double eval(const vector<double>& _, const vector<double>& inputs) const { return inputs[idx]; }
double eval(const SymVec& _, const vector<double>& inputs) const { return inputs[idx]; }
string c_print(const vector<string>& _, const vector<string>& names) const {
if (names.empty()) {
return default_str;
}
return names[idx];
}
Interval eval(const vector<Interval>& _, const vector<Interval>& inputs) const {
return inputs[idx];
}
unsigned min_arity() const { return 0; }
string name() const { return "var"; }
};
class Const : public FunDef {
private:
double value;
string value_str;
public:
Const(double _value) : value(_value) {
ostringstream os;
os.precision(17);
os.setf(ios::showpoint);
os << '(' << value << ')';
value_str = os.str();
}
double eval(const vector<double>& _, const vector<double>& inputs) const { return value; }
double eval(const SymVec& _, const vector<double>& inputs) const { return value; }
string c_print(const vector<string>& _, const vector<string>& names) const {
return value_str;
}
Interval eval(const vector<Interval>& _, const vector<Interval>& inputs) const {
// Profil/Bias seems to have a problem with 0 * inf when the Interval is exact zero (fpe)
//if (value == 0.0) return Interval(-BiasEpsilon,BiasEpsilon);
return Interval(value);
}
unsigned min_arity() const { return 0; }
string name() const { return "parameter"; }
};
// Get functions out, excluding Const and Var
vector<const FunDef*> get_defined_functions() {
vector<const FunDef*> res;
for (unsigned i = 0; i < language.size(); ++i) {
res.push_back(language[i]);
if (dynamic_cast<const Const*>(language[i]) != 0 || (dynamic_cast<const Var*>(language[i]) != 0) ) {
res.back() = 0; // erase
}
}
return res;
}
class Sum : public FunDef {
public :
double eval(const vector<double>& vals, const vector<double>& _) const {
double res = 0;
for (unsigned i = 0; i < vals.size(); ++i) res += vals[i];
return res;
}
string c_print(const vector<string>& args, const vector<string>& _) const {
if (args.empty()) { return "0.0"; }
ostringstream os;
os << "(" << args[0];
for (unsigned i = 1; i < args.size(); ++i) {
os << "+" << args[i];
}
os << ")";
return os.str();
}
Interval eval(const vector<Interval>& args, const vector<Interval>& inputs) const {
Interval interv(0.0); //(0.0-BiasEpsilon, 0.0+BiasEpsilon); // Profil/Bias seems to have a problem with 0 * inf when the Interval is exact zero (fpe)
for (unsigned i = 0; i < args.size(); ++i) {
Interval a = args[i]; // Profil doesn't know much about const correctness
interv += a;
}
return interv;
}
unsigned min_arity() const { return 0; }
bool has_varargs() const { return true; }
string name() const { return "sum"; }
};
class Prod : public FunDef {
public :
double eval(const vector<double>& vals, const vector<double>& _) const {
double res = 1;
for (unsigned i = 0; i < vals.size(); ++i) res *= vals[i];
return res;
}
string c_print(const vector<string>& args, const vector<string>& _) const {
if (args.empty()) { return "1.0"; }
ostringstream os;
os << "(" << args[0];
for (unsigned i = 1; i < args.size(); ++i) {
os << "*" << args[i];
}
os << ")";
return os.str();
}
Interval eval(const vector<Interval>& args, const vector<Interval>& inputs) const {
Interval interv(1.0);
for (unsigned i = 0; i < args.size(); ++i) {
Interval a = args[i]; // Profil doesn't know much about const correctness
interv *= a;
}
return interv;
}
unsigned min_arity() const { return 0; }
bool has_varargs() const { return true; }
string name() const { return "prod"; }
};
class Power : public FunDef {
public :
double eval(const vector<double>& vals, const vector<double>& _) const {
return pow(vals[0], vals[1]);
}
string c_print(const vector<string>& args, const vector<string>& _) const {
return "pow(" + args[0] + ',' + args[1] + ')';
}
Interval eval(const vector<Interval>& args, const vector<Interval>& _) const {
Interval first = args[0];
Interval second = args[1];
Interval lg = log(first);
if (!valid(lg)) throw interval_error();
return exp(second * lg);
}
unsigned min_arity() const { return 2; }
string name() const { return "pow"; }
};
class IsNeg : public FunDef {
public:
double eval(const vector<double>& vals, const vector<double>& _) const {
if (vals[0] < 0.0) return vals[1];
return vals[2];
}
double eval(const Sym& sym, const vector<double>& inputs) const {
const SymVec& args = sym.args();
double arg0 = ::eval(args[0], inputs);
if (arg0 < 0.0) {
return ::eval(args[1], inputs);
}
return ::eval(args[2], inputs);
}
string c_print(const vector<string>& args, const vector<string>& _) const {
return "((" + args[0] + "<0.0)?" + args[1] + ":" + args[2]+")";
}
Interval eval(const vector<Interval>& args, const vector<Interval>& _) const {
Interval a0 = args[0];
if (a0.upper() < 0.0) return args[1];
if (a0.lower() >= 0.0) return args[2];
return Interval( std::min(args[1].lower(), args[2].lower()), std::max(args[1].upper(), args[2].upper()));
}
unsigned min_arity() const { return 3; }
string name() const { return "ifltz"; }
};
template <typename Func>
class Unary : public FunDef {
Func un;
double eval(const vector<double>& vals, const vector<double>& _) const {
return un(vals[0]);
}
string c_print(const vector<string>& args, const vector<string>& _) const {
return un(args[0]);
}
Interval eval(const vector<Interval>& args, const vector<Interval>& _) const {
return un(args[0]);
}
unsigned min_arity() const { return 1; }
string name() const { return un.name(); }
};
struct Inv {
double operator()(double val) const { return 1.0 / val; }
string operator()(string v) const { return "(1./" + v + ")"; }
Interval operator()(Interval v) const { return 1.0 / v; }
string name() const { return "inv"; }
};
struct Min {
double operator()(double val) const { return -val; }
string operator()(string v) const { return "(-" + v + ")"; }
Interval operator()(Interval v) const { return -v; }
string name() const { return "min"; }
};
string prototypes = "double pow(double, double);";
string get_prototypes() { return prototypes; }
unsigned add_prototype(string str) { prototypes += string("double ") + str + "(double);"; return prototypes.size(); }
#define FUNCDEF(funcname) struct funcname##_struct { \
double operator()(double val) const { return funcname(val); }\
string operator()(string val) const { return string(#funcname) + '(' + val + ')'; }\
Interval operator()(Interval val) const { return funcname(val); }\
string name() const { return string(#funcname); }\
};\
const token_t funcname##_token = add_function( new Unary<funcname##_struct>);\
unsigned funcname##_size = add_prototype(#funcname);
FunDef* make_var(int idx) { return new Var(idx); }
FunDef* make_const(double value) { return new Const(value); }
const token_t sum_token = add_function( new Sum );
const token_t prod_token = add_function( new Prod);
const token_t inv_token = add_function( new Unary<Inv>);
const token_t min_token = add_function( new Unary<Min>);
const token_t pow_token = add_function( new Power);
const token_t ifltz_token = add_function( new IsNeg);
FUNCDEF(sin);
FUNCDEF(cos);
FUNCDEF(tan);
FUNCDEF(asin);
FUNCDEF(acos);
FUNCDEF(atan);
FUNCDEF(sinh);
FUNCDEF(cosh);
FUNCDEF(tanh);
FUNCDEF(asinh);
FUNCDEF(acosh);
FUNCDEF(atanh);
FUNCDEF(exp);
FUNCDEF(log);
double sqr(double x) { return x*x; }
FUNCDEF(sqr);
FUNCDEF(sqrt);
/* Serialization */
void write_raw(ostream& os, const Sym& sym) {
token_t token = sym.token();
const SymVec& args = sym.args();
if (is_constant.size() > token && is_constant[token]) {
os << "c" << language[token]->c_print(vector<string>(), vector<string>());
} else {
const Var* var = dynamic_cast<const Var*>( language[token] );
if (var != 0) {
os << "v" << var->idx;
} else {
os << "f" << token << ' ' << args.size();
}
}
for (unsigned i = 0; i < args.size(); ++i) {
write_raw(os, args[i]);
}
}
string write_raw(const Sym& sym) {
ostringstream os;
write_raw(os, sym);
return os.str();
}
Sym read_raw(istream& is) {
char id = is.get();
switch (id) {
case 'c' :
{
double val;
is.get(); // skip '('
is >> val;
is.get(); // skip ')'
return SymConst(val);
}
case 'v' :
{
unsigned idx;
is >> idx;
return SymVar(idx);
}
case 'f' :
{
token_t token;
unsigned arity;
is >> token;
is >> arity;
SymVec args(arity);
for (unsigned i = 0; i < arity; ++i) {
args[i] = read_raw(is);
}
return Sym(token, args);
}
default : {
cerr << "Character = " << id << " Could not read formula from stream" << endl;
exit(1);
}
}
return Sym();
}
Sym read_raw(string str) {
istringstream is(str);
return read_raw(is);
}
void read_raw(istream& is, Sym& sym) {
sym = read_raw(is);
}

View file

@ -0,0 +1,124 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef FUNCTION_DEF_H_
#define FUNCTION_DEF_H_
#include <string>
#include <vector>
#include <memory>
#include "Sym.h"
#include "Interval.h"
class FunDef {
public:
virtual ~FunDef() {}
// (possibly) lazy evaluation function, default implementation calls 'eager' eval
virtual double eval(const SymVec& args, const std::vector<double>& inputs) const;
// eager evaluation function
virtual double eval(const std::vector<double>& args, const std::vector<double>& inputs) const = 0;
// interval evaluation
virtual Interval eval(const std::vector<Interval>& args, const std::vector<Interval>& inputs) const = 0;
// prints 'c' like code
virtual std::string c_print(const std::vector<std::string>& names, const std::vector<std::string>& names) const = 0;
virtual unsigned min_arity() const = 0;
virtual bool has_varargs() const { return false; } // sum, prod, min, max are variable arity
virtual std::string name() const = 0;
protected:
};
extern std::vector<const FunDef*> get_defined_functions();
extern const FunDef& get_element(token_t token);
/* Single case evaluation */
extern double eval(const Sym& sym, const std::vector<double>& inputs);
/* Static analysis through interval arithmetic */
extern Interval eval(const Sym& sym, const std::vector<Interval>& inputs);
/* Pretty printers, second version allows setting of variable names */
extern std::string c_print(const Sym& sym);
extern std::string c_print(const Sym& sym, const std::vector<std::string>& var_names);
/* Pretty printer streamer */
inline std::ostream& operator<<(std::ostream& os, Sym sym) { return os << c_print(sym); }
/* Create constant */
extern Sym SymConst(double value);
/* Create variable */
extern Sym SymVar(unsigned idx);
/* Add function to the language table (and take a guess at the arity) */
class LanguageTable;
extern void add_function_to_table(LanguageTable& table, token_t token);
// token names
extern const token_t sum_token;
extern const token_t prod_token;
extern const token_t inv_token;
extern const token_t min_token;
extern const token_t pow_token;
extern const token_t ifltz_token;
#define HEADERFUNC(name) extern const token_t name##_token;\
inline Sym name(Sym arg) { return Sym(name##_token, arg); }
/* This defines the tokens: sin_token, cos_token, etc. */
HEADERFUNC(sin);
HEADERFUNC(cos);
HEADERFUNC(tan);
HEADERFUNC(asin);
HEADERFUNC(acos);
HEADERFUNC(atan);
HEADERFUNC(sinh);
HEADERFUNC(cosh);
HEADERFUNC(tanh);
HEADERFUNC(asinh);
HEADERFUNC(acosh);
HEADERFUNC(atanh);
HEADERFUNC(exp);
HEADERFUNC(log);
HEADERFUNC(sqr);
HEADERFUNC(sqrt);
extern std::string get_prototypes();
// reading and writing in internal format
extern std::string write_raw(const Sym& sym);
extern void write_raw(ostream& os, const Sym& sym);
extern Sym read_raw(std::string str);
extern Sym read_raw(std::istream& is);
extern void read_raw(std::istream& is, Sym& sym);
#include "SymOps.h"
#endif

View file

@ -0,0 +1,109 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include "FunDef.h"
#include "SymOps.h"
#include "Sym.h"
using namespace std;
void collect(token_t t, Sym a, SymVec& args) {
if (a.token() == t) {
const SymVec& a_args = a.args();
for (unsigned i = 0; i < a_args.size(); ++i) {
collect(t, a_args[i], args);
}
return;
}
args.push_back(a);
}
Sym operator+(Sym a, Sym b) {
SymVec args;
collect(sum_token, a, args);
collect(sum_token, b, args);
return Sym(sum_token, args);
}
Sym operator*(Sym a, Sym b) {
SymVec args;
collect(prod_token, a, args);
collect(prod_token, b, args);
return Sym(prod_token, args);
}
Sym operator/(Sym a, Sym b) {
SymVec args;
collect(prod_token, a, args);
SymVec args2;
collect(prod_token, b, args2);
SymVec inv;
inv.push_back(Sym(prod_token, args2));
args.push_back( Sym(inv_token, inv) );
return Sym(prod_token, args);
}
Sym operator-(Sym a, Sym b) {
SymVec args;
collect(sum_token, a, args);
SymVec args2;
collect(sum_token, b, args2);
SymVec min;
min.push_back(Sym(sum_token, args2));
args.push_back( Sym(min_token, min) );
return Sym(sum_token, args);
}
Sym operator-(Sym a) {
return Sym(min_token, a);
}
Sym pow(Sym a, Sym b) {
SymVec args;
args.push_back(a);
args.push_back(b);
return Sym(pow_token, args);
}
Sym ifltz(Sym a, Sym b, Sym c) {
SymVec args;
args.push_back(a);
args.push_back(b);
args.push_back(c);
return Sym(ifltz_token, args);
}

View file

@ -0,0 +1,31 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef SYMOPS_H
#define SYMOPS_H
#include "Sym.h"
extern Sym operator+(Sym a, Sym b);
extern Sym operator*(Sym a, Sym b);
extern Sym operator/(Sym a, Sym b);
extern Sym operator-(Sym a, Sym b);
extern Sym pow(Sym a, Sym b);
extern Sym ifltz(Sym a, Sym b, Sym c);
extern Sym operator-(Sym a);
#endif

View file

@ -0,0 +1,27 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <string>
#include <iostream>
using namespace std;
double error(string errstr) {
cerr << "ERROR: " << errstr << endl;
exit(1);
}

View file

@ -0,0 +1,97 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include "LanguageTable.h"
#include "Sym.h"
#include <utils/eoRNG.h>
using namespace std;
eoRng rng(time(0));
extern Sym default_const();
class LanguageImpl {
public :
std::vector<Sym> vars;
LanguageTable::erc_func erc;
std::vector<functor_t> functions;
std::vector< std::vector<token_t> > functions_per_arity;
LanguageImpl() : erc(default_const) {}
};
LanguageTable::LanguageTable() {
pimpl = new LanguageImpl;
}
LanguageTable::~LanguageTable() {
delete pimpl;
}
LanguageTable::LanguageTable(const LanguageTable& that) {
pimpl = new LanguageImpl(*that.pimpl);
}
LanguageTable& LanguageTable::operator=(const LanguageTable& that) {
*pimpl = *that.pimpl;
return *this;
}
void LanguageTable::add_function(token_t token, unsigned arity) {
functor_t f = {token, arity};
add_function( f );
}
void LanguageTable::add_function(functor_t f) {
if (f.arity > 0) {
pimpl->functions.push_back(f);
} else {
pimpl->vars.push_back(Sym(f.token));
}
if (pimpl->functions_per_arity.size() <= f.arity) pimpl->functions_per_arity.resize(f.arity+1);
pimpl->functions_per_arity[f.arity].push_back(f.token);
}
void LanguageTable::set_erc( erc_func func) { pimpl->erc = func; }
/* Getting info out */
extern Sym SymConst(double val);
Sym LanguageTable::get_random_var() const { return rng.choice(pimpl->vars); }
Sym LanguageTable::get_random_const() const { return pimpl->erc(); }
functor_t LanguageTable::get_random_function() const
{
return rng.choice(pimpl->functions);
}
token_t LanguageTable::get_random_function(unsigned arity) const
{
return rng.choice(pimpl->functions_per_arity[arity]);
}

View file

@ -0,0 +1,56 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef LANGUAGE_TABLE_H
#define LANGUAGE_TABLE_H
#include "token.h"
class LanguageImpl;
class Sym;
class LanguageTable {
LanguageImpl* pimpl;
public:
LanguageTable();
~LanguageTable();
LanguageTable(const LanguageTable& org);
LanguageTable& operator=(const LanguageTable& org);
/* setting it up */
typedef Sym (*erc_func)();
void add_function(token_t token, unsigned arity);
void add_function(functor_t functor);
void set_erc(erc_func func);
/* Getting info out */
Sym get_random_var() const;
Sym get_random_const() const;
functor_t get_random_function() const;
token_t get_random_function(unsigned arity) const;
};
#endif

View file

@ -0,0 +1,40 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include "NodeSelector.h"
#include "Sym.h"
#include <utils/eoRNG.h>
unsigned RandomNodeSelector::select_node(Sym sym) const {
return rng.random(sym.size());
}
unsigned BiasedNodeSelector::select_node(Sym sym) const {
unsigned p = rng.random(sym.size());
for (unsigned i = 0; i < nRounds; ++i) {
Sym res = get_subtree(sym, p);
if (res.args().size() > 0) break;
p = rng.random(sym.size());
}
return p;
}

View file

@ -0,0 +1,47 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef NODESELECTOR_H
#define NODESELECTOR_H
class Sym;
/** Base class for selecting nodes */
class NodeSelector {
public:
virtual unsigned select_node(Sym sym) const = 0;
};
/** Select nodes uniformly */
class RandomNodeSelector : public NodeSelector {
public:
unsigned select_node(Sym sym) const;
};
/** A node selector that does a specified number of rounds ignoring terminals */
class BiasedNodeSelector : public NodeSelector {
public:
unsigned nRounds;
BiasedNodeSelector() : nRounds(3) {} // 3: for binary trees 87.5% chance of selecting an internal node
BiasedNodeSelector(unsigned n) : nRounds(n) {}
unsigned select_node(Sym sym) const;
};
#endif

View file

@ -0,0 +1,46 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <utils/eoRNG.h>
#include "TreeBuilder.h"
Sym TreeBuilder::make_terminal() const {
if (rng.flip(vcprob)) {
return table.get_random_var();
}
return table.get_random_const();
}
Sym TreeBuilder::build_tree(unsigned max_depth, bool grow) const {
if (max_depth == 0 || grow && rng.random(2) == 0) {
return make_terminal();
}
// pick a random function, no matter what arity
functor_t func = table.get_random_function();
SymVec args(func.arity);
for (unsigned i = 0; i < args.size(); ++i) {
args[i] = build_tree(max_depth-1, grow);
}
return Sym(func.token, args);
}

View file

@ -0,0 +1,45 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef TREEBUILDER_H_
#define TREEBUILDER_H_
#include "Sym.h"
#include "LanguageTable.h"
class TreeBuilder {
const LanguageTable& table;
// probability of selecting a var versus a const when the choice boils down to selecting a terminal
double vcprob;
Sym make_terminal() const;
public:
TreeBuilder(const LanguageTable& t) : table(t), vcprob(0.9) {};
TreeBuilder(const LanguageTable& t, double vc) : table(t), vcprob(vc) {};
void set_var_vs_const_probability(double p) { vcprob = p; }
Sym build_tree(unsigned max_depth, bool grow) const;
void build_tree(Sym& tree, unsigned max_depth, bool grow) const { tree = build_tree(max_depth, grow); }
};
#endif

17
eo/contrib/mathsym/header Normal file
View file

@ -0,0 +1,17 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/

View file

@ -0,0 +1,133 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include "Dataset.h"
#include <fstream>
#include <sstream>
#include <iostream>
using namespace std;
class DataSetImpl {
public:
vector< vector<double> > inputs;
vector<double> targets;
void read_data(vector<string> strings) {
// find the number of inputs
istringstream cnt(strings[0]);
unsigned n = 0;
for (;;) {
string s;
cnt >> s;
if (!cnt) break;
++n;
}
inputs.resize(strings.size(), vector<double>(n-1));
targets.resize(strings.size());
for (unsigned i = 0; i < strings.size(); ++i) {
istringstream is(strings[i]);
for (unsigned j = 0; j < n; ++j) {
if (!is) {
cerr << "Too few targets in record " << i << endl;
exit(1);
}
if (j < n-1) {
is >> inputs[i][j];
} else {
is >> targets[i];
}
}
}
}
};
Dataset::Dataset() { pimpl = new DataSetImpl; }
Dataset::~Dataset() { delete pimpl; }
Dataset::Dataset(const Dataset& that) { pimpl = new DataSetImpl(*that.pimpl); }
Dataset& Dataset::operator=(const Dataset& that) { *pimpl = *that.pimpl; return *this; }
unsigned Dataset::n_records() const { return pimpl->targets.size(); }
unsigned Dataset::n_fields() const { return pimpl->inputs[0].size(); }
const std::vector<double>& Dataset::get_inputs(unsigned record) const { return pimpl->inputs[record]; }
double Dataset::get_target(unsigned record) const { return pimpl->targets[record]; }
double error(string errstr);
void Dataset::load_data(std::string filename) {
vector<string> strings; // first load it in strings
ifstream is(filename.c_str());
for(;;) {
string s;
getline(is, s);
if (!is) break;
if (s[0] == '#') continue; // comment, skip
strings.push_back(s);
}
is.close();
if (strings.size() == 0) {
error("No data could be loaded");
}
pimpl->read_data(strings);
}
std::vector<double> Dataset::input_minima() const {
vector<vector<double> >& in = pimpl->inputs;
vector<double> mn(in[0].size(), 1e+50);
for (unsigned i = 0; i < in.size(); ++i) {
for (unsigned j = 0; j < in[i].size(); ++j) {
mn[j] = std::min(mn[j], in[i][j]);
}
}
return mn;
}
vector<double> Dataset::input_maxima() const {
vector<vector<double> >& in = pimpl->inputs;
vector<double> mx(in[0].size(), -1e+50);
for (unsigned i = 0; i < in.size(); ++i) {
for (unsigned j = 0; j < in[i].size(); ++j) {
mx[j] = std::max(mx[j], in[i][j]);
}
}
return mx;
}

View file

@ -0,0 +1,51 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef DATASET_H_
#define DATASET_H_
#include <string>
#include <vector>
class DataSetImpl;
class Dataset {
DataSetImpl* pimpl;
Dataset& operator=(const Dataset&); // cannot assign
public:
Dataset();
~Dataset();
Dataset(const Dataset&);
void load_data(std::string filename);
unsigned n_records() const;
unsigned n_fields() const;
const std::vector<double>& get_inputs(unsigned record) const;
double get_target(unsigned record) const;
std::vector<double> input_minima() const;
std::vector<double> input_maxima() const;
};
#endif

View file

@ -0,0 +1,237 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <vector>
#include <valarray>
#include "ErrorMeasure.h"
#include "Dataset.h"
#include "Sym.h"
#include "sym_compile.h"
#include "TargetInfo.h"
using namespace std;
#ifdef INTERVAL_DEBUG
#include <BoundsCheck.h>
#include <FunDef.h>
vector<double> none;
IntervalBoundsCheck bounds(none, none);
#endif
static double not_a_number = atof("nan");
class ErrorMeasureImpl {
public:
const Dataset& data;
TargetInfo train_info;
ErrorMeasure::measure measure;
Scaling no_scaling;
ErrorMeasureImpl(const Dataset& d, double t_p, ErrorMeasure::measure m) : data(d), measure(m) {
#ifdef INTERVAL_DEBUG
bounds = IntervalBoundsCheck(d.input_minima(), d.input_maxima());
#endif
unsigned nrecords = d.n_records();
unsigned cases = unsigned(t_p * nrecords);
valarray<double> t(cases);
for (unsigned i = 0; i < cases; ++i) {
t[i] = data.get_target(i);
}
train_info = TargetInfo(t);
no_scaling = Scaling(new NoScaling);
}
ErrorMeasure::result eval(const valarray<double>& y) {
ErrorMeasure::result result;
result.scaling = no_scaling;
switch(measure) {
case ErrorMeasure::mean_squared:
result.error = pow(train_info.targets() - y, 2.0).sum() / y.size();
return result;
case ErrorMeasure::absolute:
result.error = abs(train_info.targets() - y).sum() / y.size();
return result;
case ErrorMeasure::mean_squared_scaled:
result.scaling = ols(y, train_info);
result.error = pow(train_info.targets() - result.scaling->transform(y), 2.0).sum() / y.size();
return result;
default:
cerr << "Unknown measure encountered: " << measure << " " << __FILE__ << " " << __LINE__ << endl;
}
return result;
}
unsigned train_cases() const {
return train_info.targets().size();
}
vector<ErrorMeasure::result> multi_function_eval(const vector<Sym>& pop) {
multi_function all = compile(pop);
std::vector<double> y(pop.size());
std::vector<double> err(pop.size());
const std::valarray<double>& t = train_info.targets();
for (unsigned i = 0; i < train_cases(); ++i) {
// evaluate
all(&data.get_inputs(i)[0], &y[0]);
for (unsigned j = 0; j < y.size(); ++j) {
double diff = y[j] - t[i];
if (measure == ErrorMeasure::mean_squared) { // branch prediction will probably solve this inefficiency
err[j] += diff * diff;
} else {
err[j] += fabs(diff);
}
}
}
std::vector<ErrorMeasure::result> result(pop.size());
double n = train_cases();
Scaling no = Scaling(new NoScaling);
for (unsigned i = 0; i < pop.size(); ++i) {
result[i].error = err[i] / n;
result[i].scaling = no;
}
return result;
}
vector<ErrorMeasure::result> single_function_eval(const vector<Sym> & pop) {
vector<single_function> funcs(pop.size());
compile(pop, funcs); // get one function pointer for each individual
valarray<double> y(train_cases());
vector<ErrorMeasure::result> result(pop.size());
for (unsigned i = 0; i < funcs.size(); ++i) {
for (unsigned j = 0; j < train_cases(); ++j) {
y[j] = funcs[i](&data.get_inputs(j)[0]);
}
#ifdef INTERVAL_DEBUG
//cout << "eval func " << i << " " << pop[i] << endl;
pair<double, double> b = bounds.calc_bounds(pop[i]);
// check if y is in bounds
for (unsigned j = 0; j < y.size(); ++j) {
if (y[j] < b.first -1e-4 || y[j] > b.second + 1e-4 || !finite(y[j])) {
cout << "Error " << y[j] << " not in " << b.first << ' ' << b.second << endl;
cout << "Function " << pop[i] << endl;
exit(1);
}
}
#endif
result[i] = eval(y);
}
return result;
}
vector<ErrorMeasure::result> calc_error(const vector<Sym>& pop) {
// currently we can only accumulate simple measures such as absolute and mean_squared
switch(measure) {
case ErrorMeasure::mean_squared:
case ErrorMeasure::absolute:
return multi_function_eval(pop);
case ErrorMeasure::mean_squared_scaled:
return single_function_eval(pop);
}
return vector<ErrorMeasure::result>();
}
};
ErrorMeasure::result::result() {
error = 0.0;
scaling = Scaling(0);
}
bool ErrorMeasure::result::valid() const {
return isfinite(error);
}
ErrorMeasure::ErrorMeasure(const Dataset& data, double train_perc, measure meas) {
pimpl = new ErrorMeasureImpl(data, train_perc, meas);
}
ErrorMeasure::~ErrorMeasure() { delete pimpl; }
ErrorMeasure::ErrorMeasure(const ErrorMeasure& that) { pimpl = new ErrorMeasureImpl(*that.pimpl); }
ErrorMeasure::result ErrorMeasure::calc_error(Sym sym) {
single_function f = compile(sym);
valarray<double> y(pimpl->train_cases());
for (unsigned i = 0; i < y.size(); ++i) {
y[i] = f(&pimpl->data.get_inputs(i)[0]);
if (!finite(y[i])) {
result res;
res.scaling = Scaling(new NoScaling);
res.error = not_a_number;
return res;
}
}
return pimpl->eval(y);
}
vector<ErrorMeasure::result> ErrorMeasure::calc_error(const vector<Sym>& syms) {
return pimpl->calc_error(syms);
}
double ErrorMeasure::worst_performance() const {
if (pimpl->measure == mean_squared_scaled) {
return pimpl->train_info.tvar();
}
return 1e+20; // TODO: make this general
}

View file

@ -0,0 +1,61 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef ERROR_MEASURE_H
#define ERROR_MEASURE_H
#include "Scaling.h"
class ErrorMeasureImpl;
class Sym;
class Dataset;
class ErrorMeasure {
ErrorMeasureImpl* pimpl;
public :
enum measure {
absolute,
mean_squared,
mean_squared_scaled,
};
struct result {
double error;
Scaling scaling;
result();
bool valid() const;
};
ErrorMeasure(const Dataset& data, double train_perc, measure meas = mean_squared);
~ErrorMeasure();
ErrorMeasure(const ErrorMeasure& that);
ErrorMeasure& operator=(const ErrorMeasure& that);
result calc_error(Sym sym);
std::vector<result> calc_error(const std::vector<Sym>& sym);
double worst_performance() const;
};
#endif

View file

@ -0,0 +1,417 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include "Scaling.h"
#include "TargetInfo.h"
using namespace std;
Scaling slope(const std::valarray<double>& x, const TargetInfo& targets) {
double xx = 0.0;
double xy = 0.0;
const valarray<double>& y = targets.targets();
for (unsigned i = 0; i < x.size(); ++i) {
xx += x[i] * x[i];
xy += x[i] * y[i];
}
if (xx < 1e-7) return Scaling(new LinearScaling(0.0,0.0));
double b = xy / xx;
return Scaling(new LinearScaling(0.0, b));
}
// Still needs proper testing with non-trivial lambda
Scaling regularized_least_squares(const std::valarray<double>& inputs, const TargetInfo& targets, double lambda) {
double n = inputs.size();
valarray<double> x = inputs;
double a,b,d;
a=b=d=0;
for (unsigned i = 0; i < n; ++i) {
a += 1 + lambda;
b += x[i];
d += x[i] * x[i] + lambda;
}
//invert
double ad_bc = a*d - b * b;
// if ad_bc equals zero there's a problem
if (ad_bc < 1e-17) return Scaling(new LinearScaling);
double ai = d/ad_bc;
double bi = -b/ad_bc;
double di = a/ad_bc;
double ci = bi;
// Now multiply this inverted covariance matrix (C^-1) with x' * t
std::valarray<double> ones = x;
// calculate C^-1 * x' )
for (unsigned i = 0; i < n; ++i)
{
ones[i] = (ai + bi * x[i]);
x[i] = (ci + di * x[i]);
}
// results are in [ones, x], now multiply with y
a = 0.0; // intercept
b = 0.0; // slope
const valarray<double>& t = targets.targets();
for (unsigned i = 0; i < n; ++i)
{
a += ones[i] * t[i];
b += x[i] * t[i];
}
return Scaling(new LinearScaling(a,b));
}
Scaling ols(const std::valarray<double>& y, const std::valarray<double>& t) {
double n = y.size();
double y_mean = y.sum() / n;
double t_mean = t.sum() / n;
std::valarray<double> y_var = (y - y_mean);
std::valarray<double> t_var = (t - t_mean);
std::valarray<double> cov = t_var * y_var;
y_var *= y_var;
t_var *= t_var;
double sumvar = y_var.sum();
if (sumvar == 0. || sumvar/n < 1e-7 || sumvar/n > 1e+7) // breakout when numerical problems are likely
return Scaling(new LinearScaling(t_mean,0.));
double b = cov.sum() / sumvar;
double a = t_mean - b * y_mean;
Scaling s = Scaling(new LinearScaling(a,b));
return s;
}
Scaling ols(const std::valarray<double>& y, const TargetInfo& targets) {
double n = y.size();
double y_mean = y.sum() / n;
std::valarray<double> y_var = (y - y_mean);
std::valarray<double> cov = targets.tcov_part() * y_var;
y_var *= y_var;
double sumvar = y_var.sum();
if (sumvar == 0. || sumvar/n < 1e-7 || sumvar/n > 1e+7) // breakout when numerical problems are likely
return Scaling(new LinearScaling(targets.tmean(),0.));
double b = cov.sum() / sumvar;
double a = targets.tmean() - b * y_mean;
if (!finite(b)) {
cout << a << ' ' << b << endl;
cout << sumvar << endl;
cout << y_mean << endl;
cout << cov.sum() << endl;
exit(1);
}
Scaling s = Scaling(new LinearScaling(a,b));
return s;
}
Scaling wls(const std::valarray<double>& inputs, const TargetInfo& targets) {
std::valarray<double> x = inputs;
const std::valarray<double>& w = targets.weights();
unsigned n = x.size();
// First calculate x'*W (as W is a diagonal matrix it's simply elementwise multiplication
std::valarray<double> wx = targets.weights() * x;
// Now x'*W is contained in [w,wx], calculate x' * W * x (the covariance)
double a,b,d;
a=b=d=0.0;
for (unsigned i = 0; i < n; ++i)
{
a += w[i];
b += wx[i];
d += x[i] * wx[i];
}
//invert
double ad_bc = a*d - b * b;
// if ad_bc equals zero there's a problem
if (ad_bc < 1e-17) return Scaling(new LinearScaling);
double ai = d/ad_bc;
double bi = -b/ad_bc;
double di = a/ad_bc;
double ci = bi;
// Now multiply this inverted covariance matrix (C^-1) with x' * W * y
// create alias to reuse the wx we do not need anymore
std::valarray<double>& ones = wx;
// calculate C^-1 * x' * W (using the fact that W is diagonal)
for (unsigned i = 0; i < n; ++i)
{
ones[i] = w[i]*(ai + bi * x[i]);
x[i] = w[i]*(ci + di * x[i]);
}
// results are in [ones, x], now multiply with y
a = 0.0; // intercept
b = 0.0; // slope
const valarray<double>& t = targets.targets();
for (unsigned i = 0; i < n; ++i)
{
a += ones[i] * t[i];
b += x[i] * t[i];
}
return Scaling(new LinearScaling(a,b));
}
//Scaling med(const std::valarray<double>& inputs, const TargetInfo& targets);
double mse(const std::valarray<double>& y, const TargetInfo& t) {
valarray<double> residuals = t.targets()-y;
residuals *= residuals;
double sz = residuals.size();
if (t.has_weights()) {
residuals *= t.weights();
sz = 1.0;
}
return residuals.sum() / sz;
}
double rms(const std::valarray<double>& y, const TargetInfo& t) {
return sqrt(mse(y,t));
}
double mae(const std::valarray<double>& y, const TargetInfo& t) {
valarray<double> residuals = abs(t.targets()-y);
if (t.has_weights()) residuals *= t.weights();
return residuals.sum() / residuals.size();
}
/*
double standard_error(const std::valarray<double>& y, const std::pair<double,double>& scaling) {
double a = scaling.first;
double b = scaling.second;
double n = y.size();
double se = sqrt( pow(a+b*y-current_set->targets,2.0).sum() / (n-2));
double mean_y = y.sum() / n;
double sxx = pow( y - mean_y, 2.0).sum();
return se / sqrt(sxx);
}
double scaled_mse(const std::valarray<double>& y){
std::pair<double,double> scaling;
return scaled_mse(y,scaling);
}
double scaled_mse(const std::valarray<double>& y, std::pair<double, double>& scaling)
{
scaling = scale(y);
double a = scaling.first;
double b = scaling.second;
std::valarray<double> tmp = current_set->targets - a - b * y;
tmp *= tmp;
if (weights.size())
return (weights * tmp).sum();
return tmp.sum() / tmp.size();
}
double robust_mse(const std::valarray<double>& ny, std::pair<double, double>& scaling) {
double smse = scaled_mse(ny,scaling);
std::valarray<double> y = ny;
// find maximum covariance case
double n = y.size();
int largest = 0;
{
double y_mean = y.sum() / n;
std::valarray<double> y_var = (y - y_mean);
std::valarray<double> cov = tcov * y_var;
std::valarray<bool> maxcov = cov == cov.max();
for (unsigned i = 0; i < maxcov.size(); ++i) {
if (maxcov[i]) {
largest = i;
break;
}
}
}
double y_mean = (y.sum() - y[largest]) / (n-1);
y[largest] = y_mean; // dissappears from covariance calculation
std::valarray<double> y_var = (y - y_mean);
std::valarray<double> cov = tcov * y_var;
y_var *= y_var;
double sumvar = y_var.sum();
if (sumvar == 0. || sumvar/n < 1e-7 || sumvar/n > 1e+7) // breakout when numerical problems are likely
return worst_performance();
double b = cov.sum() / sumvar;
double a = tmean - b * y_mean;
std::valarray<double> tmp = current_set->targets - a - b * y;
tmp[largest] = 0.0;
tmp *= tmp;
double smse2 = tmp.sum() / (tmp.size()-1);
static std::ofstream os("smse.txt");
os << smse << ' ' << smse2 << '\n';
if (smse2 > smse) {
return worst_performance();
//std::cerr << "overfit? " << smse << ' ' << smse2 << '\n';
}
scaling.first = a;
scaling.second = b;
return smse2;
}
class Sorter {
const std::valarray<double>& scores;
public:
Sorter(const std::valarray<double>& _scores) : scores(_scores) {}
bool operator()(unsigned i, unsigned j) const {
return scores[i] < scores[j];
}
};
double coc(const std::valarray<double>& y) {
std::vector<unsigned> indices(y.size());
for (unsigned i = 0; i < y.size(); ++i) indices[i] = i;
std::sort(indices.begin(), indices.end(), Sorter(y));
const std::valarray<double>& targets = current_set->targets;
double neg = 1.0 - targets[indices[0]];
double pos = targets[indices[0]];
double cumpos = 0;
double cumneg = 0;
double sum=0;
double last_score = y[indices[0]];
for(unsigned i = 1; i < targets.size(); ++i) {
if (fabs(y[indices[i]] - last_score) < 1e-9) { // we call it tied
pos += targets[indices[i]];
neg += 1.0 - targets[indices[i]];
if (i < targets.size()-1)
continue;
}
sum += pos * cumneg + (pos * neg) * 0.5;
cumneg += neg;
cumpos += pos;
pos = targets[indices[i]];
neg = 1.0 - targets[indices[i]];
last_score = y[indices[i]];
}
return sum / (cumneg * cumpos);
}
// iterative re-weighted least squares (for parameters.classification)
double irls(const std::valarray<double>& scores, std::pair<double,double>& scaling) {
const std::valarray<double>& t = current_set->targets;
std::valarray<double> e(scores.size());
std::valarray<double> u(scores.size());
std::valarray<double> w(scores.size());
std::valarray<double> z(scores.size());
parameters.use_irls = false; parameters.classification=false;
scaling = scale(scores);
parameters.use_irls=true;parameters.classification=true;
if (scaling.second == 0.0) return worst_performance();
for (unsigned i = 0; i < 10; ++i) {
e = exp(scaling.first + scaling.second*scores);
u = e / (e + exp(-(scaling.first + scaling.second * scores)));
w = u*(1.-u);
z = (t-u)/w;
scaling = wls(scores, u, w);
//double ll = (log(u)*t + (1.-log(u))*(1.-t)).sum();
//std::cout << "Scale " << i << ' ' << scaling.first << " " << scaling.second << " LL " << 2*ll << std::endl;
}
// log-likelihood
u = exp(scaling.first + scaling.second*scores) / (1 + exp(scaling.first + scaling.second*scores));
double ll = (log(u)*t + (1.-log(u))*(1.-t)).sum();
return 2*ll;
}
*/

View file

@ -0,0 +1,93 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef SCALING_H_
#define SCALING_H_
#include "shared_ptr.h"
#include <valarray>
#include <iostream>
#include <string>
class TargetInfo;
class ScalingBase {
public:
std::valarray<double> apply(const std::valarray<double>& x) {
std::valarray<double> xtmp = x;
transform(xtmp);
return xtmp;
}
virtual double transform(double input) const = 0;
virtual void transform(std::valarray<double>& inputs) const = 0;
virtual std::ostream& print(std::ostream& os, std::string str) const = 0;
virtual std::valarray<double> transform(const std::valarray<double>& inputs) const = 0;
};
typedef shared_ptr<ScalingBase> Scaling;
class LinearScaling : public ScalingBase {
double a,b;
public:
LinearScaling() : a(0.0), b(1.0) {}
LinearScaling(double _a, double _b) : a(_a), b(_b) {}
double transform(double input) const { input *=b; input += a; return input; }
void transform(std::valarray<double>& inputs) const { inputs *= b; inputs += a; }
std::valarray<double> transform(const std::valarray<double>& inputs) const {
std::valarray<double> y = a + b * inputs;
return y;
}
double intercept() const { return a; }
double slope() const { return b; }
std::ostream& print(std::ostream& os, std::string str) const {
os.precision(16);
os << a << " + " << b << " * " << str;
return os;
}
};
class NoScaling : public ScalingBase{
void transform(std::valarray<double>&) const {}
double transform(double input) const { return input; }
std::valarray<double> transform(const std::valarray<double>& inputs) const { return inputs; }
std::ostream& print(std::ostream& os, std::string str) const { return os << str; }
};
extern Scaling slope(const std::valarray<double>& inputs, const TargetInfo& targets); // slope only
extern Scaling ols(const std::valarray<double>& inputs, const TargetInfo& targets);
extern Scaling wls(const std::valarray<double>& inputs, const TargetInfo& targets);
extern Scaling med(const std::valarray<double>& inputs, const TargetInfo& targets);
extern Scaling ols(const std::valarray<double>& inputs, const std::valarray<double>& outputs);
extern double mse(const std::valarray<double>& y, const TargetInfo& t);
extern double rms(const std::valarray<double>& y, const TargetInfo& t);
extern double mae(const std::valarray<double>& y, const TargetInfo& t);
// Todo Logistic Scaling
#endif

View file

@ -0,0 +1,138 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include "TargetInfo.h"
using namespace std;
TargetInfo::TargetInfo(const TargetInfo& org) { operator=(org); }
TargetInfo& TargetInfo::operator=(const TargetInfo& org) {
_targets.resize(org._targets.size());
_weights.resize(org._weights.size());
_tcov_part.resize(org._tcov_part.size());
_targets = org._targets;
_weights = org._weights;
_tcov_part = org._tcov_part;
_tmean = org._tmean;
_tvar = org._tvar;
_tstd = org._tstd;
_tmed = org._tmed;
return *this;
}
TargetInfo::TargetInfo(const std::valarray<double>& t) {
_weights.resize(0);
_targets.resize(t.size());
_targets = t;
_tmean = _targets.sum()/_targets.size();
_tcov_part.resize(_targets.size());
_tcov_part = _targets;
_tcov_part -= _tmean;
std::valarray<double> tmp = _tcov_part;
tmp = _tcov_part;
tmp *= tmp;
_tvar = tmp.sum() / (tmp.size()-1);
_tstd = sqrt(_tvar);
_tmed = 0;
}
TargetInfo::TargetInfo(const std::valarray<double>& t, const std::valarray<double>& w) {
_targets.resize(t.size());
_weights.resize(w.size());
_targets = t;
_weights = w;
double sumw = _weights.sum();
// scale weights so that they'll add up to 1
_weights /= sumw;
_tmean = (_targets * _weights).sum();
_tcov_part.resize(_targets.size());
_tcov_part = _targets;
_tcov_part -= _tmean;
_tvar = (pow(_targets - _tmean, 2.0) * _weights).sum();
_tstd = sqrt(_tvar);
_tmed = 0.;
}
// calculate the members, now in the context of a mask
void TargetInfo::set_training_mask(const std::valarray<bool>& tmask) {
TargetInfo tmp;
if (has_weights() ) {
tmp = TargetInfo( _targets[tmask], _weights[tmask]);
} else {
tmp = TargetInfo( _targets[tmask] );
}
_tcov_part.resize(tmp._tcov_part.size());
_tcov_part = tmp._tcov_part;
_tmean = tmp._tmean;
_tvar = tmp._tvar;
_tstd = tmp._tstd;
_tmed = tmp._tmed;
_training_mask.resize(tmask.size());
_training_mask = tmask;
}
struct SortOnTargets
{
const valarray<double>& t;
SortOnTargets(const valarray<double>& v) : t(v) {}
bool operator()(int i, int j) const {
return fabs(t[i]) < fabs(t[j]);
}
};
vector<int> TargetInfo::sort() {
vector<int> ind(_targets.size());
for (unsigned i = 0; i < ind.size(); ++i) { ind[i] = i; }
std::sort(ind.begin(), ind.end(), SortOnTargets(_targets));
valarray<double> tmptargets = _targets;
valarray<double> tmpweights = _weights;
valarray<double> tmpcov = _tcov_part;
for (unsigned i = 0; i < ind.size(); ++i)
{
_targets[i] = tmptargets[ ind[i] ];
_tcov_part[i] = tmpcov[ ind[i] ];
if (_weights.size()) _weights[i] = tmpweights[ ind[i] ];
}
return ind;
}

View file

@ -0,0 +1,65 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef TARGETINFO_H_
#define TARGETINFO_H_
#include <valarray>
#include <vector>
class TargetInfo {
std::valarray<double> _targets;
std::valarray<double> _weights;
std::valarray<bool> _training_mask;
// some stuff for ols
std::valarray<double> _tcov_part;
double _tmean;
double _tvar;
double _tstd;
double _tmed;
public:
TargetInfo() {}
TargetInfo(const std::valarray<double>& t);
TargetInfo(const std::valarray<double>& t, const std::valarray<double>& w);
TargetInfo(const TargetInfo& org);
TargetInfo& operator=(const TargetInfo& org);
~TargetInfo() {}
const std::valarray<double>& targets() const { return _targets; }
const std::valarray<double>& weights() const { return _weights; }
const std::valarray<bool>& mask() const { return _training_mask; }
void set_training_mask(const std::valarray<bool>& mask);
bool has_weights() const { return _weights.size(); }
bool has_mask() const { return _training_mask.size(); }
std::vector<int> sort();
const std::valarray<double>& tcov_part() const { return _tcov_part; }
double tmean() const { return _tmean; }
double tvar() const { return _tvar; }
double tstd() const { return _tstd; }
double devmedian() const { return _tmed; }
};
#endif

View file

@ -0,0 +1,102 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef _SHARED_PTR_H
#define _SHARED_PTR_H
template <class T> class weak_ptr;
template <class T>
class shared_ptr {
private:
T* ptr;
unsigned* count; //
/* special case, null pointer (nil-code) */
static unsigned* nil() { static unsigned nil_counter(1); return &nil_counter; }
void decref() { if (--(*count) == 0) { delete ptr; delete count; }}
void incref() { ++(*count); }
friend class weak_ptr<T>;
public:
shared_ptr() : ptr(0), count(nil()) { incref(); }
~shared_ptr() { decref(); }
shared_ptr(const shared_ptr<T>& o) : ptr(o.ptr), count(o.count) { incref(); }
shared_ptr(T* p) : ptr(p), count(new unsigned(1)) {}
explicit shared_ptr(const weak_ptr<T>& w) : ptr(w.ptr), count(w.count) { incref(); }
shared_ptr<T>& operator=(const shared_ptr<T>& o) {
if (ptr == o.ptr) return *this;
decref();
ptr = o.ptr;
count = o.count;
incref();
return *this;
}
T* get() { return ptr; }
T* operator->() { return ptr; }
T& operator*() { return *ptr; }
const T* get() const { return ptr; }
const T* operator->() const { return ptr; }
const T& operator*() const { return *ptr; }
bool operator==(const shared_ptr<T>& o) const { return ptr == o.ptr; }
bool operator!=(const shared_ptr<T>& o) const { return ptr != o.ptr; }
bool operator<(const shared_ptr<T>& o) const { return ptr < o.ptr; }
unsigned refcount() const { return *count; }
};
template <class T>
class weak_ptr {
T* ptr;
unsigned* count;
friend class shared_ptr<T>;
public:
weak_ptr() : ptr(0), count(shared_ptr<T>::nil()) {}
explicit weak_ptr( const shared_ptr<T>& s) : ptr(s.ptr), count(s.count) {}
shared_ptr<T> lock() const { return shared_ptr<T>(*this); }
T* get() { return ptr; }
T* operator->() { return ptr; }
T& operator*() { return *ptr; }
const T* get() const { return ptr; }
const T* operator->() const { return ptr; }
const T& operator*() const { return *ptr; }
bool operator==(const shared_ptr<T>& o) const { return ptr == o.ptr; }
bool operator!=(const shared_ptr<T>& o) const { return ptr != o.ptr; }
bool operator<(const shared_ptr<T>& o) const { return ptr < o.ptr; }
unsigned refcount() const { return *count; }
};
#endif

View file

@ -0,0 +1,364 @@
/*
DESCRIPTION:
The class 'Sym' in this package provides a reference counted, hashed tree structure that can be used in genetic programming.
The hash table behind the scenes makes sure that every subtree in the application is stored only once.
This has a couple of advantages:
o Memory: all subtrees are stored only once
o Comparison: comparison for equality for two subtrees boils down to a pointer comparison
o Overview: by accessing the hashtable, you get an instant overview of the state of the population
The disadvantage of this method is the constant time overhead for computing hashes. In practice,
it seems to be fast enough.
===== How to Use this =========
In essence, the Sym data structure contains two important pieces of data,
the 'token' (of type token_t = int) and the children, a vector of Sym (called SymVec).
The token should contain all information to be able to figure out which
function/terminal is represented by the node in the tree. By retrieving this token value
and the SymVec it is possible to write recursive traversal routines for evaluation, printing,
etc.
*/
#include <iostream>
#include "Sym.h"
using namespace std;
/*
* Suppose token value '0' designates our terminal, and token value '1' designates a binary function.
* Later on a ternary function will be used as well, designated with token value '2'
* The function below will create a tree of size three
*/
Sym test1() {
SymVec children;
children.push_back( Sym(0) ); // push_back is a member from std::vector, SymVec is derived from std::vector
children.push_back( Sym(0) );
Sym tree = Sym(token_t(1), children); // creates the tree
/* Done, now print some information about the node */
cout << "Size = " << tree.size() << endl; // prints 3
cout << "Depth = " << tree.depth() << endl; // prints 2
cout << "Refcount = " << tree.refcount() << endl; // prints 1
Sym tree2 = tree; // make a copy (this only changes refcount)
cout << "Refcount now = " << tree.refcount() << endl; // print 2
return tree; // tree2 will be deleted and reference count returns to 1
}
/* To actually use the tree, evaluate it, the following simple recursive function
* can be used
*/
int eval(const Sym& sym) {
if (sym.token() == 0) { // it's a terminal in this example
return 1;
}
// else it's the function
const SymVec& children = sym.args(); // get the children out, children.size() is the arity
// let's assume that we've also got a ternary function designated by token '2'
if (sym.token() == token_t(1))
return eval(children[0]) + eval(children[1]); // evaluate
return eval(children[0]) + eval(children[1]) * eval(children[2]); // a ternary function
}
/* Note that you simply use the stored token that was defined above. Simply checking the size of SymVec in
* this particular example could have sufficed, but it's instructive to use the tokens.
*
* And to test this:
*/
void test_eval() {
Sym tree = test1();
cout << "Evaluating tree1 returns " << eval(tree) << endl;
}
/* Writing initialization functions.
*
* As the Sym class is recursive in nature, initialization can simply be done using
* recursive routines as above. As an example, the following code does 'full' initialization.
*/
Sym init_full(int depth_left) {
if (depth_left == 0) return Sym(0); // create terminal
// else create either a binary or a ternary function
depth_left--;
if (rand() % 2 == 0) { // create binary
SymVec vec(2);
vec[0] = init_full(depth_left);
vec[1] = init_full(depth_left);
return Sym(token_t(1), vec);
} else { // create ternary tree
SymVec vec(3);
vec[0] = init_full(depth_left);
vec[1] = init_full(depth_left);
vec[2] = init_full(depth_left);
return Sym(token_t(2), vec); // token value 2 designates a ternary now, even though the arity can simply be read from the size of the 'SymVec'
}
}
/* Examining the hash table.
*
* The hash table is a static member of the Sym class, but can be obtained and inspected
* at any point during the run. The hash table follows the SGI implementation of hashmap (and effectively
* uses it in gcc). An example:
*/
void inspect_hashtable() {
SymMap& dag = Sym::get_dag(); // get the hashmap
unsigned i = 0;
for (SymMap::iterator it = dag.begin(); it != dag.end(); ++it) {
Sym node(it); // initialize a 'sym' with the iterator
cout << "Node " << i++ << " size " << node.size();
cout << " refcount " << node.refcount()-1; // -1: note that by creating the Sym above the refcount is increased
cout << " depth " << node.depth();
cout << '\n';
}
}
/* The above code effectively examines all distinct subtrees in use in the application and prints some stats for the node */
/* Manipulating trees
*
* The Sym class is set up in such a way that you cannot change a Sym, so how do you perform crossover and mutation?
*
* Simple, you create new syms. The Sym class supports two functions to make this easier: 'get_subtree' and 'insert_subtree'.
* These traverse the tree by index, where 0 designates the root and other values are indexed depth first.
*/
Sym subtree_xover(Sym a, Sym b) {
Sym to_insert = get_subtree(a, rand() % a.size() ); // select random subtree, will crash if too high a value is given
/* 'insert' it into b. This will not really insert, it will however create a new sym,
* equal to 'b' but with a's subtree inserted at the designated spot. */
return insert_subtree(b, rand() % b.size(), to_insert);
}
/* Tying it together, we can create a simple genetic programming system. Mutation is not implemented here,
* but would be easy enough to add by using recursion and/or 'set'. */
void run_gp() {
int ngens = 50;
int popsize = 1000;
cout << "Starting running " << popsize << " individuals for " << ngens << " generations." << endl;
vector<Sym> pop(popsize);
// init population
for (unsigned i = 0; i < pop.size(); ++i) {
pop[i] = init_full(5);
}
double best = 0.0;
// do a very simple steady state tournament
for (unsigned gen = 0; gen < ngens * pop.size(); ++gen) {
int sel1 = rand()% pop.size();
int sel2 = rand() % pop.size();
int sel3 = rand() % pop.size();
double ev1 = eval(pop[sel1]);
double ev3 = eval(pop[sel3]);
double bst = max(ev1,ev3);
if (bst > best) {
best = bst;
}
if (ev3 > ev1) {
sel1 = sel3; // selection pressure
}
Sym child = subtree_xover(pop[sel1], pop[sel2]);
// Check for uniqueness
if (child.refcount() == 1) pop[ rand() % pop.size() ] = child;
}
// and at the end:
inspect_hashtable();
// and also count number of nodes in the population
int sz = 0;
for (unsigned i = 0; i < pop.size(); ++i) { sz += pop[i].size(); }
cout << "Number of distinct nodes " << Sym::get_dag().size() << endl;
cout << "Nodes in population " << sz << endl;
cout << "ratio " << double(Sym::get_dag().size())/sz << endl;
cout << "Best fitness " << best << endl;
}
/* One extra mechanism is supported to add annotations to nodes. Something derived from
* 'UniqueNodeStats' can be used to attach new information to nodes. For this to function,
* we need to supply a 'factory' function that creates these node-stats; attach this function to the
* Sym class, so that it gets called whenever a new node is created. The constructors of the Sym class
* take care of this.
*
* IMPORTANT:
* in a realistic application, the factory function needs to be set BEFORE any Syms are created
* Mixing Syms creating with and without the factory can lead to unexpected results
*
* First we derive some structure from UniqueNodeStats: */
struct MyNodeStats : public UniqueNodeStats {
int sumsize;
~MyNodeStats() { cout << "MyNodeStats::~MyNodeStats, sumsize = " << sumsize << endl; }
};
/* then define the factory function. It will get a Sym, which is just created. */
UniqueNodeStats* create_stats(const Sym& sym) {
MyNodeStats* stats = new MyNodeStats; // Sym will take care of memory management
int sumsize = sym.size();
for (unsigned i = 0; i < sym.args().size(); ++i) {
// retrieve the extra node stats of the child
UniqueNodeStats* unique_stats = sym.args()[i].extra_stats(); // extra_stats retrieves the stats
MyNodeStats* child_stats = static_cast<MyNodeStats*>(unique_stats); // cast it to the right struct
sumsize += child_stats->sumsize;
}
stats->sumsize = sumsize;
return stats; // now it will get attached to the node and deleted when its reference count goes to zero
}
void test_node_stats() {
if (Sym::get_dag().size() != 0) {
cerr << "Cannot mix nodes with and without factory functions" << endl;
exit(1);
}
/* Very Important: attach the factory function to the Sym class */
Sym::set_factory_function(create_stats);
Sym tree = init_full(5); // create a tree
// get extra node stats out
MyNodeStats* stats = static_cast<MyNodeStats*>( tree.extra_stats() );
cout << "Size = " << tree.size() << " SumSize = " << stats->sumsize << endl;
Sym::clear_factory_function(); // reset
}
/* And run the code above */
int main() {
srand(time(0));
cout << "********** TEST EVALUATION **************\n";
test_eval();
cout << "********** TEST ALGORITHM ***************\n";
run_gp();
cout << "********** TEST FACTORY ****************\n";
test_node_stats(); // can work because there are no live nodes
}
/* ********** Member function reference: ********************
*
* Sym() The default constructor will create an undefined node (no token and no children), check for empty() to see if a node is undefined
*
* Sym(token_t) Create a terminal
*
* Sym(token_t, const SymVec&)
* Create a node with token and SymVec as the children
*
* Sym(SymIterator it)
* Create a sym from an iterator (taken from the hashtable directly, or from Sym::iterator)
*
* dtor, copy-ctor and assignment
*
* UniqueNodeStats* extra_stats()
* Returns an UniqueNodeStats pointer (= 0 if no factory is defined)
*
*
* int hashcode() returns the hashcode for the node
*
* int refcount() returns the reference count for the node
*
* bool operator== checks for equality (note that this is a pointer compare, really really fast)
*
* bool empty() returns whether the node is undefined, i.e. created through the default ctor
*
* int arity() shorthand for sym.args().size()
*
* token_t token() return identifying token for the node
*
* const SymVec& args()
* returns the children of the node (in a vector<Sym>)
*
* unsigned size() returns the size, i.e., number of nodes
*
* unsigned depth() returns the depth
*
* iterator() returns the pointer to the node in the hashtable
*
*
********** Static functions: ********************
*
*
*
* SymMap& get_dag() returns the hash table containing all nodes. This should only be used for inspection,
* even though the dag itself is not const. This to enable the use of the ctor Sym(SymIterator) to inspect
* using the Sym interface (rather than the hash table interface). This does allow you to make destructive
* changes to the class, so use with care
*
* set_factory_function( UniqueNodeStats (*)(const Sym&) )
* Set the factory function
*
* clear_factory_function()
* Clears the factory function, allocated UniqueNodeStats will still be deleted, but no new ones will be created.
*
********** Utility Functions ********************
*
* Sym get_subtree(const Sym& org, int i)
* Retreive the i-th subtree from the Sym. Standard depth first ordering, where root has index 0 and the
* rightmost terminal has index sym.size()-1
*
* Sym insert_subtree(const Sym& org, int i, const Sym& subtree)
* Returns a Sym that is equal to 'org', for which the i-th subtree (same ordering as get_subtree) is replaced
* by the third argument subtree.
*
* Sym next(const Sym&)
* Returns the successor of the argument sym from the hashtable with wrap around. This is implemented just because
* it can be done. It may be an interesting way to mutate...
*
* */

View file

@ -0,0 +1,135 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <sstream>
#include <vector>
#include "Sym.h"
using namespace std;
typedef UniqueNodeStats* (*NodeStatFunc)(Sym&);
UniqueNodeStats* (*Sym::factory)(const Sym&) = 0;
void (*Sym::extra_dtor)(token_t) = 0;
SymMap Sym::dag(100000); // reserve space for so many nodes
size_t get_size(const SymVec& vec) {
size_t sz = 0;
for (unsigned i = 0; i < vec.size(); ++i) {
sz += vec[i].size();
}
return sz;
}
size_t get_depth(const SymVec& vec) {
size_t dp = 1;
for (unsigned i = 0; i < vec.size(); ++i) {
dp = std::max(dp, vec[i].depth());
}
return dp;
}
Sym::Sym(token_t tok, const SymVec& args_)
{
detail::SymKey key(tok, detail::SymArgs(args_));
detail::SymValue val;
node = dag.insert(pair<const detail::SymKey, detail::SymValue>(key, val)).first;
if (__unchecked_refcount() == 0) { // new node, set some stats
node->second.size = 1 + get_size(args_);
node->second.depth = 1 + get_depth(args_);
incref();
node->first.fixate();
// call the factory function if available
if (factory) node->second.uniqueNodeStats = factory(*this);
}
else incref();
}
Sym::Sym(token_t tok, const Sym& a) {
SymVec args_; args_.push_back(a);
detail::SymKey key(tok, detail::SymArgs(args_));
detail::SymValue val;
node = dag.insert(pair<const detail::SymKey, detail::SymValue>(key, val)).first;
if (__unchecked_refcount() == 0) { // new node, set some stats
node->second.size = 1 + get_size(args_);
node->second.depth = 1 + get_depth(args_);
incref();
node->first.fixate();
// call the factory function if available
if (factory) node->second.uniqueNodeStats = factory(*this);
}
else incref();
}
Sym::Sym(token_t tok) {
detail::SymKey key(tok);
detail::SymValue val;
node = dag.insert(pair<const detail::SymKey, detail::SymValue>(key, val)).first;
if (__unchecked_refcount() == 0) { // new node, set some stats
node->second.size = 1;
node->second.depth = 1;
incref();
// call the factory function if available
if (factory) node->second.uniqueNodeStats = factory(*this);
}
else incref();
}
std::pair<Sym,bool> insert_subtree_impl(const Sym& cur, size_t w, const Sym& nw) {
if (w-- == 0) return make_pair(nw, !(nw == cur));
const SymVec& vec = cur.args();
std::pair<Sym,bool> result;
unsigned i;
for (i = 0; i < vec.size(); ++i) {
if (w < vec[i].size()) {
result = insert_subtree_impl(vec[i], w, nw);
if (result.second == false) return std::make_pair(cur, false); // unchanged
break;
}
w -= vec[i].size();
}
SymVec newvec = cur.args();
newvec[i] = result.first;
return make_pair(Sym(cur.token(), newvec), true);
}
Sym insert_subtree(const Sym& cur, size_t w, const Sym& nw) {
return insert_subtree_impl(cur,w,nw).first;
}
Sym get_subtree(const Sym& cur, size_t w) {
if (w-- == 0) return cur;
const SymVec& vec = cur.args();
for (unsigned i = 0; i < vec.size(); ++i) {
if (w < vec[i].size()) return get_subtree(vec[i], w);
w-=vec[i].size();
}
return cur;
}

View file

@ -0,0 +1,164 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef SYMNODE_H_
#define SYMNODE_H_
#include <cassert>
#if __GNUC__ == 3
#include <backward/hash_map.h>
#else
#include <hash_map.h>
using std::hash_map;
#endif
/* Empty 'extra statistics' structure, derive from this to keep other characteristics of nodes */
struct UniqueNodeStats { virtual ~UniqueNodeStats(){} };
#include "SymImpl.h"
#include "token.h"
typedef hash_map<detail::SymKey, detail::SymValue, detail::SymKey::Hash> SymMap;
typedef SymMap::iterator SymIterator;
class Sym;
typedef std::vector<Sym> SymVec;
/* Sym is the tree, for which all the nodes are stored in a hash table.
* This makes checking for equality O(1) */
class Sym
{
public:
Sym() : node(dag.end()) {}
explicit Sym(token_t token, const SymVec& args);
explicit Sym(token_t token, const Sym& args);
explicit Sym(token_t var);
explicit Sym(SymIterator it) : node(it) { incref(); }
Sym(const Sym& oth) : node(oth.node) { incref(); }
~Sym() { decref(); }
const Sym& operator=(const Sym& oth) {
if (oth.node == node) return *this;
decref();
node = oth.node;
incref();
return *this;
}
/* Unique Stats are user defined */
UniqueNodeStats* extra_stats() const { return empty()? 0 : node->second.uniqueNodeStats; }
int hashcode() const { detail::SymKey::Hash hash; return hash(node->first); }
// Friends, need to touch the node
friend struct detail::SymKey::Hash;
friend struct detail::SymKey;
unsigned refcount() const { return empty()? 0: node->second.refcount; }
bool operator==(const Sym& other) const {
return node == other.node;
}
bool operator!=(const Sym& other) const { return !(*this == other); }
bool empty() const { return node == dag.end(); }
/* Support for traversing trees */
unsigned arity() const { return node->first.arity(); }
token_t token() const { assert(!empty()); return node->first.token; }
const SymVec& args() const { return node->first.vec(); }
/* size() - depth */
unsigned size() const { return empty()? 0 : node->second.size; }
unsigned depth() const { return empty()? 0 : node->second.depth; }
SymMap::iterator iterator() const { return node; }
/* Statics accessing some static members */
static SymMap& get_dag() { return dag; }
/* This function can be set to create some UniqueNodeStats derivative that can contain extra stats for a node,
* it can for instance be used to create ERC's and what not. */
static void set_factory_function(UniqueNodeStats* (*f)(const Sym&)) { factory=f; }
static void clear_factory_function() { factory = 0; }
static void set_extra_dtor( void (*extra)(token_t) ) { extra_dtor = extra; }
unsigned address() const { return reinterpret_cast<unsigned>(&*node); }
private :
// implements getting subtrees
Sym private_get(size_t w) const;
unsigned __unchecked_refcount() const { return node->second.refcount; }
void incref() {
if (!empty())
++(node->second.refcount);
}
void decref() {
if (!empty() && --(node->second.refcount) == 0) {
if (extra_dtor) {
extra_dtor(token());
}
dag.erase(node);
}
}
// The one and only data member, an iterator into the static map below
SymIterator node;
// A static hash_map that contains all live nodes..
static SymMap dag;
// Factory function for creating extra node stats, default will be 0
static UniqueNodeStats* (*factory)(const Sym&);
static void (*extra_dtor)(token_t);
};
/* Utility hash functor for syms */
class HashSym {
public:
int operator()(const Sym& sym) const { return sym.hashcode(); }
};
/* Utility Functions */
// get_subtree retrieves a subtree by standard ordering (0=root, and then depth first)
Sym get_subtree(const Sym& org, size_t w);
// insert_subtree uses the same ordering as get and inserts the second argument, returning a new tree
Sym insert_subtree(const Sym& org, size_t w, const Sym& nw);
/* Get the successor from the hashtable, no particular purpose other than an interesting way to mutate */
inline Sym next(const Sym& sym) {
SymIterator it = sym.iterator();
++it;
if (it == Sym::get_dag().end()) it = Sym::get_dag().begin();
return Sym(it);
}
#endif

View file

@ -0,0 +1,109 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include "Sym.h"
using namespace std;
namespace detail {
class SymArgsImpl {
public:
vector<Sym> owned_args;
};
size_t SymArgs::len() const {
return vec().size();
}
SymArgs::SymArgs() : pimpl( new SymArgsImpl ) {
args_ptr = &pimpl->owned_args;
}
SymArgs::SymArgs(const std::vector<Sym>& v) : pimpl(0) {
args_ptr = &v;
}
SymArgs::~SymArgs() {
delete pimpl;
}
SymArgs::SymArgs(const SymArgs& args) : pimpl(0), args_ptr(args.args_ptr) {
if (args.pimpl && args.args_ptr == &args.pimpl->owned_args) {
pimpl = new SymArgsImpl(*args.pimpl);
args_ptr = &pimpl->owned_args;
}
}
const SymArgs& SymArgs::operator=(const SymArgs& args) {
if (args.pimpl && args.args_ptr == &args.pimpl->owned_args) {
pimpl = new SymArgsImpl(*args.pimpl);
args_ptr = &pimpl->owned_args;
} else {
args_ptr = args.args_ptr;
}
return *this;
}
void SymArgs::fixate() const {
assert(pimpl == 0);
pimpl = new SymArgsImpl;
pimpl->owned_args = *args_ptr;
args_ptr = &pimpl->owned_args;
}
// For Tackett's hashcode
#define PRIMET 21523
#define HASHMOD 277218551
const int nprimes = 4;
const unsigned long primes[] = {3221225473ul, 201326611ul, 1610612741ul, 805306457ul};
int SymKey::calc_hash() const {
unsigned long hash = unsigned(token);
hash *= PRIMET;
const SymVec& v = args.vec();
for (unsigned i = 0; i < v.size(); ++i) {
hash += ( (v[i].address() >> 3) * primes[i%nprimes]) % HASHMOD;
}
return hash;// % HASHMOD;
}
bool SymKey::operator==(const SymKey& other) const {
if (token != other.token) return false;
return args.vec() == other.args.vec();
}
/* Just to store this info somewhere:
*
* Address Based Hash Function Implementation
* uint32 address_hash(char* addr)
* {
* register uint32 key;
* key = (uint32) addr;
* return (key >> 3) * 2654435761;
* }
*/
SymValue::SymValue() : refcount(0), size(0), depth(0), uniqueNodeStats(0) {}
SymValue::~SymValue() { delete uniqueNodeStats; }
} // namespace detail

View file

@ -0,0 +1,103 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef __SYM_IMPL_H__
#define __SYM_IMPL_H__
#include <vector>
#include "token.h"
class Sym;
namespace detail {
class SymArgsImpl;
class SymArgs {
mutable SymArgsImpl* pimpl; // contains circular reference to vector<Sym>
mutable const std::vector<Sym>* args_ptr;
public:
SymArgs();
SymArgs(const std::vector<Sym>& v);
~SymArgs();
SymArgs(const SymArgs& args);
const SymArgs& SymArgs::operator=(const SymArgs& other);
size_t len() const;
const std::vector<Sym>& vec() const { return *args_ptr; }
void fixate() const;
};
class SymKey
{
public:
SymKey(token_t _token) : args(), token(_token) {}
SymKey(token_t _token, const detail::SymArgs& _args) : args(_args), token(_token) {}
private:
detail::SymArgs args;
public:
bool operator==(const SymKey& other) const;
struct Hash
{
int operator()(const SymKey& k) const { return k.calc_hash(); };
};
unsigned arity() const { return args.len(); }
const std::vector<Sym>& vec() const { return args.vec(); }
token_t token; // identifies the function
// fixates (i.e. claims memory) for the embedded vector of Syms
void fixate() const { args.fixate(); }
private:
int calc_hash() const;
};
struct SymValue
{
friend class Sym;
SymValue();
~SymValue();
unsigned getRefCount() const { return refcount; }
unsigned getSize() const { return size; }
unsigned getDepth() const { return depth; }
private :
// for reference counting
unsigned refcount;
// some simple stats
unsigned size;
unsigned depth;
UniqueNodeStats* uniqueNodeStats;
};
} // namespace detail
#endif

View file

@ -0,0 +1,28 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef TOKEN_H
#define TOKEN_H
typedef unsigned token_t;
struct functor_t {
token_t token;
unsigned arity;
};
#endif

View file

@ -0,0 +1,359 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <LanguageTable.h>
#include <TreeBuilder.h>
#include <FunDef.h>
#include <Dataset.h>
#include <eoSymInit.h>
#include <eoSym.h>
#include <eoPop.h>
#include <eoSymMutate.h>
#include <eoSymCrossover.h>
#include <eoSymEval.h>
#include <eoOpContainer.h>
#include <eoDetTournamentSelect.h>
#include <eoMergeReduce.h>
#include <eoGenContinue.h>
#include <eoEasyEA.h>
#include <eoGeneralBreeder.h>
#include <utils/eoParser.h>
#include <utils/eoCheckPoint.h>
#include <utils/eoStat.h>
#include <utils/eoStdoutMonitor.h>
#include <utils/eoRNG.h>
using namespace std;
typedef EoSym<eoMinimizingFitness> EoType;
static int functions_added = 0;
void add_function(LanguageTable& table, eoParser& parser, string name, unsigned arity, token_t token, const FunDef& fun);
void setup_language(LanguageTable& table, eoParser& parser);
template <class T>
T& select(bool check, T& a, T& b) { if (check) return a; return b; }
class eoBestIndividualStat : public eoSortedStat<EoType, string> {
public:
eoBestIndividualStat() : eoSortedStat<EoType, string>("", "best individual") {}
void operator()(const vector<const EoType*>& _pop) {
ostringstream os;
os << (Sym) *_pop[0];
value() = os.str();
}
};
class AverageSizeStat : public eoStat<EoType, double> {
public:
AverageSizeStat() : eoStat<EoType, double>(0.0, "Average size population") {}
void operator()(const eoPop<EoType>& _pop) {
double total = 0.0;
for (unsigned i = 0; i < _pop.size(); ++i) {
total += _pop[i].size();
}
value() = total/_pop.size();
}
};
class SumSizeStat : public eoStat<EoType, unsigned> {
public:
SumSizeStat() : eoStat<EoType, unsigned>(0u, "Number of subtrees") {}
void operator()(const eoPop<EoType>& _pop) {
unsigned total = 0;
for (unsigned i = 0; i < _pop.size(); ++i) {
total += _pop[i].size();
}
value() = total;
}
};
class DagSizeStat : public eoStat<EoType, unsigned> {
public:
DagSizeStat() : eoStat<EoType, unsigned>(0u, "Number of distinct subtrees") {}
void operator()(const eoPop<EoType>& _pop) {
value() = Sym::get_dag().size();
}
};
int main(int argc, char* argv[]) {
eoParser parser(argc, argv);
/* Language */
LanguageTable table;
setup_language(table, parser);
/* Data */
eoValueParam<string> datafile = parser.createParam(string(""), "datafile", "Training data", 'd', string("Regression"), true); // mandatory
double train_percentage = parser.createParam(1.0, "trainperc", "Percentage of data used for training", 0, string("Regression")).value();
/* Population */
unsigned pop_size = parser.createParam(1500u, "population-size", "Population Size", 'p', string("Population")).value();
uint32_t seed = parser.createParam( uint32_t(time(0)), "random-seed", "Seed for rng", 'D').value();
cout << "Seed " << seed << endl;
rng.reseed(seed);
double var_prob = parser.createParam(
0.9,
"var-prob",
"Probability of selecting a var vs. const when creating a terminal",
0,
"Population").value();
double grow_prob = parser.createParam(
0.5,
"grow-prob",
"Probability of selecting 'grow' method instead of 'full' in initialization and mutation",
0,
"Population").value();
unsigned max_depth = parser.createParam(
8u,
"max-depth",
"Maximum depth used in initialization and mutation",
0,
"Population").value();
bool use_uniform = parser.createParam(
false,
"use-uniform",
"Use uniform node selection instead of bias towards internal nodes (functions)",
0,
"Population").value();
double subtree_mut_prob = parser.createParam(
0.2,
"subtree-mut-rate",
"Probability of performing subtree mutation",
0,
"Population").value();
double node_mut_prob = parser.createParam(
0.2,
"node-mut-rate",
"Probability of performing node mutation",
0,
"Population").value();
double subtree_xover_prob = parser.createParam(
0.4,
"xover-rate",
"Probability of performing subtree crossover",
0,
"Population").value();
double homologous_prob = parser.createParam(
0.4,
"homologous-rate",
"Probability of performing homologous crossover",
0,
"Population").value();
unsigned max_gens = parser.createParam(
50,
"max-gens",
"Maximum number of generations to run",
'g',
"Population").value();
unsigned tournamentsize = parser.createParam(
5,
"tournament-size",
"Tournament size used for selection",
't',
"Population").value();
unsigned meas_param = parser.createParam(
2u,
"measure",
"Error measure:\n\
0 -> absolute error\n\
1 -> mean squared error\n\
2 -> mean squared error scaled (equivalent with correlation)\n\
",
'm',
"Regression").value();
ErrorMeasure::measure meas = ErrorMeasure::mean_squared_scaled;
if (meas_param == 0) meas = ErrorMeasure::absolute;
if (meas_param == 1) meas = ErrorMeasure::mean_squared;
if (functions_added == 0) {
cout << "ERROR: no functions defined" << endl;
exit(1);
}
/* End parsing */
if (parser.userNeedsHelp())
{
parser.printHelp(std::cout);
return 1;
}
Dataset dataset;
dataset.load_data(datafile.value());
cout << "Data " << datafile.value() << " loaded " << endl;
/* Add Variables */
unsigned nvars = dataset.n_fields();
for (unsigned i = 0; i < nvars; ++i) {
table.add_function( SymVar(i).token(), 0);
}
TreeBuilder builder(table, var_prob);
eoSymInit<EoType> init(builder, grow_prob, max_depth);
eoPop<EoType> pop(pop_size, init);
BiasedNodeSelector biased_sel;
RandomNodeSelector random_sel;
NodeSelector& node_selector = select<NodeSelector>(use_uniform, random_sel, biased_sel);
eoProportionalOp<EoType> genetic_operator;
eoSymSubtreeMutate<EoType> submutate(builder, node_selector);
genetic_operator.add( submutate, subtree_mut_prob);
eoSymNodeMutate<EoType> nodemutate(table);
genetic_operator.add(nodemutate, node_mut_prob);
//eoQuadSubtreeCrossover<EoType> quad(node_selector);
eoBinSubtreeCrossover<EoType> bin(node_selector);
genetic_operator.add(bin, subtree_xover_prob);
eoBinHomologousCrossover<EoType> hom;
genetic_operator.add(hom, homologous_prob);
IntervalBoundsCheck check(dataset.input_minima(), dataset.input_maxima());
ErrorMeasure measure(dataset, train_percentage, meas);
eoSymPopEval<EoType> evaluator(check, measure);
eoDetTournamentSelect<EoType> selectOne(tournamentsize);
eoGeneralBreeder<EoType> breeder(selectOne, genetic_operator);
eoPlusReplacement<EoType> replace;
// Terminators
eoGenContinue<EoType> term(max_gens);
eoCheckPoint<EoType> checkpoint(term);
eoBestFitnessStat<EoType> beststat;
checkpoint.add(beststat);
eoBestIndividualStat printer;
AverageSizeStat avgSize;
DagSizeStat dagSize;
SumSizeStat sumSize;
checkpoint.add(printer);
checkpoint.add(avgSize);
checkpoint.add(dagSize);
checkpoint.add(sumSize);
eoStdoutMonitor genmon;
genmon.add(beststat);
genmon.add(printer);
genmon.add(avgSize);
genmon.add(dagSize);
genmon.add(sumSize);
checkpoint.add(genmon);
eoPop<EoType> dummy;
evaluator(pop, dummy);
eoEasyEA<EoType> ea(checkpoint, evaluator, breeder, replace);
ea(pop); // run
}
void add_function(LanguageTable& table, eoParser& parser, string name, unsigned arity, token_t token, const FunDef& fun, bool all) {
ostringstream desc;
desc << "Enable function " << name << " arity = " << arity;
bool enabled = parser.createParam(false, name, desc.str(), 0, "Language").value();
if (enabled || all) {
cout << "Func " << name << " enabled" << endl;
table.add_function(token, arity);
if (arity > 0) functions_added++;
}
}
void setup_language(LanguageTable& table, eoParser& parser) {
bool all = parser.createParam(false,"all", "Enable all functions").value();
bool ratio = parser.createParam(false,"ratio","Enable rational functions (inv,min,sum,prod)").value();
bool poly = parser.createParam(false,"poly","Enable polynomial functions (min,sum,prod)").value();
// assumes that at this point all tokens are defined (none are zeroed out, which can happen with ERCs)
vector<const FunDef*> lang = get_defined_functions();
for (token_t i = 0; i < lang.size(); ++i) {
if (lang[i] == 0) continue;
bool is_poly = false;
if (poly && (i == prod_token || i == sum_token || i == min_token) ) {
is_poly = true;
}
bool is_ratio = false;
if (ratio && (is_poly || i == inv_token)) {
is_ratio = true;
}
const FunDef& fun = *lang[i];
if (fun.has_varargs() ) {
for (unsigned j = fun.min_arity(); j < fun.min_arity() + 8; ++j) {
if (j==1) continue; // prod 1 and sum 1 are useless
ostringstream nm;
nm << fun.name() << j;
bool addanyway = (all || is_ratio || is_poly) && j == 2;
add_function(table, parser, nm.str(), j, i, fun, addanyway);
}
}
else {
add_function(table, parser, fun.name(), fun.min_arity(), i, fun, all || is_ratio || is_poly);
}
}
}

Binary file not shown.

Binary file not shown.

View file

@ -0,0 +1,185 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <utils/eoRNG.h>
#include "FunDef.h"
#include "sym_compile.h"
#include "Dataset.h"
#include "ErrorMeasure.h"
#include "LanguageTable.h"
#include "BoundsCheck.h"
#include "TreeBuilder.h"
#include <iostream>
using namespace std;
void test_xover();
int main() {
Dataset dataset;
dataset.load_data("problem4.dat");
cout << "Records/Fields " << dataset.n_records() << ' ' << dataset.n_fields() << endl;
LanguageTable table;
table.add_function(sum_token, 2);
table.add_function(prod_token, 2);
table.add_function(sum_token, 0);
table.add_function(prod_token, 0);
table.add_function(inv_token, 1);
table.add_function(min_token, 1);
for (unsigned i = 0; i < dataset.n_fields(); ++i) {
table.add_function( SymVar(i).token(), 0);
}
TreeBuilder builder(table);
IntervalBoundsCheck bounds(dataset.input_minima(), dataset.input_maxima() );
ErrorMeasure measure(dataset, 1.0);
unsigned n = 1000;
unsigned k = 0;
vector<Sym> pop;
double sumsize = 0;
for (unsigned i = 0; i < n; ++i) {
Sym sym = builder.build_tree(6, i%2);
pop.push_back(sym);
sumsize += sym.size();
}
cout << "Size " << sumsize/pop.size() << endl;
// shuffle
for (unsigned gen = 0; gen < 10; ++gen) {
random_shuffle(pop.begin(), pop.end());
for (unsigned i = 0; i < pop.size(); i+=2) {
unsigned p1 = rng.random(pop[i].size());
unsigned p2 = rng.random(pop[i+1].size());
Sym a = insert_subtree(pop[i], p1, get_subtree(pop[i+1], p2));
Sym b = insert_subtree(pop[i+1], p2, get_subtree(pop[i], p1));
pop[i] = a;
pop[i+1] = b;
}
cout << gen << ' ' << Sym::get_dag().size() << endl;
}
vector<Sym> oldpop;
swap(pop,oldpop);
for (unsigned i = 0; i < oldpop.size(); ++i) {
Sym sym = oldpop[i];
if (!bounds.in_bounds(sym)) {
k++;
continue;
}
pop.push_back(sym);
}
cout << "Done" << endl;
// full compilation
time_t start_time = time(0);
time_t compile_time;
{
multi_function f = compile(pop);
compile_time = time(0);
vector<double> out(pop.size());
for (unsigned j = 0; j < dataset.n_records(); ++j) {
f(&dataset.get_inputs(j)[0], &out[0]);
}
}
time_t end_time = time(0);
cout << "Evaluated " << n-k << " syms in " << end_time - start_time << " seconds, compile took " << compile_time - start_time << " seconds" << endl;
start_time = time(0);
vector<single_function> funcs;
compile(pop, funcs);
compile_time = time(0);
for (unsigned i = 0; i < pop.size(); ++i) {
single_function f = funcs[i];
for (unsigned j = 0; j < dataset.n_records(); ++j) {
f(&dataset.get_inputs(j)[0]);
}
}
end_time = time(0);
cout << "Evaluated " << n-k << " syms in " << end_time - start_time << " seconds, compile took " << compile_time - start_time << " seconds" << endl;
return 0; // skip the 'slow' one-by-one method
start_time = time(0);
for (unsigned i = 0; i < pop.size(); ++i) {
single_function f = compile(pop[i]);
for (unsigned j = 0; j < dataset.n_records(); ++j) {
f(&dataset.get_inputs(j)[0]);
}
}
end_time = time(0);
cout << "Evaluated " << n-k << " syms in " << end_time - start_time << " seconds" << endl;
}
void test_xover() {
Sym c = SymVar(0);
Sym x = c + c * c + c;
cout << c << endl;
cout << x << endl;
vector<Sym> pop;
for (unsigned i = 0; i < x.size(); ++i) {
for (unsigned j = 0; j < x.size(); ++j) {
Sym s = insert_subtree(x, i, get_subtree(x, j));
pop.push_back(s);
cout << i << ' ' << j << ' ' << s << endl;
}
}
x = Sym();
c = Sym();
SymMap& dag = Sym::get_dag();
for (SymMap::iterator it = dag.begin(); it != dag.end(); ++it) {
Sym s(it);
cout << s << ' ' << s.refcount() << endl;
}
}

Binary file not shown.

View file

@ -0,0 +1,119 @@
/*
* Copyright (C) 2005 Maarten Keijzer
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include "LanguageTable.h"
#include "TreeBuilder.h"
#include "FunDef.h"
#include "Dataset.h"
#include "eoSymInit.h"
#include "eoSym.h"
#include "eoPop.h"
#include "eoSymMutate.h"
#include "eoSymCrossover.h"
#include "eoSymEval.h"
typedef EoSym<double> EoType;
int main() {
LanguageTable table;
table.add_function(sum_token, 2);
table.add_function(prod_token, 2);
table.add_function(inv_token, 1);
table.add_function(min_token, 1);
table.add_function( SymVar(0).token(), 0);
table.add_function(tan_token, 1);
table.add_function(sum_token, 0);
table.add_function(prod_token, 0);
TreeBuilder builder(table);
eoSymInit<EoType> init(builder);
eoPop<EoType> pop(10, init);
for (unsigned i = 0; i < pop.size(); ++i) {
// write out pretty printed
cout << (Sym) pop[i] << endl;
}
BiasedNodeSelector node_selector;
eoSymSubtreeMutate<EoType> mutate1(builder, node_selector);
eoSymNodeMutate<EoType> mutate2(table);
cout << "****** MUTATION ************" << endl;
for (unsigned i = 0; i < pop.size(); ++i) {
cout << "Before " << (Sym) pop[i] << endl;
mutate1(pop[i]);
cout << "After 1 " << (Sym) pop[i] << endl;
mutate2(pop[i]);
cout << "After 2 " << (Sym) pop[i] << endl;
}
cout << "****** CROSSOVER ***********" << endl;
eoQuadSubtreeCrossover<EoType> quad(node_selector);
eoBinSubtreeCrossover<EoType> bin(node_selector);
eoBinHomologousCrossover<EoType> hom;
for (unsigned i = 0; i < pop.size()-1; ++i) {
cout << "Before " << (Sym) pop[i] << endl;
cout << "Before " << (Sym) pop[i+1] << endl;
hom(pop[i], pop[i+1]);
cout << "After hom " << (Sym) pop[i] << endl;
cout << "After hom " << (Sym) pop[i+1] << endl;
quad(pop[i], pop[i+1]);
cout << "After quad " << (Sym) pop[i] << endl;
cout << "After quad " << (Sym) pop[i+1] << endl;
bin(pop[i], pop[i+1]);
cout << "After bin " << (Sym) pop[i] << endl;
cout << "After bin " << (Sym) pop[i+1] << endl;
cout << endl;
}
cout << "****** Evaluation **********" << endl;
Dataset dataset;
dataset.load_data("problem4.dat");
IntervalBoundsCheck check(dataset.input_minima(), dataset.input_maxima());
ErrorMeasure measure(dataset, 0.90, ErrorMeasure::mean_squared_scaled);
eoSymPopEval<EoType> evaluator(check, measure);
eoPop<EoType> dummy;
evaluator(pop, dummy);
for (unsigned i = 0; i < pop.size(); ++i) {
cout << pop[i] << endl;
}
}

Binary file not shown.

View file

@ -0,0 +1,102 @@
# 101 2 nan nan
0 -0
0.1 -8.89903723981037e-05
0.2 -0.00122598240763888
0.3 -0.00517587564272387
0.4 -0.0132382052428645
0.5 -0.0254643081877282
0.6 -0.0407070337997998
0.7 -0.057285499199392
0.8 -0.0737562490233578
0.9 -0.0892727708954409
1 -0.103268200413493
1.1 -0.114577577792354
1.2 -0.120446083316857
1.3 -0.116000062935524
1.4 -0.0946298417869761
1.5 -0.0493963195458011
1.6 0.0248409598316732
1.7 0.129207388804052
1.8 0.259260510831339
1.9 0.404709502287215
2 0.550653582802201
2.1 0.680124882844178
2.2 0.777313232294796
2.3 0.830628236863242
2.4 0.834788256127692
2.5 0.79140362503436
2.6 0.707953643967287
2.7 0.595509207315954
2.8 0.46588939354196
2.9 0.329064872475183
3 0.191504886385553
3.1 0.0558518299407852
3.2 -0.0781019284912663
3.3 -0.211542824409141
3.4 -0.344524221510933
3.5 -0.474312294176053
3.6 -0.594727989459508
3.7 -0.696713721292122
3.8 -0.769988957905497
3.9 -0.805344773512717
4 -0.796949283133396
4.1 -0.744036874310458
4.2 -0.651525836186196
4.3 -0.529396820025977
4.4 -0.39098574050144
4.5 -0.250612439788816
4.6 -0.121113466670896
4.7 -0.0118426008551395
4.8 0.0724429930487275
4.9 0.13163323998176
5 0.169341449166714
5.1 0.191319990014267
5.2 0.203657703492626
5.3 0.211210264725322
5.4 0.216611758596317
5.5 0.220036646440173
5.6 0.219677788419796
5.7 0.212731354762643
5.8 0.196574232374672
5.9 0.169803441763318
6 0.132875383837662
6.1 0.0882099910986659
6.2 0.0397733622937463
6.3 -0.00771703924831375
6.4 -0.0497388405970527
6.5 -0.0828196592845663
6.6 -0.105101954751146
6.7 -0.116508794142315
6.8 -0.118508967610477
6.9 -0.113576955648085
7 -0.104507044103426
7.1 -0.0937591677397028
7.2 -0.0829871341075209
7.3 -0.0728391340503617
7.4 -0.0630443677389944
7.5 -0.0527284491080759
7.6 -0.0408508284036276
7.7 -0.0266394584962638
7.8 -0.00991225704809318
7.9 0.00878546797299183
8 0.0282459694154097
8.1 0.0468334978875681
8.2 0.0628175943181446
8.3 0.0747179317764707
8.4 0.0815794072993519
8.5 0.0831208473565567
8.6 0.0797359724522078
8.7 0.072361534724142
8.8 0.0622560784562081
8.9 0.0507477567273995
9 0.0390091892966309
9.1 0.0279034956336959
9.2 0.0179233472968783
9.3 0.00922050032813963
9.4 0.00170282350837218
9.5 -0.00483635091477891
9.6 -0.010593872761556
9.7 -0.0156676782318098
9.8 -0.020019888098685
9.9 -0.0234934811028401
10 -0.0258701880584331