The Evolving Tree core programs

The core of the Evolving Tree consists of two programs, etreetrain and etreequery. These are used, respectively, to train and analyze a tree structure. This document tells how they are invoked.

etreetrain

The first step in using any neural algorithm is training. The etreetrain program does this to a given data file and a bunch of parameters. It is invoked as follows.

etreetrain <training data file> <output file> <parameter file>

The training data file's structure is documented here. The output file is where the resulting tree structure is saved. It should have the suffix ebd (Evolving Tree binary data). It should be noted that this file is just a binary dump, it is not portable between different systems. The parameter file contains the training parameters. The file's structure is documented in the Doxygen portion of the documents. Usually the wrapper scripts deal with this transparently.

etreequery

This program uses a tree trained earlier and does queries on it. Its usage is quite straightforward.

etreecv <training data file> <testing data file> <tree file>

The first parameter contains the data to be analyzed, usually the training data. The second parameter contains the query vectors. The last points to the tree structure trained with the etreetrain program above.

The program first loads the tree and then maps all the training vectors to the leaf nodes. It then takes the query vectors one by one and finds the corresponding BMU. Then all training vectors that had been mapped to this node are found and their indices (usually class ids) are printed. The exact output format can be found in the Doxygen files.

etreenewick

This program outputs the specified ETree and data set in the Newick tree format. Newick trees are commonly used in bioinformatics for defining tree structures. Using it is straightforward.

etreenewick <data file> <tree file> <output file>

data file usually contains the data used in training, tree file contains the tree trained with etreetrain. The result will be written in output file.


Copyright 2004-2006 Jussi Pakkanen, Laboratory of Computer and Information Science, Helsinki University of Technology.