#include <etree.hh>
Public Member Functions | |
void | Initialize () |
Initialization. | |
bool | HasTreeGrown () |
Stopping criteria. | |
Etree (DataCache *dc) | |
A basic constructor. | |
void | SetParameters (int dT, int dF, double e0, double s0, double t1, double t2, double bdecay) |
Sets the training parameters to the given values. | |
int | FindBMU (const double *datavec) const |
Find the best matching unit. | |
double | CalculateUpdateFactor (const double distance) const |
Calculate the update weight. | |
void | UpdateNodes (int BMU, const double *datavec) |
Moves the fringe nodes towards the data vector. | |
int | TreeDistance (const int node1, const int node2) const |
Calculate the tree distance between nodes. | |
vector< int > | FindTreePath (int nodenum) const |
Find path from root to the specified node. | |
vector< pair< int, int > > | GetSortedFringeDistances (const int nodenum) const |
Get tree distances from the given node. | |
bool | SetDataCache (DataCache &newcache) |
Sets a new data cache. | |
bool | IsLeafNode (unsigned int i) const |
Check for leafnodedness. | |
void | SingleIteration (const double *datavec) |
Trains the tree with one data vector. | |
void | SingleRound (bool exportData=false) |
This function calculates one round of the training algorithm. | |
void | DecayBMUcounts () |
Decrease BMU counters. | |
void | KmeansAdjust () |
Fine-tune leaf locations. | |
void | TrunkReshape (int nodenum) |
Optimizes locations of trunk nodes. | |
void | ExportChildrenAsMatlab (string varname, ostream &os) |
Export the current tree as a Matlab matrix. | |
void | ExportShapeAsMatlab (ostream &os) |
Prints every non-leaf node's relationship data as plain text. | |
bool | ReadFromDisk (string fname) |
Read tree data from given file. | |
int | SubtreeSize (int node) |
int | TotalNodes () |
int | FringeSize () |
vector< int > | NodesAtDepth (int depth) |
Returns the nodes that are at the specified depth in the tree. | |
unsigned int | MaxDepthNode () const |
Returns the index of the deepest node. Ties broken arbitrarily. | |
vector< int > | GetDepthDistribution () const |
int | GetMaxDepth () const |
Return the path to the deepest node. | |
void | PrintNewickTree (ostream &out, const int curnodenum=0) const |
void | PrintTreeInfo (ostream &os) const |
Prints some statistics in text form. | |
void | PrintNodeInfo (int nodenum, ostream &os) const |
Print the relationship info of a given node in plain text. | |
void | PrintNodeProto (int nodenum, ostream &os) const |
Print the prototype vector of a given node as plaintext. | |
void | CreateIndex () |
Builds the index information. | |
void | ClearIndex () |
vector< int > | GetIndexLabels (int i) const |
vector< int > | GetClassification (const double *queryvec) const |
Seeks the best matches to the given query vector. | |
void | PrintIndexStats (ostream &out) const |
Prints the distribution of mapped data vectors on leaf nodes. | |
Protected Member Functions | |
void | Unserialize (istream &is) |
Extract the tree structure from a binary mess. | |
void | UpdateNode (int curnode, int prevnode, int BMU, const double *datavec, double distance) |
Moves the fringe nodes towards the data vector. | |
Protected Attributes | |
vector< Node > | nodes |
A container for the nodes. | |
vector< int > | fringe |
int | dim |
Dimension of data. | |
double | round |
How manyth iteration is going on. | |
double | eta0 |
Initial learning-rate decay value. | |
double | sigma0 |
Initial neighborhood decay. | |
double | tau1 |
A time constant. | |
double | tau2 |
A time constant. See Haykin, pages 450-. | |
double | bmuDecay |
A regularization factor. | |
int | divisionThreshold |
When a node is BMU this many times, it is split. | |
int | divisionFactor |
When split, a node gets this many children. | |
double | weightThreshold |
A cutoff threshold for neighbor updating. | |
double | growingThreshold |
A cutoff threshold for tree growing. | |
int | prevTreeSize |
The size of the tree in the previous round. | |
DataCache * | datavecs |
A container for the training vectors. | |
vector< vector< int > > | labels |
A container for data vector indices that have been mapped to nodes. | |
Friends | |
class | VisualizerData |
ostream & | operator<< (ostream &os, Etree &outtree) |
Serializes the tree to the stream. | |
istream & | operator>> (istream &is, Etree &intree) |
Reads an Evolving Tree from a binary stream. |
The Evolving Tree contains a group of nodes, the training parameters and the functions necessary to train and query the tree.
|
A basic constructor. Sets all values to sensible guesstimates. For better performance, tweak them by hand.
|
|
Calculate the update weight. Figure out how much to update a node towards a data vector. Has exponential neighborhood and decay.
|
|
Builds the index information. This should be called after training, but before querying. |
|
Decrease BMU counters. Multiplies all the BMU counters of the leaves by regularization factor. |
|
Export the current tree as a Matlab matrix. The matrix is built from rows of node prototype vectors. Only exports the nodes that are in the fringe.
|
|
Prints every non-leaf node's relationship data as plain text. Uses the following format.
numberofelements
|
|
Find the best matching unit. Starting at the root, progresses down along the path until a leaf node is found. Note that this assumes nodes[0] is the root. Being a leaf node is equivalent to having zero children. |
|
Find path from root to the specified node. Returns a vector of integers that tell the path from the root node to the node specified along the tree. The first element is the root and the last element is the node in question.
|
|
Seeks the best matches to the given query vector. Finds the BMU, and returns the labels of the data vectors mapped to it.
|
|
Returns vector indicating how many nodes are at each level of the tree. |
|
Get tree distances from the given node. This function calculates the tree distances between the given node and all fringe nodes. The result is sorted according to distance. The first element in pair<int, int> is the distance and the second one is the index of the node. This is so that STL's sort can be used directly.
|
|
Stopping criteria. Check if the tree has grown between rounds. Can be used to stop the training.
|
|
Initialization. Initializes the tree and its root node. |
|
Check for leafnodedness.
|
|
Fine-tune leaf locations. Simple k-means optimization. Map all nodes to leafs, and then place the node to their center of mass. |
|
Returns the nodes that are at the specified depth in the tree. Depth of 0 means root, 1 means the first level and so on.
|
|
Prints the distribution of mapped data vectors on leaf nodes. Note that you must call CreateIndex before calling this. |
|
Read tree data from given file. Datacache must be set to something proper beforehand.
|
|
Sets a new data cache.
|
|
Trains the tree with one data vector. This takes a datavector and does one iteration of the training algorithm. It also splits the BMU should that be necessary. It does not update the round variable, because one round consists of many iterations.
|
|
This function calculates one round of the training algorithm. In a nutshell: shuffle data vectors, run SingleIteration on them. Stop.
|
|
Calculate the tree distance between nodes. Figuring out the tree distance is quite simple. First we find out the paths from the nodes to root. Then we find the place where they differ. Then we just add the lengths of the remaining paths and subtract one.
|
|
Optimizes locations of trunk nodes. A bottom-up recursion that places trunk nodes to the center of mass of their children.
|
|
Moves the fringe nodes towards the data vector. Recursively goes through the tree and updates weights of the leaf nodes.
|
|
Moves the fringe nodes towards the data vector. Updates the fringe node locations towards the data vector.
|
|
Serializes the tree to the stream. Note that datacache is not exported. Also no magic numbers or checksums are used. The order is as follows. First the constants in the order they appear in the class declaration. Next comes the fringe, then the serialized nodes.
|
|
Reads an Evolving Tree from a binary stream. No checking of any kind is done.
|
|
Indices to bottom layer nodes. Only these may be BMUs. |