M-file format specifications for the SOM Toolbox project ======================================================== -------------------------------------------------------- File names ---------- File names will be of the form som_xxx.m where xxx stands for an arbitrarily long descriptive name. Let's try to keep them short, though. For DOS / Win3.1 we'll prepare another separate package with shorter 8+3 filenames, at some point. -------------------------------------------------------- File structure -------------- In principle each file should be structured as follows. In practice you can add/combine/leave out necessary parts. function [ret1] = fu(arg1,arg2) % Help % % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% Check arguments %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% Initialization %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% Action %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% Build / clean up the return arguments -------------------------------------------------------- Error ----- When encountering an error in a function, use error('message') function to exit from function. -------------------------------------------------------- Helps ----- Each file will have a short help notice at the beginning of the file. The format of the help is as follows: % FUNCTION_NAME: one-line description of the purpose/function of the file % % [ret1, ret2] = function(arg1, arg2, arg3, ... ;call format with optional % arg4, [arg5]) ;parameters in []'s % ARGUMENTS: % % arg1: size / type and description % arg2: size / type and description % [arg5]: size / type and description [default value] ;optional argument % % arguments in []'s are optional % % RETURNS: % % ret1: size / type and description % ret2: size / type and description % % A longer multiline description of the function, % if necessary. % % SEE ALSO: function1, function2, function3 % In the help use the following variable names: n1, n2, ..., nk the sizes of the SOM along each dimension dim the input dimension k the output dimension (dimension of the grid) p the length of training -------------------------------------------------------- Comments -------- The files should be well documented. At least the following things should be done: - logical parts of the algorithm should be highlighted - most important variables should be explained Additionally you may include: - description of what is the purpose of a important sections/commands - possibly a simplified algorithm to aid in understanding - advices if the user wants to change something (like distance measure) -------------------------------------------------------- Standard variable names ----------------------- To make the development and understanding of the package easier, the following standard variable names should be used: - all structures start with leading 's' and continue with a capital letter (e.g. sData) - all matrixes start with a capital letter (e.g. Udist) - all vectors, strings and scalars are with small letters (e.g. diff) dim = dimension of the input space sData = data struct D = data matrix dlen = number of data vectors dlabels = labels of data data_id = data set name sMap = map struct M = map matrix (codebook) msize = map grid sizes mdim = map output dimension (dimension of the grid) munits = total number of map units mlabels = labels of map lattice = map lattice shape = map shape neigh = neighborhood function cnames = names of components cweights = weights of components rad = neighborhood radius rad_ini = initial neighborhood radius rad_fin = final neighborhood radius sTrseq = train sequence alpha = training coefficient alpha_ini = initial training coefficient alpha_fin = final training coefficient alphaf = training coefficient function epochs = train cycles train_len = training length Name strings: 'hexa' = hexagonal 'rect' = rectangular 'toroid' = toroid shape 'cyl' = cylinder shape 'bubble' = bubble neighborhood function 'gauss' = gaussian neighborhood function 'ep' = ep neighborhood function 'random' = random initialization 'linear' = linear initialization and linear train coeff. function 'inv' = inverse train coefficient function 'seq' = sequential training 'batch' = batch training