There are two ways to fuse data from different data sets. A simple approach is to combine the features from all sets to a single vector. Another is based on hierarchical map-structure as depicted in figure 5.9. In this approach each set is first examined separately. After this the information extracted by the low level maps are combined on a high level map. Usually either the indexes or the coordinates of the BMUs on the low level maps are concatenated and used to train the high level map. Also data histograms could be used as in the construction of mill-technology maps.
Figure 5.9: Data fusion to combine data from three different areas of information: technology, economics and environment.
To compare the two methods both the hierarchical and the combined feature vector approaches were examined. In the hierarchical approach the data for the upper level map was constructed by taking the technological, economical and environmental information associated with each of the mills, finding their BMUs in the respective maps and finally concatenating the BMU coordinates. In addition the original vector components were added to the vector, but these components were not used in the winner search, just in the updating phase. This way they were classified implicitly (see section 3.3.2). By examining the u-matrix and the implicit classification results, six clusters were extracted from the resulting map. The clusters are listed in table 5.9 and shown in figure 5.10a. The map resulting from the combined feature vector approach is shown in figure 5.10b. Also this map was clustered to six different areas. They are listed in table 5.10.
The six clusters produced by the two approaches have many similarities. Both maps have a NEWS/WOODC cluster with high total production and low environmental load. Also a PULP AND LINERBOARD and the OTHER clusters can be found from both maps. The WF DEBTED and WF COATED clusters of the hierarchical map correspond to the WF and PULP clusters of the combined feature vector map.
However, the results of the two approaches differed a lot in certain aspects. In the combined feature vector map the TISSUE cluster forms a cluster of its own while on the hierarchical map it is grouped in the same cluster with the WF papers. On the combined feature vector map there is also a special PULP cluster which was not present in the hierarchical map.
The largest variety can be found from the economical variables. However, this should not come entirely as a suprise since companies typically own many kinds of mills and therefore the relation between mill-type and the financial situation of the mother company is hardly a simple one. In future work a promising solution to this is to add mill-specific cost-competitiveness data to the feature vector.
In the end using both approaches is probably useful. The hierarchical approach is structured in that it first examines each area independently and then tries to link the different cases from the separate areas with each other. The combined feature vector approach on the other hand directly seeks correlations between all variables regardless of their area of origin. The final results are most reliable when the two approaches agree. When they do not, they offer two different viewpoints to the problem field.
Figure 5.10: U-matrix of the combined maps: the hierarchical approach (a) and the combined feature vectors approach (b).
|PULP AND LINERBOARD||Linerboard mills and chemical pulp mills.|
|Low fixed investments, high equity and ROCE.|
|High environmental load, especially pulp production related.|
|WOODC AND TISSUE||Newsprint, fluting and tissue mills and associated pulp production.|
|Medium economical variables.|
|Low environmental load except for energy related variables.|
|OTHER||Cartonboard, other papers, diwa and chemimechanical pulps.|
|High pulp and paper share of sales, otherwise medium economical figures.|
|Low environmental load.|
|WF DEBTED||Woodfree papers and chemical pulp.|
|High fixed investments and debts.|
|Low equity, ROCE and capital turnover.|
|High environmental load, except for energy related variables.|
|WF COATED||Woodfree papers and coaters.|
|High sales and EBIT, low sales growth.|
|Fairly high environmental load.|
|High forest industry share of sales, sales growth, equity, ROCE and capital turnover, low net indebtness.|
|Mostly low environmental load.|
|NEWS||High total production capacity: wood containing papers and mechanical pulp.|
|High sales, EBIT and net indebtness. Low forest industry of sales, equity, ROCE and capital turnover.|
|Low or medium environmental load.|
|PULP AND LINERBOARD||Linerboard and fluting papers, some woodfree paper, high chemical and semichemical pulp production.|
|High sales and EBIT, low capital turnover.|
|High or medium environmental load.|
|PULP||Chemical pulp mills.|
|Low sales, EBIT, fixed investments and debts, high sales growth, equity, ROCE and capital turnover.|
|High environmental load.|
|WF||WF papers and cartonboard.|
|High sales, EBIT and fixed investments.|
|Low environmental load.|
|OTHER||Wrapping and other papers, partially also diwa pulp.|
|Medium economical figures except for fairly high debts and low ROCE.|
|Low environmental load.|
|TISSUE||Tissue paper, dewa pulp.|
|High pulp and paper share of sales, fair ROCE and capital turnover.|
|Medium environmental load.|