Adaptive and Intelligent Systems
Applications 1994-1999 - Final Report

7. Applications of probabilistic modeling and search methods (PROMISE)

7.1 Intelligent inventory control (VTT-PROMISE)

Abstract

The VTT-PROMISE project was concerned with the forecasting of product sales for companies operating in the consumer market. The goal was to prepare the forecasts as automatically as possible, and in time spans consistent with the business processes of the companies. In addition, a small demonstration of inventory control was prepared using real inventory data. A prototype system was built.

Results

A system for automatic forecasting has been built. The system incorporates forecasting, automatic model building and database and user interfaces. The system can be run in either interactive or batch mode.

The forecasts estimated by the system are substantially better than those obtained through the old forecasting practices of the companies: in test runs, the system forecasts more accurately than the company's old system for 88 % of products.

Utilization of genetic algorithms in model selection has been researched. The results seem promising but GAs haven't been incorporated in the actual system due to sensitivity problems. A new method for the combination of weekly and monthly forecasts has been developed. Model selection strategies for the situation where time is a scarce resource have been studied. The main result is that starting from a "full" model (that is, incorporating initially all the factors that might affect the individual process) and dropping out one insignificant factor at a time is both efficient and gives accurate forecasting models. The superiority of actual within-sample forecasting accuracy over e.g. the Akaike information criterion as a model selection criterion has been demonstrated.

Project information

Participants

The work was carried out by VTT Information Technology, and the participating companies were Valio (Finland's largest dairy company), Kesko (Finland's largest wholesales company) and ICL Data (a major software company).

Project dates

The project started on April 1, 1998 and ended on February 28, 2000.

Project volume

The volume of the project is FIM 2.750.000, and an approximate total of 70 man months will be spent throughout the whole project.

Project manager

Ilkka Karanta
P.O. Box 1201, VTT Information Technology
FIN-02044 VTT, Finland
Phone: +358-9-456 4509, Mobile: +358-40-514 7589
Fax: +358-9-456 6027
E-mail: ilkka.karanta@vtt.fi
URL: http://www.vtt.fi/tte/staff/kai/

Publications

Some manuscripts are under preparation. The publications of the project so far are:

Ilkka Karanta: Multilevel forecasting improves corporate planning and operations. ERCIM News, No. 38 (July 1999), p. 36.

Ilkka Karanta: Constrained forecasting with time series models. Bulletin of the International Statistical Institute, 52nd session, contributed papers, Vol. 2, pp. 117-118.

7.2 Probabilistic modeling and stochastic optimization (UH-PROMISE)

Abstract

The project focused on two research areas: probabilistic modeling and stochastic optimization. In probabilistic modeling, the main goal of the project was to develop computationally efficient methods for building and applying probabilistic models, such as Bayesian networks and finite mixture models. In stochastic optimization, the goal was to empirically study and compare different stochastic search methods, such as simulated annealing and genetic algorithms, in complex, highly constrained problem domains.

Results

In probabilistic modeling, the research concentrated on theoretical and practical issues concerning model selection with respect to predictive performance of the selected models. The methods developed in the project were validated by using proprietary real-world problems provided by the industrial partners, as well as publicly available benchmark problems available on the Internet. In the empirical tests performed, the group was able to show that even relatively simple Bayesian models in many cases yield better results than alternative techniques, if implemented in a theoretically correct manner. Moreover, the group was able to show, theoretically and empirically, that there exist several “urban legends” concerning the Bayesian methodology for model selection, and that the well-known procedures commonly used in machine learning are in many cases based on misunderstandings or theoretically invalid arguments that lead to sub-optimal behavior of the models. For some of these cases, the group was able to develop alternative model selection procedures which gave good results in the empirical results performed.

Figure 1. A set of high-dimensional data vectors, visualized in 3-D space by using a Bayesian network model.

In stochastic optimization, the group concentrated on empirical comparisons between different stochastic optimization algorithms, such as simulated annealing and genetic algorithms. The empirical results demonstrated that although it is possible to obtain consistently good results with genetic algorithms, similar performance was in many cases possible to achieve with much simpler and more efficient methods, such as different stochastic greedy algorithms. The group also developed a novel version of the celebrated simulated annealing algorithm. In this algorithm the difficult problem of parameter selection is solved by adjusting the so-called cooling schedule automatically during the optimization process.

For the empirical part of the work, the group developed software that allows the researchers to use several dozens of Linux-workstations as a single “virtual supercomputer”, which has made it possible to study interesting exponential-time problems. Some of the Bayesian modeling methods developed in the project were implemented in BAYDA, a JAVA software package for flexible data analysis in classification domains. BAYDA is available free of charge for research and teaching purposes from the group’s homepage. The scientific results are reported in the over 20 international scientific publications produced during the project; copies of the articles can be downloaded from the group’s home site.

Figure 2. A snapshot of the VRML interface designed for visualizing the highly constrained packing problem provided by TietoEnator and StoraEnso.

The results of the project are already being exploited in several commercial products. StoraEnso is already widely using intelligent container packing software, implemented by TietoEnator, based on the optimization algorithms developed in the project. Some of the optimization methods developed by the group have also been integrated into fielded telecommunications software packages developed and used by Nokia. A commercial product development project, aiming at a data analysis software suite exploiting the probabilistic modeling methods developed in the project, is also currently in progress by one of the industrial partners.

During the project, the research group has established excellent research contacts with all major probabilistic modeling research groups in the world, and hosted visits from researchers from, for example, NASA, UCL (London) and CWI (Amsterdam). The group’s researchers have made several visits to these institutes in return, and as a concrete result of this cooperation, the group has published joint papers with the foreign colleagues. On the European level, the group has been actively participating in two European research networks: Neural and Computational Learning (NeuroCOLT) and Highly Structured Stochastic Systems (HSSS). The group is also participating in three project proposals within the fifth framework research programme of EU.

Project information

Participants

University of Helsinki
TietoEnator
Kone
BayesIT
Nokia
Kibron

Project dates

March 1, 1998 - April 30, 2000.

Project volume

Total budget FIM 2.650.000, 105 man months

Project manager

Dr. Petri Myllymäki
P.O. Box 26, Department of Computer Science
FIN-00014 University of Helsinki, Finland
Tel: +358 9 191 44212
Fax: +358 9 191 44441
E-mail: Petri.Myllymaki@cs.Helsinki.FI
URL: http://www.cs.Helsinki.FI/petri.myllymaki/

Publications

1. P.Kontkanen, J.Lahtinen, P.Myllymäki, T.Silander, and H.Tirri, Supervised Model-Based Visualization of High-Dimensional Data. To appear in Intelligent Data Analysis.

2. P.Kontkanen, P.Myllymäki, T.Silander, H.Tirri, and P.Grünwald, On Predictive Distributions and Bayesian Networks. To appear in Statistics and Computing 10 (2000), 39-54.

3. P.Kontkanen, P.Myllymäki, T.Silander, and H.Tirri, Density Estimation by Minimum Encoding Mixtures of Histograms. Pp. 162-164 in Book of Abstracts, Second European Conference on Highly Structured Stochastic Systems (HSSS'99), Pavia, Italy, September 1999.

4. P.Myllymäki, Massively Parallel Probabilistic Reasoning with Boltzmann Machines. Applied Intelligence 11, 31-44 (1999).

5. P.Kontkanen, P.Myllymäki, T.Silander, and H.Tirri, On the Accuracy of Stochastic Complexity Approximations. Chapter 9 in Causal Models and Intelligent Data Management, edited by A.Gammerman. Springer-Verlag, 1999.

6. P.Kontkanen, P.Myllymäki, T.Silander, and H.Tirri, On Supervised Selection of Bayesian Networks. Pp. 334-342 in Proceedings of the 15th International Conference on Uncertainty in Artificial Intelligence (UAI'99), edited by K. Laskey and H. Prade. Morgan Kaufmann Publishers, 1999.

7. J.Lahtinen, P.Myllymäki, T.Silander, H.Tirri, and H.Wettig, An Empirical Evaluation of Stochastic Search Methods in Real-World Telecommunication Domains. Pp. 181-187 in Proceedings of the 3rd World Multiconference on Systemics, Cybernetics and Informatics (SCI'99) and 5th International Conference on Information Systems Analysis and Synthesis (ISAS'99), Volume 4, edited by M. Torres, B. Sanchez, S. Radhakrishan and R. Osers. International Institute of Information and Systemics, 1999.

8. P.Ruohotie, H.Tirri, P.Nokelainen and T.Silander, Modern Modeling of Professional Growth. Research Center for Vocational Education, Saarijärven Offset 1999.

9. H. Tirri, What the heritage of Thomas Bayes has to offer for modern educational research? Chapter II in P.Ruohotie, H.Tirri, P.Nokelainen and T.Silander, Modern Modeling of Professional Growth. Research Center for Vocational Education, Saarijärven Offset 1999.

10. T. Silander and H. Tirri, Bayesian classification. Chapter III in P.Ruohotie, H.Tirri, P.Nokelainen and T.Silander, Modern Modeling of Professional Growth. Research Center for Vocational Education, Saarijärven Offset 1999.

11. P. Nokelainen, P.Ruohotie and H.Tirri, Professional Growth Determinants-Comparing Bayesian and linear approaches to classification. Chapter IV in P.Ruohotie, H.Tirri, P.Nokelainen and T.Silander, Modern Modeling of Professional Growth. Research Center for Vocational Education, Saarijärven Offset 1999.

12. P.Kontkanen, P.Myllymäki, T.Silander, H.Tirri, K.Valtonen, Exploring the Robustness of Bayesian and Information-Theoretic Methods for Predictive Inference. Pp. 231-236 in Proceedings of Uncertainty'99: The Seventh International Workshop on Artificial Intelligence and Statistics, edited by D.Heckerman and J.Whittaker. Morgan Kaufmann Publishers, 1999.

13. P.Kontkanen, P.Myllymäki, T.Silander, and H.Tirri, On Bayesian Case Matching. Pp. 13-24 in Advances in Case-Based Reasoning, Proceedings of the 4th European Workshop (EWCBR-98), edited by B.Smyth and P.Cunningham. Vol. 1488 in Lecture Notes in Artificial Intelligence, Springer-Verlag, 1998.

14. P.Kontkanen, P.Myllymäki, T.Silander, and H.Tirri, BAYDA: Software for Bayesian Classification and Feature Selection. Pp. 254-258 in Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD-98), edited by R.Agrawal, P.Stolorz and G.Piatetsky-Shapiro. AAAI Press, Menlo Park, CA, 1998.

15. P.Grünwald, P.Kontkanen, P.Myllymäki, T.Silander, and H.Tirri, Minimum Encoding Approaches for Predictive Modeling. Pp. 183-192 in Proceedings of the 14th International Conference on Uncertainty in Artificial Intelligence UAI'98), edited by G.Cooper and S.Moral. Morgan Kaufmann Publishers, San Francisco, CA, 1998.

16. E.Koskimäki, J.Göös, P.Kontkanen, P.Myllymäki, and H.Tirri, Comparing Soft Computing Methods in Prediction of Manufacturing Data. Pp. 775-784 in Tasks and Methods in Applied Artificial Intelligence, Proceedings of the 11th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA-98-AIE), edited by A.P. del Pobil, J.Mira and M. Ali. Vol. 1416 in Lecture Notes in Artificial Intelligence, Springer-Verlag, 1998.

17. H.Tirri and T.Silander, Stochastic complexity based estimation of missing elements in questionnaire data. The Annual American Educational Research Association Meeting (AERA'98), SIG Educational Statisticians, San Diego, 1998. ERIC Document Reproduction Service, microfiche No. TM029880.

18. P.Myllymäki and H.Tirri, Prospects of Bayesian networks (in Finnish). Technology Report 58/98. Technology Development Center (Tekes), 1998.

19. P.Kontkanen, P. Myllymäki, T. Silander, and H.Tirri, Bayes Optimal Instance-Based Learning. Pp. 77-88 in Machine Learning: ECML-98, Proceedings of the 10th European Conference, edited by C.Nédellec and C.Rouveirol. Vol. 1398 in Lecture Notes in Artificial Intelligence, Springer-Verlag, 1998.

20. P.Kontkanen, P.Myllymäki, T.Silander, H.Tirri, and P.Grünwald, Bayesian and Information-Theoretic Priors for Bayesian Network Parameters. Pp. 89-94 in Machine Learning: ECML-98, Proceedings of the 10th European Conference, edited by C.Nédellec and C.Rouveirol. Vol. 1398 in Lecture Notes in Artificial Intelligence, Springer-Verlag, 1998.

21. P.Kontkanen, P. Myllymäki, T.Silander, and H.Tirri, Batch Classifications with Discrete Finite Mixtures. Pp. 208-213 in Machine Learning: ECML-98, Proceedings of the 10th European Conference, edited by C.Nédellec and C.Rouveirol. Vol. 1398 in Lecture Notes in Artificial Intelligence, Springer-Verlag, 1998.

The above publications and additional information can be obtained through the CoSCo group home page at URL http://www.cs.Helsinki.FI/research/cosco/ .

7.3 Using background knowledge in neural modelling (HUT-PROMISE)

Abstract

One of the most important features of neural networks is generality, as the same network can be trained to solve quite different tasks, depending on the training data. However, this is also one of the most prominent problems in using neural networks in real world problems as inclusion of the existing domain knowledge in the models is difficult, and as yet has no general theory. The goal of the project was to develop methods for using background knowledge in neural network modeling, and to apply the methods to real problems together with the industrial partners. The main focus has been on Bayesian techniques for neural networks, which provide an efficient method for choosing the correct model complexity, and tools for analysing the confidence of the resulting models.

The application oriented goals were 1) to develop a fast and accurate solution to the inverse problem in electrical impedance tomography for industrial tomography purposes (with Ahlström Pumps), 2) to assist OWC Ltd in applying neural network technology to improve the company's proprietary weight measurement system, and 3) to develop a statistical model and analysis tools for quality control of concrete (with Lohja Rudus Ltd.). A small task in the project was also a case study of a neural modelling tool, Q-opt, developed in the earlier MENES project, that allows the use of backgound knowledge in training the model.

Results

The main results of the methodological research consisted of methods for Bayesian analysis of neural networks, and a novel approach for statistical inverse methods, applied in electrical impedance tomography. Bayesian modelling is based on defining a probability model for the problem, and choosing suitable prior distributions for unknown entities, and integrating over all unknown variables to obtain the model predictions for the given data. We have studied the Markov Chain Monte Carlo techniques for this integration, mainly in order to speed up the computationally expensive sampling process and to assess the convergence of the sampling. The goodness of the Bayesian model depends largely on the validity of the chosen distributions, and we have studied the use of distributions that allow more flexibility in specifying the prior assumptions than the standard normal/independence assumptions. We have validated the results in several real applications related to this project and other similar statistical modelling tasks in the laboratory.

In electrical impedance tomography (EIT) the aim is to recover the internal structure of an object based on impedance measurements from the surface. EIT is a very promising technique for industial tomoraphy (process monitoring) as the instruments are inexpensive, but the inverse problem for the image reconstruction is very difficult. We have developed a novel approach for the EIT inverse problem, where the problem is transformed into a more regular space (eigen space) and Bayesian neural network is used to approximate the inverse mapping. The method is highly competent with the state-of-the-art inverse methods, and provides many advantages over any other approach: the reconstruction is nearly five orders of magnitude faster, facilitating real time reconstruction, and incorporating additional background knowledge or constraints is easy. In addition, the end goal in EIT is often some index variable computed from the reconstructed image, such as the void fraction in a mixed flow of liquid and gas, and with the developed approach these can be estimated directly without the far more complex reconstruction process.

In the case problem of concrete quality estimation a central problem was small number of available data, as each sample requires casting tests and the final compression strength is measured three months after casting. Thus the estimation method needs to use all samples as efficiently as possible, making Bayesian techniques a tempting choice. We have applied generalized linear models and the developed Bayesian neural network techniques (including non-Gaussian and mutually correlating noise models) to estimate the quality parameters (density, compression strength, slump, bleeding, air percentage, etc.) given the recipe (amount of cement, water and additives) and several variables related to the properties of the stone material (natural or crushed, size and shape distributions of the grains, mineralogical composition, etc.). We have also developed an image analysis tool for measuring the shape attributes of the sand grains (size, shape, texture, roughness, angularity etc.) based on standard 1200 dpi color scanner images of the grains. The study supported a large quality research programme of the industrial partner, and the result is the first statistical quality model of concrete in this extent.

In the weight measurement application (OWC) we have provided the industrial partner with neural network methods that are used in the product to estimate the weight of the object based on the signals from the company's proprietary strain gauge devices. We have also developed a vehicle recognition and analysis system that is used in an on-line road scale product that weighs vehicles that pass the scale under normal traffic conditions.

Project information

Participants

Laboratory of Computational Engineering, HUT
Ahlström Pumps Ltd.
Lohja Rudus Ltd.
OWC - Omni Weight Control Ltd.
Taipale Engineering Ltd.

Aki Vehtari and Jouko Lampinen. Bayesian neural networks: Case studies in industrial applications. In Suzuki, Roy, Ovaska, Furuhashi, and Dote, editors, Soft Computing in Industrial Applications. Springer-Verlag, 1999.

Jouko Lampinen, Paula Litkey, and Harri Hakkarainen. Selection of training samples for learning with hints. In Proc. IJCNN'99, Washington, DC, USA, July 1999.

Aki Vehtari and Jouko Lampinen. Bayesian neural networks with correlating residuals. In Proc. IJCNN'99, Washington, DC, USA, July 1999.

Aki Vehtari and Jouko Lampinen. Bayesian neural networks for image analysis. In B. K. Ersboll and P. Johansen, editors, Proceedings of SCIA'99, pages 95-102, Kangerlussuaq, Greenland, June 1999.

Aki Vehtari, Jukka Heikkonen, Jouko Lampinen, and Jouni Juujärvi. Using Bayesian neural networks to classify forest scenes. In David P. Casasent, editor, Intelligent Robots and Computer Vision XVII: Algorithms, Techniques, and Active Vision, volume 3522 of Proceedings of SPIE, pages 66-73, Boston, MA, USA, November 1998.

Jouni Juujärvi, Jukka Heikkonen, Sami Brandt, and Jouko Lampinen. Digital image based tree measurement for forest industry. In David P. Casasent, editor, Intelligent Robots and Computer Vision XVII: Algorithms, Techniques, and Active Vision, volume 3522 of Proceedings of SPIE, pages 114-123, Boston, MA, November 1998.

Jouko Lampinen, Aki Vehtari, and Kimmo Leinonen. Application of Bayesian neural network in electrical impedance tomography. In Proc. IJCNN'99, Washington, DC, USA, July 1999.

Aki Vehtari and Jouko Lampinen. Bayesian neural networks for industrial applications. In Proceedings of SMCIA/99 -1999 IEEE Midhight-Sun Workshop on Soft Computing Methods in Industrial Applications, pages 63-68, Kuusamo, Finland, June 1999.

Jukka Heikkonen and Jari Varjo and Aki Vehtari, Forest Change Detection via Landsat TM Difference Features, Proceedings of 11^th Scandinavian Conference on Image Analysis SCIA'99, 1999, pp. 157-164, Kangerlussuaq, Greenland.

Aki Vehtari, Jouni Juujärvi, Jukka Heikkonen, and Jouko Lampinen. Forest scene classification: Comparison of classifiers. In Human and Artificial Information Processing, Proceedings of SteP'98, the 8^th Finnish artificial intelligence conference, pages 152-160, Jyväskylä, Finland, September 1998. Picaset Oy, Helsinki.

Figure 1. Example of EIT reconstruction of a gas bubble in a liquid. The left figure shows the potential field due to one current injection with opposite electrodes. The right figure shows the reconstruction with Bayesian neural network. The color indicates the bubble probability and blue contour the detected bubble boundary.

Figure 2. Example of the image analysis system for concrete quality modeling. The figures show samples of crushed (left) and natural (right) gravel. One feature for describing the shape of the grains is their angularity. This is based on estimating the amount of material worn off from the grains, using morphological erosion. The erosion spectra on different scales are
shown in the figures. The number in each image is relative total index of angularity. The size of the grains in the image is 3.1-5.4 mm.

jukka.iivarinen@hut.fi
http://www.cis.hut.fi/neuronet/Tekes/7.shtml
Wednesday, 29-Nov-2000 10:27:11 EET