| 6.1 | Introduction |
| 6.2 | Fundamentals of Modeling |
| 6.3 | Model Structures for Prediction |
| 6.3.1 | Regression Models with Linear Structure |
| 6.3.2 | Local Piecewise Model Structures for Regression |
| 6.3.3 | Nonparametric "Memory-Based" Local Models |
| 6.3.4 | Stochastic Components of Model Structures |
| 6.3.5 | Predictive Models for Classification |
| 6.3.6 | An Aside: Selecting a Model of Appropriate Complexity |
| 6.4 | Models for Probability Distributions and Density Functions |
| 6.4.1 | General Concepts |
| 6.4.2 | Mixtures of Parametric Models |
| 6.4.3 | Joint Distributions for Unordered Categorical Data |
| 6.4.4 | Factorization and Independence in High Dimensions |
| 6.5 | The Curse of Dimensionality |
| 6.5.1 | Variable Selection for High-Dimensional Data |
| 6.5.2 | Transformations for High-Dimensional Data |
| 6.6 | Models for Structured Data |
| 6.7 | Pattern Structures |
| 6.7.1 | Patterns in Data Matrices |
| 6.7.2 | Patterns for Strings |
| 6.8 | Further Reading |
| 8.1 | Introduction |
| 8.2 | Searching for Models and Patterns |
| 8.2.1 | Background on Search |
| 8.2.2 | The State-Space Formulation for Search in Data Mining |
| 8.2.3 | A Simple Greedy Search Algorithm |
| 8.2.4 | Systematic Search and Search Heuristics |
| 8.2.5 | Branch-and-Bound |
| 8.3 | Parameter Optimization Methods |
| 8.3.1 | Parameter Optimization: Background |
| 8.3.2 | Closed Form and Linear Algebra Methods |
| 8.3.3 | Gradient-Based Methods for Optimizing Smooth Functions |
| 8.3.4 | Univariate Parameter Optimization |
| 8.3.5 | Multivariate Parameter Optimization |
| 8.3.6 | Constrained Optimization |
| 8.4 | Optimization with Missing Data: The EM Algorithm |
| 8.5 | Online and Single-Scan Algorithms |
| 8.6 | Stochastic Search and Optimization Techniques |
| 8.7 | Further Reading |
| 9.1 | Introduction |
| 9.2 | Describing Data by Probability Distributions and Densities |
| 9.2.1 | Introduction |
| 9.2.2 | Score Functions for Estimating Probability Distributions and Densities |
| 9.2.3 | Parametric Density Models |
| 9.2.4 | Mixture Distributions and Densities |
| 9.2.5 | The EM Algorithm for Mixture Models |
| 9.2.6 | Nonparametric Density Estimation |
| 9.2.7 | Joint Distributions for Categorical Data |
| 9.3 | Background on Cluster Analysis |
| 9.4 | Partition-Based Clustering Algorithms |
| 9.4.1 | Score Functions for Partition-Based Clustering |
| 9.4.2 | Basic Algorithms for Partition-Based Clustering |
| 9.5 | Hierarchical Clustering |
| 9.5.1 | Agglomerative Methods |
| 9.5.2 | Divisive Methods |
| 9.6 | Probabilistic Model-Based Clustering Using Mixture Models |
| 9.7 | Further Reading |
| 11.1 | Introduction |
| 11.2 | Linear Models and Least Squares Fitting |
| 11.2.1 | Computational Issues in Fitting the Model |
| 11.2.2 | A Probabilistic Interpretation of Linear Regression |
| 11.2.3 | Interpreting the Fitted Model |
| 11.2.4 | Inference and Generalization |
| 11.2.5 | Model Search and Model Building |
| 11.2.6 | Diagnostics and Model Inspection |
| 11.3 | Generalized Linear Models |
| 11.4 | Artificial Neural Networks |
| 11.5 | Other Highly Parameterized Models |
| 11.5.1 | Generalized Additive Models |
| 11.5.2 | Projection Pursuit Regression |
| 11.6 | Further Reading |
| 12.1 | Introduction |
| 12.2 | Memory Hierarchy |
| 12.3 | Index Structures |
| 12.3.1 | B-trees |
| 12.3.2 | Hash Indices |
| 12.4 | Multidimensional Indexing |
| 12.5 | Relational Databases |
| 12.6 | Relational Algebra |
| 12.7 | The Structured Query Language (SQL) |
| 12.8 | Query Execution and Optimization |
| 12.9 | Data Warehousing and On-Line Analytical Processing (OLAP) |
| 12.10 | Data Structures for OLAP |
| 12.11 | String Databases |
| 12.12 | Very Large Data Sets, Data Management, and Data Mining |
| 12.12.1 | Force the Data into Main Memory |
| 12.12.2 | Scalable Versions of Data Mining Algorithms |
| 12.12.3 | Special-Purpose Algorithms for Disk Access |
| 12.12.4 | Pseudo Data Sets and Sufficient Statistics |
| 12.13 | Further Reading |
| 14.1 | Introduction |
| 14.2 | Evaluation of Retrieval Systems |
| 14.2.1 | The Difficulty of Evaluating Retrieval Performance |
| 14.2.2 | Precision versus Recall |
| 14.2.3 | Precision and Recall in Practice |
| 14.3 | Text Retrieval |
| 14.3.1 | Representation of Text |
| 14.3.2 | Matching Queries and Documents |
| 14.3.3 | Latent Semantic Indexing |
| 14.3.4 | Relevance Feedback |
| 14.4 | Automated Recommender Systems |
| 14.5 | Document and Text Classification |
| 14.6 | Image Retrieval |
| 14.6.1 | Image Understanding |
| 14.6.2 | Image Representation |
| 14.6.3 | Image Queries |
| 14.6.4 | Image Invariants |
| 14.6.5 | Generalizations of Image Retrieval |
| 14.7 | Time Series and Sequence Retrieval |
| 14.7.1 | Global Models for Time Series Data |
| 14.7.2 | Structure and Shape in Time Series |
| 14.8 | Summary |
| 14.9 | Further Reading |