Evaluation and Analysis of Supervised Learning Algorithms and Classifiers

Detta är en avhandling från Karlskrona : Blekinge Institute of Technology

Sammanfattning: The fundamental question studied in this thesis is how to evaluate and analyse supervised learning algorithms and classifiers. As a first step, we analyse current evaluation methods. Each method is described and categorised according to a number of properties. One conclusion of the analysis is that performance is often only measured in terms of accuracy, e.g., through cross-validation tests. However, some researchers have questioned the validity of using accuracy as the only performance metric. Also, the number of instances available for evaluation is usually very limited. In order to deal with these issues, measure functions have been suggested as a promising approach. However, a limitation of current measure functions is that they can only handle two-dimensional instance spaces. We present the design and implementation of a generalised multi-dimensional measure function and demonstrate its use through a set of experiments. The results indicate that there are cases for which measure functions may be able to capture aspects of performance that cannot be captured by cross-validation tests. Finally, we investigate the impact of learning algorithm parameter tuning. To accomplish this, we first define two quality attributes (sensitivity and classification performance) as well as two metrics for measuring each of the attributes. Using these metrics, a systematic comparison is made between four learning algorithms on eight data sets. The results indicate that parameter tuning is often more important than the choice of algorithm. Moreover, quantitative support is provided to the assertion that some algorithms are more robust than others with respect to parameter configuration. To sum up, the contributions of this thesis include; the definition and application of a formal framework which enables comparison and deeper understanding of evaluation methods from different fields of research, a survey of current evaluation methods, the implementation and analysis of a multi-dimensional measure function and the definition and analysis of quality attributes used to investigate the impact of learning algorithm parameter tuning.