Ensembles of classifiers by Evolutionary Algorithms
The effectiveness of a classifier ensemble depends on the selection of accurate and diverse
classification models, that is, the committee should be composed by precise models
which make different prediction mistakes. Finding the optimal global solution can be
computationally costly since the number of possible combinations of classifiers, and hence
the search space, is vast. Evolutionary Algorithms are metaheuristics that have been
shown to be effective in NP-hard problems. One of their advantages over traditional search
methods is that they are less likely to get stuck in a local optimum; the main reason for
that is due to the fact that they process a set of solutions (population), instead of just one
solution. Furthermore, there are diversity-guided Evolutionary Algorithms in which the
selection of the individuals to compose the next generation is based on the dissimilarity
between the members of the population. Given these characteristics, the use of Evolutionary
Algorithms with diversity-enforcing heuristics is an appropriate approach to optimize the
construction of ensembles. Therefore, an Evolutionary Algorithm, which encourages the
selection of diverse classifiers to form the committee, was developed in this dissertation, it
is called Diversity-based Classifier Ensemble (DCE). However, running this Evolutionary
Algorithm to create ensembles is CPU and memory-intensive. For that reason, a parallel
Evolutionary Algorithm called Parallel Diversity-based Classifier Ensemble (P-DCE) was
also developed using the global parallelization model to distribute the computational cost
among multiple CPUs. Also, in order to encouraging the ensemble diversity and aiming
to distribute the computational cost among multiple CPUs, two parallel Evolutionary
Algorithms based on the island model were also proposed. They are called Island Diversity-
based Classifier Ensemble (IDCE) and Island Classifier Ensemble (ICE). The results
obtained from the computational experiments indicate that the proposed algorithms can
be useful tools for solving the classifier ensembles optimization problem when compared
to the random search method. More precisely, the DCE and IDCE algorithms proved to
be promising to find ensembles with satisfactory accuracies in a competitive time when
compared with the random search approach. The ICE algorithm presented better results
compared to the random search approach and to the DCE and IDCE algorithms. Finally,
it was also possible to conclude that as parallelism increases, the P-DCE execution time
decreases.