Institutions | About Us | Help | Gaeilge
rian logo


Mark
Go Back
Overfitting and Diversity in Classification Ensembles based on Feature Selection
Cunningham, Padraig
TCD-CS-2000-07 This paper addresses Wrapper-like approaches to feature subset selection and the production of classifier ensembles based on members with different feature subsets. The paper starts with the observation that if an insufficient amount of data is used to guide the Wrapper search then the feature selection will overfit the data. If the objective of the feature selection exercise is to build a better predictor, rather than identify important features for data mining reasons, then ensembles offers a solution. Overfitting may be used to provide diversity in ensembles provided the overfitted members have variety. The paper concludes with an assessment of entropy as a measure of diversity in classifier ensembles. A tentative conclusion is that diversity is not such a problem where a large number of features is involved but needs to be monitored for problems with smaller numbers of features ? say less than 25.
Keyword(s): Computer Science
Publication Date:
2000
Type: Report
Peer-Reviewed: Unknown
Language(s): English
Institution: Trinity College Dublin
Citation(s): Cunningham, Padraig. 'Overfitting and Diversity in Classification Ensembles based on Feature Selection'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2000-07, 2000, pp8
Publisher(s): Trinity College Dublin, Department of Computer Science
File Format(s): application/pdf
First Indexed: 2014-05-13 05:31:23 Last Updated: 2015-04-10 05:14:04