I this page I will put more info about the projects that I have done or I am currently doing. This page is under construction.
In this paper, we quantify the effect of distribution complexity on the RMS, for several data-efficient popular error estimators, including resubstitution, leave-one-out, cross-validation, bootstrap, and bolstering. Several classification rules are considered: Quadratic Discriminant Analysis (QDA), 3-nearest-neighbor (3NN) and neural networks (NNet).
In defining distributional complexity we want to differentiate between complexity and separability of the classes. Specifically, we want a measure of complexity that will not be related to the Bayes error but will be related to the complexity of the Bayes decision boundary. The classes may be multimodal, with different “modes” being highly interwoven in Euclidean space, but without overlap among the class-conditional densities. In this case the Bayes error will be zero, but the Bayes classifier may possess a complex decision boundary. Despite the fact that, in principle, such a situation involves perfectly separable classes, it presents one, in practice, with a difficult problem for both classifier design and error estimation. Our interest here is not with error-estimation RMS as a function of Bayes error, but as a function of distribution complexity, and thus complexity of the decision boundary, and our proposed definition of complexity reflects this fact.
Esmaeil Atashpaz-Gargari, Chao Sima, Ulisses M Braga-Neto, Edward R Dougherty, Relationship between the accuracy of classifier error estimation and complexity of decision boundary, Pattern Recognition, 2012 (Link)