
Cost Sensitive boosting Algorithms: Do we really need them?
Authors: Nikolaos Nikolaou, Narayanan Edakunni, Meelis Kull, Peter Flach, Gavin Brown
Journal: Machine Learning
Publication Date: 02 August, 2016
Department of: Computer Science
In Abstract
A unifying framework to understand two decades of research.
The world relies on data, and increasingly on data analytics. “Boosting” is the name of a family of algorithms used extensively in data science, across numerous applications. “Cost sensitive” algorithms are those that continue to work well when the cost of making a false positive prediction outweighs the cost of a false negative, or vice versa. This occurs, for example, in safety critical applications like medical informatics.
It has been perceived that Boosting tends to fail (i.e. predict incorrectly) on such “imbalanced cost” data, hence much effort has been expended attempting to patch this perceived weakness. This research identified 15 distinct Boosting variants, published over 20 years, each with its own claims to distinction and superiority.
Researchers at Manchester analysed these, with 4 theoretical frameworks – Bayesian Decision Theory, Margin Theory, Functional Gradient Descent, and Probabilistic modelling – and axioms were identified, that algorithms must obey to be coherent. The result was that only one algorithm is consistent with all frameworks, and practical for deployment in real applications. Surprisingly, this is the original 1997 algorithm, with a “calibration” of its outputs. Extensive experiments support this, showing it outperforms all other variants.
This research, in effect, renders irrelevant almost 20 years of heuristic algorithms, pointing out that a simple calibration of the original work is more than sufficient.
- The researchers found 15 “core” variants of the algorithm, and at least 5 more minor variants, published in many different venues.
- For many years Boosting was the algorithm used by digital cameras to find faces in images.