|
Real-world data is never perfect and often suffer from noises that may affect interpretations of the data, the models created from the data, and the decisions made based on the data. A common solution for handling noise is to employ outlier detection techniques. LOF is a well-known and widely used algorithm for outlier detection based on local densities of data. However, it does not perform well on removing class noises since it does not take the information of class labels into consideration. In this paper, we propose a new noise removal algorithm based on Combined Local Outlier Factors: CLOF. Specifically, CLOF firstly defines three local outlier factors, i.e., lofa, lof1} and lof0}, and eliminates attribute noises using lofa}. Then, CLOF finds and corrects the labels of class noises by simultaneously using the three local outlier factors. Experimental results on artificial and real-world UCI data sets demonstrate that CLOF can effectively identify class noises and attribute noises so as to improve the classification performances of various classifiers, especially for data sets with severe class overlappings. |
|
Keywords:pattern recognition; quad class noise; quad attribute noise; quad local outlier factor; quad outlier detection |
|