Consistency Measures for Feature Selection
Kilho Shin, Danny Fernandes, Seiya Miyazaki
Although consistency-based feature selection is an important category of feature selection research, it is defined only intuitively in the literature. So first, we provide a formal definition of consistency measure, and then using this definition evaluate 19 feature selection measures from the literature. While only 5 of these were labeled as consistency measures by their original authors, by our definition, an additional 9 measures should be included as consistency measures. To compare these 14 consistency measures in terms of sensitivity, we introduce the concept of quasi-linear compatibility order, and partially determine the order among the measures. Next, we propose a new fast algorithm for consistency-based feature selection. We ran experiments using eleven large datasets to compare the performance of our algorithm against INTERACT and LCC, the only two instances of consistencybased algorithms with potential real world application. Our algorithm shows vast improvement in time efficiency, while its performance in accuracy is comparable with that of INTERACT and LCC.