A Fast Algorithm for Feature Selection in Conditional Maximum Entropy Modeling

YQ Zhou,FL Weng,L Wu,H Schmidt
DOI: https://doi.org/10.3115/1119355.1119375
2003-01-01
Abstract:This paper describes a fast algorithm that selects features for conditional maximum entropy modeling. Berger et al. (1996) presents an incremental feature selection (IFS) algorithm, which computes the approximate gains for all candidate features at each selection stage, and is very time-consuming for any problems with large feature spaces. In this new algorithm, instead, we only compute the approximate gains for the top-ranked features based on the models obtained from previous stages. Experiments on WSJ data in Penn Treebank are conducted to show that the new algorithm greatly speeds up the feature selection process while maintaining the same quality of selected features. One variant of this new algorithm with look-ahead functionality is also tested to further confirm the good quality of the selected features. The new algorithm is easy to implement, and given a feature space of size F, it only uses O(F) more space than the original IFS algorithm.
What problem does this paper attempt to address?