A data quality improvement method based on non-word errors correction

keting yin,shan wang,zirui liu,qi yu,bo zhou
DOI: https://doi.org/10.1109/ICSESS.2014.6933721
2014-01-01
Abstract:Spelling errors of data entry is an important factor which influences banking data quality. Based on banking information system, we study non-word spelling errors occurring in the process of typing in with keyboard. The fingering of keyboarders and QWERTY international keyboard layout will be taken into account in the division of the letters. 26 letters will be divided into 5 types according to fingering and keyboard partitions, and accordingly, a mathematical model based on keyboard probability will be proposed. Combined with effective labeling and sequencing methods of erroneous words, this model will further lead to a recommended list of misspelled words. Case study based on this model is carried out and a process of correcting the nonword spelling errors is demonstrated. The study shows that the method proposed in this paper will effectively produce the recommended list of misspelled words and improve the quality of data entry.
What problem does this paper attempt to address?