Methods and Quality of Occupational Coding in Social Surveys

Ren Liying,Qiu Zeqi,Li Li,Yan Jie
DOI: https://doi.org/10.3785/j.issn.1008-942X.2011.09.211
2012-01-01
Abstract:Occupation is an important variable in social science research,but mistakes in the coding process of occupations in survey research are.unavoidable.Coding operations can take various forms.They are distinguished as centralized coding and decentralized coding based on their work sites,or as manual coding and computer-assisted coding based on their coding tools.Thus,combining these two dimensions there are four coding methods: manual centralized coding,manual decentralized coding,computer-assisted centralized coding,and computer-assisted decentralized coding.Computer-assisted coding has not been well developed in China,so most Chinese surveys employed the first two coding methods: interviewers carrying out coding during the interviewing process;or experienced coders performing the coding within the survey organization after data collection. When choosing coding methods,survey practitioners usually have three factors in mind: cost,time efficiency,and coding quality.It is commonly believed that on-site coding by interviewers is cheaper and quicker than coders' centralized coding.However,there have been contradictory attitudes towards the quality of these two coding methods,and there have been very few empirical studies about that.Based on analysis of the occupational information collected by the Chinese Family Panel Studies(CFPS) in 2010,this study compares the results from these two existing coding methods in China and discusses the core factors that affect coding quality. This study shows that coding results from these two methods differ greatly.Regarding the most detailed coding with 595 categories,only about one-third of the results from these two methods are identical.Even for simple coding with only eight categories,the proportion of identification still makes up only three-fourths. Interviewers' text recording quality is an important factor that affects coding quality.In addition,interviewers' background and coding experiences are two main reasons for the discrepancies in the detailed coding results.It is also shown in this study that occupational categories have different levels of coding difficulty which also have an effect on coding results. Administration of quality control over interviewers' on-site occupational coding is difficult in practice.Therefore,in rigorous social surveys,especially when detailed coding results are needed,it is strongly suggested to use the method of centralized coding.Moreover,since the quality of the interviewers' text recording is so important to the collection of accurate and complete occupational information,the following steps are recommended: establish a standard for interviewers' text recoding,strengthen the training of interviewers,and check their performance on a regular basis.It is also important to enhance quality control in the coding process,such as paying more attention to the design of the coding process as well as the supervision of the coders' work.These suggestions can be effectively put into practice in computer-assisted interviewing surveys.
What problem does this paper attempt to address?