A Novel Localization Method for Mathematical Formula in English Scientific Document

LI Feng,WU Wei
2009-01-01
Abstract:A novel bottom-up and top-down mixed method for the localization of mathematics formulas in English scientific document image is proposed.Firstly,a benchmark parameter is calculated using the statistic data of the whole document image.Secondly,the document image is divided into lines with horizontal project data of local-maximum components in the image and each line is divided into some sub-regions in terms of the vertical projection data.These sub-regions are classified in terms of the benchmark parameter.Finally,the locations of formulas in the document image are obtained by suitably merging certain specific regions.The novel method can be used for picture-text mixed documents and can reduce the effect of the pictures and forms in the document image on mathematical expression localization.
What problem does this paper attempt to address?