Extraction of Web Mathematical Formulas Based on Nutch

CUI Lin-wei,SU Wei,GUO Wei,LI Lian
DOI: https://doi.org/10.3969/j.issn.1001-6600.2011.01.034
2011-01-01
Abstract:The paper introduces the recognizing and extracting methods of mathematics expressions in formula-based mathematics search engine.It summarizes the corresponding features of MathML,OpenMath,LaTex and Infix when they are embedded in a Web page.A feature-based heuristic method of recognizing and extracting mathematical expressions is given in the paper.The experiments proves that the method is effective and useful.
What problem does this paper attempt to address?