Recognition of Machine-Printed Arabic Mathematical Formulas
Abstract
—The proposed system recognizes machine-printed Arabic mathematical formulas, extracted from scanned images of clearly printed documents, and outputs the recognition results in MathML format. Two main stages are followed by the proposed system: symbol recognition and symbol-arrangement analysis. A combination of different statistical features (Run length, central Zernike moments, Bi-level co-occurrence, etc.) and K*, an instance-based classifier, have been used to achieve high accuracy for the recognition of mathematical symbols. We defined a set of replacement rules by a coordinate grammar to parse mathematical formulas. We used the coordinate grammar with emphasis on symbol recognizer as well as symbol arrangement analysis. Both ascending and descending parsing scheme, based on operator dominance, has been used to parse the formula. The proposed system provides output in MathML and achieves satisfactory results.
Origin : Files produced by the author(s)
Loading...