Detecting overlapping text from map images is a challenging problem. Previous algorithms generally assume specific cartographic styles (e. g., road shapes and text format) and are difficult to adjust for handling different map types. In this paper, we build on our previous text recognition work, Strabo, to develop an algorithm for detecting overlapping characters from non-text symbols. We call this algorithm Overlapping Text Detection (OTD). OTD uses the recognition results and locations of detected text labels (from Strabo) to detect potential areas that contain overlapping text. Next, OTD classifies these areas as either text or non-text regions based on their shape descriptions (including the ratio of number of foreground pixels to area size, number of connected components, and number of holes). The average precision and recall of OTD in classifying text and non-text regions were 77% and 86%, respectively. We show that OTD improved the precision and recall of text detection in Strabo by 19% and 41%, respectively, and produced higher accuracy compared to a state-of- the-art text/graphic separation algorithm.
Narges Honarvar Nazari, Tianxiang Tan, Yao-Yi Chiang, "Integrating Text Recognition for Overlapping Text Detection in Maps" in Proc. IS&T Int’l. Symp. on Electronic Imaging: Document Recognition and Retrieval XXIII, 2016, https://doi.org/10.2352/ISSN.2470-1173.2016.17.DRR-061