This paper proposes a strip-based fast and robust text detection algorithm for low cost embedded devices such as scanners/printers that is designed to operate with minimal memory requirements. Generally speaking, the unavailability of the whole document at once along with other memory
and processing speed constraints pose a significant challenge. While conventional approaches process the whole image/page with intensive algorithms to get a desirable result, our algorithm processes strips of the page very efficiently in terms of speed and memory allocation. To this effect,
a DCT block based approach along with appropriate pre and post-processing algorithms is used to create a map of text pixels from the original page while suppressing any non-text background, graphics or images. The proposed algorithm is able to detect text pixels from documents of varying backgrounds,
colors and non-textual portions. This algorithm is simulated in both MATLAB and C programming languages and tested using a Beagle Board to simulate a low processing CPU on a wide variety of documents. The average execution time for a full 8.5x11 page scanned at 300 dpi is approximately 0.5
sec. in C and about 3 seconds on the Beagle board.