Print quality (PQ) is most important in the printing industry. To detect and analyze print defects is an effective solution to improve print quality. As the different types of print defects appear in different regions of interest (ROI) in the digital image of a scanned page, extracting
the different ROIs helps to detect and analyze the printer defect. This paper proposes a method to extract different ROIs based on the digital image object map [1], which includes three different labels: raster (images or pictures), vector (background and smooth gradient color areas), and
symbol (symbols and texts). Our ROI extraction method will extract four kinds of ROIs based on these three labeled objects. So we need to distinguish the background area and smooth gradient color area (color vector) from other vector objects. The process of the ROI extraction method includes
four parts; and each part will extract one kind of ROI. For the color vector and background ROI extraction part, we develop two approaches: one is to obtain the maximum area rectangular ROI; and the other approach is to extract the deepest rectangular ROI. With both of these two methods, we
use a greedy algorithm to gather additional useful ROIs. In the final result of the ROI extraction process, we only save the left top and right bottom positions for each ROI. In the end, we design a Matlab GUI Tool and label the ROI ground truth manually. We calculate the intersection over
union (IoU)) between the ROI extraction result and the ROI manually labeled ground truth to evaluate our ROI extraction algorithm, and check whether it is good enough to crop different ROIs from the image of the scanned page to detect and analyze print defects.