Various human detection algorithms are limited in capability due to the lack of using other supplemental algorithms for enhancing detection. We propose using two different algorithms to extract vital information to augment human detection algorithms for increased accuracy. The first algorithm is the computation of depth information. Information needed to obtain depth is based on the specific location of the camera based from frame to frame. Most calibrated stereo cameras can develop accurate depth information, but the motion that takes place from frame to frame can be utilized for developing rough depth perception of the objects in the scene. Block-matching and optical flow algorithms can be used to provide these disparities that happen in the image, which will provide depth information for the human detection algorithm. The second algorithm is superpixel segmentation. This algorithm determines a rough over-segmentation of the imagery, which well defines the boundaries as larger pixels that are within the imagery. This information can be used to distinguish background and foreground information to create a specific segmentation around the human detection, rather than a bounding box detection that may include various background information. The fusion of these algorithms with human detection has been shown to increase detection accuracy and providing better localization of the human in the imagery.
In this paper, we propose a new human detection descriptor based on a combination of three major types of visual information: color, shape, and texture. Shape features are extracted based on both the gradient concept and the phase congruency in LUV color space. The Center-Symmetric Local Binary Pattern (CSLBP) approach is used to capture the texture information of the image. The fusing of these complementary information yields to capture a broad range of the human appearance details that improves the detection accuracy. The proposed features are formed by computing the phase congruency of the three color channels in addition to the gradient magnitude and CSLBP value for each pixel in the image with respect to its neighborhood. Only the maximum phase congruency values are selected from the corresponding color channels. The histogram of oriented phase and gradients, as well as the histogram of CSLBP values for the local regions of the image, are determined. These histograms are concatenated to construct the proposed descriptor, that fuses the shape and texture features, and it is named as Chromatic domain Phase features with Gradient and Texture (CPGT). Several experiments were conducted to evaluate the performance of the proposed CPGT descriptor. The experimental results show that the proposed descriptor has better detection performance and lower error rates when compared to several state of art feature extraction methodologies.