Convolution Neural Networks (CNN) are rapidly deployed in ADAS and Autonomous driving for object detection, recognition, and semantic segmentation. The prior art of supporting CNN (HW IP or multi-core SW) doesn't address efficient implementation for the first layer, YUV color space, and output stride support. The given paper proposes a new pre-processing technique to enhance CNN based HW IP or multi-core SW solution. The pre-processor enables new features namely (1) Higher parallelism for the first layer with boosting of first layer (2) Efficient YUV color space (3) Efficient output stride support. The pre-processor uses novel phase-split method to enable supporting above features. The proposed solution splits input to multiple phases based on spatial location e.g. 2 phases for YUV 4:2:0 format, 4 phases for output strides 2 etc. The proposed solution is a unified solution that enables utilization (>90%) for the first layer and reduction of bandwidth of 2-4x for output stride of 2. For YUV color space, this reduces the computation by factor 2 along saving of ∼0.1 mm2 of silicon area with negligible loss in accuracy.