In this article, we present the BinDCT algorithm, a fast approximation of the Discrete Cosine Transform, and its efficient VLSI architectures for hardware implementations. The design objective is to meet the real-time constrain in embedded systems. Two VLSI architectures are proposed. The first architecture is targeted for low complexity applications such as videophones, digital cameras, and digital camcorders. The second architecture is designed for high perform applications, which include high definition TV and digital cinema. In order to meet the real-time constrain for these applications, we decompose the structure of the BinDCT algorithm into simple matrices and map them into multi-stage pipeline architectures. For low complexity implementation, the proposed 2-D BinDCT architecture can be realized with the cost of 10 integer adders, 80 registers and 384 bytes of embedded memory. The high performance architecture can be implemented with an extra of 30 adders. These designs can calculate real-time DCT/IDCT for video applications of CIF format at 5 MHz clock rate with 1.55 volt power supply. With its high performance and low power consumption features, BinDCT coprocessor is an excellent candidate for real-time DCT-based image and video processing applications.
Philip P. Dang, Paul M. Chau, Truong Q. Nguyen, Trac D. Tran, "BinDCT and Its Efficient VLSI Architectures for Real-Time Embedded Applications" in Journal of Imaging Science and Technology, 2005, pp 124 - 137, https://doi.org/10.2352/J.ImagingSci.Technol.2005.49.2.art00004