We propose a novel hybrid framework for estimating a clean panoramic background from consumer RGB-D cameras. The method explicitly handles moving objects, eliminates distortions observed in traditional 2D stitching methods and adaptively handles errors in input depth maps to avoid errors common in 3D based schemes. It produces a panoramic output which integrates parts of the scene as captured from the different poses of the moving camera and removes moving objects by replacing them with their correct background information in color and depth. A fused and cleaned RGB-D has multiple applications such as virtual reality, video compositing and creative video editing. Existing image stitching methods rely on either color or depth information and thus suffer from perspective distortions or low RGB fidelity. A detailed comparison between traditional and state-of-the-art methods and the proposed framework demonstrates the advantages of fusing 2D and 3D information for panoramic background estimation.