Most sports competitions are still judged by humans; the process of judging itself is not only skill and experience demanding but also at the risk of errors and unfairness. Advances in sensing and computing technologies have found successful applications to assist human judges with the refereeing process (e.g., the wellknown Hawk-Eye system). Along this line of research, we propose to develop a computer vision (CV)-based objective synchronization scoring system for synchronized diving - a relatively young Olympic sport. In synchronized diving, subjective judgement is often difficult due to the rapidness of human motion, the limited viewing angles as well as the shortness of human memory, which inspires our development of an automatic and objective scoring system. Our CV-based scoring system consists of three components: (1) Background estimation using color and optical flow clues that can effectively segment the silhouette of both divers from the input video; (2) Feature extraction using histogram of oriented-gradients (HOG) and stick figures to obtain an abstract representation of each diver's posture invariant to body attributes (e.g., height and weight); (3) Synchronization evaluation by training a feed-forward neural network using cross-validation. We have tested the designed system on 22 diving video collected at 2012 London Olympic Games. Our experimental results have shown that CV-based approach can accurately produce synchronization scores that are close to the ones given by human judges with a MSE of as low as 0.24.
We propose a novel hybrid framework for estimating a clean panoramic background from consumer RGB-D cameras. The method explicitly handles moving objects, eliminates distortions observed in traditional 2D stitching methods and adaptively handles errors in input depth maps to avoid errors common in 3D based schemes. It produces a panoramic output which integrates parts of the scene as captured from the different poses of the moving camera and removes moving objects by replacing them with their correct background information in color and depth. A fused and cleaned RGB-D has multiple applications such as virtual reality, video compositing and creative video editing. Existing image stitching methods rely on either color or depth information and thus suffer from perspective distortions or low RGB fidelity. A detailed comparison between traditional and state-of-the-art methods and the proposed framework demonstrates the advantages of fusing 2D and 3D information for panoramic background estimation.