Reviewing athletic performance is a critical part of modern sports training, but snapshots only showing part of a course or exercise can be misleading, while travelling cameras are expensive. In this paper we describe a system merging the output of many autonomous inexpensive camera nodes distributed around a course to reliably synthesize tracking shots of multiple athletes training concurrently. Issues such as uncontrolled lighting, athlete occlusions and overtaking/pack-motion are dealt with, as is compensating for the quirks of cheap image sensors. The resultant system is entirely automated, inexpensive, scalable and provides output in near real-time, allowing coaching staff to give immediate and relevant feedback on a performance. Requiring no alteration to existing training exercises has boosted the system's uptake by coaches, with over 100,000 videos recorded to date.
Results from wind-tunnel testing of athletes cannot always be repeated on the track, but reducing aerodynamic drag is critical for racing. Drag force is highly correlated with an athlete's frontal area, so in this paper we describe a system to segment an athlete from the very challenging background found in a standard racing environment. Given an accurate segmentation, a front-on view, and the athlete's position (for scaling), one can effectively count the pixels and thereby measure the moving area. The method described does not rely on alteration of the track lighting, background, or athlete's appearance. An image-matting algorithm more used in the film industry is combined with an innovative model-based pre-process to allow the whole measurement to be automated. Area results have better than one percent error compared to handextracted measurements over a representative period, while frame-by-frame measurements capture expected cyclic variation. A near real-time implementation permits rapid iteration of aerodynamic experiments during training.
Understanding complex events from unstructured video, like scoring a goal in a football game, is an extremely challenging task due to the dynamics, complexity and variation of video sequences. In this work, we attack this problem exploiting the capabilities of the recently developed framework of deep learning. We consider independently encoding spatial and temporal information via convolutional neural networks and fusion of features via regularized Autoencoders. To demonstrate the capacities of the proposed scheme, a new dataset is compiled, compose of goal and no-goal sequences. Experimental results demonstrate that extremely high classification accuracy can be achieved, from a dramatically limited number of examples, by leveraging pre-trained models with fine-tuned fusion of spatio-temporal features.
In the field of competitive swimming a quantitative evaluation of kinematic parameters is a valuable tool for coaches but also a labor intensive task. We present a system which is able to automate the extraction of many kinematic parameters such as stroke frequency, kick rates and stroke-specific intra-cyclic parameters from video footage of an athlete. While this task can in principle be solved by human pose estimation, the problem is exacerbated by permanently changing self-occlusion and severe noise caused by air bubbles, splashes, light reflection and light refraction. Current approaches for pose estimation are unable to provide the necessary localization precision under these conditions in order to enable accurate estimates of all desired kinematic parameters. In this paper we reduce the problem of kinematic parameter derivation to detecting key frames with a deep neural network human pose estimator. We show that we can correctly detect key frames with a precision which is on par with the human annotation performance. From the correctly located key frames, aforementioned parameters can be successfully inferred.
Collegiate athletics, particularly football, provide tremendous value to schools through branding, revenue, and publicity. As a result, extensive effort is put into recruiting talented students. When recruiting, home games are exceptional tools used to show a school's unique game-day atmosphere. However, this is not a viable option during the offseason or for off-site visits. This paper explores a solution to these challenges by using virtual reality (VR) to recreate the game-day experience. The Virtual Reality Application Center in conjunction with Iowa State University (ISU) athletics, created a VR application mimicking the game-day experience at ISU. This application was displayed using the world's highest resolution six-sided CAVETM, an Oculus Rift DK2 computer-driven head mounted display (HMD) and a Merge VR smart phone-driven HMD. A between-subjects user study compared presence between the different systems and a video control. In total, 82 students participated, indicating their presence using the Witmer and Singer questionnaire. Results revealed that while the CAVETM scored the highest in presence, the Oculus and Merge only experienced a slight drop compared to the CAVETM. This result suggests that the mobile ultra-low-cost Merge is a viable alternative to the CAVE TM and Oculus for delivering the game-day experience to ISU recruits.
A playbook in American Football can consist of hundreds of plays, and to learn each play and the corresponding assignments and responsibilities is a big challenge for the players. In this paper we propose a teaching tool for coaches in American Football based on computer vision and visualization techniques which eases the learning process and helps the players gain deeper knowledge of the underlying concepts. Coaches can create, manipulate and animate plays with adjustable parameters which affect the player actions in the animation. The general player behaviors and interactions between players are modeled based on expert knowledge. The final goal of the framework is to compare the theoretical concepts with their practical implementation in training and game, by using computer vision algorithms which extract spatio-temporal motion patterns from corresponding real video material. First results indicate that the software can be used effectively by coaches and the players' understanding of critical moments of the play can be increased with the animation system.