BroadTrack: Broadcast Camera Tracking for Soccer

Floriane Magera,Thomas Hoyoux,Olivier Barnich,Marc Van Droogenbroeck
2024-12-03
Abstract:Camera calibration and localization, sometimes simply named camera calibration, enables many applications in the context of soccer broadcasting, for instance regarding the interpretation and analysis of the game, or the insertion of augmented reality graphics for storytelling or refereeing purposes. To contribute to such applications, the research community has typically focused on single-view calibration methods, leveraging the near-omnipresence of soccer field markings in wide-angle broadcast views, but leaving all temporal aspects, if considered at all, to general-purpose tracking or filtering techniques. Only a few contributions have been made to leverage any domain-specific knowledge for this tracking task, and, as a result, there lacks a truly performant and off-the-shelf camera tracking system tailored for soccer broadcasting, specifically for elevated tripod-mounted cameras around the stadium. In this work, we present such a system capable of addressing the task of soccer broadcast camera tracking efficiently, robustly, and accurately, outperforming by far the most precise methods of the state-of-the-art. By combining the available open-source soccer field detectors with carefully designed camera and tripod models, our tracking system, BroadTrack, halves the mean reprojection error rate and gains more than 15% in terms of Jaccard index for camera calibration on the SoccerNet dataset. Furthermore, as the SoccerNet dataset videos are relatively short (30 seconds), we also present qualitative results on a 20-minute broadcast clip to showcase the robustness and the soundness of our system.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **Improve the accuracy and robustness of camera tracking in football event broadcasts, especially for cameras installed on elevated tripods around the stadium**. Specifically, the author points out that although existing methods can use football field markings for single - frame calibration, in practical applications, these methods fail to fully consider temporal consistency and the special properties of camera lenses (such as zoom and lens distortion). Therefore, the existing camera tracking systems do not perform well in football broadcasts. To address these problems, the author proposes a new system named **BroadTrack**, which combines an open - source football field detector, a carefully designed camera and tripod model, aiming to track football broadcast cameras efficiently and accurately. Through this method, BroadTrack can significantly reduce the average reprojection error and increase the Jaccard index by more than 15% on the SoccerNet dataset. ### Main problem summary: 1. **Limitations of existing methods**: Most existing methods rely only on single - frame calibration techniques and ignore temporal consistency, resulting in poor tracking performance in long - time video sequences. 2. **Camera lens characteristics**: Broadcast cameras usually have high - power zoom lenses and lens distortion, and these characteristics are not fully considered by existing methods, affecting the accuracy of calibration and tracking. 3. **Lack of application of specialized knowledge**: Few studies use domain - specific knowledge to improve the camera tracking task, especially when dealing with cameras installed on elevated tripods around the stadium. ### Solutions: - **Combine domain knowledge**: Use the specific characteristics of broadcast cameras and tripods to design more accurate camera models. - **Multi - frame tracking**: By introducing temporal consistency constraints, ensure the smooth change of camera parameters over time. - **Optimization algorithm**: Adopt non - linear optimization methods to minimize reprojection errors and other error functions, thereby improving the accuracy and robustness of tracking. Through these improvements, BroadTrack outperforms existing methods in many aspects, especially in long - time video sequences.