A Short Note on Evaluating RepNet for Temporal Repetition Counting in Videos

Debidatta Dwibedi,Yusuf Aytar,Jonathan Tompson,Pierre Sermanet,Andrew Zisserman
2024-11-14
Abstract:We discuss some consistent issues on how RepNet has been evaluated in various papers. As a way to mitigate these issues, we report RepNet performance results on different datasets, and release evaluation code and the RepNet checkpoint to obtain these results. Code URL: <a class="link-external link-https" href="https://github.com/google-research/google-research/blob/master/repnet/" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is about how to evaluate the performance of RepNet in the video repetition counting task. Specifically, the paper points out the consistency problem in the performance evaluation of RepNet in different literatures, and emphasizes that some literatures use a modified RepNet model, which has led to misunderstandings of its performance. The author of the paper points out that the original RepNet model can handle predictions with a cycle length of more than 32 through the multi - speed evaluation technique without modifying the model. To clarify these confusions, the author reports the performance results of RepNet on different datasets and releases the evaluation code and RepNet checkpoints. ### Main problem points: 1. **Evaluation consistency problem**: Multiple literatures have reported the insufficient performance of RepNet on some repetition - counting datasets, but what are often used in these reports are modified RepNet models, not the original model. 2. **Impact of model modification**: For example, in the TransRAC paper, it is mentioned that for a fair comparison, they modified the last fully - connected layer of RepNet so that it can handle videos with more than 32 action cycles. This modification has led to the performance of the modified RepNet model being close to zero on the UCFRep and RepCount datasets. 3. **Multi - speed evaluation technique**: In the original RepNet paper, a multi - speed evaluation technique was proposed. By changing the video playback speed, longer - cycle predictions can be handled without modifying the model. This method has also been verified on the Countix dataset. ### Solutions: - **Re - evaluate RepNet**: The author re - evaluated the performance of the original RepNet model on the Countix, UCFRep, and RepCount - A datasets using the multi - speed evaluation technique and reported detailed performance metrics. - **Release evaluation code and model**: For transparency and reproducibility, the author released the code used for evaluation and the checkpoints of the RepNet model. ### Conclusion: Through re - evaluation, the author found that the original RepNet model still performs excellently in the video repetition counting task, and even outperforms more modern methods on some datasets. This shows that although many new models and larger backbone networks have emerged in recent years, the RepNet model trained in 2020 is still highly competitive at low resolutions. It is hoped that the community can use the released model to further improve the repetition - counting method.