Language and Multimodal Models in Sports: A Survey of Datasets and Applications

Haotian Xia,Zhengbang Yang,Yun Zhao,Yuqing Wang,Jingxi Li,Rhys Tracy,Zhuangdi Zhu,Yuan-fang Wang,Hanjie Chen,Weining Shen

2024-06-18

Abstract:Recent integration of Natural Language Processing (NLP) and multimodal models has advanced the field of sports analytics. This survey presents a comprehensive review of the datasets and applications driving these innovations post-2020. We overviewed and categorized datasets into three primary types: language-based, multimodal, and convertible datasets. Language-based and multimodal datasets are for tasks involving text or multimodality (e.g., text, video, audio), respectively. Convertible datasets, initially single-modal (video), can be enriched with additional annotations, such as explanations of actions and video descriptions, to become multimodal, offering future potential for richer and more diverse applications. Our study highlights the contributions of these datasets to various applications, from improving fan experiences to supporting tactical analysis and medical diagnostics. We also discuss the challenges and future directions in dataset development, emphasizing the need for diverse, high-quality data to support real-time processing and personalized user experiences. This survey provides a foundational resource for researchers and practitioners aiming to leverage NLP and multimodal models in sports, offering insights into current trends and future opportunities in the field.

Computation and Language

What problem does this paper attempt to address?

The paper aims to systematically explore the integrated application of Natural Language Processing (NLP) and multimodal models in the sports domain, providing a comprehensive dataset and application review. Specifically, the paper addresses the following key issues: 1. **Dataset Classification**: The paper categorizes datasets into three types: language-based datasets, multimodal datasets, and convertible datasets. These datasets are used for text or multimodal tasks (such as text, video, audio), and single-modal datasets (such as video) can be converted into multimodal datasets by adding annotations. 2. **Application Areas**: The paper details various applications of NLP and multimodal models in the sports domain, including match prediction and analysis, hate speech detection, named entity recognition, news summarization, fan interaction, sports understanding, medical applications, and educational applications. 3. **Future Directions and Challenges**: The paper discusses the future development directions and challenges of NLP and multimodal models in the sports domain, emphasizing the need for more high-quality and diverse data to support real-time processing and personalized user experiences. Through this review, the paper provides a foundational resource for researchers and practitioners, helping them better utilize NLP and multimodal models in sports research and practice.

Language and Multimodal Models in Sports: A Survey of Datasets and Applications

Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video

A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision

A Survey of Multimodal Large Language Model from A Data-centric Perspective

SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models

Multimodal Large Language Models: A Survey

SportQA: A Benchmark for Sports Understanding in Large Language Models

Sports Video Analysis on Large-Scale Data

Personalized Multimodal Large Language Models: A Survey

SNIL: Generating Sports News From Insights With Large Language Models

A Survey on Image-text Multimodal Models

A Survey on Multimodal Benchmarks: In the Era of Large AI Models

A Survey on Video Action Recognition in Sports: Datasets, Methods and Applications

A survey on advancements in image-text multimodal models: From general techniques to biomedical implementations

Sporthesia: Augmenting Sports Videos Using Natural Language

Vision+X: A Survey on Multimodal Learning in the Light of Data

Towards Universal Soccer Video Understanding

OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters

Large Language Models in Sport Science & Medicine: Opportunities, Risks and Considerations

A Survey on Benchmarks of Multimodal Large Language Models

SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs