What problem does this paper attempt to address?

The problem that this paper attempts to solve is to detect the clickbait phenomenon in YouTube videos. Specifically, the author focuses on how to use machine learning and deep learning techniques to identify videos whose titles, descriptions or thumbnails do not match the actual content. These videos aim to attract users to click in order to increase the number of views and the income of creators. ### Problem Background With the development of the Internet, more and more people rely on the network to obtain information. Many platforms allow anyone to publish content, but the authenticity of this content cannot be guaranteed. Especially on video - sharing platforms such as YouTube, some creators will use misleading titles, descriptions or thumbnails to attract users to click in order to increase the number of views and income. This behavior is called "clickbait", which not only wastes users' time, but may also affect the trustworthiness of the platform. ### Research Objectives The author hopes to detect clickbait videos on YouTube by experimenting with a variety of advanced machine learning techniques and using text features (such as titles, descriptions, etc.). Specific research objectives include: 1. **Identify clickbait videos**: Analyze the content of videos such as titles, descriptions, thumbnails and comments to determine whether they are clickbait. 2. **Improve detection accuracy**: Experiment with different machine learning and deep learning models to find the most effective detection method. 3. **Explore multi - modal features**: In addition to text features, other statistical features (such as the number of likes, the number of comments, etc.) are also considered to improve the detection effect. ### Method Overview The author used a variety of machine learning and deep learning models in the research and combined different feature extraction methods. Mainly including: - **Logistic Regression with Word2Vec**: Use Word2Vec to generate word vectors and combine the metadata features of videos for classification. - **Random Forest with Word2Vec**: A random forest classifier based on Word2Vec embedding, trained with more features. - **MLP with Word2Vec**: Use a multi - layer perceptron (MLP) to process the text features and metadata features of Word2Vec embedding. - **MLP with BERT**: Use BERT to generate context - related word vectors and combine metadata features for classification. - **MLP with DistilBERT**: Use DistilBERT (a lightweight version of BERT) to perform similar tasks. ### Results and Conclusions Through experiments, the author found that: - Using more features (such as titles, descriptions, the number of likes, the number of comments, etc.) can significantly improve the detection accuracy. - The random forest model performs best when all features are combined, with an accuracy rate of 92.5%. - Pretrained language models such as BERT and DistilBERT perform excellently in processing natural language tasks and can capture more abundant semantic information. In conclusion, this research provides an effective method for automatically detecting clickbait videos on YouTube, which is helpful for improving user experience and platform trustworthiness.

Clickbait Detection in YouTube Videos

Clickbait in YouTube Prevention, Detection and Analysis of the Bait using Ensemble Learning

Machine Learning Based Detection of Clickbait Posts in Social Media

Detecting Clickbait in Online Social Media: You Won't Believe How We Did It

Towards Reliable Online Clickbait Video Detection: A Content-Agnostic Approach

Clickbait in Education Positive or Negative? Machine Learning Answers

Boost Clickbait Detection Based on User Behavior Analysis

Multimodal Clickbait Detection by De-confounding Biases Using Causal Representation Inference

Is it a click bait? Let's predict using Machine Learning

How Curiosity can be modeled for a Clickbait Detector

Clickbait detection on WeChat: A deep model integrating semantic and syntactic information

Clickbait Detection with Style-Aware Title Modeling and Co-attention.

BaitBuster-Bangla: A comprehensive dataset for clickbait detection in Bangla with multi-feature and multi-modal analysis

Clickbait Classification and Spoiling Using Natural Language Processing

Non-Alpha-Num: a novel architecture for generating adversarial examples for bypassing NLP-based clickbait detection mechanisms

"It is Luring You to Click on the Link With False Advertising" - Mental Models of Clickbait and Its Impact on User's Perceptions and Behavior Towards Clickbait Warnings

Intelligent Clickbait News Detection System Based on Artificial Intelligence and Feature Engineering

Seeing is Not Always Believing: an Exploratory Study of Clickbait in WeChat

Federated Hierarchical Hybrid Networks for Clickbait Detection

Clickbait Detection via Large Language Models

Clicktok: Click Fraud Detection using Traffic Analysis