Abstract:In the digital age, rapid dissemination of information has elevated the challenge of distinguishing between authentic news and disinformation. This challenge is particularly acute in regions experiencing geopolitical tensions, where information plays a pivotal role in shaping public perception and policy. The prevalence of disinformation in the Ukrainian-language information space, intensified by the hybrid war with russia, necessitates the development of sophisticated tools for its detection and mitigation. Our study introduces the "Online Learning with Sliding Windows for Text Classifier Ensembles" (OLTW-TEC) method, designed to address this urgent need. This research aims to develop and validate an advanced machine learning method capable of dynamically adapting to evolving disinformation tactics. The focus is on creating a highly accurate, flexible, and efficient system for detecting disinformation in Ukrainian-language texts. The OLTW-TEC method leverages an ensemble of classifiers combined with a sliding window technique to continuously update the model with the most recent data, enhancing its adaptability and accuracy over time. A unique dataset comprising both authentic and fake news items was used to evaluate the method's performance. Advanced metrics, including precision, recall, and F1-score, facilitated a comprehensive analysis of its effectiveness. The OLTW-TEC method demonstrated exceptional performance, achieving a classification accuracy of 93%. The integration of the sliding window technique with a classifier ensemble significantly contributed to the system's ability to accurately identify disinformation, making it a robust tool in the ongoing battle against fake news in the Ukrainian context. The application of the OLTW-TEC method highlights its potential as a versatile and effective solution for disinformation detection. Its adaptability to the specifics of the Ukrainian language and the dynamic nature of information warfare offers valuable insights into the development of similar tools for other languages and regions. OLTW-TEC represents a significant advancement in the detection of disinformation within the Ukrainian-language information space. Its development and successful implementation underscore the importance of innovative machine learning techniques in combating fake news, paving the way for further research and application in the field of digital information integrity.

Ukrainian Texts Classification: Exploration of Cross-lingual Knowledge Transfer Approaches

Knowledge-based Document Embedding for Cross-Domain Text Classification

Expanding the Text Classification Toolbox with Cross-Lingual Embeddings

Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification

About Methods for Classifying Hidden Language Concepts in Specialized Texts Involving Pseudoinverse, Clustering and Data Grouping

Universal Cross-Lingual Text Classification

Methods for Detoxification of Texts for the Russian Language

Unraveling Bi-Lingual Multi-feature Based Text Classification: A Case Study

Detection of Toxic Language in Short Text Messages

The Grammar and Syntax Based Corpus Analysis Tool For The Ukrainian Language

Three language political leaning text classification using natural language processing methods

Be My Donor. Transfer the NLP Datasets Between the Languages Using LLM

OLTW-TEC: online learning with sliding windows for text classifier ensembles

A survey on text classification: Practical perspectives on the Italian language.

Multilingual text categorization and sentiment analysis: a comparative analysis of the utilization of multilingual approaches for classifying twitter data

Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models

T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text Classification

Usage of the Speech Disfluency Detection Method for the Machine Translation of the Transcriptions of Spoken Language

From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation

UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data

Exploring transfer learning for Deep NLP systems on rarely annotated languages