News and Load: A Quantitative Exploration of Natural Language Processing Applications for Forecasting Day-ahead Electricity System Demand

Yun Bai,Simon Camal,Andrea Michiorri
DOI: https://doi.org/10.48550/arXiv.2301.07535
2024-01-30
Abstract:The relationship between electricity demand and weather is well established in power systems, along with the importance of behavioral and social aspects such as holidays and significant events. This study explores the link between electricity demand and more nuanced information about social events. This is done using mature Natural Language Processing (NLP) and demand forecasting techniques. The results indicate that day-ahead forecasts are improved by textual features such as word frequencies, public sentiments, topic distributions, and word embeddings. The social events contained in these features include global pandemics, politics, international conflicts, transportation, etc. Causality effects and correlations are discussed to propose explanations for the mechanisms behind the links highlighted. This study is believed to bring a new perspective to traditional electricity demand analysis. It confirms the feasibility of improving forecasts from unstructured text, with potential consequences for sociology and economics.
Computation and Language,Artificial Intelligence,Computers and Society
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to explore the application of natural language processing (NLP) techniques in power demand forecasting, especially how to use text information (such as news reports) to improve day - ahead power demand forecasting. Specifically, the main research objectives include: 1. **Verify whether the information extracted from the news can improve power demand forecasting**: - The research assumes that social factors (such as global events, political developments, international conflicts, etc.) have an impact on power demand, and these factors can be reflected through text data such as news. 2. **Develop a complete forecasting chain integrating text and other structured data**: - Combine the traditional forecasting model based on meteorological and calendar data with the features extracted from news texts to form a more comprehensive forecasting framework. 3. **Explain the mechanisms behind the improved performance**: - Explain why text features can enhance the forecasting effect from the perspectives of global, local, and causal relationships. By analyzing different types of text features (such as word frequency, sentiment analysis, topic distribution, word embeddings, etc.), explore their associations with power demand. ### Research background and motivation Traditionally, power demand forecasting mainly relies on meteorological data (such as temperature) and human activity patterns (such as weekdays and weekends). However, in recent years, the influence of social and economic factors (such as the global pandemic, climate change, international conflicts, etc.) on power demand has become more and more significant. These factors can usually be captured through text data such as news reports. Therefore, researchers hope to introduce NLP techniques to extract valuable information from news texts to improve the accuracy of power demand forecasting. ### Method overview To achieve the above - mentioned goals, the researchers adopted a multi - step method: 1. **Data acquisition and pre - processing**: - Collect power load, air temperature, calendar data, and news texts. - Pre - process the text data, including word segmentation, removing stop words, converting to lowercase, and removing irrelevant information. 2. **Feature extraction**: - Use statistical methods (such as word frequency, sentence length, etc.), semantic methods (such as sentiment analysis, topic distribution), and representation methods (such as word embeddings) to extract features from the text. - Screen out features useful for forecasting through the Granger causality test. 3. **Model construction and evaluation**: - Use the ExtraTrees regression model for day - ahead power demand forecasting. - Evaluate the model performance and use indicators such as RMSE, MAE, and SMAPE to measure the forecasting error. - Explain the behavior and causal relationships of the model through the local interpretable model (LIME) and double machine learning (Double ML). ### Main contributions - **Verify that news text information can improve power demand forecasting**. - **Develop a complete forecasting framework that combines text and other structured data**. - **Provide a detailed explanation of the mechanisms behind the improved performance**, including analysis from the perspectives of global, local, and causal relationships. Through these efforts, researchers hope to provide a new perspective for the power system, helping it better cope with uncertain social events and improve the accuracy and reliability of power demand forecasting.