Multi-Industry Simplex 2.0 : Temporally-Evolving Probabilistic Industry Classification

Maksim Papenkov
2024-07-23
Abstract:Accurate industry classification is critical for many areas of portfolio management, yet the traditional single-industry framework of the Global Industry Classification Standard (GICS) struggles to comprehensively represent risk for highly diversified multi-sector conglomerates like Amazon. Previously, we introduced the Multi-Industry Simplex (MIS), a probabilistic extension of GICS that utilizes topic modeling, a natural language processing approach. Although our initial version, MIS-1, was able to improve upon GICS by providing multi-industry representations, it relied on an overly simple architecture that required prior knowledge about the number of industries and relied on the unrealistic assumption that industries are uncorrelated and independent over time. We improve upon this model with MIS-2, which addresses three key limitations of MIS-1 : we utilize Bayesian Non-Parametrics to automatically infer the number of industries from data, we employ Markov Updating to account for industries that change over time, and we adjust for correlated and hierarchical industries allowing for both broad and niche industries (similar to GICS). Further, we provide an out-of-sample test directly comparing MIS-2 and GICS on the basis of future correlation prediction, where we find evidence that MIS-2 provides a measurable improvement over GICS. MIS-2 provides portfolio managers with a more robust tool for industry classification, empowering them to more effectively identify and manage risk, particularly around multi-sector conglomerates in a rapidly evolving market in which new industries periodically emerge.
Portfolio Management
What problem does this paper attempt to address?
The paper primarily aims to address the limitations of existing industry classification systems in dealing with modern diversified enterprises, especially cross-industry giants like Amazon. Specifically, the paper attempts to solve the following key issues: 1. **Limitations of Traditional Industry Classification Systems**: - The currently dominant Global Industry Classification Standard (GICS) assigns each company to a single industry, which cannot comprehensively represent risks when dealing with highly diversified cross-industry enterprises. - This single-industry framework is inadequate for understanding the actual business composition and associated risks of companies like Amazon in different industries. 2. **Introduction of Multi-Industry Simplex 2.0 (MIS-2)**: - The paper introduces Multi-Industry Simplex 2.0 (MIS-2), an improvement over the previous version MIS-1, utilizing natural language processing techniques such as topic modeling to provide a more flexible probabilistic industry classification method. - MIS-2 addresses the issues in MIS-1 through improvements in three key aspects: - Using Bayesian non-parametric methods to automatically infer the number of industries from the data, avoiding the need to know the number of industries in advance. - Employing Markov updates to account for industry characteristics that change over time. - Considering the relationships and hierarchical structures of industries, allowing for the identification of both broad and specific industries. 3. **Empirical Comparison and Validation**: - The paper also provides out-of-sample test results, directly comparing the performance of MIS-2 and GICS in predicting future relevance, and finds that MIS-2 shows significant improvements over GICS. - MIS-2 offers portfolio managers a more powerful tool to help them more effectively identify and manage risks, especially with the emergence of new industries in rapidly changing market environments. In summary, the paper aims to improve the existing single-industry classification method by introducing a more flexible and accurate probabilistic industry classification system to better accommodate the diversified nature of modern enterprises.