Optimal ETF Selection for Passive Investing

David Puelz,Carlos M. Carvalho,P. Richard Hahn
DOI: https://doi.org/10.48550/arXiv.1510.03385
2015-11-28
Abstract:This paper considers the problem of isolating a small number of exchange traded funds (ETFs) that suffice to capture the fundamental dimensions of variation in U.S. financial markets. First, the data is fit to a vector-valued Bayesian regression model, which is a matrix-variate generalization of the well known stochastic search variable selection (SSVS) of George and McCulloch (1993). ETF selection is then performed using the decoupled shrinkage and selection (DSS) procedure described in Hahn and Carvalho (2015), adapted in two ways: to the vector-response setting and to incorporate stochastic covariates. The selected set of ETFs is obtained under a number of different penalty and modeling choices. Optimal portfolios are constructed from selected ETFs by maximizing the Sharpe ratio posterior mean, and they are compared to the (unknown) optimal portfolio based on the full Bayesian model. We compare our selection results to popular ETF advisor <a class="link-external link-http" href="http://Wealthfront.com" rel="external noopener nofollow">this http URL</a>. Additionally, we consider selecting ETFs by modeling a large set of mutual funds.
Statistical Finance,Applications
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to select a small number of exchange - traded funds (ETFs) from a large number of ETFs that can capture the basic changing dimensions of the US financial market to meet the needs of individual investors for passive investment. Specifically, the paper focuses on screening ETFs by constructing a vector - valued Bayesian regression model, which is a matrix - variable generalization of the stochastic search variable selection (SSVS) method proposed by George and McCulloch (1993). The selection of ETFs uses the "decoupled shrinkage and selection" procedure described by Hahn and Carvalho (2015) and is adapted in two aspects: first, it is applicable to the vector - response setting, and second, it incorporates random covariates. Finally, the optimal portfolio is constructed by maximizing the posterior mean of the Sharpe ratio and is compared with the (unknown) optimal portfolio based on the full Bayesian model. The methodology mentioned in the paper combines and extends two previous techniques: first, it extends the decision - theoretic variable selection (DSS) method of Hahn and Carvalho (2015) to the vector - valued response setting; second, it considers the random design matrix in the selection stage, which is particularly important in the investment context because future ETF returns are unknown at the time of selection. Overall, this research aims to provide individual investors with an effective method through statistical models and optimization techniques to select a few from many ETFs to achieve portfolio diversification and risk dispersion while maintaining low investment costs.