Application of L<inf>ρ</inf> Norm Regularization Methods for Modelling Biological Systems

Kang Li,Padhraig Gormley,Shiji Song
DOI: https://doi.org/10.1109/icmlc.2009.5212751
2009-01-01
Abstract:In systems biology, molecular interactions are typically modelled using white-box differential equations based on mass action kinetics. Unfortunately, problems with dimensionality can arise when the number of molecular species in the system becomes very large, which make the transparent modelling and behavior simulation extremely difficult or computationally too expensive. As an alternative, data-driven identification of molecular interaction pathways using a black-box approach has recently been investigated. One of the main objectives in building black-box models, which in many cases are linear-in-the-parameters ones, is to produce a sparse model to effectively represent the system behavior. A popular approach is to select model terms one by one from a pool of candidates (basis functions), and an information criterion is then used to stop the selection process. The advantage is the computational efficiency, the disadvantage is that the derived model is not necessarily sparse. Alternative approach is to introduce into the normal loss function a penalty term on the parameters, leading to improved sparseness and generalization performance of the derived model. Moreover, there is a positive probability that the model structure can be accurately picked up among a wide range of possibilities. Generally speaking, there are three l rho norm regularization methods, including the Lasso (rho = 1), Ridge (rho = 2) and Bridge (0 < rho < 1). In particular, Lasso has been introduced into computational biology in recent years. This paper investigates the effectiveness of the three (l rho ) regularization methods on the model identification of the MAPK signal transduction pathway, and simulation results are compared and analyzed.
What problem does this paper attempt to address?