Separating biological variance from noise by applying EM algorithm to modified General Linear Model

Tien-Wen Lee
DOI: https://doi.org/10.1101/2024.09.29.615661
2024-10-01
Abstract:Introduction: The General Linear Model (GLM) has been widely used in research, where error term has been treated as noise. However, compelling evidence suggests that in biological systems, the target variables may possess their innate variances. Methods: A modified GLM was proposed to explicitly model biological variance and non-biological noise. Employing the Expectation and Maximization (EM) scheme can distinguish biological variance from noise, termed EMSEV (EM for Separating Variances). The performance of EMSEV was evaluated by varying noise levels, dimensions of the design matrix, and covariance structures of the target variables. Results: The deviation between EMSEV outputs and the pre-defined distribution parameters increased with noise level. With a proper initial guess, when the noise magnitude and the variance of the target variables were similar, there were deviations of 3% and 10 to 16% in the estimated mean and covariance of the target variables, respectively, along with a 1.7% deviation in noise estimation. Conclusion: EMSEV appears promising for distinguishing signal variance from noise in biological systems. The potential applications and implications in biological science and statistical inference are discussed.
Bioinformatics
What problem does this paper attempt to address?