Estimation of Amino Acid Residue Substitution Rates at Local Spatial Regions and Application in Protein Function Inference: A Bayesian Monte Carlo Approach

Yan Y. Tseng,Jie Liang
DOI: https://doi.org/10.1093/molbev/msj048
2006-01-13
Abstract:The amino acid sequences of proteins provide rich information for inferring distant phylogenetic relationships and for predicting protein functions. Estimating the rate matrix of residue substitutions from amino acid sequences is also important because the rate matrix can be used to develop scoring matrices for sequence alignment. Here we use a continuous time Markov process to model the substitution rates of residues and develop a Bayesian Markov chain Monte Carlo method for rate estimation. We validate our method using simulated artificial protein sequences. Because different local regions such as binding surfaces and the protein interior core experience different selection pressures due to functional or stability constraints, we use our method to estimate the substitution rates of local regions. Our results show that the substitution rates are very different for residues in the buried core and residues on the solvent exposed surfaces. In addition, the rest of the proteins on the binding surfaces also have very different substitution rates from residues. Based on these findings, we further develop a method for protein function prediction by surface matching using scoring matrices derived from estimated substitution rates for residues located on the binding surfaces. We show with examples that our method is effective in identifying functionally related proteins that have overall low sequence identity, a task known to be very challenging.
Biomolecules
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to search for Strange Quark Matter (SQM) in cosmic rays through the SLIM (Search for Light and Intermediate mass Monopoles) detector, especially low - mass and intermediate - mass SQM lumps, namely "strangelets" and "nuclearites". Specifically, the paper focuses on the following aspects: 1. **Explore the existence of Strange Quark Matter**: The paper discusses whether SQM could be the ground state of Quantum Chromodynamics (QCD) and assumes that these SQM lumps may exist in cosmic rays. The mass range of these lumps is from the mass of heavy nuclei to larger values. 2. **Evaluate the sensitivity of the SLIM detector**: SLIM is a large - area experiment installed in the Chacaltaya Laboratory in Bolivia for detecting penetrating cosmic rays. The paper discusses the sensitivity of the SLIM detector to strangelets and nuclearites and presents preliminary results. 3. **Study the properties and behavior of SQM**: The paper describes in detail the expected properties of SQM, including its density, the relationship between charge and mass, and their propagation behavior in the atmosphere. For low - mass SQM lumps, electrons will form an electron cloud around the quark core; for larger - mass lumps, all electrons will reach equilibrium inside SQM. 4. **Calculate the flux upper limit**: Based on the Dark Matter (DM) density assumption, the paper calculates the maximum flux upper limit of nuclearites. In addition, it also discusses the galactic propagation model of cosmic - ray nuclearites generated by the tidal disruption of binary strange stars. 5. **Analyze the data of the SLIM detector**: Through the testing of some SLIM modules, no candidate events of strangelets or nuclearites were found. Based on this, the paper gives the flux upper limit at the 90% confidence level, which is of great significance for restricting some production and propagation models. ### Summary of mathematical formulas - **Upper limit of nuclearite flux**: \[ \Phi_{\text{max}}=\frac{\rho_{\text{DM}}v}{2\pi M} \] where $\rho_{\text{DM}}$ is the dark matter density, $v$ is the average speed of nuclearites, and $M$ is the mass of nuclearites. - **Nuclearite deceleration formula**: \[ v(L) = v_0 e^{-\frac{\sigma}{M}\int_0^L\rho(x)dx} \] where $\rho(x)$ is the air density, $\sigma$ is the cross - section of nuclearites, and $L$ is the penetration depth. - **Nuclearite cross - section formula**: \[ \sigma=\begin{cases} \pi\left(\frac{3M}{4\pi\rho_N}\right)^{2/3}&\text{for }M\geq8.4\times 10^{14}\text{ GeV}\\ \pi\times 10^{-16}\text{ cm}^2&\text{for lower mass nuclearites} \end{cases} \] where $\rho_N = 3.6\times 10^{16}\text{ g/cm}^3$. ### Conclusion Through the research of the SLIM detector, this paper provides the latest data and theoretical analysis on the existence of strangelets and nuclearites in cosmic rays. Although no definite candidate events have been found so far, the research sets an important flux upper limit for future work and provides valuable references for further exploration in this field.