Abstract:Analyzing probabilistic programs and randomized algorithms are classical problems in computer science. The first basic problem in the analysis of stochastic processes is to consider the expectation or mean, and another basic problem is to consider concentration bounds, i.e. showing that large deviations from the mean have small probability. Similarly, in the context of probabilistic programs and randomized algorithms, the analysis of expected termination time/running time and their concentration bounds are fundamental <a class="link-external link-http" href="http://problems.In" rel="external noopener nofollow">this http URL</a> this work, we focus on concentration bounds for probabilistic programs and probabilistic recurrences of randomized algorithms. For probabilistic programs, the basic technique to achieve concentration bounds is to consider martingales and apply the classical Azuma's inequality. For probabilistic recurrences of randomized algorithms, Karp's classical "cookbook" method, which is similar to the master theorem for recurrences, is the standard approach to obtain concentration bounds. In this work, we propose a novel approach for deriving concentration bounds for probabilistic programs and probabilistic recurrence relations through the synthesis of exponential supermartingales. For probabilistic programs, we present algorithms for synthesis of such supermartingales in several cases. We also show that our approach can derive better concentration bounds than simply applying the classical Azuma's inequality over various probabilistic programs considered in the literature. For probabilistic recurrences, our approach can derive tighter bounds than the Karp's well-established methods on classical algorithms. Moreover, we show that our approach could derive bounds comparable to the optimal bound for quicksort, proposed by McDiarmid and Hayward. We also present a prototype implementation that can automatically infer these bounds

Calculating complexity of large randomized libraries

Why high-error-rate random mutagenesis libraries are enriched in functional and improved proteins

Estimating the Number of Essential Genes in Random Transposon Mutagenesis Libraries

Designing gene libraries from protein profiles for combinatorial protein experiments.

Sampling Strategies for Experimentally Mapping Molecular Fitness Landscapes Using High-Throughput Methods

Calculation of Relative Binding Free Energy for Mutations in Protein Complexes: The Alchemical Path

Quantification of the effect of mutations using a global probability model of natural sequence variation

Estimation of demography and mutation rates from one million haploid genomes

Synthesis cost-optimal targeted mutant protein libraries

Statistical Methods for Estimating Complexity from Competition Experiments between Two Populations

Concentration-Bound Analysis for Probabilistic Programs and Probabilistic Recurrence Relations

Hierarchy and extremes in selections from pools of randomized proteins

A Statistical Mechanical Approach to Combinatorial Chemistry

GGAssembler: precise and economical design and synthesis of combinatorial mutation libraries

Statistical distributions of sequencing by synthesis with probabilistic nucleotide incorporation

Methods to Estimate Cryptic Sequence Complexity

Using BEAN-counter to quantify genetic interactions from multiplexed barcode sequencing experiments

Improving the Accuracy of Bulk Fitness Assays by Correcting Barcode Processing Biases

Rare-Event Sampling Analysis Uncovers the Fitness Landscape of the Genetic Code

Counting unique molecular identifiers in sequencing using a multitype branching process with immigration

Estimating Clonality