Frontiers in Mathematical Biology

Asymptotic distribution of motifs in a stochastic context-free grammar model of RNA folding

Svetlana Poznanovikj

Georgia Institute of Technology


Some recent methods for predicting RNA secondary structures are based on stochastic context-free grammars (SCFGs). We analyze one of the most notable SCFGs which is used in the prediction program Pfold. In particular, we show that the distribution of base pairs, helices and various types of loops in RNA secondary structures generated by this SCFG is asymptotically Gaussian, for a generic choice of the grammar probabilities. Our proofs are based on singularity analysis of probability generating functions.