24
feb
Določitev nekodirajočih RNA v genomu
In contrast to the fairly reliable and complete annotation of the protein coding genes in the human genome, comprehensive information has still been lacking for non-coding RNAs until very recently. Still quality and coverage of ncRNA annotation is not comparable to protein coding genes.
Recently, we completed a comparative screen of vertebrate genomes for structural non-coding RNAs, which evaluates conserved genomic DNA sequences for signatures of structural conservation of base pairing patterns and exceptional thermodynamic stability. This study has been complemented by corresponding computional efforts on a variety of other clades, including urochordates, insects, nematodes, yeasts, and trypanosomatids, as well as a detailed study of the ENCODE regions.
In this presentation I will focus on the comparative genomics techniques used to perform these screens an on a comparison of the major results:
In human, we find more than 30000 structured RNA elements, almost 1000 of which are conserved across all vertebrates. In contrast, the number of detectable structured ncRNAs in invertebrate genomes appears is at least one order magnitude smaller, with an even smaller list of candidates in unicellular organisms. In vertebrates, roughly a third of the candidates is found in introns of known genes, a sixth are potential regulatory elements in untranslated regions of protein-coding mRNAs, and about half are located far away of any known gene. Only a small fraction of these sequences has been described previously. A comparison with recent tiling array data shows that more than 40% of the predicted structured human RNAs overlap with experimentally detected sites of transcription. The widespread conservation of secondary structure points to a large number of functional ncRNAs and cis-acting mRNA structures in the human genome.