Institute of Molecular Biology and Biotechnologies, Azerbaijan National Academy of Sciences
Using the computer program TSSPlant, search for putative transcription start sites (TSS) – promoters in [-1000:+101] regions (+1 is the annotated gene start) of 22,258, 23,330, 17,896, 18,226, 17,645, 38,702 and 11,035 (totally, 149,092) protein-coding genes from monocot Oryza sativa and Zea mays, dicot Arabidopsis thaliana, Glycine max, Medicago truncatula, Populus trichocarpa and Vitis vinifera, respectively, was performed. At least, one potential TSS for every gene was predicted. The comparative analysis of these TSSs by the promoter class for all genes, as well as for only plastid or mitochondrial genes revealed that in all plants TATA-less promoters prevail over the TATA-promoters (~70% TATA-less promoters vs ~30% TATA-promoters). Taking, for every gene, only the predicted TSS (TSSp) which is located closest to the annotated gene start, an analysis of distances between TSSp and gene starts showed that for 70% and more genes this distance is less than 100 bp. These findings indicate that the prediction accuracy of TSSPlant program is quite high.