Protein evidence of unannotated ORFs in Drosophila reveals diversity in the evolution and properties of young proteins.

TitleProtein evidence of unannotated ORFs in Drosophila reveals diversity in the evolution and properties of young proteins.
Publication TypeJournal Article
Year of Publication2022
AuthorsZheng EB, Zhao L
Date Published2022 09 30
KeywordsAnimals, Drosophila, Drosophila melanogaster, Evolution, Molecular, Male, Open Reading Frames, Proteins

De novo gene origination, where a previously nongenic genomic sequence becomes genic through evolution, is increasingly recognized as an important source of novelty. Many de novo genes have been proposed to be protein-coding, and a few have been experimentally shown to yield protein products. However, the systematic study of de novo proteins has been hampered by doubts regarding their translation without the experimental observation of protein products. Using a systematic, mass-spectrometry-first computational approach, we identify 993 unannotated open reading frames with evidence of translation (utORFs) in Drosophila melanogaster. To quantify the similarity of these utORFs across Drosophila and infer phylostratigraphic age, we develop a synteny-based protein similarity approach. Combining these results with reference datasets ontissue- and life stage-specific transcription and conservation, we identify different properties amongst these utORFs. Contrary to expectations, the fastest-evolving utORFs are not the youngest evolutionarily. We observed more utORFs in the brain than in the testis. Most of the identified utORFs may be of de novo origin, even accounting for the possibility of false-negative similarity detection. Finally, sequence divergence after an inferred de novo origin event remains substantial, suggesting that de novo proteins turn over frequently. Our results suggest that there is substantial unappreciated diversity in de novo protein evolution: many more may exist than previously appreciated; there may be divergent evolutionary trajectories, and they may be gained and lost frequently. All in all, there may not exist a single characteristic model of de novo protein evolution, but instead, there may be diverse evolutionary trajectories.

Alternate JournalElife
PubMed ID36178469
PubMed Central IDPMC9560153
Grant ListR35 GM133780 / GM / NIGMS NIH HHS / United States
T32 GM007739 / GM / NIGMS NIH HHS / United States

Person Type: