
Full text loading...
Category: Clinical Microbiology
Helitrons, the Eukaryotic Rolling-circle Transposable Elements, Page 1 of 2
< Previous page | Next page > /docserver/preview/fulltext/10.1128/9781555819217/9781555819200_Chap40-1.gif /docserver/preview/fulltext/10.1128/9781555819217/9781555819200_Chap40-2.gifAbstract:
Helitrons are one of three groups of eukaryotic class 2 transposable elements (TEs) so far described. Unique in structure and coding capacity, they are hypothesized to move by a rolling-circle-like replication mechanism via a single-stranded DNA intermediate ( 1 , 2 ). The other two groups, the classic cut-and-paste and the Maverick/Polinton ( 3 – 5 ) both encode a transposase/integrase and are flanked by target site duplications (TSDs) (for review see reference 6 ). The repair resulting from the staggered double-stranded joining of the TE to the target DNA creates the TSD flanking the insertion (for reviews see references 7 and 8 ). Helitrons encode a putative protein called the Rep/Helicase ( 1 ), which is predicted to have both HUH endonuclease activity (for review see reference 9 ) and 5′ to 3′ helicase activity. The HUH endonuclease (the Rep of the Rep/Helicase) ( Figure 1 ) would likely make a single-stranded nick in the host DNA, which is consistent with the lack of TSD observed flanking Helitron insertions. A related protein with an HUH endonuclease domain encoded by various bacterial Insertion Sequence families (IS608, IS91, and ISCR1) makes a single-stranded nick in the host DNA and the insertions are not flanked by TSDs (for review see reference 9 ).
Full text loading...
Structure and coding capacity of canonical animal and plant Helitrons, Helentrons, Proto-Helentron, Helitron2 and IS91. (A) Structure of a typical animal Helitron. (B) A typical plant Helitron encoding Rep/Helicase and RPA proteins; one to three RPA genes can be found on either side of the Rep/Helicase gene. (C) Structure of a typical nonautonomous plant or animal Helitron; they do not encode the Rep/Helicase gene but share the common structural features. (D) Structure and coding capacity of a Helentron; Helentrons have sub terminal inverted repeats (subTIRs) (red), and a short palindrome at the 3′ end (stem loop). The subTIRs can either be palindromic or form a palindrome with the short inverted repeats (sideways triangle), near the subTIR, if present. (E) Structure of a Helentron-associated INterspersed Element (HINE); HINEs are nonautonomous but have the same structural features as that of the autonomous partner. (F) Structure and coding capacity of Proto-Helentron. (G) Structure and coding capacity of Helitron2 (redrawn from reference 12 ). (H) Structure of a bacterial IS91 element that is proposed to transpose by rolling-circle mechanism (redrawn from references 9 , 46 ). The genes that are occasionally carried by Helentrons are indicated with a black asterisk (*) and are included only if they were found in multiple families or across species. Sequences flanking the elements are shown in red.
Amino acid alignment of the Rep motifs of select Helitrons and Helentrons. An alignment of the Rep motif of Helentrons from 12 species and Helitrons from seven species (redrawn from reference 11 ). Identical residues are shaded in black and conservative changes are shaded in gray. Amino acids that distinguish Helentrons from Helitrons are boxed in red. The black triangles and stars above the alignment denote the two histidine residues and the two tyrosines respectively, which are known to be critical for catalytic activity of the rolling-circle elements. Sequences representing Helentrons have “Hele” and Helitrons have “Helit” as suffix to the name of the organism. The accessions and coordinates of the sequences used in this alignment are available in reference 11 .
Criteria for classifying Helitrons and Helentrons. (A) Classification criteria for Helitrons; colors of the 5′ and 3′ ends denote common ancestry. Helitrons belong to the same family (Family A) when they share > 80% identity over the last 30 bp (denoted by an orange 3′ end). Subfamilies share 80% sequence identity in the 3′ end but have different 5′ ends (Family A, subfamily B). Helitrons belonging to family C have a different 3′ end. Exemplars have internal sequences that are > 20% diverged compared with any other exemplar. (B) Classification criteria for Helentrons. Helentrons belonging to the same family share 100% sequence identity across the 11-bp subterminal inverted repeats (subTIRs) (Family A). A subfamily has at least 80% identity over last 60 bp at the 3′ end (excluding variable Ts) (Family A, subfamily B). Helentrons belonging to Family C have a different subTIR. Exemplars have internal sequences that are > 20% diverged.
Genome-wide identification of Helitrons. (A) A pipeline for genome-wide identification of candidate Helitrons and their verification. Examples of structure and repeat-based tools that could be used for annotating candidate Helitrons are listed on the respective side. The black star denotes that they are pipelines that use a set of tools. (B) Alignment of two Helitron copies inserted at different locations to identify the boundary of the element. Homology drops at the boundary of the element and the sequences at the boundary have canonical Helitron features. (C) Empty site verification for a Helitron. The first line is host sequence with a Helitron insertion and the second line is the paralogous site without the Helitron insertion. (D) Alignment of two Helentron-associated INterspersed Element (HINE) copies at different locations to identify the boundary of the element. Homology drops at the boundary of the element. Since the insertion is in T-rich sequence, the precise boundary of the element can be unambiguously identified only through the identification of an empty site (E). Empty site verification of a HINE copy. The first line is the host sequence containing HINE insertion and second line is the paralogous site without HINE insertion. The accession numbers and coordinates are given in black.
Proposed model for the transposition of Helitrons. The blue line indicates the Helitron, the small triangle indicates the 5′ end of the Helitron and the star indicates the 3′ end of the Helitron. (A) The first tyrosine (Y1) residue of the Rep protein (shown as black oval) cleaves at the 5′ end of the Helitron in the donor strand (shown as green lines) and the second tyrosine (Y2) residue cleaves the target DNA (shown as black lines). The tyrosine residues covalently join to the 5′ end of the respective strands. (B) The free 3′ hydroxyl in the target DNA attacks the DNA–Y1 bond and forms a covalent bond with the donor strand resulting in strand transfer. The free 3′ hydroxyl in the donor strand serves as a primer for DNA synthesis by host DNA polymerase. The strand is displaced by 5′ to 3′ activity of the Helicase protein and remains single-stranded (ss) with the help of ssDNA-binding protein. (C) At the termination site, the free Y1 residue cleave the 3′ end and becomes covalently linked to the 5′ end of the nicked strand and initiates the strand transfer when the 3′ hydroxyl of the cleaved Helitron attacks the Y2 at the 5′ end of the target DNA and forms a covalent bond. (D) The heteroduplex is passively resolved by DNA replication (redrawn from references 2 , 46 ).
Helitron-containing gene fragment captured at the DNA level and RNA level. (A) The structure of (HelibatN23.3) exemplar that has captured the promoter, 5′ untranslated region (UTR), Exon1, and Intron1 of the PIAS1 (protein inhibitor of activated signal transducer and activator of transcription 1 [STAT-1]) gene, which inhibits STAT1-mediated gene activation and the DNA-binding activity. (B) Structure of the HelibatN127.3 containing the cDNA of protein phosphatase 1, regulatory (inhibitor) subunit 12C (PPP1R12C) gene; Helitron contains seven exons (blue box), 3′ UTR (pink box), polyAs (yellowish green box) and 11-bp target site duplication (TSD) (purple arrows). (C) Empty site for the retroposed mRNA; first line is the Helitron containing the PPP1R12C cDNA and second line is a paralogous site within another Helitron but without the retrogene. The black bold letters shows the TSD. The accession and coordinates of the Helitrons are given. The flanking AT dinucleotide of the Helitron insertion is shown in red.
Active transposition-based models for gene capture. (A) End bypass model ( 2 ). (a) The Rep/Helicase protein cleaves the 5′ end of the Helitron (red line) and invades the target site (blue line) (shown in c) (see transposition model, Figure 5 for details). (b and c) Capture of flanking sequence (black line) occurs when the protein fails to recognize the termination signal (black star). Later the transposition is terminated by a cryptic random termination signal (four star) and the donor strand is cleaved and transferred to the target site (redrawn from reference 2 ). (B) Modified end bypass/chimeric transposition model ( 69 ). (a) The Rep/Helicase protein cleaves the 5′ end of the Helitron (red line) and invades the target site (blue line). (b and c) Capture of the flanking sequence occurs when the protein fails to recognize the termination signal or the 3′ end is truncated. The 3′ end of another Helitron (green line) in the proper orientation is recognized and is used a new 3′ end, thus creating a novel composite element. The protein cleaves the donor strand at the new termination signal and the donor strand is transferred to the target site.
Filler DNA model and site-specific recombination model of gene captures. (A) Filler DNA model (for review see reference 56 ). When a double-strand break occurs in the acceptor DNA (Helitron) (red line), the host exonuclease creates 3′ single-stranded (ss) overhangs. The free 3′ ssDNA anneals to the donor DNA based on microhomology triggering the synthesis of new DNA, which then anneals back to the acceptor DNA. The new DNA acquired from the donor acts as the template for the other strand. Hence, the Helitron now contains host sequences from a random location, which could be of genic or nongenic origin depending on which region of the host was used for repair. (B) Site-specific recombination model ( 107 ). The sites of recombination are marked by crosses. The capture of the host sequence by the Helitron would require three recombinational events, one within Helitron and two flanking the host sequence. As Helitrons do not encode integrase a host integrase is used for the capture.
Distribution of Helitrons across the eukaryotic tree of life. The four-pointed star shows the presence of Helitrons, the five-pointed star represents the presence of Helentrons, and the rectangular bars represent the presence of the Helentron Rep protein. The tree of life was redrawn from reference 192 . The numbers within parentheses represent the numbers of whole genome sequences available at the NCBI whole genome shotgun (wgs) database as per 18 June 2014 http://www.ncbi.nlm.nih.gov/. The plus sign within parentheses indicates that the respective element was identified from the transcriptome assembly deposited at the wgs database. The dot within parentheses indicates that Helentrons were identified from the Mucorales group of fungi traditionally classified as a Zygomycete but are not deposited as Zygomycetes at NCBI. TBLASTN searches ( 193 ) were employed against the wgs database to identify sequences homologous to the Helitron Rep protein query and the signature amino acids (see Figure 2 ) were used to differentiate Helentron from Helitron proteins (we do not know how this correlates with structure outside of animals and plants). Hits with very low copy number and of short contigs were not reported because of the possibility of contamination.
The abundance of Helitron-generated DNA in different organisms. The different organisms include Arabidopsis thaliana (1.6%; 1.9/120 Mb), Medicago trunculata (1.2%; 5/419 Mb), Oryza sativa spp. japonica (4%; 15/374 Mb), Sorghum bicolor (1%; 7.4/734 Mb), Zea mays (6.6%; 136.4/2066 Mb) ( 83 ), Caenorhabditis elegans (2.3%; 2.3/100 Mb) ( 81 ), Nematostella vectensis (3%; 8.9/297 Mb) ( 133 ), Bombyx mori (4.2%; 19.7/465.7 Mb) ( 84 ), Heliconius melpomene (6.62%; 17.1/260 Mb) ( 84 ), Drosophila virilis (5%; 9.5/189 Mb) ( 56 ), and Myotis lucifugus (5.8%; 109.5/1887 Mb) ( 51 ).
Distribution of horizontally transferred Helitrons and their phylogenetic relationship. (A) A venn diagram showing the distribution of different horizontally transferred Helitron families—Heligloria, Helisimi, Heliminu, and Helianu ( 70 ), Lep1 ( 115 )—and nonautonomous Helentrons—Helentron-associated INterspersed Elements (HINEs) ( 11 , 33 ). (B) Phylogenetic relationship between different organisms that carry horizontally transferred Helitrons. The exact divergence of Microsporidia from other groups is not known. The phylogenetic tree is redrawn from reference ( 70 ). The distribution of a Helitron family (black) is shown as organisms within the black ellipse. The colors of letters for each organism are related to the group in the phylogeny. The red rectangle shows the distribution of HINEs and red letters represent the HINE family.
Tandem copies of Helitrons and Helentron-associated INterspersed Elements (HINEs). (A) Tandem copies of four Helitrons having 5′ TC and 3′ CTAG are counted as indicated by the horizontal black line underneath the box. Boxes with the same color indicate that they have > 85% sequence identity. Sequences homologous to multiple Helitron ends can be identified within a single Helitron. The sequences that are homologous to Helitron 5′ ends are shown before the dots and sequences that are homologous to Helitron 3′ ends are shown after the dots within the box. (B) Empty site of the tandem Helitron described above. The first line is host sequences with the Helitron insertion and second line is an orthologous site in another bat Rhinolophus ferrumequinum. (C) Three tandem copies of a HINE insertion in the Drosophila ananassae genome. One copy is truncated because it is at the end of the contig. The HINE copies are 99% identical to each other.
Examples of the impact of Helitrons on gene structure and expression. (A) Insertion of a Helitron in the promoter (Yellow box with letter P) disrupts the transcription of the gene ( 102 , 169 ). (B) A Helitron insertion upstream of the promoter can increase the expression of the gene ( 170 ). (C) A Helentron-associated INterspersed Element (HINE) insertion in the promoter region of P element disrupts the promoter but provides a de novo promoter ( 172 , 173 ). (D) Chimeric transcript generated from a Helitron containing multiple gene fragments (maroon, light orange and light pink boxes) (e.g., 82 , 98 , 101 ). (E) Helitron insertions in the 5′ UTR contributes to transcript diversity ( 51 , 84 ). (F) Insertion in the 3′ UTR can disrupt the polyadenylation of the transcripts causing loss of function ( 174 ). (G) Helitron insertion provides novel alternative polyadenylation sites ( 51 ). (H) Helitron insertions in the 3′ UTR provides putative microRNA binding sites ( 51 ). (I) Insertion of a Helitron in an intron can disrupt or alter splicing increasing transcript diversity ( 51 , 82 ) often causing loss of function ( 99 , 177 ). (J) Helitrons provide cryptic splice sites and create novel fusion transcripts ( 51 , 82 ). (K) Two different waves of HINE amplification provided binding sites for the protein involved in dosage compensation (shown in orange and red bars) on the X chromosomes that were generated ∼ 15 and ∼1 million years ago ( 127 ). (L) Helitron insertions contribute to long noncoding RNAs ( 51 ).