
Full text loading...
Category: Microbial Genetics and Molecular Biology
Insertion Sequences Revisited, Page 1 of 2
< Previous page | Next page > /docserver/preview/fulltext/10.1128/9781555817954/9781555812096_Chap15-1.gif /docserver/preview/fulltext/10.1128/9781555817954/9781555812096_Chap15-2.gifAbstract:
This chapter is an update of a survey of insertion sequences (ISs) published in 1998. Researchers have retained the same basic structure: a first section including some key properties of ISs and a second section that defines and describes the different IS families. Throughout the text the authors have tried to compare and contrast the different IS families in terms of their transposition mechanism and control of their transposition activity. Researchers have introduced an additional section concerning bacterial genomes and plasmids since a number of genome sequences have become available over the past three years, and a large number of potential ISs have been identified in several of these. Researchers have also retained a section on eukaryotic insertion sequences. A general pattern for the functional organization of Tpases appears to be emerging from the limited number that have been analyzed. Another general feature of IS elements is that, on insertion, most generate short directly repeated sequences (DR) of the target DNA flanking the IS. Transposition activity is frequently modulated by host factors. The G+C content of family members varies from 70% in the mycobacterial examples to 25% in those isolated from Mycoplasma species. Family members from Mycoplasma species merit special attention.
Full text loading...
Organization of a typical insertion sequence. The IS is represented as an open box in which the terminal IRs are shown as gray boxes labeled IRL (left inverted repeat) and IRR (right inverted repeat). A single open-reading frame encoding the Tpase is indicated as a hatched box stretching over the entire length of the IS and extending within the IRR sequence. XYZ enclosed in a pointed box flanking the IS represents short directly repeated sequences generated in the target DNA as a consequence of insertion. The Tpase promoter, p, partially localized in IRL, is shown by a horizontal arrow. A typical domain structure (gray boxes) of the IRs is indicated beneath. Domain Irepresents the terminal base pairs at the very tip of the element whose recognition is required for Tpase-mediated cleavage. Domain II represents the base pairs necessary for sequence-specific recognition and binding by the Tpase.
Organization of a typical insertion sequence. The IS is represented as an open box in which the terminal IRs are shown as gray boxes labeled IRL (left inverted repeat) and IRR (right inverted repeat). A single open-reading frame encoding the Tpase is indicated as a hatched box stretching over the entire length of the IS and extending within the IRR sequence. XYZ enclosed in a pointed box flanking the IS represents short directly repeated sequences generated in the target DNA as a consequence of insertion. The Tpase promoter, p, partially localized in IRL, is shown by a horizontal arrow. A typical domain structure (gray boxes) of the IRs is indicated beneath. Domain Irepresents the terminal base pairs at the very tip of the element whose recognition is required for Tpase-mediated cleavage. Domain II represents the base pairs necessary for sequence-specific recognition and binding by the Tpase.
Different types of Tpase-mediated cleavage at transposon ends. (A) Tpase-catalyzed cleavages associated with different transposable elements with DDE Tpases. Transposons are represented by hatched boxes, and flanking donor DNA is represented by black lines. The arrows indicate Tpase-mediated cleavages at the 3′ ends of each element, which give rise to active 3’OH groups shown as open circles and 5′-phosphate groups shown as t-bars. Closed circles indicate 3′OH groups generated in flanking donor DNA. (B) Chemistry of the cleavage and strand transfer events. The left-hand panel shows nucleophilic attack by a water molecule on the transposon phosphate backbone. The nucleotide shown as base A represents the terminal 3′ base of the transposon and that marked B, the neighboring 5′ nucleotide of the vector backbone DNA. Initial attack generates a 3′OH group on the transposon end. The right-hand panel shows a strand transfer event. The 3′OH group at the transposon end acts as a nucleophile in the attack of the target phosphodiester backbone (bases X and Y), joining the 3′ transposon end to a 5′ target end and creating a 3′OH group on the neighboring target base (X). Also shown in this panel as dashed arrows is the "disintegration” reaction in which the 3′OH of the target (X) attacks the newly created phosphodiester bond between the transposon (A) and target (Y) to regenerate the original phosphodiester bond between X and Y.
Different types of Tpase-mediated cleavage at transposon ends. (A) Tpase-catalyzed cleavages associated with different transposable elements with DDE Tpases. Transposons are represented by hatched boxes, and flanking donor DNA is represented by black lines. The arrows indicate Tpase-mediated cleavages at the 3′ ends of each element, which give rise to active 3’OH groups shown as open circles and 5′-phosphate groups shown as t-bars. Closed circles indicate 3′OH groups generated in flanking donor DNA. (B) Chemistry of the cleavage and strand transfer events. The left-hand panel shows nucleophilic attack by a water molecule on the transposon phosphate backbone. The nucleotide shown as base A represents the terminal 3′ base of the transposon and that marked B, the neighboring 5′ nucleotide of the vector backbone DNA. Initial attack generates a 3′OH group on the transposon end. The right-hand panel shows a strand transfer event. The 3′OH group at the transposon end acts as a nucleophile in the attack of the target phosphodiester backbone (bases X and Y), joining the 3′ transposon end to a 5′ target end and creating a 3′OH group on the neighboring target base (X). Also shown in this panel as dashed arrows is the "disintegration” reaction in which the 3′OH of the target (X) attacks the newly created phosphodiester bond between the transposon (A) and target (Y) to regenerate the original phosphodiester bond between X and Y.
DDE consensus of different families. Individual representative members of each family are shown. Amino acids forming part of the conserved motif are indicated by large bold letters. Uppercase letters indicate conservation within a family and lowercase letters indicate that the particular amino acid is predominant. The numbers in parentheses show the distance in amino acids between the amino acids of the conserved motif. Conservations indicated were derived from previously published alignments or from alignments generated for this chapter. The retroviral integrase alignment is based on reference 287. The overall alignment for the IS3 family (not shown) is essentially that obtained in reference 287. For IS21, see reference 134; mariner, see references 90 and 318; IS630, see reference 90; IS4 and IS5, see reference 312; IS256, see reference 281. N2, N3, and C1 are regions originally defined in the IS4 family (312).
DDE consensus of different families. Individual representative members of each family are shown. Amino acids forming part of the conserved motif are indicated by large bold letters. Uppercase letters indicate conservation within a family and lowercase letters indicate that the particular amino acid is predominant. The numbers in parentheses show the distance in amino acids between the amino acids of the conserved motif. Conservations indicated were derived from previously published alignments or from alignments generated for this chapter. The retroviral integrase alignment is based on reference 287. The overall alignment for the IS3 family (not shown) is essentially that obtained in reference 287. For IS21, see reference 134; mariner, see references 90 and 318; IS630, see reference 90; IS4 and IS5, see reference 312; IS256, see reference 281. N2, N3, and C1 are regions originally defined in the IS4 family (312).
Simple insertions and cointegrate formation. (A) Strand transfer and replication leading to simple insertions and cointegrates. The ISDNA is shown as a shaded box. Liberated transposon 3′OH groups are shown as small shaded circles and those of the donor backbone (bold lines) as filled circles. 5′ phosphates are indicated by a bar. Strand polarity is indicated. Target DNA is shown as unfilled boxes. The left-hand column shows an example of an IS that undergoes double-strand cleavage prior to strand transfer. The right-hand column presents an element that undergoes single-strand cleavage at its ends. After strand transfer, this can evolve into a cointegrate molecule by replication or a simple insertion by secondstrand cleavage. (B) Replicative and nonreplicative transposition as mechanisms leading to cointegrates. The figure shows three pathways that generate "cointegrate” molecules by (I) replicative transposition, (II) simple insertion from a dimeric form of the donor molecule, and (III) simple insertion from a donor carrying tandem copies of the transposable element. Transposon DNA is indicated by a heavy line and the terminal repeats by small open circles. The relative orientation is indicated by an open arrowhead. The square and oval symbols represent compatible origins of replication and are included to visually distinguish the different replicons.
Simple insertions and cointegrate formation. (A) Strand transfer and replication leading to simple insertions and cointegrates. The ISDNA is shown as a shaded box. Liberated transposon 3′OH groups are shown as small shaded circles and those of the donor backbone (bold lines) as filled circles. 5′ phosphates are indicated by a bar. Strand polarity is indicated. Target DNA is shown as unfilled boxes. The left-hand column shows an example of an IS that undergoes double-strand cleavage prior to strand transfer. The right-hand column presents an element that undergoes single-strand cleavage at its ends. After strand transfer, this can evolve into a cointegrate molecule by replication or a simple insertion by secondstrand cleavage. (B) Replicative and nonreplicative transposition as mechanisms leading to cointegrates. The figure shows three pathways that generate "cointegrate” molecules by (I) replicative transposition, (II) simple insertion from a dimeric form of the donor molecule, and (III) simple insertion from a donor carrying tandem copies of the transposable element. Transposon DNA is indicated by a heavy line and the terminal repeats by small open circles. The relative orientation is indicated by an open arrowhead. The square and oval symbols represent compatible origins of replication and are included to visually distinguish the different replicons.
IS distribution among different families. The figure shows the number distribution of the entire IS database into the various IS families. Isoforms are not taken into account.
IS distribution among different families. The figure shows the number distribution of the entire IS database into the various IS families. Isoforms are not taken into account.
Organization of IS1. (A) Dendrogram of the InsB′ reading frames of IS1 elements from the enterobacteria and IS1- like Synechocystis elements. (B) Comparison of terminal inverted repeats. (C) Structure of IS1. Left (IRL) and right (IRR) inverted terminal repeat are shown as filled boxes. Relative positions of the insA and insB′ reading frames, together with their overlap region, are shown within the open box representing IS1. The IS1 promoter pIRL partially located in IRL is indicated as a small arrow. IHF binding sites located partially within each terminal IR are shown as small open boxes. The InsA protein is represented as a hatched box beneath. The InsA and InsB′ components of the InsAB′ frameshift product are shown as hatched and stippled boxes, respectively. Thin arrows indicate the probable region of action of InsA and InsAB′ proteins. The effect of InsA and InsAB′ on transposition is shown above.
Organization of IS1. (A) Dendrogram of the InsB′ reading frames of IS1 elements from the enterobacteria and IS1- like Synechocystis elements. (B) Comparison of terminal inverted repeats. (C) Structure of IS1. Left (IRL) and right (IRR) inverted terminal repeat are shown as filled boxes. Relative positions of the insA and insB′ reading frames, together with their overlap region, are shown within the open box representing IS1. The IS1 promoter pIRL partially located in IRL is indicated as a small arrow. IHF binding sites located partially within each terminal IR are shown as small open boxes. The InsA protein is represented as a hatched box beneath. The InsA and InsB′ components of the InsAB′ frameshift product are shown as hatched and stippled boxes, respectively. Thin arrows indicate the probable region of action of InsA and InsAB′ proteins. The effect of InsA and InsAB′ on transposition is shown above.
The IS3 family. (A) General organization of IS3 family members. The black boxes indicate the left (IRL) and right (IRR) terminal inverted repeats. Transcription probably occurs from a weak promoter located partially in IRL. The two consecutive overlapping open reading frames are indicated (orfA and orfB) and are arranged in reading phases 0 and −1, respectively. The products of these frames are shown below. OrfA and OrfB are shown as hatched and open boxes, respectively. The position of a potential helix-turn-helix motif (HTH) is shown as a stippled box in OrfA and the DDE catalytic domain as a stippled box in OrfB. A potential leucine zipper (LZ) at the C-terminal end of OrfA and extending into OrfAB is also indicated. Each leucine heptad is indicated by an oval. Those present in the OrfA domain are crosshatched whereas that deriving from the frameshifted product is open. (B) The nucleotide sequence of the terminal IRs of two representative elements of each subgroup is shown.
The IS3 family. (A) General organization of IS3 family members. The black boxes indicate the left (IRL) and right (IRR) terminal inverted repeats. Transcription probably occurs from a weak promoter located partially in IRL. The two consecutive overlapping open reading frames are indicated (orfA and orfB) and are arranged in reading phases 0 and −1, respectively. The products of these frames are shown below. OrfA and OrfB are shown as hatched and open boxes, respectively. The position of a potential helix-turn-helix motif (HTH) is shown as a stippled box in OrfA and the DDE catalytic domain as a stippled box in OrfB. A potential leucine zipper (LZ) at the C-terminal end of OrfA and extending into OrfAB is also indicated. Each leucine heptad is indicated by an oval. Those present in the OrfA domain are crosshatched whereas that deriving from the frameshifted product is open. (B) The nucleotide sequence of the terminal IRs of two representative elements of each subgroup is shown.
The IS4 family. (A) Dendrogram of different members of the IS4 family. (B) Comparison of a representative set of terminal IRs. (C) Organization of IS10 and IS50. IS10: The Tpase promoter, pIN, and the anti-RNA promoter, pOUT, are indicated as horizontal arrows. A mechanistically important IHF site is indicated by an open box next to IRL. The Tpase is represented underneath. Stippled boxes indicate the positions of consensus sequence within members of the IS4 family (from positions 93 to 132, 157 to 187, and 266 to 326). Iand IIindicate patch Iand patch IIas defined by mutagenesis ( 189 ). The vertical arrow indicates a protease-sensitive site. IS50: The promoters for Tpase and inhibitor protein, p1 and p2, are indicated as horizontal arrows. DnaA and Fis binding sites located close to the left and right ends, respectively, are indicated by open boxes.
The IS4 family. (A) Dendrogram of different members of the IS4 family. (B) Comparison of a representative set of terminal IRs. (C) Organization of IS10 and IS50. IS10: The Tpase promoter, pIN, and the anti-RNA promoter, pOUT, are indicated as horizontal arrows. A mechanistically important IHF site is indicated by an open box next to IRL. The Tpase is represented underneath. Stippled boxes indicate the positions of consensus sequence within members of the IS4 family (from positions 93 to 132, 157 to 187, and 266 to 326). Iand IIindicate patch Iand patch IIas defined by mutagenesis ( 189 ). The vertical arrow indicates a protease-sensitive site. IS50: The promoters for Tpase and inhibitor protein, p1 and p2, are indicated as horizontal arrows. DnaA and Fis binding sites located close to the left and right ends, respectively, are indicated by open boxes.
The IS5 family. (A) Dendrogram of the Tpases of present members of the family showing the different subgroups. (B) Comparison of the terminal IRs of representative members of each subgroup.
The IS5 family. (A) Dendrogram of the Tpases of present members of the family showing the different subgroups. (B) Comparison of the terminal IRs of representative members of each subgroup.
The IS6 family. (A) Terminal inverted repeats. (B) Transposition mechanism. A target plasmid is distinguished by an open oval representing the origin of replication. The transposon carried by the donor plasmid is composed of two copies of the IS (heavy double lines terminated by small circles) in direct relative orientation (indicated by the open arrowhead) flanking an interstitialDNA segment (shown as a zigzag). The donor plasmid is distinguished by an open rectangle representing its origin of replication. Tpasemediated replicon fusion of the two molecules generates a third copy of the IS in the same orientation as the original pair (open arrowhead). Homologous recombination using the recA system between any two copies can, in principle, occur. This will either regenerate the donor plasmid leaving a single IS copy in the target, delete the transposon, or transfer the transposon to the target (as shown) leaving a single copy of the IS in the donor molecule.
The IS6 family. (A) Terminal inverted repeats. (B) Transposition mechanism. A target plasmid is distinguished by an open oval representing the origin of replication. The transposon carried by the donor plasmid is composed of two copies of the IS (heavy double lines terminated by small circles) in direct relative orientation (indicated by the open arrowhead) flanking an interstitialDNA segment (shown as a zigzag). The donor plasmid is distinguished by an open rectangle representing its origin of replication. Tpasemediated replicon fusion of the two molecules generates a third copy of the IS in the same orientation as the original pair (open arrowhead). Homologous recombination using the recA system between any two copies can, in principle, occur. This will either regenerate the donor plasmid leaving a single IS copy in the target, delete the transposon, or transfer the transposon to the target (as shown) leaving a single copy of the IS in the donor molecule.
The IS21family. (A) General organization. Terminal inverted repeats IRL and IRR are shown as filled boxes. The position of the istA and istB reading frames is also shown. The horizontal lines below show the relative positions of the multiply repeated elements whose sequence is presented in B. IstA (hatched box) together with the potential "DDE”motif (stippled box) and IstB (open box) are indicated below. The possibility of translational coupling between the two reading frames is indicated. (B) The nucleotide sequence of the multiple terminal repeats and their coordinates are presented. CS, complementary strand. L1, L2, L3, and R1, R2, indicate internal repeated sequences at the left and right ends, respectively.
The IS21family. (A) General organization. Terminal inverted repeats IRL and IRR are shown as filled boxes. The position of the istA and istB reading frames is also shown. The horizontal lines below show the relative positions of the multiply repeated elements whose sequence is presented in B. IstA (hatched box) together with the potential "DDE”motif (stippled box) and IstB (open box) are indicated below. The possibility of translational coupling between the two reading frames is indicated. (B) The nucleotide sequence of the multiple terminal repeats and their coordinates are presented. CS, complementary strand. L1, L2, L3, and R1, R2, indicate internal repeated sequences at the left and right ends, respectively.
The IS30 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The IS30 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The IS66 family. (A) Organization of IS866. A "best guess” diagram of the open reading frames is shown. All are transcribed from left to right. The difference in shading is simply to facilitate their distinction. Terminal IRs are shown as black boxes. (B) Terminal inverted repeats.
The IS66 family. (A) Organization of IS866. A "best guess” diagram of the open reading frames is shown. All are transcribed from left to right. The difference in shading is simply to facilitate their distinction. Terminal IRs are shown as black boxes. (B) Terminal inverted repeats.
The IS110 family. The dendrogram is based on Tpase alignments.
The IS110 family. The dendrogram is based on Tpase alignments.
The IS256 family. (A) Dendrogram. (B) Terminal inverted repeats.
The IS256 family. (A) Dendrogram. (B) Terminal inverted repeats.
The IS481 family. (A) Dendrogram. (B) Terminal inverted repeats.
The IS481 family. (A) Dendrogram. (B) Terminal inverted repeats.
The IS630 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats showing the 3′ TA- 5′ target dinucleotide duplicated following insertion.
The IS630 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats showing the 3′ TA- 5′ target dinucleotide duplicated following insertion.
The IS982 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The IS982 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The IS1380 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The IS1380 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The ISAs1 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The ISAs1 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The ISL3 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The ISL3 family. (A) Dendrogram based on Tpase alignments. (B) Terminal inverted repeats.
The IS200 complex. The figure shows the organization of IS200 (top) with short inverted repeats (open arrows) at the left end and the relative position of the potential open reading frame (hatched box). Selected examples of IS605 and IS607 relatives are also included. In all cases the orfB frames (unfilled boxes) show clear similarities. The upstream orfA frames are similar to that of IS200 for members of the IS605 group (thin crosshatching). For the IS607 group orfA (heavy hatching) resemble each other but are not related to those of the IS605 group. The relative localization of the two frames is indicated with either a significant overlapping region or a one-base overlap, suggesting translational coupling or no overlap at all. Some isolated members carry short IRs. These are indicated by filled boxes.
The IS200 complex. The figure shows the organization of IS200 (top) with short inverted repeats (open arrows) at the left end and the relative position of the potential open reading frame (hatched box). Selected examples of IS605 and IS607 relatives are also included. In all cases the orfB frames (unfilled boxes) show clear similarities. The upstream orfA frames are similar to that of IS200 for members of the IS605 group (thin crosshatching). For the IS607 group orfA (heavy hatching) resemble each other but are not related to those of the IS605 group. The relative localization of the two frames is indicated with either a significant overlapping region or a one-base overlap, suggesting translational coupling or no overlap at all. Some isolated members carry short IRs. These are indicated by filled boxes.