Bacteriophage N4, an Escherichia coli (E. coli) K-12 strain-specific podovirus, is the founding member of the N4-like phage subfamily, which contains phage species prevalent in the human gut microbiome. Although numerous phage species have been identified as N4-like in available databases recently, no disciplined methodology for the classification of phages to the N4-like phage subfamily currently exists. Therefore, I defined a universal framework for the classification of new N4-like phage species using N4 virion-encapsidated RNA polymerase (vRNAP) as a marker due to its activity across numerous N4 physiological processes and ease of detection in genomic sequences and isolated virions. 54 phage species encoding N4 vRNAP homologs were detected in current databases through DELTA-BLAST homology search. These species share similar genomic and morphological properties as N4, but infect a broad range of Proteobacteria occupying diverse ecological niches across five continents. I created a reticulate phylogeny of N4-like phages through shared ORFs, which revealed greater genetic similarity among phages infecting closely related hosts and identified the N4 transcriptional machinery (vRNAP, N4 RNAPII, and N4SSB) as a hallmark of N4-like phages due to its complete conservation across the subfamily. I also identified ORFs uniquely conserved within clusters of phages that encode putative host specificity factors, which are excellent targets for the design of novel antibiotics and engineering phages to infect new hosts. Comparative genomics studies demonstrated that the N4 transcriptional strategy utilizing the sequential activity of a virion-encapsidated RNAP for early transcription, a heterodimeric RNAP for middle transcription, and single-stranded DNA-binding protein (SSB) required for coupling late transcription with DNA replication is a unique feature of N4 and its relatives. The phage-encoded heterodimeric N4 RNA polymerase II (N4 RNAPII), responsible for the transcription of N4 middle genes, is a member of the T7-like RNA polymerase family. Unlike T7 RNAP, N4 RNAPII cannot initiate transcription from double-stranded templates and requires the additional factor N4 gp2 for transcription in vivo. Gp2 is an SSB that activates N4 RNAPII transcription through direct interaction with N4 RNAPII. In this work, I define the requirements for N4 RNAPII promoter recognition and elucidate the molecular mechanism of transcription activation by its transcription factor gp2. In vitro transcription, DNA binding, and crosslinking assays show that the N4 RNAPII specificity loop directly interacts with bases in the template strand within short, AT-rich sequences for promoter recognition and transcription initiation. I also used crosslinking and mass spectrometry techniques to show that the gp2 N-terminus localizes to the N4 RNAPII active site, where it activates N4 RNAPII transcription by increasing the catalytic efficiency of first phosphodiester bond formation 1.5-fold. I propose a model for N4 RNAPII transcription activation where interaction between the N-terminus of ssDNA-bound gp2 and the N4 RNAPII active site recruits N4 RNAPII to single-stranded templates and coordinates the N4 RNAPII active site to increase the catalytic efficiency of transcription initiation.




Downloads Statistics

Download Full History