Published September 1, 2023 | Version v1
Journal article Open

Genome analysis of SARS-CoV-2 isolates from a population reveals the rapid selective sweep of a haplotype carrying many pre-existing and new mutations

  • 1. Institute of Advanced Study in Science and Technology
  • 2. Gauhati Medical College and Hospital
  • 3. University of Chicago

Description

To understand the mechanism underlying the evolution of SARS-CoV-2 in a population, we sequenced 92 viral genomes from Assam, India. Analysis of these and database sequences revealed a complete selective sweep of a haplotype in Assam carrying 13 pre-existing variants, including a high leap in frequency of a variant on ORF8, which is involved in immune evasion. A comparative study between sequences of same lineage and similar time frames in and outside Assam showed that 10 of the 13 pre-existing variants had a frequency ranging from 96 to 99%, and the remaining 3 had a low frequency outside Assam. Using a phylogenetic approach to infer sequential occurrences of variants we found that the variant Phe120del on ORF8, which had a low frequency (1.75%) outside Assam, is at the base of the phylogenetic tree of variants and became totally fixed (100%) in Assam population. Based on this observation, we inferred that the variant on ORF8 had a selective advantage, so it carried the haplotype to reach the100% frequency. The haplotype also carried 32 pre-existing variants at a frequency from 1.00 to 80.00% outside Assam. Those of these variants that are more closely linked to the S-protein locus, which often carries advantageous mutations and is tightly linked to the ORF8 locus, retained higher frequencies, while the less tightly linked variants showed lower frequencies, likely due to recombination among co- circulating variants in Assam. The ratios of non-synonymous substitutions to synonymous substitutions suggested that some genes such as those coding for the S-protein and non-structural proteins underwent positive selection while others were subject to purifying selection during their evolution in Assam. Furthermore, we observed negative correlation of the Ct value of qRT-PCR of the patients with abundant ORF6 transcripts, suggesting that ORF6 can be used as a marker for estimating viral titer. In conclusion, our in-depth analysis of SARS-CoV-2 genomes in a regional population reveals the mechanism and dynamics of viral evolution.

Data availability

The sequences generated in this study are submitted to GISAID database and can be accessed using the accession number given in Additional file 2: Table S2.

Files

Genome-analysis-of-SARS-CoV-2-isolates-from-a-population.pdf

Files (8.6 MB)

Name Size Download all
Article
md5:e0e91f8415e3b198ed686b812c27df5e
6.8 MB Preview Download
md5:33abba5337a2d85b50f6fbff0a2bbc40
1.8 MB Preview Download

Additional details

Identifiers

DOI
10.1186/s12985-023-02139-3
Other
oai:uchicago.tind.io:7774

Funding

Department of Science and Technology, India
SEED/TITE/2019/103/G dtd: 12.03.20
Department of Science and Technology, India
SEED/TITE/2019/103/C dtd: 12.03.2020
Department of Science and Technology, India
SEED/TITE/2019/103/C dtd: 21.09.2020

UChicago Information

Division(s)
Biological Sciences Division
Department(s)
Ecology and Evolution