Demographic information of the participants
Between 2006 and 2020, eight Burmese individuals were newly diagnosed with HIV in Baoshan, China. The participants included six females and two males, with ages ranging from twenty-one to forty-two years old in 2020, and ethnicities including Lisu, Dai, Han, and Jingpo (Table 1). Most participants were either uneducated or had only received primary education, and all were farmers. The majority were married, but three participants were unmarried. The primary route of HIV transmission was heterosexual contact, with only one male infected through needle sharing (Table 1).
Prevalence of known subtypes of HIV strains
The maximum likelihood tree, constructed using near full-length genomes of HIV, showed that among the eight sequences amplified from Burmese PLHIV in Baoshan, only one sequence (YN33F28) clustered with the known subtype C, and one sequence (YN9M24) clustered with the known CRF08_BC, suggesting that they likely belonged to subtype C and CRF08_BC, respectively (Fig. 1). The remaining six sequences did not cluster with any known subtypes, indicating that they might have represented newly formed recombinant strains (Fig. 1). These findings were corroborated by a neighbor-joining tree incorporating reference sequences of all HIV subtypes and CRFs (Figure S1).
Maximum likelihood tree based on the near full-length genome of HIV. The red triangles represent the sequences amplified in this study, and the black circles represent the similar sequences downloaded from the HIV database. The sectorial shadings in different colors are used to distinguish different HIV subtypes
The Bootscan plot of sequence YN33F28 showed that it was most similar to HIV subtype C, with no recombination breakpoints, further confirming that it was indeed subtype C (Fig. 2A). The Bootscan plot of sequence YN9M24 demonstrated that its recombination structure was consistent with that of CRF08_BC, which was formed by recombination between subtype B and subtype C. It contained three subtype B fragments and three subtype C fragments, separated by five breakpoints (Fig. 2B). These results further confirmed that sequence YN9M24 belonged to CRF08_BC.

Bootscan plots of HIV sequences of known subtypes. (A) Subtype C; (B) CRF08_BC. In each plot, the lines in different colors represent the reference sequences of different subtypes, and the black arrows indicate the shared breakpoints among different HIV sequences
Recombinant strains formed through second-generation recombination
The maximum likelihood tree revealed that sequences YN36F38, YN35F22, YN34F21, YN7F27, and YN32M22 clustered with known CRFs (CRF82_cpx, CRF86_BC, and CRF178_BC) and a URF (KY406739), respectively. However, these sequences were positioned closer to the root of the phylogenetic tree compared to their corresponding reference sequences, indicating that they were genetically closely related but not identical to the reference sequences. This suggested that they might have been recombinant strains formed through second-generation recombination involving these reference sequences (Fig. 1).
Bootscan analysis further supported this hypothesis. YN36F38 exhibited a recombination structure in the latter half of the genome identical to that of CRF82_cpx, with three shared recombination breakpoints (Fig. 3A). Similarly, YN35F22 shared a recombination structure in the latter half of the genome with CRF86_BC, with five shared recombination breakpoints (Fig. 3B). Meanwhile, YN34F21 and YN7F27 each shared four and three recombination breakpoints with CRF178_BC, respectively (Fig. 3C). YN32M22, on the other hand, shared two recombination breakpoints with KY406739 in both the anterior and posterior regions of the genome (Fig. 4). These findings provided further evidence that these five sequences (YN36F38, YN35F22, YN34F21, YN7F27, and YN32M22) represent recombinant strains resulting from second-generation recombination involving known CRFs or URFs.

Bootscan plots of second-generation recombinant HIV sequences composed of CRFs. (A) CRF82_cpx; (B) CRF86_BC; (C) CRF178_BC. In each plot, the lines in different colors represent the reference sequences of different subtypes, and the black arrows indicate the shared breakpoints among different HIV sequences

Bootscan plots of second-generation recombinant HIV sequences composed of URF_BC. In each plot, the lines in different colors represent the reference sequences of different subtypes, and the black arrows indicate the shared breakpoints among different HIV sequences
Maximum likelihood trees constructed from the subregions of these five sequences revealed that their identical fragments clustered with their respective parental sequences (Figure S2). This observation further supports the conclusion that these sequences are second-generation recombinants derived from known CRFs or URFs.
Newly formed URFs
The remaining sequence, YN8F28, did not cluster with any known subtypes, CRFs, or URFs. Instead, it positioned closer to the root of the phylogenetic tree, suggesting that it was genetically distant from known sequences and might have represented newly formed URF (Fig. 1).
Bootscan analysis confirmed that YN8F28 was a recombinant strain derived from HIV subtypes B and C, as well as CRF01_AE, comprising four subtype B fragments, six subtype C fragments, and four CRF01_AE fragments (Fig. 5). Although phylogenetically positioned between CRF82_cpx and CRF83_cpx (Fig. 1), YN8F28 shared no recombination breakpoints with these CRFs (Fig. 5). These findings indicated that YN8F28 was a novel URF arising from complex recombination events involving multiple HIV subtypes.

Bootscan plots of newly identified HIV URF and known CRFs. In each plot, the lines in different colors represent the reference sequences of different subtypes