Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 13;22(1):59.
doi: 10.1186/s12915-024-01838-9.

Reconstructing the ancestral gene pool to uncover the origins and genetic links of Hmong-Mien speakers

Affiliations

Reconstructing the ancestral gene pool to uncover the origins and genetic links of Hmong-Mien speakers

Yang Gao et al. BMC Biol. .

Abstract

Background: Hmong-Mien (HM) speakers are linguistically related and live primarily in China, but little is known about their ancestral origins or the evolutionary mechanism shaping their genomic diversity. In particular, the lack of whole-genome sequencing data on the Yao population has prevented a full investigation of the origins and evolutionary history of HM speakers. As such, their origins are debatable.

Results: Here, we made a deep sequencing effort of 80 Yao genomes, and our analysis together with 28 East Asian populations and 968 ancient Asian genomes suggested that there is a strong genetic basis for the formation of the HM language family. We estimated that the most recent common ancestor dates to 5800 years ago, while the genetic divergence between the HM and Tai-Kadai speakers was estimated to be 8200 years ago. We proposed that HM speakers originated from the Yangtze River Basin and spread with agricultural civilization. We identified highly differentiated variants between HM and Han Chinese, in particular, a deafness-related missense variant (rs72474224) in the GJB2 gene is in a higher frequency in HM speakers than in others.

Conclusions: Our results indicated complex gene flow and medically relevant variants involved in the HM speakers' evolution history.

Keywords: Genomic diversity; Hmong-Mien; Local adaptation; Next-generation sequencing; Reconstructing genomes.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Genetic diversity among Hmong-Mien-speaking (HM) subpopulations. a The hierarchical genetic coordinates of HM speakers by PCA in the context of East Asia. The north–south population is divided by the geographical boundary, along the Qinling Mountains—Huai River line. b Linear fitting between PC1 and latitude. The red dots mean HM groups. The regression lines fitted using all the groups in the PCA. The blue shadow is a 95% confidence interval. c Linear fitting between PC2 and longitude. d The ancestral component inference of the HM speakers by ADMIXTURE software. e Paternal genetic model and divergence time estimation of HM population by Y chromosome O-M7 haplogroup. f The genealogy tree of the HM language family was constructed by the distance matrix generated by the comparison of basic words
Fig. 2
Fig. 2
Gene flow from surrounding populations promoted the formation of a subpopulation structure within the HM population. a FST of the pairwise population was the result of randomly selecting 9 samples and repeating it 100 times. Each point represents a repetition. bd The diamonds denote two target populations, and the other dots denote the populations tested. The Yao samples come from two sites, Guilin and Laibin. The color represents the relative difference in FST between the population to be tested and the two target populations (see “Methods”). The results show that surrounding populations contributed gene flow to the HM populations
Fig. 3
Fig. 3
A fine origin history model of the HM speakers by qpGraph. The basic skeleton was obtained through the analysis of the above group structure and history and then further adjusted according to the score. The final score was 7.53 × 10−6. The dotted line represents the admixture events, and the percentage was the infiltration proportion of the current admixture event. The divergence time in the figure is the result of MSMC-IM analysis
Fig. 4
Fig. 4
A workflow for reconstructing the ancestral genome based on present-day populations. I. Local ancestry inference of modern human genome. The segments inferred as HM in 5 of the 10 results were treated as candidate segments. Grey dots denote variants. II. Assembly of HM ancestral genomes using candidate segments. Each time, a segment was randomly selected from the optional starting sites and extended back no more than 6 kb. The low-coverage regions of the ancestral segments were skipped. The pentagrams denote the starting sites for candidates.
Fig. 5
Fig. 5
A map for tracking the population migration history of the HM population. The black dots in the image are all publicly available ancient DNA samples. The red triangle denotes the Daxi site in the middle reaches of the Yangtze River. The red dot shows the eight ancient DNA samples that share the most genetic drift with the reconstructed aHM population. The numbers in parentheses represent the total number of samples in the relic and the number of samples associated with aHM. These ancient DNA samples came from people who mainly lived in southern East Asia for about 3000–4000 years, which is later than the time of the most recent common ancestor of present-day HM populations. These ancient DNA samples reflect the footprints of population diffusion after the divergence of HM populations. Ancient DNA samples showed that the ancestors of the HM population reached the China–Indochina Peninsula in the South and the Yellow River Basin in the North
Fig. 6
Fig. 6
rs72474224 showed a specific positive selection signal in populations of Southern East Asia, including HM. a Distribution of allele frequency of rs72474224-T in the context of worldwide populations. b Haplotype network was constructed by PopART using the 39 SNVs located in GJB2. The rs72474224-T only appears on the haplotypes in the red box. c Extended haplotype homozygosity (EHH) of rs72474224-T. This allele showed positive selection signals in HM populations
Fig. 7
Fig. 7
Rare derived alleles with strong effects were dispersed in different individuals. The top shows all transcripts of each gene, and each line below is a Yao individual. Triangles of different colors indicate that rare derived alleles are located in different loci. The box indicates that two related individuals share the same rare variants at this locus. The individual marked with a short red line indicates that the individual carries at least one variant with strong effects on the gene. The proportion of individuals carrying strong effect variants on each gene is the carrying rate

Similar articles

References

    1. Handel Z. Review of Ratliff (2010): Hmong-Mien language history. Diachronica. 2012;29(3):385–398. doi: 10.1075/dia.29.3.06han. - DOI
    1. Wen B, Li H, Gao S, Mao X, Gao Y, Li F, et al. Genetic structure of Hmong-Mien speaking populations in East Asia as revealed by mtDNA lineages. Mol Biol Evol. 2005;22(3):725–734. doi: 10.1093/molbev/msi055. - DOI - PubMed
    1. Li H, Huang Y, Mustavich LF, Zhang F, Tan JZ, Wang LE, et al. Y chromosomes of prehistoric people along the Yangtze River. Hum Genet. 2007;122(3–4):383–388. doi: 10.1007/s00439-007-0407-2. - DOI - PubMed
    1. Cai X, Qin Z, Wen B, Xu S, Wang Y, Lu Y, et al. Human migration through bottlenecks from Southeast Asia into East Asia during last glacial maximum revealed by Y chromosomes. PLoS ONE. 2011;6(8):e24282. doi: 10.1371/journal.pone.0024282. - DOI - PMC - PubMed
    1. Consortium HPAS. Abdulla MA, Ahmed I, Assawamakin A, Bhak J, Brahmachari SK, et al. Mapping human genetic diversity in Asia. Science. 2009;326(5959):1541–5. doi: 10.1126/science.1177074. - DOI - PubMed