Building a timeline for the evolution of language subcomponents

Alejandro M. Andirkó, Juan Moriano and Cedric Boeckx

The evolutionary history of the faculty of language is complex and multifaceted,
parallel to the muddled history of our species [Bergstrom et al., 2021].
The multiple subcomponents that make language possible seem to have originated
at distinct points in evolutionary history, and often build upon brain
networks and structures that precede the emergence of our species [Dediu and
Levinson, 2013]. For this reason, large-scale estimations of the time of emergence
of genetic variants are essential to ask precise answers about the emergence of
traits that enable the faculty of language. Assigning a temporal dimension to
genetic variation in humans would open up the possibility of predicting when did
changes aecting the language subcomponents emerged in time. This temporal
aspect is a crucial question to clarify in language evolution, as H. sapiens have
species-specic changes in brain regions related to the language faculty (such as
the parietal lobes and the cerebellum) [Pereira-Pedro et al., 2020].
Until recently, estimating the age of variants was heavily dependent on making
assumptions on demographic models that are quickly revised by new ndings
in population genetics and paleogenomics. The appearance of a non-parametric,
genome-wide open repository of genetic variant age estimation, in combination
with high quality paleogenomic data, oers us the opportunity of a temporal
evaluation of sets of variants related to recent changes in brain morphology
likely impacting language.
Here, we introduce an analysis work
ow from genetic variation to changes
in gene expression in specic brain regions. First, we selected Homo sapiensspeci
c variants in high frequency, variants under positive selection since the
divergence with extinct human species, introgression data and enhancer and
promoter data from brain genomics data. We then crossed this data with the
database of genetic variant estimation, showing that high-frequency variants
follow a recurrent bimodal distribution, peaking at circa 50 thousand and 1
million years. Then, we applied a machine learning-based prediction of variant
expression in brain regions, as well as time-sensitive gene ontology analysis. We
show that specic enrichments of gene categories accumulate in particular time
windows, which brings into prominence the 300-500k time slice, a key moment
in our species history [Hublin et al., 2017]. Additionally, we nd evidence for
very early mutations impacting the facial phenotype, and much more recent
molecular events linked to specic brain regions such as the cerebellum or the
precuneus, key areas in the human capacity of language, in consonance with
previous work.

Anders Bergstrom, Chris Stringer, Mateja Hajdinjak, Eleanor M. L. Scerri, and
Pontus Skoglund. Origins of modern human ancestry. Nature, 590(7845):
229{237, February 2021. ISSN 1476-4687. doi: 10.1038/s41586-021-03244-5.
Dan Dediu and Stephen C. Levinson. On the antiquity of language: the reinterpretation
of Neandertal linguistic capacities and its consequences. Frontiers
in Psychology, 4, 2013. ISSN 1664-1078. doi: 10.3389/fpsyg.2013.00397. URL
Ana Soa Pereira-Pedro, Emiliano Bruner, Philipp Gunz, and Simon Neubauer.
A morphometric comparison of the parietal lobe in modern humans and Neanderthals.
Journal of Human Evolution, 142:102770, 2020. ISSN 0047-2484.
doi: 10.1016/j.jhevol.2020.102770.
Jean-Jacques Hublin, Abdelouahed Ben-Ncer, Shara E. Bailey, Sarah E. Freidline,
Simon Neubauer, Matthew M. Skinner, Inga Bergmann, Adeline
Le Cabec, Stefano Benazzi, Katerina Harvati, and Philipp Gunz. New fossils
from Jebel Irhoud, Morocco and the pan-African origin of Homo sapiens.
Nature, 546(7657):289{292, June 2017. ISSN 1476-4687. doi: 10.1038/nature22336.