Linguistics by Dmitry Nikolaev
Language (forthcoming), 2024
This study provides two mathematical formalizations of borrowability. These operationalizations a... more This study provides two mathematical formalizations of borrowability. These operationalizations allow us to quantitatively evaluate the borrowability of phonological segments and to make predictions about the likelihood that speech sounds will be borrowed in language contact situations. Our approach departs from traditional borrowability hierarchies based on qualitative observations and instead provides empirically motivated models based on probability theory and statistics. Our study uses as input two large cross-linguistic segment inventory databases and our results show that segments have markedly different borrowability profiles, highlighting their different diffusion patterns through space and time.
Routledge Handbook of Eurolinguistics (forthcoming), 2024

Linguistic Typology, 2023
It has been recently shown by Nikolaev and Grossman (2020) that it is possible to provide a fine-... more It has been recently shown by Nikolaev and Grossman (2020) that it is possible to provide a fine-grained typological analysis of consonant inventories of the world's languages by investigating co-occurrence classes of segments, i.e. groups of segments that tend to be found together in inventories. Nikolaev and Grossman argued that the structure of many of such co-occurrence classes is in contradiction with the Feature-Economy Principle. As a side product of this analysis, a new definition of the Basic Consonant Inventory (BCI)-a cluster of segments forming the bedrock of consonantal inventories of the world's languages-was provided. This paper replicates the co-occurrence study in an arguably more robust way. In addition to making a methodological contribution, it shows that some of the co-occurrence classes defined by Nikolaev and Grossman, including the BCI, are not statistically stable and may be an artefact of the imbalance in the language sample used for the analysis. The findings of the authors regarding the Feature-Economy Principle, however, were corroborated.
Linguistic Typology, 2021
This article provides a new precise algorithmic definition of the notion "phonological-inventory ... more This article provides a new precise algorithmic definition of the notion "phonological-inventory gap". On the basis of this definition, I propose a method for identifying gaps, provide descriptive data on several types of consonant-inventory gaps in the world's languages, and investigate the relationships between gaps and inventory size, processes of sound change, and phonological segment borrowing.

International Review of Social Psychology, 2021
The last decade saw rapid growth of the body of work devoted to relations between social thermore... more The last decade saw rapid growth of the body of work devoted to relations between social thermoregulation and various other domains, with a particular focus on the connection between prosociality and physical warmth. This paper reports on a first systematic cross-linguistic study of the exponents of conceptual metaphor AFFECTION IS WARMTH (Lakoff & Johnson, 1980; Grady, 1997), which provides the motivation for the large share of research in this area. Assumed to be universal, it enables researchers, mostly speakers of major European languages, to treat words like warm and cold as self-evident and easily translatable between languages – both in their concrete uses (to feel warm/cold) and as applied to interpersonal relationships (a cold/warm person, warm feelings, etc.). Based on a sample of 94 languages from all around the world and using methodology borrowed from typological linguistics and mixed-effects regression modelling, we show that the relevant expressions show a remarkably skewed distribution and seem to be absent or extremely marginal in the majority of language families and linguistic macro-areas. The study demonstrates once again the dramatic influence of the Anglocentric, Standard Average European, and WEIRD perspectives on many of the central concepts and conclusions in linguistics, psychology, and cognitive research and discusses how changing this perspective can impact research in social psychology in general and in social thermoregulation in particular.
Phonology, 2020
The feature-economy principle is one of the key theoretical notions which have been postulated to... more The feature-economy principle is one of the key theoretical notions which have been postulated to account for the structure of phoneme inventories in the world's languages. In this paper, we test the explanatory power of this principle by conducting a study of the co-occurrence of consonant segments in phonological inventories, based on a sample of 2761 languages. We show that the feature-economy principle is able to account for many important patterns in the structure of the world's phonological inventories; however, there are particular classes of sounds, such as what we term the ‘basic consonant inventory’ (the core cluster of segments found in the majority of the world's languages), as well as several more peripheral clusters whose organisation follows different principles.

Language Dynamics and Change, 2019
This paper discusses the impact of linguistic contact on the make-up of consonantal inventories o... more This paper discusses the impact of linguistic contact on the make-up of consonantal inventories of the languages of Eurasia. New measures for studying the importance of language contact for the development of phonological inventories are proposed, and two empirical studies are reported. First, using two different measures of dissimilarity of phonemic inventories (the Jaccard dissimilarity measure and the novel Closest-Relative Cumulative Jaccard Dissimilarity measure), it is demonstrated that language contact-operationalized as languages being connected by an edge in a neighbor network-makes a significant contribution to between-inventory differences when phylogenetic variables are controlled for. Second, a novel measure of the exposure of a language to a particular segment-the Neighbor-Pressure Metric (NPM)-is proposed as a means of quantifying language contact with respect to phonological inventories. It is shown that addition of NPM helps achieve higher prediction accuracy than using bare phylogenetic data and that distributions of different consonants display a different degree of dependence on language-contact processes. Finally, more complex models for predicting consonant inventories are briefly explored, demonstrating the presence of complex non-linear relationships between inventories of neighboring languages.
Studies in Language, 2018
This paper makes a contribution to phonological typology by investigating the distribution of aff... more This paper makes a contribution to phonological typology by investigating the distribution of affricate-rich languages in Eurasia. It shows that affricate-rich and affricate-dense languages cluster areally within Eura-sia, and have area-specific histories. In particular, the affricate-rich areas of western Eurasia—a 'European' area and a Caucasian area—are not the result of contact-induced sound changes or borrowing, while the two affricate-rich areas of eastern Eurasia—the Hindukush area and the eastern Himalayan area—are the result of contact. Specifically, affricate-rich areas depend on the emergence of retroflex affricates. Moreover, languages outside these affricate-rich areas tend to lose retroflex affricates.
Linguistics Vanguard
It is often assumed that translated texts are easier for processing than original ones. However, ... more It is often assumed that translated texts are easier for processing than original ones. However, it has also been shown that translated texts contain evident traces of source-language morphosyntax, which should presumably make them less predictable and harder to process. We test these competing observations by measuring morphosyntactic entropies of original and translated texts in several languages and show that there may exist a categorical distinction between translations made from structurally-similar languages (which are more predictable than original texts) and those made from structurally-divergent languages (which are often non-idiomatic, involve structural transfer, and therefore are more entropic).
Linguistic Typology, 2020
This paper investigates universal and areal structures in the lexicon as manifested by colexifica... more This paper investigates universal and areal structures in the lexicon as manifested by colexification patterns in the semantic domains of perception and cognition, based on data from both small and large datasets. Using several methods, including weighted semantic maps, formal concept lattices, correlation analysis, and dimensionality reduction, we identify colexification patterns in the domains in question and evaluate the extent to which these patterns are specific to particular areas. This paper contributes to the methodology of investigating areal patterns in the lexicon, and identifies a number of cross-linguistic regularities and of area-specific properties in the structuring of lexicons.
Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), 2020
Phonological segment borrowing is a process through which languages acquire new contrastive speec... more Phonological segment borrowing is a process through which languages acquire new contrastive speech sounds as the result of borrowing new words from other languages. Despite the fact that phonological segment borrowing is documented in many of the world's languages, to date there has been no large-scale quantitative study of the phenomenon. In this paper, we present SEGBO, a novel cross-linguistic database of borrowed phonological segments. We describe our data aggregation pipeline and the resulting language sample. We also present two short case studies based on the database. The first deals with the impact of large colonial languages on the sound systems of the world's languages; the second deals with universals of borrowing in the domain of rhotic consonants.
Linguistics Vanguard, 2018
The paper presents an overview of The Database of Eurasian Phonological Inventories—a new informa... more The paper presents an overview of The Database of Eurasian Phonological Inventories—a new information resource and analytical tool for research in the field of distributional phonological typology, theoretical phonology, and areal linguistics.
Forthcoming in Voprosy Yazykoznaniya
Voprosy jazykoznanija, 2020
On the sound change *ld > n d in Tibetic, a reply to Xun Gong
Slides for the talk given at ALT 13 in Pavia in September 2019.
The aim of this talk is to provide additional typological data in support of Graham Isaac's hypot... more The aim of this talk is to provide additional typological data in support of Graham Isaac's hypothesis that the distribution of absolute vs conjunct endings and deuterotonic vs prototonic forms in the Old Irish verb not only was an outcome of some purely formal historical processes like word-accent shift, but also must have had a grammatical meaning which is best understood from the perspective of the sentence information structure.
Slides for the talk given at SLE 52 in Leipzig in August 2019
SLE, 2019
These are our slides from the SLE meeting in Leipzig.
Uploads
Linguistics by Dmitry Nikolaev
The article is based on the assumption that there existed an early Christian tradition of autobiographical narrative, which included St. Patrick's "Confessio". The assumption is based on simillarities in the narrative schemes employed in "Confessio" and other autobiographical texts and text fragments dating to the Second to Fifth centituries. These incorporate "Confessions" by St. Augustine, "Letter to Donaatus" by St.Cyprian of Carthage, "On the Trinity" by Hilary of Poitiers and "Dialogue with Trypho" by St. Justin Martyr. These are the earliest known autobiographical texts in Western European literature. In addition, the main unifying feature of all these texts is the bi-partite structure, where the first part is the autobiographical narrative proper while the secornd is a theological treatise. This scheme evidently can be traced back to some autobiographical notions in the letters of St. Paul — the authors use their own life story as an argument to prove the verity of the sacral truth, which is set out by them. Patrick, who was probably familiar with writings of Cyprian and Augustine, also uses this compositional scheme. But the thrust of his argument is not in proving the truth of his words concerning God and religion but in asserting his right to speak of it, to preach the Gospel to pagans in spite of the sins in his own past.
https://www.indiana.edu/~iucweb/egyptology/
http://www.ddl.cnrs.fr/colloques/NCW2019/
http://www.ddl.cnrs.fr/colloques/NCW2019/pageweb/pdf/NCW2019_Abstract_book.pdf