Home >> News Center >> New linguistic research reveals the origins of the Sino-Tibetan linguistic family

New linguistic research reveals the origins of the Sino-Tibetan linguistic family

Phylogenetic analysis of the Sino-Tibetan linguistic family, which includes Chinese, Tibetan and Burmese, suggests that it originated some 7,200 years ago in the north of China, linked to cultures.

The Sino-Tibetan language family, which includes some of the first literary languages such as Chinese, Tibetan and Burmese, consists of over 400 modern languages spoken in China, India, Burma and Nepal. It is one of the most diverse language families in the world, with 1.4 billion speakers. Although it has been studied since the beginning of the 19th century, researchers' knowledge of the origin of these languages is still severely limited. An interdisciplinary study published in PNAS, led by scientists from the Center for Linguistic Research for East Asia (Paris), the Max Planck Institute for the Science of Human History (Jena), and the Center for Research in Decision Mathematics (Paris), sheds new light on the place and time of origin of these languages. Based on a phylogenetic study of 50 ancient and modern Sino-Tibetan languages, scholars concluded that Sino-Tibetan languages originated among millet farmers in northern China some 7,200 years ago.

Over the past 10,000 years, two of the world's largest language families have emerged, one in the west and one in eastern Eurasia. Together, these families account for almost 60% of the world's population: Indo-European (3.2 billion speakers) and Sino-Tibetan (1.4 billion). The Sino-Tibetan family comprises some 500 languages spoken in a wide geographic range, from the western Pacific coast to Nepal, India, and Pakistan. Speakers of these languages played an important role in human prehistory, giving rise to the first high cultures, such as Chinese, Tibetan, Burmese, and Nepali. However, while archaeogeneticists, philogeneticists and linguists have vigorously discussed the origins of the Indo-European language family, the formation of Sino-Tibetan languages has received little attention to date.

One of the most diverse language families in the world

"The Sino-Tibetan family is one of the most diverse families in the world. It includes all the different types of morphological systems, from insulating languages such as Chinese, Burmese and Tujian, to polysynthetic languages such as Guialrongic and Kyrgyz languages," explains Guillaume Jacques of the East Asian Linguistic Research Centre, co-founder of the study. "Although our knowledge of how to linguistically compare these languages is improving, important aspects of the development of their phonological systems and grammars remain poorly understood.

A database of core words in 50 Sino-Tibetan languages

In order to shed light on the complex history of these languages, the researchers set up a lexical database containing the nuclear vocabulary of 50 Sino-Tibetan languages. This database, now published for the first time, includes ancient languages spoken 1000 years or more ago, such as ancient Chinese, ancient Burmese and ancient Tibetan, as well as modern languages documented by field work.

"To compare these languages in a transparent manner, we have developed a specific annotation framework that allows us not only to mark which words we identify as sharing a common origin, but also which sounds in these words we believe are related," says Johann-Mattis List of the Max Planck Institute for the Science of Human History, which led the study. "A particular problem in identifying the truly related words was the numerous cases in which languages lent each other words," Jacques says. "Fortunately, we know the history of certain languages well, and we can rely on techniques we developed earlier to reveal the actual history hidden by these loans.

Evolutionary trees suggest that the family originated about 7200 years ago.

Using powerful methods of computational phylogenetics, the team inferred the most likely relationships between these languages and then estimated when their origin might have been given. "We found clear evidence from seven major subgroups, with a complex pattern of overlapping signals below this level," says Simon J. Greenhill of the Max Planck Institute for the Science of Human History. "Our estimates suggest that the ancestral language appeared about 7,200 years ago.

An agricultural analysis reveals the most likely scenario of origin and expansion of the language family.

Seeking to solve the complex paths of the evolution of Sino-Tibetan languages, the authors analyzed words related to domestication, since they could reveal how agricultural knowledge spread throughout the region. This agricultural analysis suggested an origin of the Sino-Tibetan family in northern Chinese communities of millet farmers from late Cishan and early Yangshao cultures. "The most likely scenario of language expansion involves an initial separation between an eastern group, from which Chinese dialects evolved, and a western group, which is the ancestor of the rest of the Sino-Tibetan languages," summarizes Laurent Sagart of the East Asian Linguistic Research Centre, co-founder of the study and responsible for agricultural analysis.

"We are very excited about our findings," says List. "Our approach combines robust traditional approaches with cutting-edge computational methods in the context of a computer-assisted framework that allows us to use our knowledge of current languages as a key to their past.