Welcome to another week of Dear Duolingo, an advice column just for learners. Catch up on past installments here.
Hi! I’m Olivia Thayer, and I’m thrilled to be taking over this week’s Dear Duolingo column! I have a master’s degree in historical linguistics, which means I LOVE talking about the similarities and differences between languages around the world, what makes them unique, and how they’ve changed over time. So this week, I’ll be taking you on a world tour of language families 🌍
Our question this week:

What is a language family?
A language family is a group of languages (or, in some cases, a larger group encompassing multiple language families!) that are generally accepted to have come from the same proto-language. A proto-language is an older language from which these languages descended and changed over time, often due to things like migration and colonization that led to the separation of one linguistic group into multiple groups, or contact between two or more linguistic groups that hadn’t previously lived in the same place before.
But how can we know for sure which languages are related? The short answer is: We can’t! The most helpful tool in learning which languages might be related is to know what older forms of each language looked like, but not all languages kept written records, and not all written records have stood the test of time. Also, languages don’t start out with writing systems—they start out spoken or signed. When a language isn’t well-documented through writing, it’s hard to know what words, grammar, or pronunciations it had a few centuries ago, so we can’t always say with certainty which languages belong on the same family tree.
This is especially true of creoles—because of the unique ways these languages arise and evolve, they don’t fit neatly into the imperfect system we’ve created (but they can still teach us a lot about the possibilities of language!). Signed languages can also be related to each other, with their own families, and there are also cases of them arising as brand-new languages!
So, as you’ll see below, what is or isn’t a language family is not always clear or universally agreed on. This is partly because linguists are almost always working with limited historical data like written records, and partly because of the question of what counts as a language vs. a dialect (spoiler: no one knows!). And even if everyone agreed on a language family, it would be impossible to fit all (or even most) of the world’s language families into one post—there are hundreds of families, and they’re all interesting! 🤓
Without further ado, here are 22 of the world’s language families as well as where they’re most frequently spoken and the estimated number of languages in this family today (there may have been many more, or fewer, in the past!).
Tupian
Quechuan
Aymaran
Arawakan
Oto-Manguean
Uto-Aztecan
Mayan
Indo-European
Niger-Congo
Afro-Asiatic
Austronesian
Khoe-Kwadi
Turkic
Kartvelian
Mongolic
Sino-Tibetan
Dravidian
Japonic
Kra-Dai
Koreanic
Trans-New Guinea
Pama-Nyungan
Some of the world’s major language families
Tupian
Spoken mainly in: Brazil, Bolivia, Paraguay, and Uruguay
Approximate number of languages: 70
Did you know? Guaraní, a Tupian language, is one of the two official languages of Paraguay alongside Spanish, and Bolivia recognizes Guaraní, Guarasu’we, Guarayu, Sirionó, Tapiete, and Yuki—all Tupian languages!—as six of its thirty-nine official languages. English has Tupian languages to thank for the words jaguar, petunia, and tapioca!
Quechuan
Spoken mainly in: Peru, Ecuador, Bolivia, Colombia, Argentina, and Chile
Approximate number of languages: 40
Did you know? Many Quechuan languages distinguish between what linguists call inclusive “we” and exclusive “we.” This means that there are two different ways to say “we”: one if you’re including the person you’re speaking to (“you and I”), and one if you’re not (“Raúl and I”). In other words, if someone told you We’re going hiking in the Andes, you would know whether to pack your bags or expect a postcard!
Aymaran
Spoken mainly in: Bolivia, Peru, Chile, and Argentina
Approximate number of languages: 3
Did you know? Aymaran and Quechuan languages share so much vocabulary that some linguists group them together into one language family. However, since they are the most widespread indigenous language families in the Andes, others believe that the similarities between them are due to centuries of contact, resulting in lots of words and even grammar borrowed from one language to another.
Arawakan
Spoken mainly in: The Amazon region of South America, including French Guiana, Suriname, Guyana, Venezuela, Colombia, Peru, Brazil, and Bolivia
Approximate number of languages: 40
Did you know? Arawakan languages cover a huge amount of ground, and some linguists hypothesize that Arawakan languages diversified and spread throughout South America as a lingua franca for trade before European colonization. The English words hurricane, barbecue, and tobacco are all thought to come from the formerly widespread (but now extinct) Arawakan language, Taíno.
Oto-Manguean
Spoken mainly in: Mexico, especially in the state of Oaxaca
Approximate number of languages: 180
Did you know? All languages in the Oto-Manguean family are tonal! This means that, like in Mandarin Chinese, the pitch of your voice—whether high, low, rising, falling, steady, etc.—is just as important in determining the meaning of a word as the individual consonants and vowels that make up the word.
Uto-Aztecan
Spoken mainly in: the United States and Mexico
Approximate number of languages: 60
Did you know? You’ll notice that many names for language families consist of two words separated by a hyphen—they’re often named after the two most distant members of the family! The Uto-Aztecan family was named for the Ute languages (where the U.S. state of Utah gets its name) and the Aztecan, or Nahuatl, languages spoken in Mexico. Most Uto-Aztecan languages belong to the Nahuatl family, and these languages were spoken by the Aztecs who founded the city of Tenochtitlan, present-day Mexico City.
Mayan
Spoken mainly in: Mexico, Guatemala, Belize, Honduras, and El Salvador
Approximate number of languages: 30
Did you know? Today, you’re most likely to see Mayan languages written in the Latin alphabet (the same one we use in English). But until European colonization, Mayan was written with logograms, where small, complex glyphs were used to represent words or individual syllables. Linguists are still working on deciphering this writing system—which may have been the very first writing system to exist in Mesoamerica!
Indo-European
Spoken mainly in: the world! The historical range of Indo-European languages is thought to be from northern India to western Europe—including the British Isles and Iceland—but today there are millions of speakers of languages from this family on every continent.
Approximate number of languages: 400
Did you know? Indo-European is called a language family, but the smaller branches that constitute it are often called language families, too! These branches include Indic (such as Hindi and Bengali), Iranian (which includes Farsi, Pashto, and Kurdish), Germanic (including English, German, and Dutch), and Romance (such as Spanish, French, and Portuguese).
Niger-Congo
Spoken mainly in: Africa, from West Africa in the west to the Horn of Africa in the East and south to South Africa
Approximate number of languages: 1,400
Did you know? This is one of the world’s largest language families (sometimes called a language phylum because of its size) in terms of number of speakers, number of languages, and geographical reach. The Niger-Congo languages are further divided into Bantu languages (including Swahili and Zulu, both taught on Duolingo!) and non-Bantu languages (including Yoruba and Igbo, spoken in and around Nigeria).
Afro-Asiatic
Spoken mainly in: North Africa, East Africa, and Southwest & Central Asia as far east as Uzbekistan
Approximate number of languages: 400
Did you know? This large language family is estimated to be the fourth-largest in the world by number of speakers, after Indo-European, Sino-Tibetan, and Niger-Congo. It is further divided into the Berber, Chadic, Cushitic, Egyptian, Omotic, and Semitic families, and this language family is a great example of why it’s so difficult to get a perfectly accurate count of how many languages each language family has: Arabic (a Semitic language) has dozens of dialects, and even though they’re all related, speakers of one variety may sometimes find it challenging to understand speakers of another!
Austronesian
Spoken mainly in: Southeast Asia and the islands of the Pacific and Indian Oceans, from Madagascar to New Zealand
Approximate number of languages: 1,200
Did you know? The Austronesian and Niger-Congo language families each contain roughly one-fifth of the world’s languages! Nearly all of the languages in this family belong to the Malayo-Polynesian subgroup, which includes Malagasy (spoken in Madagascar), Indonesian, Javanese, Tagalog (spoken in the Philippines), Fijian, Hawaiian, and Māori (spoken in New Zealand).
Khoe-Kwadi
Spoken mainly in: southern Africa
Approximate number of languages: 10
Did you know? You may have heard of languages with click consonants even if you’ve never heard of Khoe-Kwadi languages, but this language family is an important one. It may be because of the Khoe-Kwadi languages that neighboring Bantu languages have clicks! In fact, these clicks were so remarkable to linguists—and so rare in other parts of the world—that experts used to believe that nearly every language in southern Africa that wasn’t part of the Niger-Congo family was related… mostly because of the clicks!
Turkic
Spoken mainly in: Eurasia, from Türkiye to Siberia
Approximate number of languages: 40
Did you know? As you might have guessed from its name, the most widely-spoken language in the Turkic family is Turkish, but Turkic languages also include Uzbek, Azerbaijani, Kazakh, and Uyghur. Because of its extensive range, Turkic languages have borrowed (and loaned) many words from other language families—for example, Turkish has lots of influence from Arabic, while Yakut languages (spoken in Siberia) have been heavily influenced by Mongolian.
Kartvelian
Spoken mainly in: Georgia
Approximate number of languages: 4
Did you know? All Kartvelian languages, including Georgian, are written using one of three writing systems, which together are called Georgian scripts. Mkhedruli is the most common of these scripts, but Asomtavruli and Nuskhuri are often seen in historical or liturgical texts. No one knows the origin of these scripts, or if they’re related to another of the world’s writing systems!
Mongolic
Spoken mainly in: Mongolia and parts of Russia, China, and Afghanistan
Approximate number of languages: 10
Did you know? The Mongolic languages (of which the most widely spoken is Mongolian) are sometimes grouped into a controversial proposed language family called Altaic, which also includes Turkic, Japonic, and Koreanic languages, as well as the endangered Tungusic family. The issue is whether or not these languages form a sprachbund (from German, literally meaning “language union”)—a group of unrelated or distantly related languages that share features because of long periods of contact, not because they descended from the same proto-language.
Sino-Tibetan
Spoken mainly in: China, Southeast Asia, and the Himalayas
Approximate number of languages: 500
Did you know? The most widely-spoken language in this family—Mandarin Chinese, a Sinitic language—has a long history of written records. Usually, this makes a linguist’s job of reconstructing older forms of languages easier, but since Chinese is written with a logographic script in which the writing doesn’t often give you clues as to how the word is pronounced, it’s hard for historical linguists to know how Chinese sounds have changed over time!
Dravidian
Spoken mainly in: southern India, Sri Lanka, and Pakistan
Approximate number of languages: 70
Did you know? The most widely-spoken Dravidian languages are Telugu, Tamil, Kannada, and Malayalam, and all four of these languages have official status in one or more states of India! We got the delicious words mango and curry from Dravidian languages.
Japonic
Spoken mainly in: Japan and the Ryukyu Islands
Approximate number of languages: 12
Did you know? The Japonic languages are divided into Japanese (which includes several Japanese dialects) and Ryukyuan, spoken on the Ryukyuan Island chain on the southern end of the Japanese archipelago. Japanese and Ryukyuan aren’t mutually intelligible (in other words, speakers of Japanese can’t understand Ryukyuan and vice versa), but the Ryukyuan languages are said to form a dialect continuum, meaning that not all Ryukyuan languages in the island chain are mutually intelligible with one another, but some that are geographically closer are.
Kra-Dai
Spoken mainly in: southern China and Southeast Asia
Approximate number of languages: 95
Did you know? Like Oto-Manguean, all Kra-Dai languages are tonal! This includes Thai and Lao (the most widely spoken languages in Thailand and Laos, respectively), both written in an abugida derived from the Khmer script used in Cambodia.
Koreanic
Spoken mainly in: North Korea, South Korea, and northeastern China
Approximate number of languages: 2
Did you know? Some linguists have suggested that Koreanic and Japonic languages form one language family because of similar grammar structures and sounds in the proto-language of each, but this is more likely due to the languages’ long history of contact with one another. Koreanic languages are written using Hangul, a featural script in which the characters are designed to represent the places in the mouth where they are pronounced.
Trans-New Guinea
Spoken mainly in: Papua New Guinea and Indonesia, mostly on the island of New Guinea
Approximate number of languages: 400
Did you know? The Trans-New Guinea languages are one of the largest language families in terms of number of languages, but most languages within the family are not spoken by more than several thousand people, and are not spoken outside of New Guinea—making the island one of the most linguistically diverse places on Earth!
Pama-Nyungan
Spoken mainly in: Australia
Approximate number of languages: 300
Did you know? Pama-Nyungan is named after the Paman languages spoken in northeastern Australia, near the Great Barrier Reef, and Nyungar, spoken in the southwest of the country, near Perth. That’s almost all of Australia! Australia’s other language families—and there are about 30 of them—are all spoken in the northern parts of Western Australia, the Northern Territory, and Queensland.
All in the family
This is just the beginning 🌍 With an estimated 7,000 languages spoken around the world, learning about the world’s languages could take as long as learning one of them! Which language (family) are you tackling next?
For more answers to your language and learning questions, get in touch with us by emailing dearduolingo@duolingo.com.