Duolingo is famous for silly sentences, like this one in Spanish: A mi caballo le gusta la tele (My horse likes TV). You might be thinking...
“Surely these must have been generated by a computer algorithm. No human expert would come up with those to teach a language!”
So, are they generated by AI? Or is there more happening behind the scenes to create the delightful and effective learning experience Duolingo is known for? Read more to find out!
The Duolingo way: human experts + AI
At Duolingo, we always look for optimal solutions. To create a high-quality learning experience, we’ve learned that the best thing to do is to combine human expertise with smart AI in a way that leverages the strengths of each.
Our course creation process can be divided into four parts, and each stage involves some combination of humans and AI. The stages that require scale and personalization naturally rely more on AI, while the earlier stages involve more of our in-house learning experts!
Let’s explore each stage!
Stage 1: Curriculum design
Curriculum design is the first stage of course creation, and this is where human experts excel. At Duolingo, experienced curriculum designers carefully plan what to teach and when for any given course. They design the high-level structure of the course, specifying the order of learning objectives so that the course follows the CEFR standard but is also customized for the specific language background of our learners. They also pick compelling real-life scenarios that can be used in the course to illustrate each learning objective. Finally, they decide on the optimal way of distributing the words, phrases, and grammatical concepts across lessons so that learners aren’t overwhelmed by too many things at once and instead can gradually build on what they had previously learned. (Read more about this process here!)
What does this mean for our above example of, A mi caballo le gusta la tele (My horse likes TV)? Well, at this early stage of course creation, there are no specific sentences yet, but curriculum designers have a plan for when to teach the words for “horse”, “TV”, or “to like”. That last one, “to like”, is tricky for English learners, because in Spanish, sentences that discuss likes or dislikes are structured differently than those in English. Instead of saying “I like TV”, you’d say something like “TV is pleasing to me”. When designing a Spanish course for English learners, curriculum designers would plan to include this grammatical concept a bit later in the course and allow plenty of lessons to teach it.
All this to say, a curriculum designer is looking at the individual concepts, vocabulary, and structures that make up a sentence, but they’re not stringing them together… yet.
Stage 2: Raw content creation
The second stage of course creation is building the “raw” content for each lesson, which later serves as the pool of available material from which specific exercises are created. This part is done by human experts whose teaching experience and creative skills are essential for building our content. But AI provides some critical support for these experts to do their jobs efficiently.
For each lesson in the course sequence, Duolingo content developers write “raw” content that fits under the learning objective specified in the course plan, like talking about your hobbies. This includes everything from sentences, to paragraphs, to mini-dialogues that are common in day-to-day communication and illustrate the new words and concepts well. There’s also a need for some silliness to make learners laugh and help them stay engaged throughout their learning journey. Finally, we write translations for all the words and sentences so that learners can understand what they mean. But while human experts bring unique strengths to all these tasks, AI is a powerful partner! We build tools powered by AI algorithms to help content developers work faster and with fewer mistakes, focusing their brain power on what they do best. For example, we use AI to help create a range of possible translations of all the sentences so that we can later accept learners’ responses in cases when there are multiple correct ways of saying the same thing.
This is when our example sentence would be written. In a lesson about hobbies, a content developer would write many common sentences like “I like sports” or “My dad likes reading”, but would also include a few silly ones like “My horse likes TV”. The content team would also create translation hints for every word (like caballo = horse), as well as list all the possible ways of expressing that my horse likes TV (like using “television” instead of “TV”). And all of it would be created using AI-powered tooling.
Once the sentences, paragraphs, and dialogues are written, it’s time to put them to work in a lesson!
Stage 3: Exercise creation
At this point, we have a course plan with lots of “raw” content available for each lesson. The third stage consists of taking that content and creating a pool of interactive exercises that we can later show to learners in lessons. While for some exercises this process is led by our expert curriculum and content developers, most of the time we use computer algorithms to automatically create exercises from “raw” content.
Let’s revisit that example sentence again: A mi caballo le gusta la tele. Here’s just a sampling of the exercises we might create from this sentence using AI:
- Ask learners to fill in the blank to complete just the most important part of a long sentence in Spanish, where we automatically figure out where to put the blank to target the le gusta structure
- Ask learners to put together the sentence in Spanish using a word bank that includes all the words of the actual sentence, plus some automatically generated distractors that include other recently taught words that learners might still be unsure about
- Ask learners to choose what word in a sentence they heard by picking from two recordings automatically determined to sound similar (such as gusta and cuesta)
AI is also a critical part of how we grade many exercise types. For example, we may ask learners to say the sentence out loud and use AI to figure out whether they said it correctly. AI also allows us to generate the audio for exercises, and we’ve recently started doing that using our custom Duolingo World character voices in many courses!
But expert humans also have a part to play in exercise creation. Some types of exercises are difficult to get right by AI alone, so humans take the lead. One example is when we ask learners to read or listen to a paragraph and then ask them a question about it—our content developers write those questions themselves to make sure we ask about something that aligns with the lesson’s learning objective and that requires understanding the key parts of the text.
Stage 4: Lesson personalization
The final stage involves assembling personalized lessons that learners see when they use Duolingo. As explained above, Duolingo courses have a predetermined course structure, with a specific sequence of lessons and a pool of exercises available for each of those lessons. But each lesson that a learner sees is unique: We take the pool of available exercises for that lesson and use AI to figure out which of those exercises to show to a given learner at a particular time so that the experience is personalized to that learner’s specific needs. This is where AI shines!
Separate AI models work together to create a customized learning experience. In most cases, we use our Birdbrain model to figure out which exercises in a particular lesson are going to be the best match for a learner’s level of knowledge. For example, if a learner struggles with how to say they like something in Spanish, our algorithm might serve them an exercise that focuses specifically on gustar (to like). With our example sentence, A mi caballo le gusta la tele (My horse likes TV), this might mean serving the learner an exercise that asks them to select the proper form of gustar to complete the sentence.
We also have a model of when it’s time to practice previously learned words that we use to add exercises targeting those words into a lesson. And all of this is combined with a desire to keep learners motivated by showing them a mix of different exercise types, and a variety of sentences and language material.
The result of all four course creation stages is a personalized Duolingo lesson that (1) teaches you a carefully chosen set of concepts, (2) uses rich, communicatively useful, and sometimes silly language content, (3) is converted into interactive exercises that target what’s most important about what you’re learning, and (4) adapts to meet you where you are.
So, as you see, there’s lots going on behind the scenes where human experts and AI collaborate to create the Duolingo experience, including our famous silly sentences! Keep practicing so you can check out all of this work in action!