How the Duolingo Owl Decides What Notification To Send

Daily practice is essential for language learning, so Duolingo helps learners stay on track by sending daily practice reminders. In fact, Duo's persistence is so well known that it's even become a popular internet meme. And let's be honest, most of us have probably swiped away one of these notifications...and probably felt a bit guilty in the process.

But, have you ever wondered how Duo decides what message to send? Well, last year Duolingo’s Machine Learning Engineers built a really neat AI system to find the perfect reminder to send each learner each day! We recently published this novel algorithm in a paper and short presentation at the Knowledge Discovery and Data Mining (KDD) Conference 2020. In this post we’ll take a peek at the AI behind these notorious notifications.

SM_090320-small

In a nutshell

We use a variety of pre-written notifications for our practice reminders, and we personalize them based upon a variety of factors such as the language you're studying and your current streak. We periodically update these to keep things fresh and engaging. Since one of Duolingo’s operating principles is “Test everything,” we always run experiments to test new notifications on a small number of learners before using them across the board. This way, only the best templates get permanently added to the pool.

However, when sending practice reminders, it used to be that notifications were selected from the pool at random. We wondered if we could help more learners stay motivated by making the algorithm smarter. What if AI could find the best notification to send to each user each day? So, last year some of our Machine Learning Engineers set out to build a custom AI system to do just that.

Bandit algorithms

To better understand how learners respond to the variety of notifications, we started experimenting with bandit algorithms. Bandit algorithms are a form of AI where an algorithm must repeatedly choose between the same set of options, and it gradually learns from past decisions which options are best--that is, which of our notifications are most likely to get a learner to practice their language.

To understand how bandits work, imagine that you’re brought to a room full of slot machines. You’re given a bag full of tokens that you can use to play the slots, and some machines pay out more than others. To maximize your payout, you’d start by experimenting with many different machines, keeping track of how often each pays out. Over time, you’d start to get a sense of which machines pay out the most often and start playing those machines more.

Our bandit uses a similar strategy, but instead of slot machines it chooses notifications and its “payout” is getting a learner to complete a lesson. Essentially, it works as follows:

Blog_Notifications_Steps

Data science: How do we figure out which templates are best?

However, to make bandits work for notifications, we had to overcome a number of novel data science problems. We started by collecting data: the results of ~200 million practice reminders sent over a 34-day period. We used these to analyze which notifications were most likely to engage learners.

Our goal was to score each notification based upon how many learners completed a lesson after receiving it. However, one of the unique challenges we had to overcome was that different notifications are designed for different audiences. For example, some notifications only make sense if the user has a streak wager, or can only be sent on Mondays. And many learners will complete a lesson no matter what notification we send (especially those with crazy streaks!), which gives notifications designed for those audiences an unfair advantage. To score templates fairly, we devised a new way to compare each notification only to other notifications sent to the same type of learner.

After analyzing the notifications this way, we learned that not only did some notifications work much better than others, but that this varied from one language to another. For instance, the “Time for [language]” notification works very well for Chinese learners, but it's usually not the best option for English learners. These differences mean that we can get better engagement if we tailor the notification selection to each language.

Blog_Notifications1_small

But, before we could integrate these findings into a usable bandit algorithm, we also had to overcome “novelty effect.” We hypothesized that brand-new notifications, ones a learner had never seen before, would be especially convincing, but that eventually the novelty would wear off and they would need a different kind of notification to be persuaded to keep practicing. We were able to confirm this by analyzing the data from the ~200 million practice reminders we mentioned earlier. So, for best performance, we had to ensure that the same notification was not used too often.

However, that runs contrary to how conventional bandit algorithms work: they find the best option and then reuse it again and again. To correct this, we had to explicitly teach the AI algorithm that learners don’t like seeing the same notification too often by demoting reminders that have already been seen recently. We decided how much to space apart the repetitions of a notification using the same forgetting curve that we use to measure word learning!

image--8-

Engineering: Make it fast, make it big!

We had to make the bandit fast so that it could handle the millions of practice reminders sent to learners each day. The algorithm also needed to handle a lot of data: the system produces tens of millions of records per week that need to be analyzed. To handle all this data, we use big data tools like AWS Kinesis Firehose and Spark.

Conclusion

When we tested our bandit algorithm in the real world, within a matter of weeks we could tell that more learners were completing lessons more frequently. It was especially successful at helping tens of thousands of new learners return to their lessons, and developing good study habits is one of the toughest parts of language learning! What’s more is we could use the insights that our AI had learned to design better notifications in the future, so that we can boost learner motivation even more! Stay tuned for more information this fall about how we approach writing all our notifications.

If you enjoy working on big problems like these, consider joining us! Check out all the Software Engineering, Data Science, and Machine Learning jobs posted on our careers page.

“Hi, it’s Duo”: Meet the AI behind the meme

In a nutshell

Bandit algorithms

Data science: How do we figure out which templates are best?

Engineering: Make it fast, make it big!

Conclusion

2024 Duolingo Language Report

Does Duolingo work?

4 learnings from Duolingo efficacy studies

2024 Duolingo Language Report

Does Duolingo work?

4 learnings from Duolingo efficacy studies

About us

Help and support

Privacy and terms

About us

Press

Careers

Help and support

Privacy and terms

“Hi, it’s Duo”: Meet the AI behind the meme

In a nutshell

Bandit algorithms

Data science: How do we figure out which templates are best?

Engineering: Make it fast, make it big!

Conclusion

RELATED ARTICLES

2024 Duolingo Language Report

Does Duolingo work?

4 learnings from Duolingo efficacy studies

RELATED ARTICLES

2024 Duolingo Language Report

Does Duolingo work?

4 learnings from Duolingo efficacy studies