Five years ago, we launched the first high-stakes, digital-first test that could be taken anytime and anywhere in the world, because we saw how technology could radically improve testing for students and institutions. In this five part series, we take a look back at the enormous amount of research and development that went into reinventing the world of high-stakes testing.
__________________________________________________________

Today, the Duolingo English Test is accepted by over 3,000 universities and programs and has been taken by hundreds of thousands of people in over 207 countries and territories—but it didn’t start out that way! Read on to learn how we went from a hackathon (a collaborative event where employees across all areas of the company work together to develop brand-new ideas and passion projects) to the world’s most accessible English proficiency exam.

English opens doors

All over the world, English proficiency serves as a key to unlock opportunity: as the de facto lingua franca or “common language” used in business, government, and education in many parts of the world, English can open doors that would otherwise remain closed.

We believe it's important to acknowledge that the privileged status of English is intertwined with a long history of imperialism and injustice. But we also believe that teaching (and even assessing) English in a human-centered way can be truly empowering, and that's what we aim to do with the Duolingo English Test.

Over the years, Duolingo has taught English to millions of learners, from over 20 first languages. As they progressed through our courses, many of those learners also reached out asking if we offered a way to certify their English proficiency that was just as accessible as the courses we offer, since certifying English proficiency is a key step towards working or studying in an English-speaking country.

Image of world map percentage of learners studying english in different countries, with highest density in non-english speaking countries

Unfortunately, getting that certification can be challenging: apart from the $200 (or higher) price tag of traditional English proficiency exams, test takers also need to compete to book an appointment in a testing center, and transport themselves to and from their appointment. For the countless people who live hundreds of miles from their nearest test center, this means they must also cover the cost of lodging. Once they complete their exam, test takers must wait a week or more to receive their scores, and sharing them with institutions can cost extra. It’s no wonder learners everywhere were looking for a more accessible alternative.

Our co-founders, Luis von Ahn and Severin Hacker, were very sympathetic to these inquiries: both immigrated to the US in pursuit of higher education and had experienced the arduous and expensive process of certifying their English proficiency first-hand. We were already using our language and technology expertise to make language learning accessible for people across the globe through our app; we wondered if we could apply some of the same techniques to language assessment.

Filling in the blanks

Clearly, there was a need for change. But in order to break free from this traditional testing model that was serving as a barrier to so many, we needed to overcome a few hurdles: the limitations of physical test centers, and the limitations of fixed-form exams.

Traditionally, the language proficiency exams administered at test centers are fixed form: the questions that make up the test are pre-determined before the exam begins. In this model, tests must be administered in a secure location to ensure that people can’t game the system, by gaining access to test content in advance or sharing test items once they’ve taken the exam.

An alternative is to make tests computer adaptive, so that each individual exam is uniquely assembled. This makes it nearly impossible to get information about the test before you take it. A computer-adaptive model is well suited to online administration, but this approach is only secure if the bank of items that the tests are assembled from is large enough to prevent the possibility of repeat items. The larger the item bank is, the less chance there is of test takers encountering the same items. But actually creating those items is no small feat!

On traditional exams, each test item is written by a human. For decades, this was the only way to produce test items, but it takes time for people to come up with exam content. Instead, we wondered if our language experts could leverage AI to efficiently produce the tens of thousands of items we’d need to ensure digital test security.

“The idea was to lean on our strengths of data and machine learning to create a digital-first exam, using AI to help generate the items, ” explains Burr Settles, director of research at Duolingo and creator of the Duolingo English Test. “We’d already developed listen-and-transcribe and read-and-speak question types for the learning app, so we were on our way there. We were eager to explore how we might use technology to create a fully functional proficiency exam.”

Hacking the system

Enter the 2013 Duolingo hackathon: an annual contest that sees our engineers, designers, and language experts collaborate to develop any idea they can dream up, as long as it promotes our mission of lowering barriers to education— the perfect opportunity to explore the possibility of creating a proficiency exam.

Settles, an expert in machine learning and natural language processing, had already built the computer-adaptive placement test that exists in the learning app and had been exploring research about different test item types that could be automatically generated at the scale needed to support round-the-clock test administration.

One item type that caught his imagination was a “yes/no” vocabulary quiz, which assesses test takers on their knowledge of both common and rare words, by presenting them with a mix of real and pseudowords and asking them to select the ones that they know to be real.

Image showing screen view of vocabulary quiz. There are boxes containing real words such as ham, pesky, and spoil, and boxes containing pseudowords such as masquash, thand, and shamenting

“There’s been a lot of research into these kinds of items as far back as the 1970s, but the way they were created and delivered wasn't very user friendly or efficient,” says Settles. “So I got to thinking, how could we take this yes/no vocab task, and apply AI to generate the items, and make it computer adaptive?”

Over the next 24 hours of the hackathon, Settles focused on augmenting the mechanism for grading yes/no vocabulary tests to make it probabilistic and combining it with computer adaptive testing techniques. The end result was a dynamic, ever-changing real words vs. pseudowords quiz— the seed of what would become the Duolingo English Test.

After the hackathon, we continued developing and building more items for the test, and released a free alpha version called the “Test Center” in 2014; the Duolingo English Test was officially launched in 2016.

image of original duo the owl on a golden shield above the words "Duolingo Test Center"

Transforming testing for good

Nearly a decade after that first hackathon project, the Duolingo English Test has come a long way from this single-item quiz. Now, our fully functional English proficiency exam is the first and only high-stakes test to use AI and machine learning end-to-end at every step of the process, in order to make language proficiency certification truly accessible for all.

“We created the Duolingo English Test as an extension of our mission, to break down barriers to education,” says Settles. “We've learned that an online, personalized approach to testing is not only important for increasing access — it's an essential innovation that is reshaping the education system as we know it.”

To learn more about the evolution of the test, visit our 5th anniversary page, and stay tuned for more posts here on the blog!