We had a single developer when we first released our iOS app in October 2012. Over the past 12 years, our team of software engineers has grown tremendously (and continues to grow!), as has the size of the codebase. 

A graph showing an increase in the lines of code in the Duolingo iOS app between 2016 and 2024.

We recently examined the number of lines of code in each app release since Duolingo 4.0 in 2014. Our codebase is now nearly 20 times larger than ten years ago! While a large codebase isn't inherently bad, there are some definite advantages to paring things down. Fewer lines of code makes for a more readable and navigable project as well as less surface area to maintain. Even code that never gets run in production needs to be updated when dependencies and related files change. This leads to wasted time both for the engineer making the change and the reviewer. Cleaning up dead code not only improves developer velocity but can also have user facing impact. For instance less code means a smaller app size, which is especially beneficial for learners in areas with lower connectivity.

Our first attempt was a tried and true software engineering technique: we'd delete a file and attempt to rebuild the app. If the build succeeded, we knew it was safe to delete that file. This method worked but was time-consuming and ineffective since it relied on our knowledge and memory.

Next, we explored open-source tools like Periphery to statically analyze the codebase and generate reports on unused code. This sped up the process significantly since it only required running the tool once, covering the entire codebase without relying on human memory. However, even after cleaning up dead code, we suspected we could do more.

The tools we previously used could identify code that was never referenced, but they couldn't tell us which code was never used when the app was actually running with our users. This distinction is crucial because the Duolingo app behaves differently based on instructions from our backend servers. For instance, our servers dictate which exercises appear in a lesson. We also conduct numerous A/B tests, where, after the test concludes, some code paths are no longer used. Such unused code isn't detectable by static analysis tools.

Thankfully, our friends at Emerge Tools, with whom we collaborated last year, have a new tool called Reaper. This tool identifies code that's never called at runtime. Integrating Reaper into our app took just one line of code, and we've been running it on every version of the app in our beta program since version 7.7.0 in January 2024.

After our learners use the app for a while, we check the dashboard at Emerge Tools to see all the unused classes reported by Reaper:

 A screenshot from the Reaper tool showing 2684 unused classes out of 8567 total classes monitored.

While reviewing a single app version can be helpful, it might provide false positives, such as new classes not yet run due to inactive A/B tests. To get a clearer picture of truly unused classes, we compiled reports from all versions released since integrating Reaper and identified classes unused in any version. Some classes seemed used but were in parts of the app rarely accessed by our beta users, such as completing an entire course. Even after filtering these out, we found many more unused classes than expected.

A prime example is the exercises shown in lessons. The app code supports various exercise types, but some exercises are retired or never fully launched, leaving their code in the app. Thanks to Reaper, we identified four exercise types that are never being used.

Since integrating Reaper, we've analyzed and executed the results only twice and removed over 10,000 lines of code, nearly 1% of our codebase!

A screenshot from Reaper showing a code change that deletes 6,876 lines of code.

 

A screenshot from Github, where we store all of our source code, showing a code change that deletes 3,873 lines of code.

Using Reaper has been an excellent way to streamline our codebase. We're excited about future possibilities, like automating this analysis and adding alerts for newly unused classes, even potentially detecting unused images and animations. With the help of our friends at Emerge Tools, we're confident that even more improvements are on the horizon. We encourage teams that suspect they have unused code in their app to give Reaper a try, you’d be surprised what you might find.

The success we’ve achieved in optimizing our codebase with tools like Reaper is just one example of how we continuously strive for improvement and innovation. Our engineers play a crucial role in this journey, driving both technical excellence and impactful user experiences. If you’re a passionate engineer who is excited about tackling complex problems and making a real difference in the world with education, we're hiring!