Today we’re open sourcing Metasearch, our tool for searching up to 22 other tools in parallel! No more wondering “was that spec in a Google doc or a Confluence page?”

metasearch-2

The full list of supported data sources: AWS tagged resources, Confluence pages, Dropbox files and folders, Figma files, projects, and teams, GitHub PRs, issues, and repo metadata, Google Drive docs and spreadsheets, Google Groups groups, Greenhouse job posts, Guru cards, Hound-indexed code, Jenkins job names, Jira issues, Lingo assets, Notion pages, PagerDuty schedules and services, Pingboard employees, Rollbar projects, Slack messages and channels, TalentLMS courses, Trello boards, cards, members, and workspaces, Zoom rooms, and arbitrary websites (such as this blog itself) via sitemaps.

Metasearch at Duolingo

We’ve been using Metasearch internally for nearly a year now. It offers these benefits:

  • A single, consistent interface to search all of our data at once
  • Better defaults than some of these tools’ own search interfaces
  • The ability to search some tools that don’t have their own search interfaces

Getting started

Metasearch’s only dependency is Docker - you can get up and running with a single command. Or if you prefer, you can run it outside of Docker using Node.js.

Want to search something like SharePoint or MediaWiki that isn’t currently supported? Metasearch’s entire codebase is only around 3000 lines of TypeScript, and adding support for a new data source requires fewer than 100 lines on average. Feel free to open a pull request!

We’re hiring

Interested in building stuff like this? Duolingo is hiring software engineers. Learn more here!