Published:Sep 19, 2023

Digital democracy|Digital privacy

Data-hungry language apps: who learns more, you or the app?

In the aftermath of a recent data leak affecting 2.68M Duolingo users worldwide¹, the importance of safeguarding personal information has once again come into sharp focus. Such incidents serve as powerful reminders that our favorite applications are not merely tools for acquiring new skills but also guardians of our sensitive personal data. As we delve into the realm of language learning apps, it becomes clear that the data they collect not only enhances our educational experiences but also holds significant value in the digital marketplace, attracting the attention of advertisers and data brokers.

Key insights

  • Among the 10 popular language learning apps, Duolingo, which recently experienced a significant data leak, stands out as the most data-hungry. It consumes a substantial portion of user data, collecting an impressive 19 out of 32 potential data points, nearly 60% of the available data. Following Duolingo, Busuu collects 17 data points, and iHuman gathers 12.
  • On the opposite end of the spectrum, some apps adopt a more privacy-oriented approach. For example, EWA collects 5 out of 32 data points, HelloTalk gathers 7, and Mondly captures 8. Despite its conservative data collection, HelloTalk, surprisingly, tracks users' precise locations — a feature not found in any of the other analyzed apps.
  • However, the pivotal issue lies not only in the volume of data collected but also in how that data is managed. Many of these apps use collected data to track users, which is often done by sharing user data with third-party advertisers or even data brokers². 9 out of 10 analyzed apps employ collected data for tracking purposes, with an average of 3 data points handled in this manner.
  • Duolingo takes the lead in this category as well, emerging as the undisputed champion of tracking. It uses two-thirds of collected user data (13 out of 19 data points) for tracking purposes, which is 4 times the average among the analyzed apps. Some of the data points utilized for tracking include purchase history, coarse location, and phone number.

Methodology and sources

We analyzed 10 popular language learning apps selected from articles prominently featured in search engine results for the keyword "the most popular language learning apps," as well as those with a significant number of downloads according to the AppMagic platform. The data collection information for each app was sourced from its Apple App Store page on September 12th, 2023. The App Store provides a list of 32 unique data points categorized into 12 unique data point categories. We analyzed the data set according to the number, type, and handling of the data points collected by each app.

Note on data used to track the user: “Tracking refers to the act of linking user or device data collected from your app with user or device data collected from other companies' apps, websites, or offline properties for targeted advertising or advertising measurement purposes. Tracking also refers to sharing user or device data with data brokers².”

For the complete research material behind this study, visit here.

Data was collected from:

Apple (2023). Apps’ standartized privacy policies.

References:

¹ Surfshark (2023). Duolingo data leak exposed 2.68M users;² Apple (2023). User privacy and data use.
The team behind this research:About us