LinkedIn Scraped User Data Before TOS Update

3 min read

Without changing its rules, Google’s AI models may have been taught on LinkedIn users’ data

In the U.S., but not in the EU, EEA, or Switzerland (likely because of data protection laws there), LinkedIn users can turn off the feature that tells “content creation AI models” about personal data that LinkedIn scrapes. The switch isn’t new. But, as 404 Media was the first to report, LinkedIn didn’t change its privacy policy right away to reflect the use of the data.

The terms of service have been changed, but that usually happens a long time before a big change like this one, where user data is used for a new reason. The idea is that users can leave the site or change their account if they don’t like the changes. It looks like not this time.

So, what kinds of people does LinkedIn train? In a Q&A, the company says it has its models for writing ideas and post recommendations. However, LinkedIn also says that “another provider,” such as its parent company, Microsoft, may train generative AI models that run on its platform.

The Q&A says, “As with most of LinkedIn, when you interact with our platform, we collect and use (or process) data about your use of the platform, including personal data.” “This could include your posts and articles, how often you use LinkedIn, your preferred language, any feedback you may have given to our teams, and your use of generative AI (AI models used to make content). In line with our privacy policy, we use this information to improve LinkedIn services or grow.

LinkedIn told TechCrunch previously that it limits the personal information in datasets used for generative AI training by “enhance[ing] privacy through techniques such as redacting and removing information.”

If you don’t want LinkedIn to scrape your data, go to the “Data Privacy” part of the desktop LinkedIn settings menu, click “Data for Generative AI improvement,” and then uncheck the box next to “Use my data for training content creation AI models.” You can also try to opt out more broadly through this form. However, LinkedIn says that any opt-out will not change training, which has already happened

LinkedIn Scraped User Data for Training Ahead of TOS Update — *Source: LinkedIn*

There is a group called Open Rights Group (ORG) that wants the Information Commissioner’s Office (ICO), which protects data rights in the UK, to look into LinkedIn and other social networks that automatically train on user data. Meta said this week that it would resume its plans to scrape user data for AI training. This comes after working with the ICO to make it easier for people to opt-out.

According to a statement from Mariano delli Santi, the law and policy officer for ORG, our data was being processed without our permission by LinkedIn. “Once again, the opt-out model is completely inadequate to protect our rights. The public cannot be expected to keep an eye on and go after every single online company that decides to use our data to train AI.” It is both the law and common sense that opt-in permission is given.

The Data Protection Commission (DPC) in Ireland ensures that the EU’s general privacy law, the General Data Protection Regulation (GDPR), is followed. The DPC told TechCrunch that LinkedIn told them last week that changes would be made to LinkedIn’s global privacy policy today.

The DPC said, “LinkedIn told us that the policy would include an opt-out setting for its members who did not want their data used to train content-generating AI models.” “EU/EEA members can’t use this opt-out because LinkedIn isn’t using data from EU/EEA members to train or fine-tune these models right now.”

TechCrunch has asked LinkedIn for a reply. We’ll change this if we hear back.

Because generative AI models need more data to learn, more platforms are reusing their vast amounts of user-generated content in other ways. Some sites have even started to make money off of this material. Automattic, which owns Tumblr, Photobucket, Reddit, and Stack Overflow, are just a few sites that sell data to AI model developers.

Some of them haven’t made it easy to say no. Someone deleted a post on Stack Overflow to protest when they said they would start licensing content, but the post was restored, and the user’s account was banned.

James Emmanuel

onSeptember 19, 2024