Here's how to stop Meta from using your data for AI training

Meta has launched a new privacy setting that allows users to request the company not to use their data from public or licensed sources for training its generative AI models.
Rizwan Choudhury
Meta European headquarters
Meta European headquarters

Derick Hudson/iStock 

Meta, the company that owns Facebook and Instagram, has launched a new option for users who do not want their data to be used for training its artificial intelligence (AI) models. The new privacy setting, announced on Thursday, allows users to submit requests to access, modify, or delete any personal information that Meta has collected from public or licensed sources for generative AI model training.

Generative AI

As you might be aware, generative AI is a relatively new branch of AI that can create new content, such as text, images, audio, or video, based on predictions and patterns learned from existing data. Thanks to OpenAI’s ChatGPT, which set a trailblazer for the industry to catch up with Google, Meta, and Snap in integrating generative AI into their products,

Meta uses generative AI to power various features on its platforms, such as chatbots, translations, captions, and recommendations. To train effective generative AI models, Meta says it needs a large amount of information from publicly available and licensed sources, such as blog posts, news articles, social media posts, and other online content.

However, some of this information may contain personal data, such as names, contact details, opinions, or preferences of individuals. Meta says it respects the data subject rights of users and gives them the option to object to their data being used for generative AI model training. Users can fill out a form titled “Generative AI Data Subject Rights” on Meta’s website to submit their requests.

Here's how to stop Meta from using your data for AI training
The new form on Generative AI data subject rights.

Data scraping and privacy protection

The form does not apply to the data that users generate on Meta’s own platforms, such as Facebook comments or Instagram photos. A Meta spokesperson said that the company’s latest Llama 2 open-source large language model, which is one of the most advanced generative AI models in the world, was not trained on Meta user data. The spokesperson also said that Meta has not launched any generative AI consumer features on its platforms yet.

Meta’s move comes at a time when data scraping and privacy protection are becoming hot topics in the tech industry. Data scraping is the practice of collecting large amounts of data from websites or online platforms without the consent or knowledge of the owners or users. Many tech companies use data scraping to train their AI models or gain a competitive advantage.

As CNBC reports, last week, a group of data protection authorities from several countries, including the UK, Canada, and Switzerland, issued a joint statement to Meta and other tech giants, such as Alphabet (Google’s parent company), ByteDance (TikTok’s parent company), X (formerly Twitter), Microsoft and others. The statement reminded them that they are subject to various data protection and privacy laws worldwide and that they should protect personal information accessible on their websites from data scraping.

The statement also urged individuals to take steps to protect their personal information from data scraping and asked social media companies to enable users to engage with their services in a privacy-protective manner. The statement highlights Mass data scraping from SMCs and other websites has become more common in recent years. This has caused privacy worries, such as these: Scraped data can be used for cyberattacks. For example, hackers can use identity and contact information from scraping to fool or trap people into scams.

Add Interesting Engineering to your Google News feed.
Add Interesting Engineering to your Google News feed.
message circleSHOW COMMENT (1)chevron
Job Board