Skip to main content

After receiving regulatory approval, Meta has resumed training its Large Language Models (LLMs) using public data from European Union users on Facebook and Instagram. This development came after significant concerns over data protection and compliance regarding Meta’s data collection and model training.  

Meta initially attempted to train their AI model with EU users’ personal data without explicit consent. They based their practice on a legal concept under GDPR called “Legitimate Interests.” They claimed that using the data serves a purpose beneficial to the company or broader society, outweighing the need for explicit permission. This approach applies to both data collected directly by Meta (first-party data) and data from external sources (third-party data) within the region. This led to an intervention by the Irish Data Protection Commission (DPC), resulting in a temporary halt in Meta’s EU operations.

Meta has now secured approval from the European Data Protection Board (EDPB) to proceed with training its models on public EU user data. They have also put in place several privacy safeguards to ensure data protection and privacy such as:

  • An opt-out mechanism allowing EU users to explicitly object to their public data being used for AI training purposes.
  • Publicly available data from minors (under the age of 18) will not be used to train Meta’s generative AI.
  • User’s private messages on Meta’s platform will be excluded from training data. 
  • While users’ interactions with Meta AI including prompts and queries will be used to train and improve the model. 

Meta emphasizes that the importance of training their model with users’ data is to develop a generative AI that can fully process regional language variations such as dialects and slang, understanding culturally specific contexts and localized knowledge, and interpreting complex communication styles like humor and sarcasm. Additionally, their AI should handle various content formats, including text, voice, video, and images. This will be beneficial to millions of people and businesses in Europe. 

The Meta case establishes that AI governance is changing, and security professionals need to adapt to new rules for handling user data. Some of these guidelines include:

  • Companies should maintain well documented records of where they get their training data from, ensuring transparency and accountability.
  • Users should still have the option to decide if they want their public data to be utilized for AI training.
  • Security teams must implement precise controls to exclude certain sensitive or prohibited categories of data during AI training.
  • AI systems working across countries must get permission from regulators in each region to meet different legal rules.

 This case highlights the ever changing security landscape with advancing AI technology. It also emphasizes the need for robust technical controls at every stage of the AI development lifecycle and stronger measures to ensure privacy and compliance.

About the author:

Leave a Reply