Major Tech Firms Use User Data to Train AI Models

Major technology companies are collecting user-generated data from emails, chat prompts, and browsing activity to train large language models [1, 2].

This trend highlights a growing tension between the rapid development of artificial intelligence and individual digital privacy. As companies scale their AI capabilities, the boundary between personal communication and corporate training data has blurred.

Companies including Google, OpenAI, Anthropic, and Perplexity utilize user interactions to improve model performance and develop AI-enabled tools [3, 4]. This practice extends across various platforms, including Gmail and ChatGPT [1, 2].

Microsoft-owned LinkedIn implemented a policy change on Nov. 3, 2024 ^[5], allowing the platform to collect user data for AI training. Similarly, SpaceX's Starlink service has sought user data for its own AI model training ^[6].

Privacy concerns grew as these practices became more transparent. A report from the Washington Post on Sept. 5, 2025, detailed the ongoing struggle for users to maintain privacy against automated training systems ^[2].

There is conflicting information regarding the specific use of email data. Google said it disagreed with assertions that it uses Gmail data for AI training, though some reports suggest the company may still utilize such data ^[1].

Most of these providers offer opt-out mechanisms for users who do not want their information used [1, 4, 5, 6]. These settings are often buried in privacy menus, requiring users to manually disable data sharing to protect their personal information [3, 4].

“Major technology companies are collecting user-generated data from emails, chat prompts, and browsing activity to train large language models.”

The shift toward using personal data for AI training represents a transition from 'opt-in' to 'opt-out' privacy models. By making data collection the default, tech firms accelerate the refinement of their models using massive, real-world datasets, effectively shifting the burden of privacy management onto the end user.

Sources

[1]duckduckgo news — Google wants to use your emails to train its AI — here's how to turn that off

[2]duckduckgo news — How to use ChatGPT without giving up your data

[3]duckduckgo news — Stop letting ChatGPT and other AI chatbots train on your data. Here's why—and how

[4]duckduckgo news — Even Starlink Wants Your Data for AI Model Training. How to Opt Out

[5]duckduckgo news — Microsoft owned LinkedIn to collect user data to train AI models: Here's how you can opt out of the program

[6]duckduckgo news — Starlink Is Using Your Personal Data to Train AI. Here's How to Opt Out

Major Tech Firms Use User Data to Train AI Models

Sources

Related

AI Music Classification May Rely on Hidden Cues

Austin Evans Suggests Gaming Hardware Under $250

TechCrunch Explores Intentional Company Culture in Build Mode Series

Comments