When you interact with a chatbot, there’s a good chance that everything you say, and every prompt you give, isn’t just used to generate replies to your queries. Nearly every chatbot company on the planet also uses the information you provide to train its AI models. This can leave your privacy—and even your employer’s confidential information—exposed. But you can mitigate these privacy risks by telling chatbots not to use your data for training. Here’s how.
What is AI chatbot training?
In order for a chatbot to provide knowledgeable and (hopefully) accurate answers, the underlying large language model (LLM) that powers it needs to assimilate a massive amount of information, which it then uses to help answer your questions. This process of information assimilation is known as “training.”
The more information an LLM trains on, the more intelligent the LLM, ostensibly, gets. LLMs acquire training data from numerous sources, including public websites, social media platforms, encyclopedias, video-sharing sites like YouTube, and, unfortunately, sometimes even without permission from authors, novelists, artists, musicians, and other creatives.
But LLMs also get their training data from you, too. Every time you enter a prompt to give a chatbot information, that information is likely being used by the AI company to further train its models. And that can leave your privacy severely exposed.
Why you shouldn’t let AI chatbots train on your data
It’s generally a good idea not to allow LLMs to train on your data, especially if, in your interactions with a chatbot, you share a lot of sensitive information about yourself. If you talk to a chatbot about your physical or mental health, your finances, or your relationships, you should know that that data is, by default, usually used by the AI company to further train its LLM, which means your most intimate thoughts, worries, and concerns are becoming part of the model.
AI companies say they anonymize the information you provide before using it to train their models—but you really just have to take them at their word. Even if they do anonymize your information, that doesn’t mean a bad actor in the future couldn’t use some technique to link all the prompts about a particular health, relationship, legal, or financial issue back to you.
And if you are using an AI chatbot for work, you could be exposing your employer to legal and regulatory risks if the data you feed it contains confidential user or client information. Even if it doesn’t, you could inadvertently give away your employer’s corporate secrets, such as proprietary code or sales data. The chatbot may give you the answers you’re searching for, but it will also use all the data you give it to further train its models—and retain that data as part of itself.
