GPTBot — OpenAI's Training Crawler
What is GPTBot? OpenAI's web crawler that collects training data for GPT models. Learn how to block GPTBot in robots.txt without losing ChatGPT Search visibility.
QUICK FACTS
GPTBot What is GPTBot?
GPTBot is OpenAI's primary web crawler for gathering training data used to build and improve GPT foundation models. It fetches publicly accessible web pages and filters out content that violates OpenAI's usage policies, sources requiring paywalls, and personally identifiable information. Blocking GPTBot does not affect your visibility in ChatGPT Search — that is handled by OAI-SearchBot.
How to Block GPTBot
Add the following to your robots.txt file (located at the root of your website):
User-agent: GPTBot Disallow: /
What Happens When You Block GPTBot
Your future content will not be included in GPT training datasets. Existing trained data is not retroactively removed. ChatGPT Search citations and live browsing are NOT affected.
Should You Block GPTBot?
GPTBot is a training crawler — it collects data to build AI models. If you want to prevent your content from being used in future AI training by OpenAI, block it. This is a one-way decision: blocking today only affects future crawls, not data already collected.
GPTBot vs Other OpenAI Crawlers
OpenAI operates multiple crawlers, each serving a different purpose:
| User-agent | Purpose | Type |
|---|---|---|
| GPTBot | Collects training data for GPT models | AI Training |
| ChatGPT-User | Live page fetches triggered by ChatGPT users | User-Triggered Fetch |
| OAI-SearchBot | Indexes content for ChatGPT Search citations | AI Search Index |
Each crawler operates independently. Blocking GPTBot does not block ChatGPT-User or OAI-SearchBot — you must add a separate rule for each.
GENERATE YOUR ROBOTS.TXT
Use our visual generator to create a robots.txt file that blocks GPTBot and any other crawlers you want to opt out of.
RELATED CRAWLERS