GPTBot
AI TrainingVerify GPTBot IP Address
Verify if an IP address truly belongs to OpenAI, using official verification methods. Enter both IP address and User-Agent from your logs for the most accurate bot verification.
GPTBot is an AI Training bot operated by OpenAI. GPTBot is used by OpenAI to make their generative AI foundation models more useful and safe. It is used to crawl content that may be used in training their generative AI foundation models. Disallowing GPTBot indicates that a site’s content should not be used in training generative AI foundation models. RobotSense.io verifies OpenAI GPTBot using OpenAI’s official validation methods, ensuring only genuine GPTBot traffic is identified.
User Agent Examples
Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbotRobots.txt Configuration for GPTBot
GPTBotUse this identifier in your robots.txt User-agent directive to target GPTBot.
Recommended Configuration
Our recommended robots.txt configuration for GPTBot:
User-agent: GPTBot
Allow: /Completely Block GPTBot
Prevent this bot from crawling your entire site:
User-agent: GPTBot
Disallow: /Completely Allow GPTBot
Allow this bot to crawl your entire site:
User-agent: GPTBot
Allow: /Block Specific Paths
Block this bot from specific directories or pages:
User-agent: GPTBot
Disallow: /private/
Disallow: /admin/
Disallow: /api/Allow Only Specific Paths
Block everything but allow specific directories:
User-agent: GPTBot
Disallow: /
Allow: /public/
Allow: /blog/Set Crawl Delay
Limit how frequently GPTBot can request pages (in seconds):
User-agent: GPTBot
Allow: /
Crawl-delay: 10Note: This bot does not officially mention about honoring Crawl-Delay rule.