Google-Extended
AI TrainingVerify Google-Extended IP Address
Verify if an IP address truly belongs to Google, using official verification methods. Enter both IP address and User-Agent from your logs for the most accurate bot verification.
Google-Extended is a special user-agent that allows website owners to control whether their publicly accessible content can be used to train and improve Google’s AI models, including products like Gemini. It does not crawl the web itself; instead, it serves as a policy signal interpreted by Google’s AI systems. Site owners can allow or block AI training access by configuring robots.txt rules for Google-Extended. Blocking this agent does not affect Google Search ranking, crawling, or indexing. Its purpose is purely governance-giving publishers a transparent way to manage how their content contributes to Google’s AI research and model development.
User Agent Examples
Google does not identify any specific user-agent pattern in HTTP requests for Google-Extended bot!Robots.txt Configuration for Google-Extended
Google-ExtendedUse this identifier in your robots.txt User-agent directive to target Google-Extended.
Recommended Configuration
Our recommended robots.txt configuration for Google-Extended:
User-agent: Google-Extended
Allow: /Completely Block Google-Extended
Prevent this bot from crawling your entire site:
User-agent: Google-Extended
Disallow: /Completely Allow Google-Extended
Allow this bot to crawl your entire site:
User-agent: Google-Extended
Allow: /Block Specific Paths
Block this bot from specific directories or pages:
User-agent: Google-Extended
Disallow: /private/
Disallow: /admin/
Disallow: /api/Allow Only Specific Paths
Block everything but allow specific directories:
User-agent: Google-Extended
Disallow: /
Allow: /public/
Allow: /blog/Set Crawl Delay
Limit how frequently Google-Extended can request pages (in seconds):
User-agent: Google-Extended
Allow: /
Crawl-delay: 10Note: This bot does not officially mention about honoring Crawl-Delay rule.