Robots.txt Validator
Validate your robots.txt file for syntax errors, best practices, and compliance with standards. Fetch from any website or paste your content directly.
Enter a hostname (e.g., www.google.com or google.com) or full robots.txt URL
Validate your robots.txt file for syntax errors, best practices, and compliance with standards. Fetch from any website or paste your content directly.
Enter a hostname (e.g., www.google.com or google.com) or full robots.txt URL
A robots.txt file is a text file placed in your website's root directory that tells search engine crawlers which pages or sections of your site they can or cannot access. It's part of the Robots Exclusion Protocol (REP).
Common syntax errors include: missing colons after directives (User-agent: or Disallow:), incorrect capitalization (must use exact case), invalid characters, or malformed URLs. Each directive must be on its own line with proper formatting.
The asterisk (*) is a wildcard that matches all robots/crawlers. 'User-agent: *' means the following rules apply to all web crawlers unless they have more specific rules defined elsewhere in the file.
Yes! You can block specific AI bots by using their user-agent names. For example: 'User-agent: GPTBot' followed by 'Disallow: /' blocks OpenAI's crawler. Use our Robots Builder to easily select which AI bots to block.
Disallow tells crawlers NOT to access specific paths, while Allow explicitly permits access. Allow is useful for creating exceptions within broader Disallow rules. For example, you might disallow /admin/ but allow /admin/public/.
Warnings indicate potential issues or best practice violations that won't prevent the file from working, but might not achieve your intended goals. For example, redundant rules, unusual patterns, or missing sitemaps are warnings, not errors.
Invalid directive errors occur when using unsupported commands or misspelled directives. Valid directives are: User-agent, Disallow, Allow, Crawl-delay, and Sitemap. Check for typos and ensure proper capitalization.
Yes, it's a best practice to include your sitemap URL using the 'Sitemap:' directive. This helps search engines discover your sitemap, though you should also submit it directly through search engine webmaster tools.