Robots.txt Tester & Validator

Test and validate your robots.txt file to ensure search engines and bots can properly crawl your website. Check URL blocking rules instantly.

Fetch robots.txt from website URL

robots.txt Editor

Validation Results

Enter robots.txt content to validate

URL Tester

Quick test common paths:

How it works

More specific rules take precedence (longer paths win)
If same length, Allow beats Disallow
Default is ALLOW if no rules match

Frequently Asked Questions

Common questions about robots.txt and how to use this tool

Q. What is a robots.txt file?

A robots.txt file is a text file that tells search engine crawlers which pages or sections of your website they can or cannot access.

Q. Where should robots.txt be located?

The robots.txt file must be placed in your website’s root directory. For example: https://example.com/robots.txt

Q. Does robots.txt block pages from Google?

Robots.txt prevents crawling but doesn’t guarantee pages won’t appear in search results. Use noindex meta tag for complete blocking.

Q. How do I block AI crawlers?

Add User-agent rules for AI bots like GPTBot, ClaudeBot, followed by “Disallow: /”.

Q. What does “Disallow” mean?

The Disallow directive tells search engines not to crawl specific URLs or directories. For example, “Disallow: /admin/” prevents crawling.

Q. How do I add a sitemap?

Add “Sitemap: https://yoursite.com/sitemap.xml” at the end of your robots.txt file to help search engines find it.

Learn More About Robots.txt

Comprehensive guides to help you master robots.txt

Complete Guide to Robots.txt

What is Robots.txt?

Robots.txt is a text file webmasters create to instruct search engine robots how to crawl and index pages on their website.

Basic Syntax

User-agent: Googlebot
Disallow: /private/
Allow: /public/

User-agent: *
Disallow: /admin/
Sitemap: https://example.com/sitemap.xml

Key Directives

User-agent – Specifies which crawler the rules apply to
Disallow – Tells crawlers not to access specific paths
Allow – Explicitly permits crawling (useful for exceptions)
Sitemap – Points to your XML sitemap location

How to Block AI Crawlers

Block All Major AI Crawlers

# Block OpenAI
User-agent: GPTBot
User-agent: ChatGPT-User
Disallow: /

# Block Anthropic (Claude)
User-agent: ClaudeBot
User-agent: anthropic-ai
Disallow: /

# Block Common Crawl
User-agent: CCBot
Disallow: /

Best Practices for Robots.txt

DO:

Always include a sitemap reference
Block admin, login, and private areas
Test your robots.txt before deploying

DON’T:

Don’t use robots.txt as a security measure – it’s publicly visible
Don’t block CSS/JS files needed for rendering
Don’t use it to hide sensitive data

Robots.txt vs Noindex – What’s the Difference?

Robots.txt (Disallow)

Prevents crawling of specific URLs
Does NOT guarantee exclusion from search results
Good for: Reducing crawl budget

Noindex Meta Tag

Controls whether a page appears in search results
Requires the page to be crawlable
Good for: Removing pages from search results

The Golden Rule

If you want a page completely out of search results, use noindex – NOT robots.txt!