Robots.txt for AI Crawlers: The Complete 2026 Configuration Guide
⚠️ Critical Finding: In Audit 1 of getoutloop.com, the robots.txt was missing entries for all major AI crawlers. Result: GEO Technical score of 0/15. One file change fixed this entirely.
What is robots.txt and Why Does It Matter for GEO?
robots.txt is a plain text file located at yourdomain.com/robots.txt that tells web crawlers which pages they can and cannot access. Every major search engine and AI platform respects this file before indexing your content.
For GEO purposes, robots.txt is the most critical single technical file on your website. If GPTBot, ClaudeBot, or PerplexityBot are blocked — either explicitly or by a catch-all restriction — those AI platforms cannot read your content. They will never cite you. Your GEO score will be zero regardless of how well you've optimized everything else.
Major AI Crawlers Reference Table
| User-agent | AI Platform | Priority |
|---|---|---|
| GPTBot | OpenAI (ChatGPT) | Tier 1 |
| OAI-SearchBot | OpenAI Search | Tier 1 |
| ClaudeBot | Anthropic (Claude) | Tier 1 |
| PerplexityBot | Perplexity AI | Tier 1 |
| Google-Extended | Google (Gemini / AI Overviews) | Tier 1 |
| Bingbot | Microsoft (Copilot) | Tier 1 |
| Applebot-Extended | Apple Intelligence | Tier 2 |
| FacebookBot | Meta AI | Tier 2 |
| Amazonbot | Amazon / Alexa | Tier 2 |
| cohere-ai | Cohere | Tier 2 |
Copy-Paste robots.txt Template
Save this as robots.txt in your website root directory:
User-agent: * Allow: / Disallow: /private/ Disallow: /admin/ Crawl-delay: 1 # === AI SEARCH INDEXING (ALLOW ALL — GEO Visibility Strategy) === # ChatGPT / OpenAI User-agent: GPTBot Allow: / User-agent: OAI-SearchBot Allow: / User-agent: ChatGPT-User Allow: / # Claude / Anthropic User-agent: ClaudeBot Allow: / User-agent: anthropic-ai Allow: / # Perplexity AI User-agent: PerplexityBot Allow: / # Google AI (Gemini, AI Overviews) User-agent: Google-Extended Allow: / User-agent: GoogleOther Allow: / # Microsoft Bing / Copilot User-agent: Bingbot Allow: / # Apple Intelligence User-agent: Applebot Allow: / User-agent: Applebot-Extended Allow: / # Meta AI User-agent: FacebookBot Allow: / # Amazon Alexa / AI User-agent: Amazonbot Allow: / # Common Crawl (AI Training) User-agent: CCBot Allow: / # Cohere AI User-agent: cohere-ai Allow: / # === STANDARD SEARCH ENGINES === User-agent: Googlebot Allow: / User-agent: Slurp Allow: / User-agent: DuckDuckBot Allow: / # === SITEMAP === Sitemap: https://ronnelbesagre.com/sitemap.xml
How to Verify Your robots.txt Works
- 1.Visit https://yourdomain.com/robots.txt in your browser — you should see the plain text content
- 2.Use Google Search Console → Settings → robots.txt tester to verify each crawler user-agent
- 3.Test with: curl -A "GPTBot" https://yourdomain.com/ — should return 200 OK
- 4.Run a GEO audit using the /seo-geo-audit skill — AI Crawler Access score should jump to 12+/15
Want Me to Audit Your robots.txt?
The free AI Visibility Audit includes a full robots.txt review + all GEO technical gaps with a prioritized fix plan.
Get Free Audit