AI Visibility Audit

Is your site invisible to ChatGPT, Claude & Perplexity?

Enter any domain and we'll check if your robots.txt is blocking the AI crawlers that decide whether you show up in AI-generated answers. Free. Takes 5 seconds.

★ Recommended

Get It Now

Allowed

Partial

Blocked

⚠ Cloudflare is blocking AI bots at the edge — not your robots.txt.

Your site is proxied through Cloudflare, and AI Audit / AI Crawl Control is prepending a managed block list to your /robots.txt before it reaches crawlers. This is why GPTBot, ClaudeBot, Google-Extended, CCBot and similar appear fully blocked even if your own file allows everything.

How to fix (one-time, 30 seconds):

Cloudflare dashboard → your zone (yoursite.com)
Left sidebar → AI Audit (or Scrape Shield / Bots, depending on plan)
Turn off the managed robots.txt / AI-bot block toggle
Re-run this scan in ~60 seconds

Until you fix this, AI crawlers cannot access your site regardless of anything you put in your own robots.txt.

📄

Checking llms.txt...

The new standard for helping AI models understand your site.

✓ Recommended robots.txt — allows all major AI bots

Copy this into your robots.txt file to explicitly allow the AI crawlers that drive visibility in ChatGPT, Claude, Perplexity, Google AI Overviews and more.

★ Next Step

Get the Playbook →

Next Step

Don't have an llms.txt yet?

Build one in seconds with our free generator. It reads your sitemap and homepage to create a ready-to-publish llms.txt template you can edit, copy, or download.

Generate your llms.txt →

What this tool checks

AI Bot Access Checker reads your site's robots.txt and tests it against every major AI crawler we know of — the bots that fetch content to train large language models and, more importantly, the bots that fetch content live so ChatGPT, Claude, Perplexity and Google AI Overviews can cite you in answers. For each one, we show whether your site is Allowed, Partially allowed, or Blocked, and we give you a copy-paste robots.txt snippet that fixes any gaps in 30 seconds.

Why AI bot access matters

AI-generated answers are eating the top of the search funnel. When someone asks ChatGPT "what's the best tool for X?" or Perplexity "who makes Y?", the models can only cite sites they're allowed to read. If your robots.txt quietly blocks GPTBot or ClaudeBot, you're invisible in those answers — even if you rank #1 in classic Google search.

Most sites don't know they're blocked. Blocks usually come from default hosting rules, legacy security plugins, or an over-cautious User-agent: * line. This tool surfaces the problem in seconds, for free.

The AI crawlers we test

GPTBotOpenAI — training

OAI-SearchBotOpenAI — ChatGPT Search

ChatGPT-UserOpenAI — live browsing

ClaudeBotAnthropic — training

Claude-WebAnthropic — live browsing

anthropic-aiAnthropic — legacy

PerplexityBotPerplexity — indexing

Perplexity-UserPerplexity — live fetch

Google-ExtendedGoogle — Gemini & AI Overviews

Applebot-ExtendedApple — Apple Intelligence

BytespiderByteDance — Doubao

CCBotCommon Crawl — training data

If you want to be cited in live AI answers, the most important ones to allow are OAI-SearchBot, ChatGPT-User, Claude-Web, Perplexity-User, and Google-Extended. If you also want your content to feed future model training, add GPTBot, ClaudeBot, and CCBot.

What is llms.txt (and why we check for it)

"llms" stands for Large Language Models — the AI systems behind ChatGPT, Claude, Perplexity, and Google's Gemini. llms.txt is an emerging companion to robots.txt. Where robots.txt controls which crawlers can access your site, llms.txt tells AI models how to describe your site — a plain Markdown file at your root (https://yoursite.com/llms.txt) with a one-line summary of what you do and a curated list of your key pages. It's a positive signal, designed specifically for large language models like ChatGPT, Claude, and Perplexity. Our scanner checks whether you have one, and our free generator builds one for you in seconds. Proposed spec: llmstxt.org.

How the scan works

We fetch https://yoursite.com/robots.txt via a server-side proxy (so no CORS issues and nothing leaks through your browser), parse every User-agent / Disallow / Allow block using the same logic Google's robots parser uses, and resolve each AI crawler against that ruleset. We also check whether you have an llms.txt file at the root — an emerging standard that tells AI models how to describe your site in answers.

FAQ

Is this really free?

Yes. No signup, no rate limit worth worrying about, no hidden paywall. The tool is funded by an optional newsletter and occasional sponsor link — you can ignore both.

My robots.txt looks fine but the scan says I'm blocked. Why?

Four common causes, in order of how often we see them:

Cloudflare is rewriting your robots.txt. If your site is proxied through Cloudflare (orange cloud on), Cloudflare's AI Audit / AI Crawl Control feature can inject a managed block list at the edge that prepends Disallow: / rules for GPTBot, ClaudeBot, Google-Extended, CCBot and more — even if your own robots.txt allows everything. Fetch /robots.txt in a browser; if you see a # BEGIN Cloudflare Managed content block, that's the culprit. Fix: Cloudflare dashboard → your zone → AI Audit (or Scrape Shield / Bots depending on plan) → turn off the managed robots.txt / AI-bot block toggle.
A bare User-agent: * with Disallow: / in your own robots.txt — implicitly blocks every AI crawler that doesn't have its own explicit block.
Security plugins or WAFs (Wordfence, Cloudflare WAF rules, Sucuri) block AI user-agents at the firewall level even when robots.txt permits them.
A CDN serving a stale robots.txt. Purge the CDN cache after any change.

The scan shows exactly which rule triggered each block, so you can trace it.

Will allowing AI crawlers hurt my SEO?

No. AI crawlers are separate from traditional search crawlers (Googlebot, Bingbot) and the two don't interact. Allowing GPTBot has zero effect on your Google rankings — it only changes whether ChatGPT can see your content.

What's the difference between training bots and answer bots?

Training bots (GPTBot, ClaudeBot, CCBot) grab content to train future model versions — the benefit is long-term and diffuse. Answer bots (OAI-SearchBot, ChatGPT-User, Perplexity-User, Claude-Web) fetch live pages to cite in real answers — the benefit is immediate citations and referral traffic. Most sites should allow both; some content owners allow only answer bots.

What's llms.txt and do I need one?

llms.txt is a new standard (see llmstxt.org) — a plain-Markdown file at your site root that tells AI models what your site is about and which pages matter. It's a positive signal, in contrast to robots.txt's access-control focus. Almost nobody has one yet, which makes adding it a cheap way to stand out. Build your llms.txt in 10 seconds with our free generator →

How often should I re-check?

Any time you change your robots.txt, CMS, security plugin, or hosting. Also worth a re-check every few months — AI companies add new crawler names (this list has doubled in the last year), and your rules might need updating.

Next Step

Don't have an llms.txt yet?

Build one in seconds with our free generator. It reads your sitemap and homepage to create a ready-to-publish llms.txt template you can edit, copy, or download.

Generate your llms.txt →