CS CrawlerSignal

AI crawler policy checker

Audit your AI crawler access before you ship another llms.txt

Scan a public site for robots.txt, llms.txt, sitemap, and AI crawler rules, then copy an honest policy kit for ChatGPT Search, Claude, Perplexity, Gemini, and training crawlers.

Policy mode
Human verification Checking protection...

Free beta. No account. This scan cannot modify your site.

Signal score -- Run an audit to see crawler policy health.
robots.txtwaiting
llms.txtwaiting
sitemap.xmlwaiting
homepagewaiting

Crawler matrix

Separate search, training, and user-triggered fetches

Bot Company Use Status Rule
Run an audit to populate crawler rules.

Policy kit

Copy the pieces you can ship

robots.txt snippet


          

llms.txt draft


          

audit.json


          

FAQ

The boring caveats that keep this useful

Does llms.txt guarantee AI search ranking?

No. Treat it as an experimental AI-readable site map. The crawler rules that actually express allow/block choices still live in robots.txt.

Should I block GPTBot and allow OAI-SearchBot?

That is the balanced default: keep ChatGPT search discovery open while making a separate training crawl choice. Review it with your legal and content strategy constraints.

Can CrawlerSignal see Cloudflare managed robots.txt or server logs?

No. It reads public URLs from the outside. CDN rules, WAF settings, and real bot traffic logs require platform access and belong in a later paid monitoring product.