Skip to main content
TL;DR — Go to Technical > Technical Analysis and check the Configuration Files section to see whether your robots.txt allows key AI crawlers and whether a valid llms.txt exists. Use the Actions panel for exact fix recommendations, then verify changes are working in Analytics > Crawler Analytics over the following 4 weeks. Pro tip: your llms.txt should lead with your brand’s core value proposition in the first 3 lines — AI models use it to understand your site before crawling individual pages.

The Question

“Do I have the right robots.txt and llms.txt configuration?”
Two text files sit between your content and every AI crawler on the internet: robots.txt controls which bots can access which paths, and llms.txt provides AI language models with a structured, human-readable map of your most important content. A misconfigured robots.txt can block GPTBot or ClaudeBot without you ever noticing. A missing llms.txt means AI models have to infer your site structure from unstructured HTML instead of reading the distillation you wrote for them. You might also be wondering:
  • “Is GPTBot or ClaudeBot accidentally blocked on my site?”
  • “What should I put in an llms.txt file to help AI models understand my brand?”
  • “How do I verify that my configuration changes are being respected?”

Where to Go in Qwairy

1

Start here: Technical > Technical Analysis

Navigate to Technical > Technical Analysis — your primary audit view for both files. The Configuration Files section at the top of the audit shows the status of your robots.txt (fetched from your domain root) and llms.txt (checked at /llms.txt and /llms-full.txt). Each file is displayed with its current contents, a pass/fail status for each AI crawler user-agent, and a list of recommended changes with explanations.
2

Go deeper: Analytics > Crawler Analytics

Cross-reference with Analytics > Crawler Analytics to verify that your configuration is working in practice. After making changes to robots.txt or llms.txt, return here 2-4 weeks later and check whether the affected crawlers show increased visit frequency and broader page coverage. Use the Bot filter to focus on the specific user-agents you modified directives for.
3

Complete the picture: Actions (fix recommendations)

The Actions panel in Technical Analysis lists prioritized recommendations for your configuration files. Each action item includes the exact text to add or modify, the reason it matters for a specific AI crawler, and a link to the relevant AI provider’s documentation where available.

What to Look For

Technical Analysis — robots.txt Audit

The audit fetches your live robots.txt and parses it against the user-agent strings for all 24 tracked AI crawlers. It flags both explicit blocks (Disallow: /) and implicit blocks caused by wildcard rules that inadvertently catch AI bot user-agents.
ElementWhat it tells you
User-agent coverageWhich AI crawlers are explicitly addressed in your robots.txt vs falling through to wildcard rules
Disallow conflictsPaths that are blocked for AI crawlers but should be accessible for GEO indexing
Crawl-delay directivesWhether crawl-delay values are set in ways that reduce AI bot visit frequency
Sitemap declarationWhether your XML sitemap URL is declared, helping crawlers discover all your pages

Technical Analysis — llms.txt Audit

llms.txt is a newer standard designed specifically for AI language models. Unlike robots.txt (which governs access), llms.txt governs comprehension — it tells AI models which parts of your site matter most and how to understand your content hierarchy.
Pro Tip: Your llms.txt should include your brand’s core value proposition in the first 3 lines, followed by a structured list of your most important pages with one-sentence descriptions. AI models use this file to understand context before crawling individual pages.

Filters That Help

FilterHow to use it for this question
ProviderMap each audit finding to the AI product it affects — a ClaudeBot block impacts Anthropic’s products specifically
PeriodAfter making configuration changes, use a 30-day window starting from the change date to measure the impact on crawler visits in Crawler Analytics
Issue SeverityFilter Technical Analysis to Critical issues only to focus on the configuration problems that cause complete access blocks

How to Interpret the Results

Good result

robots.txt explicitly allows GPTBot, ClaudeBot, PerplexityBot, and Google-Extended (at minimum) with no conflicting Disallow rules on your key pages. An llms.txt file exists at /llms.txt, passes validation, includes a brand description and a curated list of important pages, and is under 100KB. Crawler Analytics confirms all four primary bots are visiting regularly.

Needs attention

Any of the following requires immediate action: a wildcard Disallow that catches GPTBot or ClaudeBot, a Disallow on /blog/, /solutions/, or /pricing/ (your highest-value GEO content directories), a missing llms.txt file entirely, or an llms.txt file that exists but fails validation (malformed syntax, missing required sections). Even a partial block on a key section can cut that content out of AI model training entirely.
Removing a Disallow directive from robots.txt does not immediately restore your content to AI model indexes. Crawlers must re-visit and re-process the previously blocked pages, which can take 4-8 weeks. Do not expect instant visibility improvements after unblocking a bot — monitor Crawler Analytics for the gradual increase in visit frequency over the weeks that follow.

Example

Scenario: Your bank offers a comprehensive suite of personal finance tools and educational content, yet a smaller fintech competitor consistently outranks you in AI responses about savings accounts and mortgage rates. You suspect a technical configuration issue is limiting your AI visibility.
  1. Open Technical > Technical Analysis and navigate to the Configuration Files section. The robots.txt audit flags a critical issue: a legacy User-agent: * / Disallow: /secure/ rule — originally added to protect online banking login pages — is also blocking the entire /secure/rates/ and /secure/calculators/ directories, which contain your publicly accessible rate comparison pages and mortgage calculators. The wildcard catches GPTBot and ClaudeBot, neither of which has its own explicit Allow directive.
  2. Add explicit User-agent: GPTBot / Allow: /secure/rates/ and Allow: /secure/calculators/ blocks, followed by similar blocks for ClaudeBot and PerplexityBot, while keeping the Disallow on sensitive /secure/accounts/ paths. The Technical Analysis audit re-runs and shows green for all three bots on the public financial content paths.
  3. Check the llms.txt section — no file exists. Use the Actions panel recommendations to create an llms.txt with your bank’s value proposition, key product pages (savings accounts, mortgage rates, investment tools), and your financial education hub. Monitor Analytics > Crawler Analytics over the following 30 days to confirm GPTBot begins visiting your rate and calculator pages.

Go Further