Do the test before you continue reading. Open a terminal and type:
curl -A "GPTBot" https://deine-domain.de/ -I
If a 200 response is returned, OpenAI's crawler is allowed to read your page. If a 403 or a Disallow message is returned, you are practically non-existent for ChatGPT - no matter how well you rank on Google. This is the uncomfortable truth: Your top 3 ranking on Google is of no use to you if the answer your customer sees no longer comes from Google at all.
Generative AI models such as ChatGPT, Claude and Perplexity are shifting the way your target group researches. Anyone who runs a shop or is responsible for one will notice this in a simple figure: the proportion of sessions that no longer start via a classic search results page, but via a fully formulated AI response. If your website does not appear in this response, you will lose the customer before they have even seen your shop.
This article explains what AI visibility is and how you can check and optimise your website along five pillars.
Ranking is no longer enough. You need to be cited.
Three developments make the topic urgent: OpenAI has given ChatGPT its own search layer, Google has rolled out its AI Overviews on a broad scale, and Perplexity has established itself as a serious search alternative. In addition, there are AI assistants with search functions such as Grok or Microsoft Copilot.
The difference to traditional search is the point at which most people have to rethink their old SEO strategy: A Google hit list shows ten blue links. An AI-generated answer often only cites two to four sources. Being indexed is not enough. The language model must classify your page as citation-worthy.
In practical terms, this means that a website in 5th place on Google can be cited more often in an AI response than its 1st place competitor - if it is optimised for AI visibility and the other is not. Link building and keywords remain relevant, but they are no longer the only leverage.
What AI visibility concretely means
AI visibility is the probability that your website will appear in a response generated by a language model. It is measurable and controllable, and it follows five rules that go beyond traditional SEO:
- Accessibility for AI crawlers - your site must be readable and retrievable.
- Entity clarity - the model must understand who you are and what you do.
- Citable content - structured and concise enough to extract a passage.
- External trust signals - other sources confirm your expertise.
- Technical hygiene - fast, up-to-date, cleanly rendered.
These five are interrelated. If one is missing, your AI visibility will drop noticeably, no matter how well the other four fit.
Pillar 1 - Can the AI crawlers clean at all?
Start here, because everything else is pointless if the answer is no. A common finding in AI visibility checks: The website blocks out the bots, often without anyone consciously deciding to do so.
The main AI crawlers are GPTBot (OpenAI), ClaudeBot (Anthropic) and PerplexityBot (Perplexity). Google controls the use of your content for AI training via the robots.txt token Google-Extended - this is not a separate crawler with its own user agent, but a rule that is based on the existing Googlebot infrastructure (and only affects the training, not the ranking).
The logic is important: robots.txt allows everything by default. You don't need to actively "invite" the bots - you just need to make sure that no Disallow rule is blocking them. A block typically looks like this:
# robots.txt - blocks AI crawlers (often unintentionally)
User agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
If something like this is in your robots.txt, get it out. However, the problem is often not in the robots.txt, but one level deeper:
- Firewall / WAF: Cloudflare, a WAF rule set or your host blocks unknown user agents across the board. For Shopware setups behind Cloudflare, this is the most common silent killer - the robots.txt says "allowed", the WAF says "403".
- Security plugins / bot management: Tools that automatically block "suspicious" bot patterns often also catch GPTBot & Co.
Crawl-delay: appears in many examples, but is not an official standard and is simply ignored by many bots - not a reliable lever.
The curl test from the beginning is your quickest diagnosis. 200 means welcome, everything else means: check who is blocking.
Pillar 2 - Does the model understand who you are?
Language models don't just read text, they read structure. Structured data are the labels by which a model recognises whether you are an agency, a SaaS provider or a furniture shop - and therefore which questions you are suitable for as a source. An incorrectly labelled company is cited less frequently and in the wrong contexts.
The tool for this is schema.org markup in JSON-LD format. For a B2B company, it looks like this:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "organisation",
"name": "nextlevels GmbH",
"url": "https://next-levels.de",
"logo": "https://next-levels.de/logo.png",
"description": "Shopware agency for e-commerce and AI visibility",
"sameAs": [
"https://www.linkedin.com/company/nextlevels",
"https://www.shopware.com/de/partner/agenturen/nextlevels-gmbh/"
],
"address": {
"@type": "PostalAddress",
"addressCountry": "DE",
"addressLocality": "Mönchengladbach"
},
"contactPoint": {
"@type": "ContactPoint",
"telephone": "+49-2161-XXXXXX",
"contactType": "Customer Service"
}
}
</script>
The markup alone is not enough. Three things are needed:
- NAP consistency: Name, address and telephone must be identical on the website, in the Google company profile, on LinkedIn and in business directories. Deviations confuse the models - the most common mistake is the old address still sitting around somewhere on the web.
- FAQPage scheme: If your content answers questions, label them as FAQs. This increases the chance that a single answer will be extracted.
- Self-disclosure in the visible text: A clear sentence such as "We are a shopware agency from Mönchengladbach" in the body text has a stronger effect than any hidden tag.
You can check your markup with the Rich Results Test from Google.
Pillar 3 - Can your text be cited?
Imagine a model is looking for an answer to "How does AI visibility differ from classic SEO?". It finds your 2,000-word article. If the answer is in four unstructured paragraphs of lead, it's hard to extract and the model moves on to the next source. If it's in a clearly delineated section with a subheading, the model cites you.
What helps:
- Inverse pyramid per section: most important statement first, details after.
- Questions as headings: "Can the AI crawlers even get in?" fits the search query better than "crawl optimisation".
- Tables and lists for comparable values - easily machine-readable.
This is what a quotable comparison table looks like (user figures as of early 2026):
| Platform | Users (approx.) | Focus |
|---|---|---|
| ChatGPT | ~900 Mio WAU | Generalist, Real-time links |
| Google AI Overviews | ~2bn / month | Integrated into Google Search |
| Perplexity | >100 million MAU | Research-focused |
| Claude | ~30 million MAU | Long-form analyses |
The warning here: Don't over-structure. A text that consists of 70% bullet points reads like a collection of keywords to humans. AI optimisation must not displace the human reader - they buy in the end.
Pillar 4 - Trust arises outside your website
Language models quote like cautious journalists: preferably what several independent sources confirm. A single opinion on your own site weighs less than a consensus that extends across the web.
An example makes the difference tangible. Agency A has 50 blog posts and ten backlinks from standard B2B sites. Agency B has 30 blog posts, but one of them has been picked up in three trade publications and the founder is a regular on industry podcasts. In traditional SEO, A could rank higher. In the AI answer, the model tends to cite B because external trust is denser.
What pays off: mentions in specialist media, a consistent brand image across all channels (website, LinkedIn, interviews tell the same story), thematic depth on your own domain instead of a single article, and - where appropriate - a Wikipedia entry that models treat as a source of trust. Incidentally, more content alone does not help: poorly linked, unread content is more ballast than a signal of trust.
Pillar 5 - Technology that does not slow down AI crawlers
This is the point that affects most modern shops. AI crawlers such as GPTBot, ClaudeBot and PerplexityBot do not render JavaScript. They read the initial HTML - and nothing else. If your storefront is a pure client-side SPA and only loads the content via JavaScript, these bots see a blank page.
This is not a marginal issue: If you run a React or Vue front-end without server-side rendering, you are invisible to AI responses, even though the page looks perfect in the browser. The solution is server-side rendering or static generation (Next.js, Nuxt, SSG). Shopware delivers its storefront rendered on the server side - a structural advantage that should not be given away lightly by a headless conversion without SSR.
In addition, the usual technical hygiene counts:
- Performance: Google's "good" threshold for the time to first byte is ≤ 800 ms (according to web.dev). A slow server means that fewer pages are captured per crawl.
- Slim HTML: As a rule of thumb, it helps to keep the HTML size low and minimise CSS/JS - there is no hard standard for this.
- Maintained
sitemap.xmlwith up-to-datelastmodinformation. - Refresh: update old posts, repair dead links. Outdated content signals a lack of maintenance.
And about llms.txt, which is currently haunting the SEO blogs: This is not a robots-like access file with Allow:/Disallow:. The proposal (llmstxt.org) describes a markdown file in the root that gives LLMs a curated map of your most important content - title, short description, then thematic link lists. You control access via robots.txt; llms.txt is discovery, not access control. Simplified:
# llms.txt
# H1 title, short description as blockquote, then curated link lists
> nextlevels: Shopware agency. The most important resources for AI systems.
## Services
- Shopware migration](https://next-levels.de/...): SW5 → SW6 ...
## Blog
- [Check AI visibility](https://next-levels.de/...): The 5 pillars ...
The five pillars at a glance
| Pillar | Core measure | Most common error |
|---|---|---|
| 1. Crawler access | No Disallow blocking, check WAF/Firewall | Cloudflare/WAF silently blocks GPTBot via 403 |
| 2. Entity clarity | schema.org + consistent NAP data | Old address somewhere on the web |
| 3. Citation | Structure, questions-H2, tables | Lead deserts without extractable passages |
| 4. External trust | PR, trade media, consistent brand image | Relying only on your own website |
| 5. Tech hygiene | SSR instead of pure client SPA, performance | KI crawlers don't render JS → empty page |
KI visibility is the answer side of the same coin as classic SEO - if you want to use the lever holistically, you'll find it in our search engine optimisation. And if you want to delve deeper into the storefront architecture and the SSR topic from pillar 5, our Angular technology page is the right place for you.
Check now if AI can find you
You can do the curl test from the beginning yourself in 30 seconds. The other four columns require a closer look. The nextlevels AI visibility check scans your website for all five pillars free of charge and provides a score plus prioritised recommendations for action. No signup, no obligation - just your data and concrete next steps.