AI Visibility

Generative Engine Optimisation, end to end

Schema architecture, llms.txt, robots.txt bot tiering, and the citation-share metric that replaces SERP ranking.

Generative Engine Optimisation, end to end hero illustration, brand-typographic editorial poster on cream paper, deep ink headline and brand-orange italic accent.

▸ Bottom line up front

When a buyer asks Perplexity, ChatGPT, Gemini, Claude, or Google AI Overviews about your category in 2026, the answer they get either cites you or it does not. GEO is the discipline that makes your brand the source the AI engine reaches for. It is not SEO with a new label. It is a different optimisation target: citation share across the AI engines, not ranking position on the SERP. Princeton GEO research published on arXiv (2311.09735) shows that statistic density lifts AI citation impressions by about 37 percent, expert quotation by about 27 percent, and outbound citation by about 22 percent. Those three patterns are the GEO playbook in one sentence.

What GEO actually is

Generative Engine Optimisation is the work of making a brand citable by generative AI systems. It is the discipline that replaces traditional SEO when the user reads the AI answer instead of clicking through to a website. It sits alongside paid acquisition, not apart from it: the same entity signals that earn an AI citation also sharpen the audience and creative models behind AI performance marketing, so a brand that is legible to the engines is also legible to the buying systems.

GEO has four pillars:

  1. Entity hygiene: Organization schema, sameAs across Wikidata, Crunchbase, LinkedIn Company, and named-publisher mentions
  2. Structured data: Article, Service, FAQPage, HowTo, BreadcrumbList with @graph cross-referenced @id
  3. Sources-first content: statistic density, expert quotation, primary citation, self-contained 50 to 150 word chunks
  4. Bot access: robots.txt that allows inference bots (OAI-SearchBot, PerplexityBot, Claude-Web, Gemini) while blocking training crawlers if you choose

The six AI engines that matter in 2026

AI engines that cite brands in 2026
EngineVolume rankCitation behaviour
Google AI Overviews + AI Mode1High citation density, prefers schema-rich pages, surfaces FAQPage answers verbatim
ChatGPT Search2Cites sources inline with hyperlinks, prefers primary-cited content with statistic density
Perplexity3Source-list at end of answer, prefers self-contained chunks of 100-150 words
Gemini (inside Google products)4Cites alongside Google AI Overviews, similar selection criteria
Bing Copilot5Cites Bing-indexed content, prefers Bing-friendly schema and structured tables
Claude6Web search citations inline, prefers expert-quoted content and named sources

Entity hygiene: making the AI know who you are

Before an AI engine can cite you it needs to know you exist as a discrete entity. Entity hygiene is the work of making your brand a stable, well-described node in the entity graphs the AI engines query. It carries the most weight in considered-purchase categories where buyers research before they commit, which is why we lean on it hardest for sectors like fintech, where a citation in an AI answer is often the first time a prospect meets the brand.

  • Organization schema with all five identifier fields: name, alternateName, legalName, identifier (UEN/EIN/etc.), and the address PostalAddress block
  • sameAs ladder: link to Wikidata, Crunchbase, LinkedIn Company, X, YouTube, GitHub if relevant. Wikidata in particular is the public entity graph the AI engines reconcile against
  • Named-publisher mentions: brand mentions in publications the AI engines crawl as authoritative (industry trade press, regulator publications, academic citations)
  • Founder Person schema with hasCredential entries, sameAs to LinkedIn, jobTitle, worksFor cross-referenced @id to the Organization node

Structured data: the @graph pattern

Every page on the site ships a single JSON-LD @graph with cross-referenced @id pointers. This is the pattern Google explicitly recommends and the AI engines consume. The 13 nodes that make up a complete consultancy page graph:

  1. Organization + ProfessionalService dual type
  2. Parent Organization (separate @graph node)
  3. Person (founder) with hasCredential
  4. WebSite with SearchAction
  5. WebPage with mainEntity, lastReviewed, reviewedBy, hasPart, Speakable
  6. Article or TechArticle wrapping the body
  7. BreadcrumbList
  8. Service with alternateName, hasOfferCatalog, areaServed
  9. FAQPage with author and dateModified per Answer
  10. HowTo with HowToStep array
  11. Dataset where benchmarks are published
  12. ImageObject for OG image
  13. ImageObject for hero accent

Validate every @graph block with json.loads before deploy. A single trailing comma or unclosed brace silently invalidates the entire script tag, and the AI engines see nothing.

Sources-first content: the Princeton GEO playbook

Princeton GEO research (arXiv:2311.09735) measured what content patterns lift AI citation impressions. The three highest-impact patterns:

Princeton GEO citation lift by content pattern
PatternLift in AI citation impressionsHow to apply
Statistic additionabout 37 percentAt least one specific number per 150-200 words. Source named inline.
Expert quotationabout 27 percentQuote a named industry source or regulator with attribution in the same paragraph.
Outbound citationabout 22 percentHyperlink to the primary source publisher (.gov, vendor official, academic).
Authoritative phrasingabout 15 percentPhrase claims as decisive operator judgment, not hedged general-purpose advice.
Easy-to-understandabout 10 percentSelf-contained 50-150 word chunks that read as a complete answer on their own.

Bot access: the robots.txt tier policy

Robots.txt in 2026 is a tier policy, not a binary allow/disallow. Three tiers:

  • Training crawlers: GPTBot, Google-Extended, CCBot, anthropic-ai. Block these if your content is your IP and you do not want it baked into the next model release.
  • Inference crawlers: OAI-SearchBot, PerplexityBot, ClaudeBot, Gemini, Bingbot. Allow these. They are the bots that crawl in real time to answer user queries, and blocking them means you do not get cited.
  • Default crawlers: Googlebot, Bingbot, DuckDuckGo, etc. Allow with standard rules.

Most marketing sites get this wrong by either blocking everything or allowing everything. The middle path (block training, allow inference) is the 2026 default for consultancy and B2B brands. Get the access tier right and citation becomes a demand source in its own right, feeding the same funnel that performance marketing spends paid budget to fill.

Questions, answered.

What is the difference between GEO, AEO, and SEO?

SEO is the discipline of ranking on the blue-link SERP. AEO (Answer Engine Optimisation) is the discipline of getting featured in answer boxes and AI Overviews on Google specifically. GEO (Generative Engine Optimisation) is the broader discipline of being cited across all generative AI engines (Perplexity, ChatGPT, Gemini, Claude, Bing Copilot, Google AI Overviews). GEO subsumes AEO and overlaps with SEO; the three are complementary not exclusive.

How do we measure GEO success?

Citation share across the six AI engines for your priority unbranded query set, quarter-over-quarter delta. Plus competitor share-of-voice on the same queries. The leapbuzz visibility-citation-tracker polls the engines weekly and logs share, sentiment, and named-competitor citations. Without weekly polling you cannot tell whether your work moved the metric.

Does GEO replace SEO?

No, it complements SEO. The blue-link SERP is shrinking but not gone; for high-intent commercial queries it still drives most direct traffic. GEO captures the upstream layer where buyers research vendors via AI before they hit your paid funnel. The two work together; the accounts that ship both outperform single-discipline accounts.

What does the Princeton GEO research actually say?

Princeton GEO research (arXiv:2311.09735) measured the lift in AI citation impressions from content pattern changes. The three highest-impact patterns were: statistic addition (about 37 percent lift), expert quotation (about 27 percent lift), and outbound citation (about 22 percent lift). Authoritative phrasing and easy-to-understand chunking added 15 and 10 percent on top. Those numbers are the GEO playbook.

Should we block AI training crawlers?

Depends on whether your content is the asset. For consultancies, agencies, and B2B brands where the writing carries IP and brand authority, blocking training crawlers (GPTBot, Google-Extended, CCBot, anthropic-ai) while allowing inference crawlers (OAI-SearchBot, PerplexityBot, ClaudeBot) is the 2026 default. For e-commerce or commodity-content sites the trade-off is different; training inclusion may help discoverability more than it costs in IP leakage.

Related reading