Generative Engine Optimisation, end to end

What GEO actually is

Generative Engine Optimisation is the work of making a brand citable by generative AI systems. It is the discipline that replaces traditional SEO when the user reads the AI answer instead of clicking through to a website. It sits alongside paid acquisition, not apart from it: the same entity signals that earn an AI citation also sharpen the audience and creative models behind AI performance marketing, so a brand that is legible to the engines is also legible to the buying systems.

GEO has four pillars:

Entity hygiene: Organization schema, sameAs across Wikidata, Crunchbase, LinkedIn Company, and named-publisher mentions
Structured data: Article, Service, FAQPage, HowTo, BreadcrumbList with @graph cross-referenced @id
Sources-first content: statistic density, expert quotation, primary citation, self-contained 50 to 150 word chunks
Bot access: robots.txt that allows inference bots (OAI-SearchBot, PerplexityBot, Claude-Web, Gemini) while blocking training crawlers if you choose

The six AI engines that matter in 2026

AI engines that cite brands in 2026
Engine	Volume rank	Citation behaviour
Google AI Overviews + AI Mode	1	High citation density, prefers schema-rich pages, surfaces FAQPage answers verbatim
ChatGPT Search	2	Cites sources inline with hyperlinks, prefers primary-cited content with statistic density
Perplexity	3	Source-list at end of answer, prefers self-contained chunks of 100-150 words
Gemini (inside Google products)	4	Cites alongside Google AI Overviews, similar selection criteria
Bing Copilot	5	Cites Bing-indexed content, prefers Bing-friendly schema and structured tables
Claude	6	Web search citations inline, prefers expert-quoted content and named sources

Entity hygiene: making the AI know who you are

Before an AI engine can cite you it needs to know you exist as a discrete entity. Entity hygiene is the work of making your brand a stable, well-described node in the entity graphs the AI engines query. It carries the most weight in considered-purchase categories where buyers research before they commit, which is why we lean on it hardest for sectors like fintech, where a citation in an AI answer is often the first time a prospect meets the brand.

Organization schema with all five identifier fields: name, alternateName, legalName, identifier (UEN/EIN/etc.), and the address PostalAddress block
sameAs ladder: link to Wikidata, Crunchbase, LinkedIn Company, X, YouTube, GitHub if relevant. Wikidata in particular is the public entity graph the AI engines reconcile against
Named-publisher mentions: brand mentions in publications the AI engines crawl as authoritative (industry trade press, regulator publications, academic citations)
Founder Person schema with hasCredential entries, sameAs to LinkedIn, jobTitle, worksFor cross-referenced @id to the Organization node

Structured data: the @graph pattern

Every page on the site ships a single JSON-LD @graph with cross-referenced @id pointers. This is the pattern Google explicitly recommends and the AI engines consume. The 13 nodes that make up a complete consultancy page graph:

Organization + ProfessionalService dual type
Parent Organization (separate @graph node)
Person (founder) with hasCredential
WebSite with SearchAction
WebPage with mainEntity, lastReviewed, reviewedBy, hasPart, Speakable
Article or TechArticle wrapping the body
BreadcrumbList
Service with alternateName, hasOfferCatalog, areaServed
FAQPage with author and dateModified per Answer
HowTo with HowToStep array
Dataset where benchmarks are published
ImageObject for OG image
ImageObject for hero accent

Validate every @graph block with json.loads before deploy. A single trailing comma or unclosed brace silently invalidates the entire script tag, and the AI engines see nothing.

Sources-first content: the Princeton GEO playbook

Princeton GEO research (arXiv:2311.09735) measured what content patterns lift AI citation impressions. The three highest-impact patterns:

Princeton GEO citation lift by content pattern
Pattern	Lift in AI citation impressions	How to apply
Statistic addition	about 37 percent	At least one specific number per 150-200 words. Source named inline.
Expert quotation	about 27 percent	Quote a named industry source or regulator with attribution in the same paragraph.
Outbound citation	about 22 percent	Hyperlink to the primary source publisher (.gov, vendor official, academic).
Authoritative phrasing	about 15 percent	Phrase claims as decisive operator judgment, not hedged general-purpose advice.
Easy-to-understand	about 10 percent	Self-contained 50-150 word chunks that read as a complete answer on their own.

Bot access: the robots.txt tier policy

Robots.txt in 2026 is a tier policy, not a binary allow/disallow. Three tiers:

Training crawlers: GPTBot, Google-Extended, CCBot, anthropic-ai. Block these if your content is your IP and you do not want it baked into the next model release.
Inference crawlers: OAI-SearchBot, PerplexityBot, ClaudeBot, Gemini, Bingbot. Allow these. They are the bots that crawl in real time to answer user queries, and blocking them means you do not get cited.
Default crawlers: Googlebot, Bingbot, DuckDuckGo, etc. Allow with standard rules.

Most marketing sites get this wrong by either blocking everything or allowing everything. The middle path (block training, allow inference) is the 2026 default for consultancy and B2B brands. Get the access tier right and citation becomes a demand source in its own right, feeding the same funnel that performance marketing spends paid budget to fill.

Frequently asked questions

What is the difference between GEO, AEO, and SEO?

SEO is the discipline of ranking on the blue-link SERP. AEO (Answer Engine Optimisation) is the discipline of getting featured in answer boxes and AI Overviews on Google specifically. GEO (Generative Engine Optimisation) is the broader discipline of being cited across all generative AI engines (Perplexity, ChatGPT, Gemini, Claude, Bing Copilot, Google AI Overviews). GEO subsumes AEO and overlaps with SEO; the three are complementary not exclusive.

How do we measure GEO success?

Citation share across the six AI engines for your priority unbranded query set, quarter-over-quarter delta. Plus competitor share-of-voice on the same queries. The leapbuzz visibility-citation-tracker polls the engines weekly and logs share, sentiment, and named-competitor citations. Without weekly polling you cannot tell whether your work moved the metric.

Does GEO replace SEO?

No, it complements SEO. The blue-link SERP is shrinking but not gone; for high-intent commercial queries it still drives most direct traffic. GEO captures the upstream layer where buyers research vendors via AI before they hit your paid funnel. The two work together; the accounts that ship both outperform single-discipline accounts.

What does the Princeton GEO research actually say?

Princeton GEO research (arXiv:2311.09735) measured the lift in AI citation impressions from content pattern changes. The three highest-impact patterns were: statistic addition (about 37 percent lift), expert quotation (about 27 percent lift), and outbound citation (about 22 percent lift). Authoritative phrasing and easy-to-understand chunking added 15 and 10 percent on top. Those numbers are the GEO playbook.

Should we block AI training crawlers?

Depends on whether your content is the asset. For consultancies, agencies, and B2B brands where the writing carries IP and brand authority, blocking training crawlers (GPTBot, Google-Extended, CCBot, anthropic-ai) while allowing inference crawlers (OAI-SearchBot, PerplexityBot, ClaudeBot) is the 2026 default. For e-commerce or commodity-content sites the trade-off is different; training inclusion may help discoverability more than it costs in IP leakage.