A diagram showing the 6-step GEO audit process with icons for crawlers, prompts, schema, and entity signals

A GEO audit is a structured review of whether AI answer engines - ChatGPT, Perplexity, Gemini, Claude, and Copilot - can access, understand, and cite your brand. In 30 minutes, you will establish a citation rate baseline, surface the technical and content gaps suppressing your visibility, and leave with a prioritized fix list.

What Is a GEO Audit?

Traditional SEO audits optimize for ranking positions in a list of links. A GEO (Generative Engine Optimization) audit optimizes for something different: inclusion as a cited source inside AI-generated answers. For a complete primer on what GEO means, see the generative engine optimization guide.

When a buyer asks ChatGPT "What are the best tools for tracking AI citations?" they do not get ten blue links. They get a synthesized answer with a short list of recommended tools, each with a citation. If your brand is not in that answer, you are invisible to that buyer - regardless of your Google ranking.

The business case for closing that gap is concrete. Semrush research from June 2025 found AI search visitors convert at 4.4x the rate of traditional organic visitors. They arrive pre-educated: the AI has already explained the market to them and recommended options. Being cited means arriving in front of buyers who are already pre-sold on the category.

The challenge is that AI citations are not static. Profound AI research found that 59.3% of citation positions change month-over-month in Google AI Overviews, with similar churn rates in ChatGPT (54.1%) and Copilot (53.4%). If you are not measuring and actively optimizing, you are invisibly losing ground every 30 days.

A GEO audit evaluates four dimensions:

Technical access - can AI crawlers physically reach your content?
Content citation-readiness - is your content structured so AI models can extract and quote it?
Entity clarity - do AI models have a consistent, high-confidence understanding of who your brand is?
Measurement infrastructure - are you tracking citation rates so you can detect drift?

What You'll Need Before You Start

Free tools required:

Access to your robots.txt file and Cloudflare dashboard (if you use Cloudflare)
Google Search Console (for question-format query export)
Google Rich Results Test - search.google.com/test/rich-results
Schema Markup Validator - validator.schema.org
Web accounts for ChatGPT, Perplexity, Gemini, Claude, and Microsoft Copilot
A spreadsheet to log results

Optional paid tools (speed up the baseline measurement step significantly):

CitedSpy, Otterly.ai, or Peec AI for automated prompt tracking

Time estimate: 30-35 minutes for a first-run manual audit. Subsequent monthly audits run 15-20 minutes because your prompt library and competitor baseline are already built. With an automated tracking tool handling baseline measurement, active audit time drops to roughly 20 minutes.

The 6-Step GEO Audit (30 Minutes Total)

A timeline graphic showing 6 audit steps across 30 minutes, with time allocations for each step

Step 1: Verify AI Crawler Access (5 minutes)

Before you optimize a single page, confirm AI crawlers can actually reach your content. ParseAI research found 27% of B2B SaaS sites accidentally block at least one major AI crawler - and most of that blocking happens at the CDN or WAF layer, not in robots.txt.

The six crawlers to verify:

Crawler	Owner	Purpose
GPTBot	OpenAI	Model training
OAI-SearchBot	OpenAI	Search retrieval (citations)
ClaudeBot	Anthropic	Model training
Claude-SearchBot	Anthropic	Search retrieval (citations)
PerplexityBot	Perplexity	Search retrieval (citations)
Google-Extended	Google	Gemini training + AI Overviews

The critical distinction: search/retrieval crawlers (OAI-SearchBot, Claude-SearchBot, PerplexityBot) fetch your pages at query time to generate cited answers. Blocking them removes you from citation eligibility entirely. Blocking only training crawlers (GPTBot, ClaudeBot, Google-Extended) protects your content from model training while preserving your ability to be cited.

Check your robots.txt first. Fetch yourdomain.com/robots.txt and look for any of these patterns:

# This blocks all crawlers including AI bots - dangerous
User-agent: *
Disallow: /

# This blocks OpenAI crawlers explicitly
User-agent: GPTBot
Disallow: /

The safest configuration explicitly allows each crawler:

User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Claude-SearchBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /

Then check your Cloudflare dashboard. Cloudflare began blocking AI crawlers by default for all new domains in mid-2025. Navigate to Security > Bots and confirm the "Block AI scrapers and crawlers" toggle is off. Your robots.txt can be perfectly configured while Cloudflare silently blocks the same bots at the network layer.

Also scan your WAF custom rules for any rules referencing GPTBot, ClaudeBot, or associated IP ranges. Check your server access logs for 403 or 429 responses against these user-agents - that is how you confirm whether crawl attempts are succeeding or failing silently.

Time breakdown: Fetch robots.txt (30 sec), review for the six user-agents (1 min), Cloudflare Bot Management check (1 min), WAF rules scan (1 min), server log spot-check (1.5 min).

Step 2: Build Your Prompt Library (5 minutes)

Your prompt library is the set of 15-25 natural-language questions your target buyers run in AI engines during research. These are not keywords - they are full questions.

Four sources for prompt discovery:

Google Search Console - filter for question-format queries (containing who/what/how/why/which/best/compare). These map directly to AI prompt patterns.
People Also Ask boxes - run your top 5 keywords in Google and collect the PAA questions.
AI engines themselves - run your top keywords in ChatGPT and Perplexity and note the follow-up questions the AI suggests.
G2 and Capterra comparison pages - these reflect how buyers frame evaluation questions in your software category.

Structure your prompt list into three tiers:

Awareness prompts: "What is [category]?", "How does [category] work?", "Best tools for [use case]"
Evaluation prompts: "How do I track my brand in AI search?", "What's the best way to measure GEO performance?"
Comparison prompts: "[Your brand] vs [Competitor A]", "[Your brand] reviews", "Is [your brand] legit?"

Prioritize comparison prompts. They indicate higher purchase intent than awareness prompts - a buyer comparing two specific products is closer to a decision than one asking what a category is.

Time breakdown: GSC export (1 min), seed queries in ChatGPT for follow-up ideas (2 min), G2 category scan (1 min), compile and tier the prompt list (1 min).

Step 3: Run Your Baseline Citation Audit (8 minutes)

A spreadsheet template showing columns for prompt, engine, brand mentioned, domain cited, sentiment, and competitor citations

Run each prompt across all five engines and record the results. This is the most time-intensive step - 20 prompts across 5 engines equals 100 combinations at roughly 5-7 seconds each.

For each prompt-engine combination, record:

Brand mentioned: yes / no
Domain cited as a source link: yes / no
Mention sentiment: positive / neutral / negative / absent
Position of mention: first, second, later in the response
Competitor domains cited in the same response
Verbatim AI response (paste into a doc for later content gap analysis)

Two metrics to calculate from your raw data:

Citation rate = (prompt-engine combinations where your domain is cited) / (total combinations tested). If you run 20 prompts across 5 engines and your domain is cited in 15 of those 100 combinations, your citation rate is 15%.

Share of citation = your citation count / (your count + all competitor citation counts) across your prompt set. This tells you whether you are gaining or losing ground relative to competitors, independent of overall citation volume.

Manual testing works for a first audit at under 25 prompts. For ongoing monthly monitoring, automated tools eliminate the 8-minute manual step and catch citation drift between audits.

Time breakdown: 100 combinations at 5 seconds each = approximately 8 minutes of active testing.

Step 4: Audit Your Schema Markup (4 minutes)

Schema markup (JSON-LD using Schema.org vocabulary) signals to AI systems what your content is, who created it, and what claims it contains. Correct implementation correlates with a +21.6% lift in AI citation likelihood according to GEO research aggregators.

Run your top 5 pages through Google's Rich Results Test at roughly 45 seconds per page.

Priority schema types to verify:

Article or BlogPosting on all editorial content - must include author with sameAs links to LinkedIn or Wikipedia, datePublished, dateModified, and publisher with logo
FAQPage on any page with Q&A sections - AI answer engines directly index FAQ markup to generate cited answers. Keep individual answers under 300 characters for maximum extractability.
Organization on your homepage - must include sameAs array with all your verified external profiles (LinkedIn, Twitter, G2, Crunchbase, Wikidata)
Person for all named founders and authors cited in content - include sameAs links to verify the person is real

Common errors that suppress schema benefits:

Missing author on Article schema (AI models cannot attribute the content to a verifiable expert)
Organization schema without sameAs (the entity cannot be linked to external records)
Schema that contradicts visible page content (triggers trust penalties)
Validation errors of any kind (broken schema is worse than no schema)

Time breakdown: 5 pages through Rich Results Test at 45 seconds each (3.75 min), scan results for errors (30 sec).

Step 5: Score Your Top Pages for GEO Signals (5 minutes)

The Princeton GEO study (Aggarwal et al., ACM KDD 2024) tested nine content interventions and measured their effect on AI citation rate. The findings are specific:

Adding statistics with cited sources: +41% lift (the single largest measured lever)
Adding direct quotes from named experts: +28% lift
Citing authoritative external sources: large positive effect
Answer-first structure: up to 40% lift in citation frequency
Keyword stuffing: measurable negative effect

Spot-check your top 10 pages against three criteria. Spend roughly 30 seconds per page:

Answer-first structure: Does each page open with a standalone 1-3 sentence direct answer in the first 100 words? AI models extract this opening as their cited summary. Pages that bury the answer in a long introduction are less frequently cited because the extractable signal is weaker. Check whether each H2 section also opens with a 1-2 sentence direct answer before expanding.

Statistical density: Target approximately one cited statistic per 150-200 words of body content. A 2,000-word article should contain 10-15 data points with explicit source attribution. Vague claims ("many companies struggle with X") score worse than specific claims ("59.3% of AI citation positions change month-over-month, per Profound AI research").

External citations and content format: Does each page link to at least 2-3 primary sources (original studies, official documentation - not other blog posts)? Are multi-point arguments formatted as bullet or numbered lists rather than dense prose? Are comparisons in tables?

Flag any page with strong Google traffic but zero AI citations - these are your highest-ROI rewrite candidates. For a full breakdown of the content tactics that drive citations, see the guide to how to get cited by ChatGPT.

Step 6: Check Your Entity Signals (3 minutes)

AI models maintain entity records for brands, people, and products - and they cite entities they have high-confidence records for more readily than ambiguous or unknown ones. This affects citation frequency independently of your content quality.

Five checks, 30-60 seconds each:

Google Knowledge Panel: Search your exact brand name in Google. A Knowledge Panel in the right sidebar indicates Google has a high-confidence entity record. No panel indicates entity ambiguity. If a panel appears, verify it shows the correct description, logo, founding date, and social links.

Wikidata: Search your brand at wikidata.org. A structured Wikidata entry is achievable without passing Wikipedia's notability requirements, and it provides the machine-readable entity record that AI systems reference.

Wikipedia: Does a brand article exist? If not, this is a longer-term authority building project - but knowing the gap matters.

Review platform presence: Domains listed on G2 and Capterra have a 3x higher chance of being cited by ChatGPT compared to sites without review platform presence. Check that your G2 and Capterra profiles are complete: accurate category, full description, website link, and at least 10 verified reviews.

Brand name consistency: Spot-check your brand name across your website, LinkedIn company page, Twitter/X, G2, Capterra, and Crunchbase. The name, description, and URL must be identical. Inconsistencies cause entity confusion and suppress citations - AI models treat consistency across sources as a trust signal.

Verify that your Organization schema sameAs array includes every verified external profile. This is how AI systems link the entity on your site to the entity on third-party platforms.

Reading Your Audit Results

After completing the six sections, calculate your scores:

Section	What good looks like
Crawler access	5/5 checks passed, no Cloudflare blocks
Prompt coverage	15-25 prompts across all three tiers
Citation rate	Baseline established (no target yet - this is your starting point)
Schema health	0 validation errors across top 10 pages
Content readiness	7+ of 10 pages answer-first with cited statistics
Entity signals	Knowledge Panel present, G2/Capterra profiles complete

A citation rate of 0-10% across your prompt set is common for brands new to GEO. Citation rates of 30-50% are achievable within 90 days with focused optimization. The baseline number matters less than establishing it - you cannot measure improvement without it.

What to Fix First: Prioritizing Your GEO Issues

Not every gap has equal ROI. Use this triage order:

Fix immediately (this week):

Any crawler blocks found in Step 1. This is table stakes - nothing else matters if AI engines cannot reach your pages.
Schema validation errors on your top 5 pages. Broken schema actively suppresses citation likelihood.

Fix in the next 30 days:

Pages with strong Google traffic but zero AI citations. These pages already have proven demand - adding answer-first structure, cited statistics, and FAQ markup is the fastest path to citation lift.
Incomplete G2 and Capterra profiles. Third-party presence has outsized impact on ChatGPT citations specifically.

Fix in 60-90 days:

Comparison prompts where competitors are cited but you are not. Fetch the cited competitor pages and analyze what structure, schema, and statistical density they use. Build or rewrite pages to match and exceed that pattern.
Entity building (Wikidata entry, Wikipedia notability building, review generation). These have longer lead times but compound over time.

The one stat to remember as you prioritize: The Princeton GEO study found that adding statistics with cited sources produces a +41% lift in AI citation rate. If your pages lack specific, sourced data points, that is your single highest-leverage content fix.

For a deeper look at how GEO and traditional SEO priorities compare at different company stages, see GEO vs SEO: Which Should You Prioritize?

The GEO Audit Checklist

A printable GEO audit checklist with six sections and scoring summary boxes

SECTION 1: CRAWLER ACCESS (5 min)

[ ] robots.txt: no Disallow: / under GPTBot, OAI-SearchBot, ClaudeBot, Claude-SearchBot, PerplexityBot, or Google-Extended
[ ] No blanket User-agent: * Disallow: /
[ ] Cloudflare Bot Management: "Block AI scrapers and crawlers" is off
[ ] WAF custom rules: no rules blocking AI crawler user-agents or IP ranges
[ ] Server logs: no 403/429 responses to AI crawler user-agents

SECTION 2: PROMPT LIBRARY (5 min)

[ ] Exported question-format queries from Google Search Console
[ ] Ran 5 seed keywords in ChatGPT and Perplexity for follow-up prompt ideas
[ ] Reviewed G2/Capterra category comparison pages
[ ] Built prompt list of 15-25 prompts across awareness / evaluation / comparison tiers
[ ] Included at least 5 brand-direct prompts

SECTION 3: BASELINE MEASUREMENT (8 min)

[ ] Ran all prompts in ChatGPT (web search enabled)
[ ] Ran all prompts in Perplexity (default search mode)
[ ] Ran all prompts in Gemini
[ ] Ran all prompts in Claude
[ ] Ran all prompts in Microsoft Copilot
[ ] Recorded: brand mentioned, domain cited, sentiment, competitor domains
[ ] Calculated citation rate and share of citation

SECTION 4: SCHEMA AUDIT (4 min)

[ ] Top 5 pages tested in Google Rich Results Test - zero errors
[ ] Article/BlogPosting schema with author sameAs on all editorial pages
[ ] FAQPage schema on pages with Q&A sections
[ ] Organization schema on homepage with complete sameAs array
[ ] Person schema for all named authors/founders

SECTION 5: CONTENT AUDIT (5 min)

[ ] Top 10 pages: direct answer in first 100 words
[ ] Statistical density: at least 1 cited stat per 200 words
[ ] External citations: 2-3 primary sources per page
[ ] H2 headings in question format or "How to / What is / Why" phrasing
[ ] Multi-point arguments in bullet or numbered list format
[ ] Flagged pages needing answer-first rewrites

SECTION 6: ENTITY SIGNALS (3 min)

[ ] Google Knowledge Panel appears for brand name search
[ ] Wikidata entry exists
[ ] G2 profile: complete description, correct category, 10+ reviews
[ ] Capterra profile: same checks
[ ] Brand name identical across website, LinkedIn, Twitter, G2, Capterra, Crunchbase
[ ] Organization schema sameAs includes all verified external profiles

SCORING SUMMARY

Section	Score
Crawler access	__ / 5
Prompts in tracking set	__
Citation rate	__%
Schema: pages error-free	__ / 10
Content: pages answer-first	__ / 10
Entity signals present	__ / 6

Frequently Asked Questions

Run a full audit once as your baseline, then repeat it quarterly. Between audits, run continuous automated monitoring - given that citation positions churn at roughly 54-59% per month across major AI engines, a quarterly audit without ongoing tracking will miss significant drift.

The terms are used interchangeably. "AI visibility audit," "AI search audit," and "GEO audit" all refer to the same systematic review of a brand's citation presence and citation-readiness across AI answer engines.

Not necessarily - but there is significant overlap. Pages that earn featured snippets in Google already have the answer-first structure that AI engines prefer. That said, Perplexity and ChatGPT regularly cite sources that rank on page 2 or 3 in Google, because their citation logic weights content structure and statistical density over link authority. For a full comparison, see GEO vs SEO: Which Should You Prioritize?

There is no universal benchmark because citation rates vary by category, prompt type, and engine. What matters is tracking your rate over time. A citation rate that improves from 8% to 22% over 90 days, in your specific prompt set, against your specific competitors, is a meaningful result regardless of how it compares to an industry average.

That is a business decision, not a GEO one. Blocking training crawlers (GPTBot, ClaudeBot, Google-Extended) prevents your content from being used in future model training but does not remove you from current citation eligibility, which depends on search/retrieval crawlers. Blocking search/retrieval crawlers (OAI-SearchBot, Claude-SearchBot, PerplexityBot) does remove you from citation eligibility. If citation visibility matters to you, leave retrieval crawlers unblocked.

Start with Step 1 - confirm crawlers can reach your site. Then move to entity signals (Step 6) - a brand with no G2 presence, no Knowledge Panel, and no Wikidata entry is effectively unknown to AI models regardless of content quality. Build the entity foundation first, then optimize content structure.

Running this audit monthly gives you a rolling picture of citation drift before it affects pipeline. Once your prompt library is built and your baseline is established, CitedSpy can automate the baseline measurement step - running your prompt set across all five engines continuously and alerting you when citation rates drop or competitors gain ground. The 30-minute manual audit is the right starting point; automated monitoring is how you stay ahead of the 59% monthly churn rate without rebuilding your spreadsheet every four weeks.

How to Do a GEO Audit in 30 Minutes