How to Run an AI Search Audit for a Brand or URL

Dhriti

•

Posted on 12/06/26•11 min read

An AI search audit answers one question: when your buyers ask LLMs the questions that matter in your category, does your brand appear – and if not, who does, and why? Pepper’s audit methodology runs in six steps: (1) define the prompt universe, (2) run prompts across all major LLMs, (3) audit brand mentions vs. domain citations, (4) run theme-level competitor analysis, (5) generate page-level recommendations, and (6) implement and re-run monthly. This post walks through every step using a real case – the audit Pepper ran on itself, which found 0 mentions and a #612 rank, and became the foundation of an 8-month citation recovery program.

The Audit Trail: Every Step, Mapped

Why Every Brand Needs an AI Search Baseline – Now
The Case Study: Pepper Audited Itself (And the Results Were Brutal)
Step 1 – Define the Prompt Universe
Step 2 – Run Prompts Across All Major LLMs
Step 3 – Audit Brand Mentions vs. Domain Citations
Step 4 – Theme-Level Competitor Analysis
Step 5 – Page-Level Recommendations
Step 6 – Implement and Iterate Monthly
Industry Updates: What Marketing Leaders Are Saying
YouTube Script
FAQ

You can’t fix what you haven’t measured

Every GEO conversation eventually arrives at the same question: where do we actually stand? Before a brand invests in content restructuring, schema implementation, or community presence, it needs a baseline – a rigorous measurement of how it appears (or doesn’t) across the AI search surfaces its buyers use.

That measurement is the AI search audit. Done properly, it produces three things: a quantified baseline (mentions, citations, share of voice, rank), a competitor map (who owns the queries you’re missing, and what they did to own them), and a prioritized action plan (which pages to create, which to fix, in what order).

This post walks through Pepper’s complete audit methodology – the same six-step process used in Atlas audits for enterprise clients – illustrated with the most honest case study available: the audit Pepper ran on itself.

“I’ll start by doing it tonight. Spend time looking through answer engines, asking the questions your customers are asking. See if it’s recommending you – and if it’s telling the story you want told about your brand. Start there.” – Panelist, Index ’26 ecosystem panel

DEFINITION: AI Search Audit (GEO Audit)

An AI search audit is a structured measurement of how a brand or URL appears across AI search platforms – ChatGPT, Gemini, Perplexity, Claude, and Google AI Overviews – for a defined set of buyer-relevant prompts. It quantifies brand mentions, domain citations, share of voice, and average brand position; identifies which competitors win the prompts the brand is missing; and produces page-level recommendations to close the gaps. Unlike a traditional SEO audit, which measures rankings on search results pages, an AI search audit measures presence inside generated answers.

The Case Study: Pepper Audited Itself (And the Results Were Brutal)

In March 2026, Pepper ran its own brand through Atlas – the same audit process used for enterprise clients. We mapped 100 prompts across 10 themes covering the queries our buyers actually use: content marketing platforms, AI-era SEO and GEO optimization, content production scaling, marketing ROI, and more.

The results:

Metric	Pepper	Top Competitor (Google)
Total LLM Mentions	0	74
Brand Rank	#612	#1
Share of Voice	0%	25.1%
Themes with Mentions	0 of 10	10 of 10
Domain Citations	0	Cited via YouTube, G2, press

Semrush had 49 mentions. Contently had 46. HubSpot had 46. Pepper – the company that coined ‘Search Everywhere Optimization’ and runs the Index GEO Growth Summit – had zero.

This is the most important property of a well-run audit: it doesn’t care about your brand story. It measures what the machines actually say. And what they say is frequently a shock – which is exactly why the audit must come before the strategy.

Why this case is instructive: Pepper’s product results were exceptional – Freshworks at a $330K renewal, Atlassian at 2.8x clicks per article, Mutual of Omaha at 189% MoM click growth. The zero-citation result wasn’t a product problem. It was a visibility and distribution problem. That distinction – which only an audit can establish – determines the entire strategy that follows.

STEP 1

Define the Prompt Universe
The 100 questions your buyers actually ask LLMs

The prompt universe is the foundation of the entire audit. Get it wrong, and every downstream measurement is measuring the wrong thing. The prompt universe is the set of questions your actual buyers ask AI systems during their discovery, evaluation, and decision process – not the keywords you wish they searched.

Pepper’s standard structure: 100 prompts organized into 10 themes, 10 prompts per theme. For the self-audit, themes included ‘AI-Era SEO & GEO Optimization,’ ‘Content Production Efficiency & Scalability,’ ‘Content Marketing ROI & Performance Measurement,’ and ‘Affordable Content & Marketing Solutions for Startups.’

There are 4 prompt sources to draw from:

The exact phrases prospects use when describing their problem. ‘How do we scale content without hiring 20 writers’ is a prompt. ‘Content scaling solutions’ is a keyword. Use the prompt.Sales call language –
Cover TOFU (definitional: ‘what is GEO’), MOFU (evaluative: ‘best GEO platforms for enterprise’), and BOFU (comparative: ‘[Brand A] vs [Brand B]’, ‘[Competitor] alternatives’) prompts in every theme.Funnel-stage variants –
Mine Reddit threads, Quora questions, and LinkedIn comments in your category. These are verbatim records of how your market phrases its questions.Community phrasing –
Ask the LLMs themselves: ‘What questions do marketing leaders ask about [category]?’ The fan-out queries the models suggest are the queries they’re already answering for someone.LLM-suggested expansions –

STEP 2

Run Prompts Across All Major LLMs
One model is not a baseline. Five models are.

Every prompt must run across every major AI search surface. LLM answers diverge significantly – a brand can be well-cited in Perplexity and invisible in ChatGPT, because each platform has different retrieval sources, different training data emphases, and different citation behaviors.

The minimum platform set: ChatGPT, Google Gemini (including AI Overviews), Perplexity, and Claude. Add Bing Copilot for B2B categories where Microsoft ecosystem buyers matter.

Execution notes that matter for measurement validity:

Run prompts in clean sessions – no conversation history, no custom instructions, no memory. You’re measuring the default answer a new buyer receives.
Record the full response, not just whether your brand appeared – position in the answer, sentiment of the mention, and which URLs were cited as sources.
Run the full set in a consistent window (1–2 days) – answers drift over time, and a scattered collection window corrupts comparability.
Automate it – manually running 100 prompts across 5 platforms is 500 executions per cycle. This is what Atlas automates, with weekly re-scans and competitor alerts.

STEP 3

Audit Brand Mentions vs. Domain Citations
Two different metrics. Two different problems. Never conflate them.

This is the most commonly misunderstood distinction in AI search measurement. A brand mention is when the LLM names your brand in its answer. A domain citation is when the LLM cites your website as a source. They are different signals, with different causes, requiring different fixes.

Brand Mentions	Domain Citations
LLM names your brand in the answer text	LLM links your website as a source for the answer
Driven by: training data presence, third-party coverage, entity recognition, directory listings	Driven by: content structure, schema, retrievability, indexed answer-format pages
Fix with: G2/Capterra profiles, press coverage, Wikipedia/Wikidata, community presence	Fix with: structured pages targeting prompts, FAQ schema, llms.txt, chunkable content
Measured as: Brand Coverage (% of prompts mentioning you) and Share of Voice	Measured as: Domain Coverage (% of prompts citing your site) and Domain Citation count
Failure means: the model doesn’t know your brand exists	Failure means: the model knows you but won’t use your site as evidence

The six core metrics every audit should report: Brand Mentions (count), Share of Voice (your mentions ÷ all brand mentions), Brand Position (average position in answers), Domain Citations (count), Brand Coverage (% of prompts mentioning you), and Domain Coverage (% of prompts citing your site).

The diagnostic power is in the combination. High mentions + low citations means the market knows you but your site isn’t structured for retrieval. Low mentions + low citations – Pepper’s own starting state – means an entity and distribution problem that content fixes alone won’t solve.

STEP 4

Theme-Level Competitor Analysis
Who wins each theme – and what they did to win it

Aggregate numbers hide the strategy. Theme-level analysis reveals it. For each of the 10 themes, the audit builds a benchmark: what percentage of that theme’s prompts does each competitor appear in?

In Pepper’s self-audit, the theme benchmarks were revealing: Google owned 63% visibility on ‘SEO & Organic Growth on Limited Budget’ but only 10% on ‘Expert Content Talent & Production Scaling.’ Contently led ‘Content Production Scaling’ themes at 20–27%. HubSpot dominated ‘AI-Powered Marketing Transformation & ROI’ at 47%. No single competitor owned everything – each had carved out theme-level territories.

For each theme, the analysis answers three questions:

This sets the realistic target. Beating a 63% incumbent requires a different investment than closing a gap on a theme where the leader sits at 17%.Who wins this theme, and at what visibility percentage? –
Trace the citations back. Is the competitor winning through comparison pages? YouTube videos? G2 category placement? Reddit threads? Press coverage? Each asset type implies a different replication path.What assets earn their visibility? –
Themes with fragmented leadership (no competitor above 20%) and asset types you can produce quickly (comparison pages, FAQ guides) are the 90-day targets. Themes with entrenched leaders and entity-driven visibility (Wikipedia, massive review bases) are long-cycle plays.Which themes are winnable in 90 days vs. 12 months? –

STEP 5

Page-Level Recommendations
Translate every gap into a specific URL to create or fix

An audit that ends with ‘improve your content’ has failed. The output of Step 5 is a prioritized list of specific pages – URLs to create, URLs to restructure – each mapped to the prompts it targets and the citation gap it closes.

From Pepper’s self-audit, the page-level recommendations were concrete:

Create comparison/alternative pages – ‘Alternatives to Contently / Skyword / ClearVoice’ appeared across 5 of 10 themes with no Pepper page targeting any of them. Highest-ROI single opportunity in the entire audit.
Create the definitional /geo page – Pepper coined ‘Search Everywhere Optimization’ yet had no authoritative GEO definition page. Competitors filled the definitional queries Pepper should own.
Launch YouTube – the #1 cited domain across LLMs (95 pages in Atlas data), where Pepper had zero presence. Largest single channel gap.
Fix technical signals – no llms.txt, no Organization schema, no Wikidata entity. AI crawlers could not identify what Pepper was.
Claim G2 and review platforms – G2 alone drives 14 LLM citation pages; directory presence carries 20–70% citation weight. Pepper was unlisted.

Each recommendation carries three attributes: the prompts it targets (from the universe), the expected citation impact (based on asset-type weight), and the effort class (hours for technical fixes, days for pages, months for entity building). That triage produces the execution sequence.

STEP 6

Implement and Iterate Monthly
The audit is a loop, not a report

The audit’s value compounds only if it re-runs. The final step of the methodology: implement the page-level recommendations, then re-run the full prompt set monthly to measure lift, catch competitor moves, and uncover new topical gaps.

The monthly re-run answers: which new pages earned citations? Which themes moved? Did any competitor spike (and from what asset)? What new prompts are buyers asking that the universe should absorb? Pepper’s standard practice adds 20 new prompts per month to the tracked set as new content publishes.

In Pepper’s own program, the audit baseline (0 citations, #612) became the measurement spine for an 8-month execution calendar – with monthly Atlas reports tracking progress against targets of 30+ citations by Month 3, 100+ by Month 6, and 200+ by Month 8.

Industry Updates: What Marketing Leaders Are Saying

‘Go Do a Search. We’re Not Showing Up.’

At Pepper’s Index ’26 summit, Allison from O’Reilly described how a simple manual audit became the internal catalyst for the entire GEO program: ‘I actually had to start at the top. I started with the president. And I was like, go do a search. Go search for top learning platforms for a tech team. We don’t even show up. But our buyers are there, and we’re not.’ The lesson for practitioners: before the formal audit, the informal one – a leadership team watching their brand fail to appear – is the fastest budget-unlocking demonstration in GEO.

The ‘Audit Tonight’ Directive From the Ecosystem Panel

The Index ’26 ecosystem panel closed with a homework assignment: spend the evening querying answer engines with your customers’ actual questions, checking whether the story being told about your brand is the story you want told. The framing matters – the panelists positioned the audit not as a vendor deliverable but as a recurring leadership discipline, with one adding that gaps found at night should become content briefs by the next morning.

LLM Traffic Converts 4–6x Higher – Which Raises Audit Stakes

A data point from the Index ’26 enterprise CMO panel reframes why audit gaps are expensive: community data shared on stage showed LLM-referred traffic converting 4 to 6 times higher than other channels. Every prompt where a competitor appears and you don’t isn’t just a visibility loss – it’s a loss of the highest-converting traffic source currently measurable. The audit quantifies exactly how much of that traffic is being conceded, theme by theme.

Boards Are Now Asking for the Baseline

Christine, a CMO on the Index ’26 enterprise panel, described educating her board on GEO metrics: ‘We’ve had to educate our board a little bit on GEO and what are the metrics that you measure to see success. We measure it down to the pipeline and the revenue.’ The audit baseline – share of voice, theme coverage, citation counts – is becoming standard board-reporting material. CMOs who can’t produce a measured baseline are increasingly the exception in enterprise reviews.

Closed-Loop Audit Systems Are the Next Wave

Dave, an investor on the Index ’26 ecosystem panel, flagged where audit methodology is heading: ‘Startups coming out nowadays work with companies like Pepper to optimize appearance in answer engines, but also then take that signal, go back automatically, produce new content for all the channels, and then measure the impact of that content. There’s a closed-loop nature to this.’ The monthly manual re-run is the current standard; the automated audit-to-content-to-measurement loop is the emerging one.

FAQ: Running an AI Search Audit

What is an AI search audit?

An AI search audit is a structured measurement of how a brand or URL appears across AI search platforms – ChatGPT, Gemini, Perplexity, Claude, and Google AI Overviews – for a defined set of buyer-relevant prompts. It quantifies brand mentions, domain citations, share of voice, and brand position; maps which competitors win the prompts the brand misses; and converts those gaps into page-level recommendations. It differs from an SEO audit by measuring presence inside generated answers rather than rankings on results pages.

How many prompts should an AI search audit include?

Pepper’s standard is 100 prompts organized into 10 themes of 10 prompts each, covering TOFU (definitional), MOFU (evaluative), and BOFU (comparative) buyer questions. Fewer than 50 prompts produces unreliable theme-level analysis; more than 200 adds collection cost without proportional insight at baseline. The set should grow by roughly 20 prompts per month as new content publishes and new buyer questions surface in sales calls and communities.

What is the difference between a brand mention and a domain citation?

A brand mention is when an LLM names your brand in its answer text; a domain citation is when the LLM links your website as a source. Mentions are driven by training-data presence, third-party coverage, and entity recognition – fixed through directories, press, and Wikipedia/Wikidata. Citations are driven by content structure and retrievability – fixed through structured pages, FAQ schema, and llms.txt. High mentions with low citations means the market knows you but your site isn’t retrievable; low on both means an entity problem content alone won’t solve.

Which LLMs should an AI search audit cover?

The minimum platform set is ChatGPT, Google Gemini (including AI Overviews), Perplexity, and Claude – adding Bing Copilot for B2B categories with Microsoft-ecosystem buyers. Coverage across all platforms matters because answers diverge significantly: a brand can lead in Perplexity while remaining invisible in ChatGPT, since each platform uses different retrieval sources and citation behaviors. Single-platform audits routinely produce false confidence.

How often should an AI search audit be re-run?

Monthly for the full prompt set, with weekly scans on priority queries. LLM answers shift as new content gets indexed, competitors publish, and models update – a quarterly cadence misses competitive moves for up to 90 days. The monthly re-run measures lift from implemented recommendations, catches competitor citation spikes, and surfaces new prompts to absorb into the tracked universe. Platforms like Pepper’s Atlas automate the re-run with alerts for significant changes.

Want the audit run for your brand? Pepper’s Atlas platform executes this exact methodology – 100-prompt universe, five-LLM coverage, mention and citation measurement, theme benchmarks, and page-level recommendations – with monthly re-runs built in. Start your AI search audit at atlas.pepper.inc

Latest Blogs

GEO / AI Search

Pepper Launches Its GEO Platform For Enterprise AI Visibility

AI search does not work like Google. There is one answer, and your brand is either in it or it is not. Here is what actually decides whether AI names you, drawn from the instrument Pepper’s team has used to run GEO for enterprises, now launching as a platform.

GEO / AI Search

Best GEO Platforms in 2026: 15 Top Tools Ranked and Reviewed

Fifteen GEO platforms, ranked and reviewed, grouped by what stage of buyer you actually are: just starting out, already inside an SEO suite, governance-first enterprise, or ready for a platform that closes the gap it finds.

GEO / AI Search

Best GEO Agencies in India: Top Indian Firms and Global Agencies Serving India

India’s AI-search boom has created a genuine specialization: GEO agencies in India that combine multi-engine citation tracking with execution. This guide ranks the strongest Indian-founded firms and India offices of global networks against five weighted criteria built for BFSI, healthcare, D2C, and B2B SaaS exporters.

Get your hands on the latest news!

Pepper Launches Its GEO Platform For Enterprise AI Visibility

GEO / AI Search

15 mins read

Best GEO Platforms in 2026: 15 Top Tools Ranked and Reviewed

GEO / AI Search

12 mins read

Best GEO Agencies in India: Top Indian Firms and Global Agencies Serving India

GEO / AI Search

9 mins read

Pepper Launches Its GEO Platform For Enterprise AI Visibility

GEO / AI Search

15 mins read

Best GEO Platforms in 2026: 15 Top Tools Ranked and Reviewed

GEO / AI Search

12 mins read

The Audit Trail: Every Step, Mapped

You can’t fix what you haven’t measured

The Case Study: Pepper Audited Itself (And the Results Were Brutal)

Industry Updates: What Marketing Leaders Are Saying

‘Go Do a Search. We’re Not Showing Up.’

The ‘Audit Tonight’ Directive From the Ecosystem Panel

LLM Traffic Converts 4–6x Higher – Which Raises Audit Stakes

Boards Are Now Asking for the Baseline

Closed-Loop Audit Systems Are the Next Wave

FAQ: Running an AI Search Audit

What is an AI search audit?

How many prompts should an AI search audit include?

What is the difference between a brand mention and a domain citation?

Which LLMs should an AI search audit cover?

How often should an AI search audit be re-run?

Latest Blogs

Get your hands on the latest news!

Similar Posts

Pepper Launches Its GEO Platform For Enterprise AI Visibility

Best GEO Platforms in 2026: 15 Top Tools Ranked and Reviewed

Best GEO Agencies in India: Top Indian Firms and Global Agencies Serving India

Pepper Launches Its GEO Platform For Enterprise AI Visibility

Best GEO Platforms in 2026: 15 Top Tools Ranked and Reviewed

Best GEO Agencies in India: Top Indian Firms and Global Agencies Serving India