Field Reference

How Each AI Engine Cites Differently

ChatGPT, Perplexity, Claude, Gemini, Google AI Overviews, Microsoft Copilot, and DeepSeek each retrieve and rank the web differently. This is the field reference for what each one weighs and how to optimize for it.

How AI engines retrieve

Every modern answer engine has the same shape: a language model that generates the answer, a retrieval layer that fetches supporting evidence from the web, and a citation layer that decides which sources to surface. The differences are in the retrieval and citation layers — not the LLM.

Three things shape per-engine behavior:

Search backbone

Bing, Google, Brave, or proprietary index. Determines which content even enters the candidate set.

Crawler

What the engine fetches and how often. Some have separate training vs. real-time crawlers.

Ranking model

How candidates get ranked into the final answer. Recency, authority, structured data, EEAT — different weights per engine.

The implication for AEO: optimizing for one engine generalizes only partially to others. ChatGPT-friendly content on a low-Bing-DA domain still loses Copilot citations. Perplexity-friendly content with no Wikipedia entry still loses ChatGPT citations.

Quick comparison

Engine
Backbone
Avg cites
Recency bias
Top signal
ChatGPT (search)
Bing + SearchGPT
2-4
Medium
Bing DA + Wikipedia
Perplexity
Google+Bing+own
5-10
Very high
Recency + diversity
Claude (research)
Brave + own
4-8
Low
First-party authority + llms.txt
Gemini
Google
2-5
Medium
Google ranking
Google AI Overviews
Google
3-6
Medium
Featured snippet eligibility
Microsoft Copilot
Bing
3-7
Medium
Bing index + Wikipedia
DeepSeek
DeepSeek own
3-6
Medium
GitHub + technical content

ChatGPT

OpenAI

Crawlers

  • GPTBot (training)
  • ChatGPT-User (real-time browse)
  • OAI-SearchBot (ChatGPT Search index)

Search backbone

Bing API + OpenAI's SearchGPT index

Citation style

Numbered superscript citations in search/browse mode. Inline links in deep research.

Avg per answer / freshness

2-4 citations in search; 0 in plain chat without browse.

24-72 hours for indexed content; real-time for ChatGPT-User fetches.

Ranking signals (in order)

  1. Domain authority on Bing (highest weight)
  2. Wikipedia presence
  3. Structured data (FAQPage, Organization)
  4. llms.txt + llms-full.txt
  5. First-party canonical content (vs aggregators)
  6. Recency for time-sensitive queries

How to optimize for ChatGPT

  • Don't block GPTBot, ChatGPT-User, or OAI-SearchBot in robots.txt
  • Get a Wikipedia entry — single biggest unlock for ChatGPT citations
  • Submit your sitemap to Bing Webmaster Tools
  • Ship Organization + SoftwareApplication JSON-LD
  • Build llms.txt (ChatGPT actively reads it)
Note: ChatGPT's citation behavior changes by mode. Search mode cites heavily; plain chat cites very little; deep research cites 6-12 sources with extended reasoning.

Perplexity

Perplexity AI

Crawlers

  • PerplexityBot
  • Perplexity-User

Search backbone

Blend of Google + Bing + Perplexity's own index. Reddit and Substack heavily weighted.

Citation style

Inline numbered citations on every claim. Most aggressive citation density of any engine.

Avg per answer / freshness

5-10 citations.

Often <1 hour for trending topics. Very recency-biased.

Ranking signals (in order)

  1. Recency (most weighted of any engine)
  2. Reddit threads and forum content
  3. Substack and independent publications
  4. First-party authoritative content
  5. Diversity — Perplexity rewards covering a topic from multiple angles
  6. Citation surface area on aggregators

How to optimize for Perplexity

  • Allow PerplexityBot in robots.txt
  • Build comparison and listicle content — Perplexity loves them
  • Engage on Reddit in your category — Perplexity surfaces threads
  • Ship FAQPage JSON-LD on every doc page
  • Keep dateModified current — recency lifts Perplexity citation rate fastest
Note: Easiest engine to win citations on if you publish frequently. Hardest engine to dominate because it diversifies sources aggressively.

Claude

Anthropic

Crawlers

  • ClaudeBot
  • Claude-Web
  • anthropic-ai

Search backbone

Anthropic's web index + Brave Search API. Conservative crawler.

Citation style

Inline citations only in research mode and Computer Use. Plain chat doesn't cite.

Avg per answer / freshness

4-8 in research mode; 0 in plain chat.

1-7 days typically. Less recency-biased than Perplexity.

Ranking signals (in order)

  1. First-party authoritative content (heaviest weight)
  2. Long-form, well-reasoned content
  3. Documentation and technical references
  4. llms.txt + llms-full.txt (Anthropic publicly endorses llms.txt)
  5. Structured data
  6. Author expertise signals (EEAT)

How to optimize for Claude

  • Allow ClaudeBot and Claude-Web in robots.txt
  • Ship llms.txt and llms-full.txt — Anthropic explicitly recommends this
  • Write long-form, comprehensive content — short tactical posts don't get cited as often
  • Include explicit author bios with credentials
  • Cite your own sources (Claude rewards content that itself cites well)
Note: Lowest citation volume of major engines but highest citation quality. Getting cited by Claude is a strong trust signal — Claude only cites content it trusts.

Gemini

Google

Crawlers

  • Google-Extended (controls Gemini training access)
  • GoogleOther (research)

Search backbone

Google Search index — same backbone as Google AI Overviews and AI Mode.

Citation style

'Sources' panel below the answer with 2-5 chip-style links. No inline citations in main text.

Avg per answer / freshness

2-5 sources in panel.

Standard Google Search freshness — varies by query type.

Ranking signals (in order)

  1. Standard Google ranking signals (PageRank, EEAT, freshness, intent match)
  2. Structured data (Google's preferred surface)
  3. Authoritative domain reputation
  4. Site reputation in Google's broader index
  5. Mobile usability and Core Web Vitals

How to optimize for Gemini

  • Don't block Google-Extended (separate from Googlebot — you can allow Search but block Gemini training)
  • Win at Google Search first — Gemini citations follow Search rankings closely
  • Ship comprehensive JSON-LD — Google reads it more reliably than any other engine
  • Maintain a current sitemap and use Search Console
  • Keep page experience metrics green (Core Web Vitals)
Note: Gemini citations are basically Google rankings with an LLM wrapper. If you rank for the query in Google Search, you'll show up in Gemini's Sources panel.

Google AI Overviews & AI Mode

Google

Crawlers

  • Googlebot
  • Google-Extended

Search backbone

Google Search index. AI Overviews (panel-style) and AI Mode (full-page generative search) both pull from the same index.

Citation style

Linked supporting URLs in the AI Overview panel; full-page in AI Mode.

Avg per answer / freshness

3-6 supporting links per AI Overview.

Real-time for trending; standard Search freshness otherwise.

Ranking signals (in order)

  1. Featured snippet eligibility (strongest predictor)
  2. Top 10 organic ranking for the query
  3. EEAT signals — author bio, credentials, citations
  4. FAQPage and HowTo JSON-LD
  5. Content that directly answers a question in the first paragraph

How to optimize for Google AI Overviews & AI Mode

  • Optimize for featured snippets — same content tends to show in AI Overviews
  • Use 'People also ask' queries as content prompts
  • Lead every doc page with the answer in the first 50 words
  • Ship FAQPage JSON-LD on every page with Q&A
  • Build category authority via consistent first-party publishing
Note: AI Overviews get 90%+ of their content from pages already ranking on page 1 of Google. SEO and AEO are not separate disciplines for Google AI surfaces.

Microsoft Copilot

Microsoft

Crawlers

  • Bingbot
  • Microsoft Copilot fetcher

Search backbone

Bing index. Copilot in Edge sidebar, Microsoft 365 Copilot, and Copilot.com share the same retrieval.

Citation style

Numbered citations with full source list.

Avg per answer / freshness

3-7 citations.

Standard Bing freshness — generally 1-7 days.

Ranking signals (in order)

  1. Bing index ranking (primary signal)
  2. Domain authority on Bing
  3. Structured data
  4. Wikipedia presence
  5. Microsoft Learn / official docs (heavily weighted for technical queries)

How to optimize for Microsoft Copilot

  • Submit sitemap to Bing Webmaster Tools (the single highest-leverage Copilot fix)
  • Don't block Bingbot
  • Build a Wikipedia presence
  • Ship JSON-LD — Bing parses it as reliably as Google
  • Get listed in Bing-indexed directories
Note: Copilot is the easiest 'enterprise' engine to win — fewer competitors optimizing for Bing, lower bar to citation. Often cites domains ChatGPT misses.

DeepSeek

DeepSeek AI

Crawlers

  • DeepSeekBot

Search backbone

DeepSeek's own crawl. Limited Western web coverage compared to ChatGPT/Gemini; stronger Chinese-language coverage.

Citation style

Inline citations in search mode.

Avg per answer / freshness

3-6 citations.

1-3 days for indexed content.

Ranking signals (in order)

  1. First-party authoritative content
  2. GitHub repositories (heavy weight for technical queries)
  3. Academic and research content
  4. Open-source documentation
  5. English-language SaaS product pages

How to optimize for DeepSeek

  • Allow DeepSeekBot in robots.txt
  • Maintain a public GitHub presence with READMEs (DeepSeek heavily weights GitHub)
  • Publish technical/dev-focused content — DeepSeek's audience skews engineering
  • Ship llms.txt — DeepSeek's parser reads it
Note: Smallest market share of major engines but rapidly growing in dev tooling and APAC. Cheap to optimize for — most Western brands ignore DeepSeek entirely.

What this means for AEO

Three takeaways for how to allocate AEO effort across engines:

1

Cover the four prerequisites first

Crawl access, llms.txt + JSON-LD, citation surface area, and freshness apply to every engine. Until those are green, per-engine optimization is a distraction.

2

Pick your engine of priority

If your buyers are devs, prioritize ChatGPT and Claude. If they are knowledge workers, Microsoft Copilot. If they are researchers or news-driven, Perplexity. If they are general consumers, Google AI Overviews.

3

Track per-engine, not aggregated

Aggregate visibility scores hide engine-level swings. Track citation share separately for each engine, then concentrate effort where you can move the needle fastest.

FAQ

Do all AI engines cite sources the same way?

No. Perplexity cites every claim inline with numbered sources. ChatGPT cites in browse/search mode but rarely in plain chat. Claude cites only in research mode. Gemini cites with its 'Sources' panel. Google AI Overviews cite a small subset of supporting links. Each engine has different defaults, different ranking signals, and different visible UI for citations.

Which AI engine cites the most sources per answer?

Perplexity averages 5-10 inline citations per answer — the most by a wide margin. ChatGPT in search/browse mode averages 2-4. Google AI Overviews show 3-6 supporting links. Claude (research mode) cites 4-8. Gemini cites 2-5. Copilot cites 3-7.

Does ChatGPT use Bing or Google?

ChatGPT's web search uses Bing as its primary backbone and OpenAI's own SearchGPT index for ChatGPT Search. Perplexity blends multiple sources (Google + Bing + its own index). Gemini uses Google. Copilot uses Bing. Claude (when it browses) uses a mix.

How do I get cited by every engine?

Cover the four prerequisites every engine cares about: (1) crawl access — don't block GPTBot, ClaudeBot, PerplexityBot, Google-Extended; (2) llms.txt + JSON-LD — give every engine the same factual brief; (3) citation surface area — get listed on category aggregators (Reddit, G2, niche directories); (4) freshness — keep dateModified current. After that, individual engines have distinct preferences covered below.

Why does Perplexity cite my site but ChatGPT doesn't?

Most common cause: ChatGPT's index is heavier on .com, established domains, Wikipedia, and high-DA sources. Perplexity blends fresher content from Reddit, Substack, and niche blogs. If Perplexity cites you and ChatGPT doesn't, increase your presence on Bing-indexed authority sources (Wikipedia entry, G2 listing, established publication coverage) — that's what shifts ChatGPT.

Track your citation share by engine

AIExposureTool monitors your brand citations separately across ChatGPT, Perplexity, Claude, Gemini, Google AI Overviews, Copilot, and DeepSeek. Free audit, no signup.