# Vizelo.ai — How does Perplexity decide which sources to cite?
# Source: https://vizelo.ai/how-perplexity-chooses-citations.html
# Last reviewed: 2026-05-26

# How does Perplexity decide which sources to cite?

**Short answer:** It runs a retrieval-augmented pipeline against multiple search backends, ranks candidates, fetches a shortlist, and cites the spans it actually used. Recency, structural clarity, and citation-graph density move the needle most.

## How Perplexity's retrieval works

Perplexity is the most citation-transparent engine on the market — every answer ships with inline source links, which is a gift to anyone trying to reverse-engineer its behavior.

The pipeline is recognizably retrieval-augmented generation:

1. A user query is rewritten into one or more search queries.
2. Those queries hit multiple backends — Perplexity's own crawler index (PerplexityBot) plus syndicated search APIs.
3. Candidates are reranked, a shortlist is fetched, spans are extracted.
4. A language model writes the answer over those spans, attaching citations to the URLs whose content actually contributed.

The exact blend between Perplexity's own index and third-party APIs isn't published, but the practical implication is that both being directly indexed by PerplexityBot *and* ranking well on the major engines compound — they feed different paths into the same answer.

## The signals that get you cited

- **Recency.** For any query with a recency dimension — news, releases, prices, comparisons, "best of 2026" — Perplexity disproportionately favors content updated in the last 90 days. Same URL, fresher date, more citations.
- **Authority.** Domain trust still matters. Pages on established domains get fetched and surfaced more often than equivalent content on new ones.
- **Structural clarity.** Pages that are easy to parse — clean HTML, real headings, FAQ and Article schema, content rendered server-side — convert into citations at a higher rate than equivalent content trapped inside JS-heavy SPAs.
- **Query-intent fit.** Perplexity rewards specificity. A page that answers the exact question wins over a more comprehensive page that buries the answer in section seven.
- **Citation graph density.** Pages that are themselves cited by other trusted sources (G2, Reddit, Wikipedia, established blogs) show up as citations more often.
- **Claim density.** Content with short, specific, attributable claims is easier for the extractor to lift.

## What we've measured across the citation graph

Tracking citations at scale across thousands of category-relevant prompts, a few patterns hold.

Recency edges out raw authority on time-sensitive queries — a one-week-old article on a smaller site routinely beats a two-year-old article on a household name. Pages with explicit FAQ or HowTo schema get cited at a higher rate than visually equivalent pages without it. And Perplexity surfaces a wider, longer-tail set of sources per answer than ChatGPT does — you don't need to be in the top three to be cited; the top eight or ten often is enough.

## Optimizing for Perplexity vs cross-engine

The fundamentals are the same across engines — good GEO is good GEO. But the emphasis tilts:

- **Perplexity** over-rewards **recency and crawlable HTML**.
- **ChatGPT** over-rewards **training-data presence and entity completeness**.
- **Google AI Overviews** lean heavily on the existing SERP.

Treat each engine as a separate distribution problem with shared infrastructure, then watch which lever moves which engine for which prompts.

## Related answers

- [How do I rank in ChatGPT?](https://vizelo.ai/how-to-rank-in-chatgpt.html)
- [Why aren't my pages cited by ChatGPT?](https://vizelo.ai/why-am-i-not-cited-by-chatgpt.html)
- [How do I track when AI engines cite my brand?](https://vizelo.ai/how-to-track-ai-citations.html)
- [Do AI engines respect robots.txt?](https://vizelo.ai/do-ai-engines-respect-robots-txt.html)
- [All answers](https://vizelo.ai/answers.html)