ChatGPT Cites Just 48 Domains for 22.5% of B2B Answers

Growtika's analysis of ChatGPT's citation patterns should worry B2B marketers: just 48 domains account for 22.5% of all citations the model provides. That's a tiny handful of sites, think Forbes, Gartner, HubSpot, getting preferential treatment when ChatGPT answers business questions. The data comes from tracking responses across numerous business queries and analyzing which sources the model references most.

This concentration stems from how ChatGPT actually retrieves information. The model uses Retrieval-Augmented Generation (RAG), which converts user queries into vector embeddings and searches for semantically similar content in a pre-indexed database. High-authority domains with structured, fact-dense content win because they're statistically more likely to be the "nearest neighbors" in vector search results. The algorithm favors them, so they get cited again and again.

If you want ChatGPT to cite your content, you need to structure it to match how these retrieval systems work. Dense, authoritative content on established domains gets amplified. Newer or smaller publications get squeezed out.

This is gatekeeping, baked into the architecture.

When fewer than 50 websites generate nearly a quarter of all B2B citations, you're getting a narrow slice of available expertise. AI models need reliable sources, but the systems that select those sources concentrate authority in familiar places.