How to Track Which AI Engines Are Sending Traffic to Your Site

AI-referred sessions jumped 527% between January and May 2025. Three chatbots account for 98% of all AI-driven site visits. Yet GA4, by default, cannot tell you which of those three…

AI-referred sessions jumped 527% between January and May 2025. Three chatbots account for 98% of all AI-driven site visits. Yet GA4, by default, cannot tell you which of those three chatbots sent any given session – free ChatGPT users strip referrer data, inflating Direct. Without custom configuration, AI engine attribution is invisible in standard analytics reporting.

The Analytics Referral Patterns That Identify AI Engine Traffic Sources

AI engine referral traffic appears in GA4 across four possible channels depending on how the platform passes referrer data: Referral (when the platform passes its domain as the referrer), Direct (when referrer data is stripped – most common for ChatGPT free tier), Organic Search (when Bing or Google attribution is passed rather than the AI platform’s attribution), or Unassigned (when the session matches no channel definition).

The platforms that pass referrer data reliably: Perplexity passes perplexity.ai as the referrer consistently, making Perplexity traffic identifiable in standard GA4 Referral tracking. Bing Copilot typically passes bing.com or copilot.microsoft.com as the referrer. Claude.ai passes claude.ai. Gemini passes gemini.google.com in most configurations.

The primary tracking gap: ChatGPT free tier, which accounts for the plurality of AI engine traffic, strips referrer data. Sessions originating from ChatGPT free often land in Direct or Unassigned. ChatGPT Plus may pass chat.openai.com or chatgpt.com as the referrer. The Custom Channel Group configuration targets both referrer patterns.

Previsible research on AI-referred traffic growth documented the 527% YoY increase through May 2025. Profound’s 4.4x conversion rate advantage for AI-referred visitors – compared to traditional organic search – makes accurate AI traffic identification a revenue attribution priority, not only a vanity metric.

How to Distinguish Perplexity, Bing Copilot, and Other AI Engine Referrals in Your Data

Referral source identification by platform: in GA4’s standard reports, navigate to Reports > Acquisition > Traffic Acquisition. Filter by Session source/medium to identify traffic from known AI engine domains. Perplexity.ai, pplx.ai (Perplexity’s shortlink domain), copilot.microsoft.com, gemini.google.com, claude.ai, and chat.openai.com or chatgpt.com appear in the source list when those platforms pass referrer data.

For a comprehensive view, build a custom channel group using the regex pattern:

(?i).*(chatgpt.com|chat.openai.com|perplexity.ai|pplx.ai|claude.ai|gemini.google.com|copilot.microsoft.com|you.com|phind.com)

This regex captures traffic from all major AI platforms that pass referrer data. Apply it as a Custom Channel Group in GA4 Admin > Data Display > Channel Groups. Name the channel “AI Engine Traffic.” Once configured, this channel appears in Traffic Acquisition reports and can be filtered for engagement, conversion, and value analysis.

Platform-specific behavior notes: You.com and Phind are included in the regex for developer-focused queries. Kagi does not consistently pass referrer data. New AI platforms entering the market should be added to the regex as they become traffic sources – audit the Referral report quarterly for unfamiliar domains that may be new AI platforms.

The Tracking Gaps That Make AI Engine Attribution Difficult and How to Work Around Them

The primary workaround for ChatGPT referrer stripping: use UTM parameters on any content where you control the link. If you publish content on platforms that pass the link to ChatGPT – Reddit posts, industry publications, social media – append UTM parameters so that when ChatGPT cites the link and users click through, the UTM data passes correctly regardless of referrer stripping. utm_source=chatgpt on links distributed through channels ChatGPT cites creates attributable sessions.

The branded search proxy: many users discover brands through LLM responses but do not click the AI citation link. Instead they search for the brand directly in Google. Rising branded search impressions in Google Search Console – queries containing the brand name – alongside AI visibility improvements is a downstream indicator of LLM-driven brand discovery. This proxy captures the AI influence on conversions where the session itself is not traceable to an AI referral.

Server-side log analysis supplements GA4: AI crawler bot traffic – GPTBot, OAI-SearchBot, PerplexityBot, Googlebot-Extended – appears in server logs before the AI system cites the page. High crawler activity from AI-specific bots on specific pages, correlated with subsequent referral traffic from the same AI platform, provides a leading indicator of which pages are citation candidates before they earn measurable referral traffic.

The LLM influence attribution gap: users discovering brands via AI and converting later through organic, direct, or paid channels are invisible to standard multi-touch attribution. Self-reported attribution – “How did you first hear about us?” – on demo request forms or checkout flows captures this influenced-but-not-referred conversion path. Survey data from this field correlated with periods of high AI visibility investment provides the only current method to quantify AI’s role in conversion paths that do not include an AI referral session.

Setting Up a Custom Reporting View for AI Engine Traffic in Google Analytics and Search Console

GA4 custom channel group setup: in GA4 Admin, navigate to Data Display > Channel Groups. Create a new channel group. Add a new channel named “AI Engine Traffic.” Define the rule: Session source matches regex (?i).*(chatgpt.com|chat.openai.com|perplexity.ai|pplx.ai|claude.ai|gemini.google.com|copilot.microsoft.com|you.com|phind.com). Save and publish. The channel will now appear in Traffic Acquisition reports and can be added to custom Exploration reports.

Key metrics to track for the AI Engine Traffic channel: sessions, engaged sessions, engagement rate, average engagement time, conversions, and conversion rate. Compare these metrics against the Organic Search channel baseline – AI-referred traffic converting at 4.4x the organic search rate is the benchmark from Profound’s analysis.

Search Console for AI traffic signals: Search Console does not directly identify AI engine traffic, but branded query impressions function as a proxy. Create a custom filter in Search Console Performance reports showing only queries containing the brand name. Track impression volume over time. Correlation between this metric and AI engine traffic volume from GA4 validates whether the AI channel is driving downstream branded search behavior.

Bing Webmaster Tools AI Performance dashboard (launched February 10, 2026): shows how many times Copilot used specific pages to ground responses and which queries triggered those events. This is the closest to direct AI citation tracking currently available from a first-party platform. Cross-reference Copilot grounding events against referral traffic from copilot.microsoft.com to understand the ratio of grounding events to resulting user visits.

Using Traffic Data to Prioritize Which AI Engines Deserve More Optimization Investment

Traffic volume by platform, combined with conversion rate by platform, determines ROI-justified optimization investment. Three chatbots account for 98% of AI-driven visits – likely ChatGPT, Perplexity, and Copilot or Gemini, though the exact distribution varies by industry. The specific distribution for your site, visible in the custom channel group data, determines where optimization investment produces the most measurable return.

If Perplexity drives 60% of your AI traffic and converts at 5x the rate of the second platform, Perplexity-specific optimization – PerplexityBot access in robots.txt, 14-day freshness cycles, visible timestamps – has the highest trackable ROI. If Copilot grounding events are high but referral conversions are low, the content is being grounded but not generating clicks – this indicates content that satisfies Copilot’s citation criteria but has CTR issues in the Copilot response format.

Quarterly investment allocation review: run the AI traffic analysis by platform each quarter, compare conversion rates, update optimization investment allocation based on which platforms are producing measurable business outcomes. The platforms growing fastest in AI traffic share deserve proportionally increased optimization attention.


Boundary condition: The 527% AI traffic growth figure is from Previsible through May 2025 and may not reflect the current growth rate – the AI traffic category is evolving rapidly and growth rates vary by industry and query category. The 4.4x conversion rate advantage is from Profound’s analysis and represents an aggregate average – individual sites may see higher or lower conversion rate differences. Custom channel group regex should be updated quarterly as new AI platforms emerge and existing platforms change their referrer data practices.

Sources

Leave a Reply

Your email address will not be published. Required fields are marked *