Why Your ROAS Numbers Are Wrong (and It's a Data Pipeline Problem)

TL;DR: ROAS numbers that look too good are usually wrong. The failure is rarely the attribution model -- it is the data pipeline underneath. Duplicate events, missing UTMs, timezone mismatches, and incomplete cross-device stitching each corrupt ROAS in predictable, detectable ways. This post walks through each failure with a concrete example.

The problem with a number everyone trusts

ROAS is the number that drives channel allocation decisions. When paid search shows 4.2x and paid social shows 2.1x, you shift budget toward search. When a channel drops below your target, you cut it. When a new channel hits 5x in week one, you scale it.

These decisions are right if the numbers are right. They are wrong -- sometimes badly wrong -- if the numbers are not.

I have worked with enough growth teams to notice a pattern: when I ask how confident they are that their ROAS numbers are accurate, the honest answer is usually "pretty confident, but we have noticed some things that do not add up." The things that do not add up are almost never in the attribution model. They are in the data pipeline feeding the attribution model.

Here are the four most common pipeline failures that produce systematically wrong ROAS numbers. Each one has a predictable direction: it either inflates ROAS (making a channel look better than it is) or deflates it (causing you to cut spend on a channel that is actually working).

Failure 1: Duplicate events inflating conversion counts

What it is: Your pipeline records the same conversion event more than once, inflating the numerator of your ROAS calculation.

How it happens:

Most event pipelines combine multiple collection methods. A user converts on your site. Three things happen nearly simultaneously: your client-side JavaScript fires a conversion event to Google Analytics, your server-side webhook fires a conversion event to your data warehouse, and a Zapier automation fires a third event to a CRM. Each of these is recorded as a separate conversion.

If your attribution model is reading from the warehouse, and the warehouse has a deduplication step that only covers one of these sources, you end up with two or three records of the same conversion. Your conversion count is inflated. ROAS goes up. The channel looks better than it is.

A concrete example:

A growth team was reporting paid search ROAS of 5.8x. They noticed that their reported conversion count was about 30% higher than the number of new customers in their CRM for the same period. Investigation revealed that a new server-side tagging setup had been added without removing the original client-side tags. Both were firing on the same conversion event. Deduplication fixed the count. True ROAS was 4.1x -- still good, but a materially different picture of the channel's efficiency.

What to check: Run a reconciliation query that counts attribution-layer conversion events against CRM new customer records for the same date range. A gap above 5% warrants investigation. The most common source is mixed client-side and server-side collection without deduplication on a stable transaction ID.

Failure 2: Missing UTM parameters in the warehouse

What it is: Ad clicks that should be attributed to a paid channel are recorded as direct or organic traffic because UTM parameters are lost in transit.

How it happens:

UTM parameters are appended to the landing page URL by the ad platform. When everything works, those parameters travel through the session and get recorded alongside the conversion. But there are several places where they can disappear:

A redirect between the ad destination URL and the actual landing page strips query strings
A mobile deep link that opens an app instead of a browser drops the URL parameters
An in-app browser (common for social ads on iOS) applies ITP-style storage restrictions that prevent first-party cookies from persisting the UTM through a multi-page session
A form submission that redirects to a thank-you page loses the referrer context if the form is on a different subdomain

When UTMs are lost, the conversion is recorded as direct traffic or organic. ROAS for paid channels drops because conversions that belong to those channels are not counted. Direct and organic traffic look mysteriously good.

A concrete example:

A growth team running significant spend on Meta noticed that their branded search volume was unusually correlated with paid social spend -- almost perfectly. Their attribution model was crediting branded search with conversions that were actually driven by Meta ads. Why? The Meta ads were running to mobile users, and the iOS in-app browser was not preserving the UTM parameters across the landing page session. Users who saw the ad and then later searched for the brand name were attributed to branded search rather than to the Meta campaign that drove the original awareness.

Meta ROAS looked low. Branded search ROAS looked high. Both numbers were wrong for the same underlying reason.

What to check: Look at your unattributed or direct-attributed conversions as a percentage of total conversions. If it is above 20-25%, you have a UTM capture problem. The fix requires auditing every path from ad click to conversion event: landing page redirects, mobile behavior, form submission flows, and any subdomain transitions.

Failure 3: Timezone mismatches between ad platforms and the warehouse

What it is: Ad spend is recorded in one timezone and conversion events are recorded in another, causing misalignment when you join spend to revenue for ROAS calculations.

How it happens:

Ad platforms let you set an account timezone. Many advertisers set this to their local timezone or the timezone of their primary market -- US Eastern, US Pacific, UK. Your data warehouse, if it follows good engineering practice, stores all timestamps in UTC. A conversion that happens at 11:30pm Eastern on a Monday is stored in your warehouse as 4:30am UTC on Tuesday.

When your attribution model joins daily ad spend against daily conversion events, Monday's spend is matched against Tuesday's conversions (and vice versa). For a typical campaign, this creates noise. For campaigns with strong day-of-week or time-of-day patterns -- which is most of them -- it creates systematic bias.

A concrete example:

A growth team running weekend promotions noticed that Monday's ROAS always looked worse than Friday's, even for campaigns with no day-of-week targeting. Investigation revealed that the ad platform was reporting Saturday spend in US Pacific time (UTC-8), while the warehouse was recording conversions in UTC. Conversions that happened Saturday evening Pacific time -- Saturday night -- were appearing as Sunday UTC in the warehouse. Saturday spend was being matched against Sunday conversions, and Sunday spend was being matched against Monday conversions. Every campaign with weekend promotion patterns had its day-of-week attribution shifted by one day.

The error did not affect weekly ROAS, but it made daily optimization misleading. The team had been reducing bids on Saturdays based on apparent underperformance that was a timezone artifact.

What to check: For any ROAS calculation that uses daily or sub-daily granularity, confirm that both your spend data and your conversion data are in the same timezone before joining. The safest approach is to normalize everything to UTC in the warehouse and accept that your reporting will display in UTC. The more common (and more error-prone) approach is to convert warehouse timestamps to the ad platform timezone at query time -- which requires knowing which timezone each platform uses, and which varies by account.

You can put a rupee figure on this leak.

Our AI Stack Audit x-rays your existing data and quantifies the gap in a fixed two-week engagement. No new tools to buy first.

See how the audit works

Failure 4: Last-click overweighting caused by incomplete cross-device stitching

What it is: Cross-device journeys are not tracked as single user paths, so last-click attribution credits the final touchpoint (often branded search or direct) rather than the upper-funnel touchpoint (often paid social or display) that actually drove the decision.

How it happens:

A user sees a paid social ad on their phone while commuting. They are interested but do not click. Two days later, they search for the product on their work laptop. They visit the site. A week later, they search for the brand by name and convert. Your attribution model records three touchpoints, but only on the work laptop -- the phone session was never connected to the laptop sessions.

In a last-click model, branded search gets 100% of the credit. In a multi-touch model, the two laptop sessions split the credit. The phone session -- which was the first exposure that created the intent -- gets nothing, because there is no cross-device identity resolution connecting the phone to the user record.

This failure does not inflate or deflate total ROAS. It redistributes credit from upper-funnel channels (paid social, display, YouTube) to lower-funnel channels (branded search, direct). Upper-funnel channels look worse. Lower-funnel channels look better. The result is that growth teams systematically underinvest in channels that create intent and over-invest in channels that simply capture intent that already exists.

A concrete example:

A growth team made a significant cut to a paid social budget after two consecutive quarters of ROAS below target. Branded search ROAS was consistently strong, so the team shifted budget there. Branded search volume dropped 40% within six weeks. Investigation revealed that branded search volume was causally downstream of paid social -- the social ads were creating awareness that converted to branded search queries. When the social budget was cut, there was less demand to capture. The team rebuilt the social budget and branded search volume recovered.

The attribution model had been reporting accurately within its constraints. The constraint was that it had no visibility into the cross-device path that connected the paid social exposure to the branded search conversion. What looked like a budget optimization was actually a self-inflicted reduction in the demand generation that made the efficient lower-funnel channel work.

What to check: Look at the correlation between your upper-funnel channel spend and your branded search volume with a 1-4 week lag. If the correlation is high, your upper-funnel channels are generating branded demand that is being credited to branded search. A proper cross-device identity graph would show these as connected paths. Without it, you are flying blind on the channels that build the pipeline.

Why the attribution model is the wrong place to look

Each of these four failures is a data engineering problem. Duplicate event deduplication, UTM parameter preservation, timezone normalization, and cross-device identity resolution all require changes to the data layer -- the pipelines, tables, and joins that feed your attribution model.

Your attribution tool can be perfectly configured and still produce wrong numbers if the data it reads from is wrong. Switching from last-click to data-driven attribution does not fix duplicate events. Moving from one attribution platform to another does not fix the timezone offset between your ad account and your warehouse. These fixes require someone to open the warehouse and change the models.

The check that surfaces all four failures is the same: compare your attribution numbers against your source-of-truth systems (CRM records, billing data) on a regular cadence and investigate any discrepancies above a threshold. Most growth teams that do this for the first time find at least one material discrepancy. Some find three or four.

What accurate ROAS numbers actually look like

When these pipeline failures are fixed, a few things change:

Channel performance numbers become defensible. When finance asks why you are increasing paid social spend, you can show the methodology behind the ROAS calculation, not just the number.
Budget optimization decisions improve. You stop cutting channels that look weak because of attribution artifacts and start cutting channels that are actually weak.
Upper-funnel channels get credit for the demand they generate. The growth model becomes more complete.

None of this requires a sophisticated attribution model. It requires accurate data feeding a simple model. A last-click model on clean data is more useful than a data-driven model on corrupt data.

The diagnostic that takes 30 minutes

If you want to know whether your ROAS numbers have these problems, start here:

Compare your attribution-layer conversion count to your CRM new customer count for the last 90 days. If the ratio is not close to 1.0, you have a duplicate event or coverage problem.
Look at the percentage of conversions with no UTM source (recorded as direct or unattributed). If it is above 25%, you have a UTM capture problem.
Check whether your ad platform account timezone matches your warehouse timezone convention. If they differ, every daily ROAS calculation is misaligned.
Run a correlation between paid social spend and branded search volume at a 1-4 week lag. A strong correlation suggests cross-device attribution is understating upper-funnel channel contribution.

Each of these takes a few queries to check. The fixes are real engineering work, but the diagnostic is fast.

If your ROAS or CAC numbers feel off, the problem is usually upstream. We build the data infrastructure that makes attribution actually work. Book a free 30-minute diagnostic: https://calendar.app.google/ebttpWW5efCSzY2D6

FAQ

Why does my attribution tool show different numbers than the ad platform?

Ad platforms use their own attribution models to report conversions -- typically crediting any conversion that happened within their platform's view or click window. Your attribution tool applies a different model across all channels simultaneously. Both are counting real events, but they are applying different rules. The discrepancy is expected. The question is which model you trust, and whether the underlying event data is accurate enough to make the model's output meaningful.

Is duplicate event inflation always visible in the data?

Not always. If your deduplication logic runs before writing to the warehouse, the duplicates are removed before you can count them. The problem surfaces through reconciliation -- when your warehouse conversion count is higher than your CRM or billing record count. If you have never run that reconciliation, you do not know whether duplicates are present.

How much does a timezone mismatch actually move ROAS numbers?

For a weekly or monthly ROAS calculation, the effect is usually small -- events are shifted by hours, not days, so they mostly land in the right period. For daily ROAS, the effect can be material, particularly for campaigns with strong time-of-day or day-of-week patterns. The most common case where this matters is weekend-heavy promotions or campaigns with dayparting optimization. For daily optimization decisions, a consistent timezone convention in the warehouse is essential.

My ROAS numbers look unusually good. Should I be suspicious?

Yes. ROAS numbers that look significantly better than industry benchmarks or significantly better than your own historical performance are often a signal of a data quality issue. The most common causes are duplicate event inflation and last-click overweighting of branded search. Neither means your campaigns are not working -- it means the reported efficiency is overstated, which creates a planning problem when you try to scale.

The problem with a number everyone trusts

Failure 1: Duplicate events inflating conversion counts

Failure 2: Missing UTM parameters in the warehouse

Failure 3: Timezone mismatches between ad platforms and the warehouse

You can put a rupee figure on this leak.

Failure 4: Last-click overweighting caused by incomplete cross-device stitching

Why the attribution model is the wrong place to look

What accurate ROAS numbers actually look like

The diagnostic that takes 30 minutes

FAQ

Why does my attribution tool show different numbers than the ad platform?

Is duplicate event inflation always visible in the data?

How much does a timezone mismatch actually move ROAS numbers?

My ROAS numbers look unusually good. Should I be suspicious?

Related Posts

What Your Attribution Model Needs from Your Data Stack

Data Strategy vs AI Strategy: Which Should Your SaaS Company Build First?

Stay ahead in AI & Data

Find your leak before you pay to fix it