Search Engine Land Fails A/B Testing
To be fair to Search Engine Land, its editors declaimed the opinions of guest writer Matt Van Wagner of Find Me Faster. Van Wagner’s article, The Pitfalls Of A/B Ad Split Testing, Part 2, might as well have been titled “The Pitfalls of Pointless Analysis,” given how it goes on about matters insignificant to the business of improving results through SEM.
Though A/B testing is crucial to success in this pursuit, it’s not a simple thing to get right. Beginners seeking pearls of wisdom in Van Wagner’s lengthy piece would be better off asking their engine sales rep (who are probably not accomplished testers, either, but then again some of them were trained at Razorfish).
Van Wagner offers a free lobster dinner (he’s from New Hampshire, where crustaceans are currency) for help with the common A/B test conundrum of a winning ad that performs poorly on its own.
He correctly identifies the problem — lack of a true A/B split among rotated ads — but gets woefully lost on the way to a solution, considering complications from custom ad serving, search histories, repeat queries and something he calls “over optimization” without identifying the classic culprit of a back-test failure.
He should have asked: Were any of the keywords in this problematic test on broad match?
By far, the most common faulty assumption in search A/B testing is that both ads are eligible to show on the same query set. This is only true on exact match. Beyond exact, the eligible query set expands (i.e. broad match gets broader) for the ad with the higher CTR. During a test attempt, the ad earning the higher CTR for the shared query set will seem to suffer a CTR reduction as the engine finds the maximum yield (for itself) via query-set expansion. Sustained A/B tests on broad match routinely “fail” to achieve a significant result as the algorithm automatically challenges the winner, driving its CTR down. The test isn’t really a failure, because increasing yield is what the algorithm is designed to do.
This scenario causes confusion about the value of A/B testing in search. But there is no controversy: Understanding how paid search works enables experts to test to our hearts’ content. And the learning that pours in from a correctly executed A/B testing program makes our clients enough wampum to buy their own lobster.






You raise some interesting points here in your blog post, though I will say I find the snarky tone of your remarks a bit unwarranted and ungentlemanly. To be fair, Search Engine Land declaims the the opinions of all its writers.
I accept your point that keyword match causes problems for A/B testing. It was remiss of me not to mention that factor straight up. It is a common cause of A/B testing failure.
However, my point is not that that A/B testing is not a valid test technique.
My point is that taken to its logical conclusion, A/B testing can cause performance degradation, and that a set of high performing ads can outperform a single champion ad declared through rounds of A/B testing, even when exact match is used exclusively.
If this is true, as is my contention, then using ad set optimization together with A/B ad testing methodology adds a new tactic for all search marketers to employ.
Razorfish are known and well-respected for their technical prowess and expertise in search marketing, so I’ll offer you this simple challenge.
Can you help prove or disprove the assertion that an exact match ad group containing a set of high performing ads will outperform any single ad that resulted from A/B testing.
If my theory of ad set optimization holds up under test, your reward, even if you don’t like lobster, is that you’ll have an effective new optimization method to offer your clients and become even more successful in managing their campaigns.
Are you up for the challenge? If so, drop me a line at matt at findmefaster and lets discuss it.
Matt Van Wagner
Thanks to Matt Van Wagner for responding. We apologize for any perceived lack of gentlemanliness. Clearly, Mr. Van Wagner is an experienced search marketer who cares about results and deserves our respect.
As for the theory that a set of ads can outperform a single champion on the same set of exact keywords, we’re confused. If none of the ads in the set outperforms the champion individually, how can the entire set? We must be missing something, because this strikes us as analogous to claiming that 10 fast men together can be faster than the fastest man in the world.
Thank you, Razorfish, I accept your apology.
As your question suggests, my Ad Sets Optimization model is counter-intuitive, but only if you are narrowly focused on the problem of finding the best performing ad, rather than optimizing your ad group performance.
Rather than your fastest runner analogy, think instead about the Tour de France bike race, where a peleton of ten good riders can always beat the fastest single rider. My Ad Set Optimization model works more like that.
I apologize that I cannot the time to explain it in detail here in your blog, but I am planning to write and talk about it more expansively in weeks to come, so please stay tuned.
I believe you will find it worth exploring for your SEM campaigns.
Thank you.
Matt Van Wagner