Putting ChatGPT and Bard to the Test in PPC Marketing

Photo of author

Art Zabalov

Updated

AI has been found useful for SEO, programming, content editing, and even meal planning and its applications continue to expand.

But is ChatGPT or Google Bard useful for PPC marketing?

To answer this question, my colleague Jake Wieczorek and I embarked on a month-long test, spanning all of our ad accounts and over 20 tasks, both big and small. 

In the following sections, I will be sharing our findings on 15 of these tasks (using Wayfair’s “All things outdoorsy” campaign for screenshot examples to protect client privacy).

Here are the 3 objectives of this test:

  1. To see if ChatGPT and Bard can perform real, complex tasks that we perform (and test various prompts in a practical environment)
  2. To see if ChatGPT and Bard can enhance our work and make it more effective and efficient
  3. To see if ChatGPT and Bard can replace us altogether

Minimum viable test sample:

  • 2 client accounts per task and all applicable prompt variants (from 2 to 5 depending on the task, not including refinement prompts)

We went into this with a bit of defiance (“We’re not getting replaced by some AI!”), but not without admiration for the things it’s been proven to do well and hope that it can streamline our processes. 

Scoring

Before we get into the results, a brief explanation of the scores.

โ€To simplify scoring, we looked at just the two most important things for us:

  • Usefulness (๐Ÿ‘ Useful, ๐Ÿค” Potentially useful, ๐Ÿ‘Ž Not useful). A โ€˜Usefulโ€™ score means we can leverage all or most of the results/answers while a โ€˜Potentially usefulโ€™ score means we can leverage only a portion of it, and ‘Not useful’ given to the results that required significant prompt refinement or result editing, defeating the purpose.
  • Self-sufficiency (๐Ÿค– Self-sufficient, ๐Ÿ”ง Not self-sufficient). We defined self-sufficiency as the ability to produce results that required little to no editing or quality control on our part.

Summary of findings

Across the 15 tasks reviewed below, we got the following results:

Usefulness:

  • ๐Ÿ‘ Useful – 4 tasks
  • ๐Ÿค” Potentially useful – 7 tasks
  • ๐Ÿ‘Ž Not useful – 4 tasks

Self-sufficiency:

  • ๐Ÿค– Self-sufficient – 1 task
  • ๐Ÿ”ง Not self-sufficient – 14 tasks

Takeaway: We emerged from this test a little underwhelmed, but also with a clearer understanding of the specific areas where ChatGPT and Bard can prove valuable.

In our assessment, AIโ€™s best use cases for advertising are in copywriting and creative brief ideation, given their comparatively lower need for extensive refinement and quality control.

Unfortunately, when it comes to technical tasks ranging from keyword research to data analysis, both ChatGPT and Bard showed limitations, often providing only potentially useful information for practical use, meaning they should only be used as an additional step in the process, rather than replacing the process altogether.

Lastly, almost none of the results are self-sufficient and require diligent quality control to ensure accuracy.

Keyword research

 ๐Ÿค” Potentially useful

 ๐Ÿ”ง Not self-sufficient

ChatGPT: Can be helpful for generating additional topic ideas for further research in Google Keyword Planner.

Note: we’ve found ChatGPT’s estimations of search traffic to be inaccurate across all of our clients and wouldn’t recommend basing assumptions on its estimates.


ChatGPT

64631ea4794219e8d0bea1ea 1e403a45

vs Google Keyword Planner

64631ea4c31920b376ea277f 82327fb3

Bard: The new topic ideas it provided were also useful, albeit tighter in nature, limiting expansion. 

Its search volume estimates, surprisingly, were completely off-base.

64631ea4a37cf9d892826597 e21b3877

Negative keywords

๐Ÿ‘ Useful

๐Ÿ”ง Not self-sufficient

ChatGPT: While not self-sufficient (or fully accurate as seen in the screenshot), we’ve found ChatGPT to be genuinely helpful for finding negative keywords for exclusions.

Worth noting, however, that it’s mostly useful for top-level exclusions and not for analyzing granular ad groups’ search terms – something we do on a weekly basis – as it doesn’t understand adjacent topics’ relation as well as a person would and applies overly restrictive exclusions.

646313f6a4b75e42e035f4be 371d863f

Bard: The negative keywords were more specific and offered new options, but didn’t offer the broader “we-didn’t-consider-these” options.

646313f6101622cb2b127f0e 3c23ae10

Grouping keywords

๐Ÿค” Potentially useful

๐Ÿ”ง Not self-sufficient

ChatGPT: Can be helpful to improve the speed of ad group creation. 

However, it consistently leaves a number of keywords without a group, and mistakenly groups certain keywords together (such as “sofa sets” in the example screenshot, which deserves its own ad group as it would need a separate product focus on sets of furniture). 

It’s not applicable for granular keyword grouping – the way we set up campaigns.

64631ea5ff0a7567890d6164 5ab68b5f

Bard: Bard approaches it differently, providing new grouping ideas, but it also throws in its own keywords :O (not always bad, but it means you’ll have to review every keyword after grouping)

64631ea4b518709fa43faa1b 74777b3e

Ad copy

๐Ÿ‘ Useful

๐Ÿ”ง Not self-sufficient

ChatGPT: Safe to say it struggles to stay within the character limits (and organize the copy into single-angle approaches) which is very important for most ad channels (Google and Microsoft Ads more so than others) and our own A/B testing efforts. 

Still, this is where we believe ChatGPT shines as, with appropriate guidance, it’s able to provide a large number of variations for different angles/personas/platforms (and neatly organize them into columns for easier analysis).

Having said that, the copy is rarely usable as-is and often needs additional refinement.

64631ea5ee623674300469bd 78cfe2f9
64631ea587fffe29c1e3ce85 0746f66d

Bard: Same issues with staying within character limits, but also more uninspiring ad copy, in my opinion.

64631ea5becdbd5e3ef1e20f 897506d8

Not to say you can’t get more out of it with some prompt refinement ๐Ÿ˜‰

64631ea5cf7f6bf9f57ae238 3f09f05a

 Pro Tip: Leverage โ€œorganize into tables, add X columns to the right and fill them out using different tones of voiceโ€ prompts to generate more copy at scale and in a more readable format.

ChatGPT ad copy script

๐Ÿ‘Ž Not useful

๐Ÿ”ง Not self-sufficient

ChatGPT: We’ve tested the RSA-writing script that is currently circulating on LinkedIn. The script reviews search ads in Google Ads and suggests additional headlines where the ad doesn’t have a maximum number of headlines or descriptions.

Find the script here: https://searchengineland.com/google-ads-script-gpt-responsive-search-ads-395548

We’ve found that:

  1. This approach doesn’t work with our approach of A/B testing different angles as the script ignores specific ad copy focuses and often produces generic ads
  2. It often produces copy that’s over the limit (as already established previously), making the variant unusable as there’s no way to relay the same information in 30 characters.

We’re also planning on verifying the test results provided in the article by running A/B tests across a number of clients, as we haven’t noticed the number of headlines affecting results. In fact, Google often picks just 3-4 headlines and rotates through those.

Bonus: Analyzing reviews for ad copy

๐Ÿค” Potentially useful

๐Ÿ”ง Not self-sufficient

ChatGPT or Bard can help with analyzing own and competitor reviews in bulk to:

  1. Identify common issues or complaints that should be addressed in the copy
  2. Discover emerging trends or topics to be added as keywords

This, however, requires additional preparation as the data needs to be scraped and published or pasted (limited) into ChatGPT/Bard for reviewing.

64631ea7ee62367430046b4c fa02bbfa

Targeting research

๐Ÿค” Potentially useful

๐Ÿ”ง Not self-sufficient

ChatGPT: Targeting research with ChatGPT could be useful when doing top-level market research, as well as to uncover new targeting angles for audience targeting/ad copy/creatives/keywords. 

Further refinement is key to uncovering additional ideas. As is the case with all tasks so far, though, none of it can be used as-is and requires verification and further analysis to be practically applicable.

64631ea68656cad4b70a8a4d 0bf6190f

Similarly generic in nature, ChatGPT’s channel suggestions can help provide general ideas for ad networks worth considering (Houzz ads were a new one for me), but should not be acted on without proper research of network specifics (i.e. ChatGPT won’t tell you that at LinkedIn’s $5-10 CPCs, you’re not likely to make a profit promoting discounted outdoor furniture)

64631ea6becdbd5e3ef1e2a7 e2eb446f

Bard: Top-level, but surprisingly detailed results from Bard.

64631ea65116c668df8f008a 329734b5

Auditing / Optimization

 ๐Ÿ‘Ž Not useful

๐Ÿ”ง Not self-sufficient

There’s no way to connect ChatGPT or Bard directly to an ad account, so any kind of auditing would require feeding it considerable amounts of data first, which A – is limited due to its ability to process data and B – requires teaching it on what to look for exactly. 

We’d need to build a ChatGPT for ad audits.

Next unicorn startup idea?

Creative brief ideas

๐Ÿ‘ Useful

๐Ÿ”ง Self-sufficient (well, kind of)

ChatGPT: Another useful application is coming up with new creative brief ideas (albeit a bit generic). The results are original and well-adjusted for brands, provided we give it enough context and specify the focus. The ad layout refinement prompts are also genuinely helpful for brief specifics.

64631ea6ff0a7567890d61b4 d47b26d1

Bonus: it can even generate storyboards for video ads, although we haven’t tested implementing this use case specifically.

64631ea6c31920b376ea2903 d0beb395

Bard: Definitely less creative flair compared to ChatGPT. We couldn’t get it to produce specific ideas even with prompt refinement.

64631ea67d8081205f5cb30f 85cf33d1

Audience research

๐Ÿค” Potentially useful

Not self-sufficient

ChatGPT: Similar to general targeting research, ChatGPT provides useful top-level information, but with none of the answers being actual Facebook audiences you’ll be able to select from the dropdown list.

64631ea6101622cb2b1f6bd2 cc4e7a00

Bard: Unfortunately, hardly anything useful, very very top-level information.

64631ea7cc057db2923200cb 8a44ea4b

Setup automation

๐Ÿ‘Ž Not useful

๐Ÿ”ง Not self-sufficient

Wouldn’t it be nice to have AI create campaigns on autopilot?

Unfortunately, there’s no known way to fully automate any part of the setup through ChatGPT or similar tools (and considering the results from the various tasks we tested it against we wouldn’t trust any automation that doesn’t involve human QC)

Data analysis

๐Ÿค” Potentially useful

๐Ÿ”ง Not self-sufficient

ChatGPT: One of the most important tasks for us. Unfortunately, while it’s good for bulk data analysis, its prompt and answer character/token limits (4000 tokens) make it hard to embed it into data analysis processes, as we often can’t feed it all of the data points we need it to take into account. 

โ€Having said that, it’s great for smaller analysis tasks, especially around grouping topical data (i.e. keyword topics).

64631ea79401364b820bf807 e500c0c2

Bard: Bard often omitted a large portion of the data and focused on very broad groups when grouping it. Same with the example below (using the same data as above).

64631ea77d8081205f5cb378 0c112b04

Budget allocation

 ๐Ÿ‘Ž Not useful

๐Ÿ”ง Not self-sufficient

Budget analysis requires providing it with a considerable amount of data for context, which is limited by its prompt and answer character/token limits, rendering it unusable. 

โ€Furthermore, if we need to rely on bulk analysis, better results can be achieved through platforms’ own budget-related features or 3rd party tools such as Shape.io (although, in our experience, none of them can be relied on for completely autonomous budget allocation and still required human QC)

Brainstorming

๐Ÿ‘ Useful

๐Ÿ”ง Not self-sufficient

ChatGPT: Great for assisting in general top-level topic brainstorming (strategy, general A/B testing, promotional angles, etc.), but unreliable for niche topics or specific/technical problems.

64631ea7942400c686315132 0ed288cd

Bard: Equally great for top-level brainstorming, although often less accurate than ChatGPT.

64631ea7f1c7ec90c7044bbe f89b5efd

CRO

๐Ÿค” Potentially useful

๐Ÿ”ง Not self-sufficient

ChatGPT: CRO suggestions are generic. Might as well ask for an extensive list of all CRO suggestions and evaluate them on a case-by-case basis.

64631ea7ee62367430046b73 193d1eaa

Bard: Similar suggestions, but even more generic.

64631ea8cc057db292320805 f7920c3b

Don’t agree with our assessment? Found other use cases for ChatGPT or Bard? Please do share as I’m genuinely interested in finding ways to make these tools work better for us and our clients. You can reach me at art[at]artdoesads.com or onย LinkedIn.

Leave a Comment

Let's chat

Book a 30-minute discovery call

BOOK A CALL

This will close in 0 seconds