Can Google’s AI Overviews Be Trusted in UK Financial Services?

Friday 15th November 2024

Can Google’s AI Overviews Be Trusted in UK Financial Services?

This is, to my knowledge, the current largest study into how Google’s AI Overviews are used within the UK financial services market.

The College Investor has produced excellent research into how AI Overviews are implemented (and causing issues) in the US, so this is our chance to see how things are shaping up in the UK.

This post outlines the findings from what I’m calling ‘phase one’ of the research. In phase one, we gathered data across over 200 search terms and assessed how Google implements AI Overviews in the UK.

We have assessed how prevalent AI Overviews are on this side of the pond and we’ve then worked with (ironically) AI models to analyse the results.

Phase two is underway; when it’s here we’ll inject the additional analysis of financial professionals to assess the quality of the AI Overviews results.

For now, we’ve benefited from systematic analysis using AI to help us fact-check and get a good first take on how well (or not) things are shaping up.

So, can Google’s AI Overviews be trusted in the UK financial services market?

The key information about this study

228 search terms reviewed

29 in banking
29 in borrowing
35 in insurance
26 in investing
34 in mortgages
30 in pensions
22 in savings
23 in tax

Search terms by type

10 are commercial - representing a user seeking a solution/product
218 are informational - representing a user seeking an answer or knowledge

Searches were carried out between 25th and 31st October 2024.

The key findings (so far)

122 of the search result pages (SERPs) contained an AI Overview, which is 54% of all searches reviewed.

% of AI Overviews by finance topic

90% of the banking searches contained an AI Overview.

Zero tax search terms contained an AI Overview.

Half of the commercial terms contained an AI Overview, and just over half (54%) of the informational terms contained an AI Overview.

Types of search term containing an AI Overview

The average word count for an AI Overview is 170.

35 of the AI Overviews (29%) contained a mention of at least one brand name.

Number of AI Overviews mentioning a finance brand

Gov.uk was the most referenced brand, with six mentions.

Pension Wise was referenced five times.

Banking brands: Barclays, Halifax, HSBC, Lloyds Bank, and Natwest were each mentioned three times.

Most mentioned finance brands in AI Overviews

Overall, 63 different brands were mentioned throughout the AI Overviews studied.

The initial analysis; how trustworthy are AI Overviews in financial services?

There are two ways we can assess how reliable Google’s AI Overviews are, using both, we’re hoping to cover all bases.

The first method, the one that is now complete, uses the latest ChatGPT model (GPT o1-preview) to study each AI Overview response, provide a rating of accuracy and a rationale for how this decision has been reached.

In effect, we’re using it to speed up the fact-checking process. This can’t yet replace a qualified human perspective, however, it has helped us to find the potential issues and flag them.

When the system detects a potential issue, it flags it as either ‘incorrect’, ‘inaccurate’, or ‘misleading’. We can then read why it has reached that conclusion and verify if we agree.

The second method is the human-only approach. This involves a financial services expert reviewing each Google response and making a decision on how accurate it is. As you can imagine, this is a more time-consuming method.

The results are in for method one (AI), and it makes for interesting reading.

Of all of the AI Overviews analysed:

75% were deemed to be correct
12% were judged to be inaccurate in some way
7% were found to be fundamentally incorrect
5% were believed to be misleading in some way
1% (a single search term) was rejected by the AI model for breaking policy (ChatGPT didn't want to address a question on tax avoidance)

In a nutshell, the majority were judged to be reliable answers, however, a quarter had issues, with 12% containing fundamental flaws or providing misleading information.

Accuracy of AI Overviews in financial services

What went wrong?

There were multiple reasons a response was judged to be incorrect, with the standout reasons being:

The AI Overview is providing an incorrect formula (for example, when answering “how to calculate mortgage interest”)
The AI Overview quotes facts and figures that cannot be verified - there’s a caveat here, because AI models aren’t great with up-to-the-minute data, so a human review will shed more light on this
The AI Overview incorrectly states information such as advising the reader that an employee cannot change workplace pension provider

What was misleading?

A number of the reporting issues relate to prices and percentages provided that cannot be verified; this is another area our human review will iron out
Misleading information on life insurance requirements
Conflating state pension with pension credit
Suggests that grandparents can open a JISA for grandchildren

What was inaccurate?

Multiple responses mixed UK and US banking information
Suggesting proof of income is required to open a bank account
Oversimplification of PCP finance
Recommends an amount of annual income to budget for car payments, rather than monthly take-home
Mixes up driver's licence information concerning insurance
Refers to financial products not necessarily commonplace in the UK market
Inaccurate information on how savings accounts can impact credit score

What are my initial takeaways?

Most of the search terms we tested were naturally ‘informational’. AI Overviews lend themselves well to research information, so it’s perhaps unsurprising that Google dishes them out on over half of the results pages.

Google has avoided the ‘tax’ topic with AI Overviews. This is interesting, it indicates that some suppression may be happening. I imagine tax is a potential minefield when it comes to providing information. The stakes are potentially higher and if regulators are watching certain areas, tax is likely to be one of them.

Brands feature more prominently than I had anticipated. Encouragingly, Gov.uk and Pension Wise (a government organisation) were the most referenced. Presumably, these are viewed as trusted resources and the ‘source of truth’.

Banking brands also featured strongly in relation to questions about where to access certain financial products. The banks mentioned are the ones you’d expect to see in the UK, with fewer mentions of the ‘neo-banks’. It suggests, as is often advised, that building a brand (as ambiguous as that advice is) is critical.

‘Investing’ is another topic that Google appears to be handling with care. Only 19% of results pages contained an AI Overview. Again, like tax, this is a high-risk area and is possibly being suppressed.

Observations

As I conducted the searches, I noticed some interesting behaviour in how the AI Overview content appears to be generated.

For most searches, the AI Overview loaded with the rest of the page; ie. it was there ready to go.

For some searches, however, the AI Overview worked through a multi-step process of ‘searching’ to ‘generating overview’.

I can’t say for certain, but it looked to me like some of the responses were ‘canned’ whilst others were generated on the fly.

So, can AI Overviews be trusted?

Anything less than 100% accuracy is, in my view, a failure of the system.

Based on our initial findings with analysis ongoing, I have to say ‘no’, AI Overviews cannot be trusted.

Despite a strong initial result that suggests 75% of responses are correct, we must not overlook the fact that finance is an area where misinformation can wreak havoc.

Anything less than 100% accuracy is, in my view, a failure of the system.

I’m inclined to take this view when we consider Google’s standing in the market. Google maintains c.90% market share in search and its AI Overviews, despite showing a small disclaimer, are presented as the answer to the question you seek.

The intention of AI Overviews is to keep you within Google’s ecosystem, it doesn’t encourage (well enough) the reader to go and verify the information.

In the UK, regulation means that finance brands and influencers/finfluencers cannot publish financial advice if they are not qualified to do so. The same must be true for platforms.

Is there an opportunity for brands?

Almost certainly. Google and other generative AI players will be doubling down on tools like AI Overviews, but they will only be good if there’s a meeting in the middle between trusted ‘sources of truth’ and well-designed and maintained tools.

Brands can and should become these sources of truth. Regulators take great care to oversee the standards brands have to maintain, so it’s imperative that Googe, with AI Overviews, correctly references these brands and refers people to them for expert guidance.

What happens next?

Phase two of this study is on its way. This will mix in insights from qualified financial experts who will give their take on the quality of AI Overviews as things stand.

Want to learn more about the process so far? Feel free to contact me!