Oxford study warns: Friendly AI chatbots lie more often

0 2 2 minutes read

Friendly chatbots lie more often. This is the result of a new Oxford study. Researchers examined five well-known AI models and found that targeted empathy training increased error rates by up to 30 percent. Particularly problematic: The systems confirm false statements, especially when users are emotionally vulnerable.

Developers are increasingly training language models to appear warm and friendly. Millions of people already regularly use such chatbots as digital companions in everyday life or ask them for advice. A new study from the University of Oxford shows the downside to this development. The targeted optimization of the software for empathy leads to the systems making factual errors much more frequently in practice.

In experiments with five well-known models, the error rate increased by ten to thirty percent after appropriate training. The friendly chatbots were more willing to spread conspiracy theories and sometimes give incorrect medical advice.

This happened even though the basic capabilities of artificial intelligence were initially almost completely preserved in standard tests. The researchers conclude from the data that emotional warmth and factual accuracy are often at odds in these systems.

Why emotional users often receive wrong answers

This behavior was increased when users revealed their own weaknesses or personal feelings in their text requests. In such situations, the empathic models would tend to agree with the users, even if they are wrong about the content.

According to the results, the chatbots confirmed users’ incorrect assumptions about forty percent more often than the original, purely factual versions of the software. This effect was strongest when people expressed overt sadness in their chat messages.

The systems seem to prioritize interpersonal harmony over pure factual truth. Similar to how people sometimes use white lies to avoid conflict, the models confirm incorrect statements made by users.

Control experiments show that the specific training on friendliness is actually responsible for this loss of accuracy. However, a deliberately neutral or cool expression of the software did not lead to comparable performance losses in the tests.

Mistakes of AI chatbots: What this means for digital therapy and advice

These findings pose challenges for AI providers in everyday use. Language models are increasingly taking on sensitive roles in digital therapy or personal advice. In such situations, incorrect confirmations could lead to risks for users. In the future, developers would have to find new ways to remain true to the facts and still react in socially appropriate ways.

The study authors involved are calling for a rethink in the general review of artificial intelligence. Common testing procedures would currently mostly overlook these systematic weaknesses because they ignore the emotional context of the user.

In order to reduce future risks for consumers, it is necessary to adapt the industry’s training methods. The researchers write in their study: “Building models that are both warm and accurate will require conscious attention to how these two properties interact.”

Also interesting:

Source link