Why almost everyone uses AI incorrectly

Ahmed Riaz

2 months ago

ChatGPT sometimes answers so convincingly that many users unconsciously treat it like a human expert – with opinions, attitudes and world knowledge. This is exactly what leads to errors in thinking that massively worsen the quality of the results.

You’ve probably already experienced this yourself: you ask an LLM (Large Language Model) like ChatGPT something and get a convincingly formulated answer. Then you ask the opposite and again the answer is convincing. So are LLMs contradictory, unreliable and arbitrary? Is there another problem that is not in the LLM?

LLMs seem like someone who has an opinion, perhaps even a belief. But an LLM is not a human colleague or expert, especially not one with character. It is a system that generates responses and imitates human communication.

However, this is why we attribute human characteristics to these non-human systems. As a result, we make mistakes when using LLMs that have a serious impact on the results.

AI is not human, so stop treating it like one

It makes sense to treat ChatGPT and other LLMs like a human counterpart, because that’s how they are constructed: communication is dialogic and the answers appear calm, polite, structured and sound extremely confident, as if there is no doubt about the answers given. LLMs don’t sound like machines, they sound like “someone who knows.” But this is exactly what leads to erroneous conclusions.

When a person speaks convincingly, we usually assume that they have a certain inner attitude. In the case of a language model, this leads to incorrect usage. Because it has no beliefs that it defends. It also has no worldview to which it adheres.

The first mental error to avoid with LLMs is therefore surprisingly simple: stop looking for an opinion in an LLM’s answer. Instead, ask what perspective the language model is currently creating for you.

AI has no opinion – and that’s a good thing

As a result, many users unconsciously turn AI into something like a digital counterpart with character. They then say sentences like: “The AI thinks that’s good”, “The AI has changed its mind” or “The AI understands what I mean”.

That sounds harmless. But it isn’t. Because this way of speaking shifts expectations. Suddenly AI should work like a human. But that’s not what a language model is built for. An LLM only produces likely and context-appropriate answers.

This is great for getting around human weaknesses, because because LLMs don’t defend their own beliefs, they can do something that humans are often bad at: changing perspectives quickly.

So it can argue for your idea, argue against your idea, test it from a customer’s perspective, dissect it from a skeptic’s perspective, or turn it into a sober risk analysis. However, an LLM does not (yet) develop character like a human being can.

Truth does not come from good language

The second big mistake in thinking is even more insidious: we confuse language quality with competence. People react strongly to linguistic security. Anyone who formulates clearly and argues in a structured manner sounds credible and therefore appears competent. And credibility and competence leads to trust, which is all too human. However, this is dangerous when using LLMs.

Because LLMs are particularly strong in producing plausible-sounding formulations. In particular, they can package uncertainties into sentences that sound like certainties and formulate mistakes in such a way that you only notice them when you look closely. This is precisely why hallucinations and factually incorrect but linguistically convincing answers from LLMs are a central risk of generative AI.

The problem now lies in the way you use the answers from LLMs. Good language alone is not enough for factual claims, legal statements, figures, sources or reliable classifications. Tests are required there.

Because people believe an answer too early because linguistically it already looks “finished”. But if you view the answers from LLMs as drafts for ideas, for variants, for wording suggestions or new perspectives, they are valuable.

This is how you get more out of a prompt

Another error in thinking is that users treat the first answer of an LLM as a kind of “final result”. You ask a question, read the answer and immediately judge: good, bad, useful, useless. However, they are incorrectly classifying how LLMs work best in practice: the first output is usually not the final result, but only the first raw material.

A human mechanism is at work, the anchoring effect. What we first see and, in the case of LLMs, get in response shapes our judgment unduly. If the first answer is mediocre, many consider the language model to be poor. If it’s shiny, they’re overestimating it. Both are wrong. LLMs respond to wording, context and prompt structure. Even small changes can therefore significantly change expenses.

This is why good results from LLMs rarely show up in the first prompt. Rather, it only shows up in the loops afterwards. If you want to work well with LLMs, you have to ask questions and get different perspectives.

You therefore have to reformulate, condense, sharpen, attack, simplify and re-sort. You must not treat the answer from LLMs like a judgment, but like a material.

AI better than humans? What LLMs are good at

People like confirmation. We seek information that fits our assumptions, and we defend ideas in which we have already “invested,” even if only mentally. This is called confirmation bias.

However, this becomes a problem when dealing with AI. Because if you ask tendentious questions, a language model often provides you with useful answers in exactly this direction of the question. Not because it has found the truth, but because it reacts to the framework you have set.

Conversely, this results in an often underestimated benefit: you can have an LLM provide you with counterarguments without offending anyone, hurting your ego or escalating a situation.

Many users don’t realize that LLMs regularly only confirm what they already thought. However, it would be wiser to do the opposite: to use LLMs to attack your own comfort zone and your own world of thoughts.

It is not the quality of LLMs, but the quality of the question that matters

A related error in thinking is that the quality of the answer will lie in the properties, algorithm and training data of an LLM. However, a surprisingly large part of the result depends on how you ask.

LLMs respond to framing, role description, objective, examples, context and nuances of wording. Language is control in LLMs. If you ask unclear questions, you won’t just get a slightly worse answer, but rather an answer to a question that wasn’t asked clearly.

But this means: Better use of AI often does not start with “better” AI, i.e. with better models, but with clearer thinking in the form of clearer questions and tasks. Once you clarify the audience, purpose, tone, depth, exclusions, and success criteria for the answer, the quality of the answer often changes noticeably.

Speaking to LLMs, the so-called “prompting”, is therefore not just a negligible banality. You create good prompts when you are clear about what you actually want.

Language models invite misunderstandings

Of course, it should now be mentioned that the way LLMs are developed is not entirely innocent of our errors in thinking. LLMs sound friendly, coherent and confident.

They often seem as if they understand what you want from them. This is precisely why you almost automatically get the impression that you are dealing with a clear-thinking “digital person”.

So the problem doesn’t just lie with the user. It is also because of the way LLMs are created, simulating humanity.

This phenomenon is now being discussed under the term “anthropomorphization of LLMs”. This is precisely why dealing with LLMs requires a clear understanding of what they can and cannot do.

Better results start with better questions

In the end, it all boils down to an uncomfortable but, in my opinion, important insight: if you want to work better with LLMs, you need to better understand and adapt how you deal with them.

You need to stop automatically seeing truth in fluency, taking the first answer as the final outcome, and confusing confirmation of your views through LLMs with good thoughts. And above all, you have to stop looking at a language model like a human being with attitude and conviction.

At its best, an LLM is a thought amplifier, an accelerator in generating answers, a perspective generator, and a dissent tool. An LLM is not a digital colleague with wisdom and worldview. This is why you need to stop measuring an LLM by human standards. If you understand this, you will be better armed against errors in thinking.

Also interesting:

Source link