4 'uncomfortable' facts about AI that everyone should know.

The concept of AI, or artificial intelligence, is currently surrounded by a great deal of misinformation and unfounded speculation. While the technology has become widely adopted, core facts are often overlooked – and these can significantly alter how you perceive AI's true capabilities and its future.

Modern language models are merely predictive machines.

Occasionally, you might hear someone say something like, "Ask ChatGPT what it thinks." This kind of talk isn't unique to AI, but it's fundamentally a flawed way of expressing things.

Current large-scale language models (LLMs) are not capable of 'thinking' or logical reasoning like humans. Essentially, they are prediction machines. The answers you receive are generated through pattern-matching, which means predicting which word is most likely to appear next based on the command you provide.

These 'patterns' are learned through training data. Products like ChatGPT or Gemini are trained on countless texts they've analyzed to learn how to respond. They aren't thinking machines, but rather like massive memory repositories, capable of piecing together words based on probability.

These models don't actually 'understand' your question, nor do they reason logically. While they might sometimes perform calculations or attempt to 'solve' a problem, the output doesn't come from reasoning. Prediction and understanding are two completely different concepts.

This limitation is clearly demonstrated by the fact that AI can… confidently make mistakes. A classic example is asking a chatbot how many 'r's are in the word 'raspberry', and receiving an answer that shows it didn't even count. Or worse, asking for house cleaning tips but being suggested a 'recipe' for creating highly toxic chlorine gas.

4 'uncomfortable' facts about AI that everyone should know. Picture 1

AI is gradually running out of quality training data.

What makes LLMs work from the start is data. Once the foundation is built, these models are 'fed' by massive data sets. In a sense, modern AI is more like something that's nurtured than something that's pre-designed. You can lay the groundwork, but the data you feed in will determine the final outcome, sometimes in very unpredictable ways.

To date, almost everything has been incorporated into modern AI models: decades-old internet archives like Common Crawl, entire Wikipedias, digitally published books, software source code, and even user-generated content on social media.

While we can continue to expand our hardware infrastructure by building more data centers, the data problem is far more difficult to solve. This raises speculation that we are approaching the true limits of current models.

Some companies, like Frontier Labs AI, have begun using AI itself to generate synthetic training data. However, this approach is like photocopying another photocopy in real life. If the quality degrades slightly with each copy, you'll eventually face a phenomenon called 'model collapse'.

'Vibe coding' isn't as magical as advertised.

If you browse through specialized tech communities, you'll find plenty of criticism directed at 'vibe coding' – a term used to describe letting AI write code without actually understanding how that code works.

It's undeniable that vibe coding has certain applications, especially for beginners or those working on personal projects. For example, it can help you complete projects like making your own E-Ink photo frame with Arduino much faster than learning everything from scratch.

However, when used outside the realm of personal preference, employing AI to write code reveals several serious problems. Current LLMs can assist programmers in IDE environments by suggesting code, helping to write faster, but this is not the case for the complete development of a product.

Even a 1% error in AI code is a major problem. For humans, a small percentage of code errors can cause serious issues and require careful correction. Although humans also make mistakes, they understand the structure and logic of what they create, which allows them to find effective solutions.

Code can be edited and cleaned up with human supervision, but this is very time-consuming, or even impossible, if the error is deep within the project's foundation. It's like trying to fix the foundation when the frame is already built.

That's why you shouldn't believe advertisements that promise you can create your dream app and make money from it, even without any programming knowledge.

4 'uncomfortable' facts about AI that everyone should know. Picture 2

You may have unknowingly contributed to 'nurturing' AI.

It's very possible that content you created yourself has been used to train AI models. This could be just a forum comment, a social media post, a blog post you once wrote, or even a school assignment that was publicly posted.

It's not just text. Photos you take and share online, even if your license doesn't permit reuse, are likely to have been collected. Artwork and music you created years ago may also be in the training database.

The source code you've shared, the open-source projects you've contributed to, even digitized books, films, or artwork—all have the potential to become 'food' for AI.

If you're still skeptical, ask yourself why ChatGPT's Studio Ghibli-style filter can mimic it so closely. You didn't give permission for this, and certainly didn't receive any compensation.

Modern language models are merely predictive machines.

AI is gradually running out of quality training data.

'Vibe coding' isn't as magical as advertised.

You may have unknowingly contributed to 'nurturing' AI.

You should read it