What would happen if Claude fixed the faulty ChatGPT code?

Some time ago, someone asked Claude, ChatGPT, and Gemini to build a solar system simulator. That was around the time Claude was getting a lot of attention, and many people realized that perhaps they shouldn't limit themselves to just ChatGPT .

 

That test yielded one of the clearest results ever. Claude did very well and won convincingly. Gemini produced working code, but it wasn't particularly impressive, and interestingly, ChatGPT failed.

Today, let's examine a different parameter. This time, instead of writing code, let's assign the LLMs the task of debugging the code. Specifically, let's ask them to fix the ChatGPT code.

The ChatGPT code is broken.

A small but serious mistake.

 

The previous test had a simple constraint: No retrying. Whatever result was received on the first attempt would be used for evaluation. Unfortunately for ChatGPT, the code it generated had a small, subtle bug that rendered the entire program unusable.

ChatGPT uses kilometers for some distances, and then uses astronomical units (AU) for others. It then blends them together and ultimately displays space and diameter in AU, but the distances between planets in kilometers. The result is that planets are only a few kilometers apart, which, on the scale of the solar system, means they are essentially inside each other.

ChatGPT's code is quite clean and implements everything, but unfortunately, this small bug has rendered it unusable, preventing other aspects of the code from being evaluated.

Writing code and debugging code are two different skills. This is also true for people. A developer might be good at writing code, while another might be good at debugging. The same concept applies to other fields. A good writer isn't necessarily a good editor, and vice versa.

Let's get to the test. The question is given below. The problem is quite simple and asks them to clearly identify what error has been corrected so that we can easily understand.

 

Bạn được giao một đoạn code nguồn bị lỗi của một trình mô phỏng hệ mặt trời. Code nguồn chứa một lỗi khiến trình mô phỏng không thể sử dụng được. Nhiệm vụ của bạn: Xác định chính xác vấn đề gây ra lỗi. Sửa code nguồn. Trả lại phiên bản code nguồn đã được sửa lỗi hoàn toàn. Giải thích ngắn gọn lỗi ở đâu và cách bạn đã sửa nó. Giữ cho lời giải thích ngắn gọn và chỉ tập trung vào nguyên nhân gốc rễ và cách giải quyết.

Note : The primary initial test was to see if Claude could fix the ChatGPT source code, but to add context, ask all regular participants to do the same.

Gemini has fixed the code correctly.

Recently, Gemini has been the worst software among similar programs. In experience, even if you send it perfect code and ask it to fix the bug, it will create a bug, "fix" it, and destroy the code. Therefore, it's unknown whether it can accurately identify the actual bug in the broken code. To clarify, the example is using Gemini 3.1 Thinking.

Surprisingly, Gemini did it. It correctly identified the error: The projection instrument expected the position in kilometers, but orbitalPosition and drawOrbit calculated in astronomical units. It even specified the result of the error: Essentially, it had superimposed the planets on top of the sun, making the system appear empty.

 

The solution it used was also correct, and the fixed code worked. Finally, we can see ChatGPT's solar system simulator!

ChatGPT can also self-correct errors.

Ultimately, this is a flawed code within ChatGPT itself. We can't expect the same chatbot that wrote the flawed code to find the error. However, ChatGPT has improved recently. In experience, it performs better than Gemini. ChatGPT used to be so frustrating that many people switched to Claude, but now, when they use it occasionally, they generally get good results.

That's natural. These models are constantly being refined and updated, even if the version names don't change. In experience, ChatGPT is also much more sensitive to custom instructions than other chatbots, so your custom instructions in ChatGPT can significantly impact the experience.

Surprisingly, ChatGPT did a great job. It found the root cause, provided a concise but clear explanation as requested, and fixed the code. And the fixed code works well (using ChatGPT 5.4 Thinking as the model).

 

ChatGPT took the longest thought process in the initial task. Perhaps all the pre-code generation thought filled the context window and contributed to the error. Or maybe ChatGPT is fine-tuned in a way that makes it better at handling smaller tasks and minor tweaks than creating a project from scratch.

The contrast is very interesting. However, the most interesting thing is what is presented in the following paragraph.

Claude created the biggest surprise.

Claude's results in the final test were simply on a different level. They were more thorough, detailed, informative, and scientifically sound, far surpassing ChatGPT and Gemini.

But here's the surprise: Claude failed to find the main error in ChatGPT's code.

Instead, it found a different bug related to the camera panning mechanism. To be fair, it wasn't delusional to think it was. The bug did exist, but it only appeared when you dragged the mouse using the Shift key to pan the camera, and at most camera angles, it was subtle enough to be easily missed. But considering there was a much bigger bug, rendering the simulator almost useless, Claude completely overlooked it.

Strange, isn't it? The chatbot that writes the best code is now the worst at debugging other people's code. This time, Claude fails where ChatGPT and Gemini succeed. To be clear, the choice remains Claude Sonnet 4.6, using the same model as before.

After being given another chance, Claude promised to review the code more carefully and continued. Then, it produced another result, still wrong. But interestingly, it didn't stop there. It said, "will review more carefully" and continued.

But once again, it found a mistake. This wasn't a serious error related to the ratio. Fortunately, Claude didn't give up. It continued to think and finally realized it was a conversion error from AU to kilometers. The consequence it deduced was wrong, but the solution was still correct. You can see some of the dialogue in the image gallery above.

Finally, Claude reported the bug along with the correct solution.

Claude is more like a bot than other chatbots. That's part of the reason why people are switching to Claude and why it's been the most used lately. A chatbot should just be a true chatbot; it doesn't need to act like a human.

While more popular chatbots like Gemini and ChatGPT seem geared towards the average user, striving to listen and feel more human, Claude is different. That difference is also evident in this test. Claude found actual bugs, but not the first decisive ones. According to Claude's reasoning, things seem to go something like this: This is a bug; the code shouldn't have bugs; this is critical. This is definitely a critical bug; then the task is complete.

Claude created the most powerful original simulator, but it was the weakest at identifying the most critical errors when constraints were in place. That's the main lesson here!

One LLM model might seem similar to another at first glance, but they differ in key aspects. Start a private conversation and ask it to find errors in its own code. Send the same code to another chatbot and ask it to do the same.

It is becoming increasingly clear that no single model can dominate everything. Perhaps we will need to combine multiple models, just in case.

Related posts
Other Artificial intelligence articles
Category

System

Windows XP

Windows Server 2012

Windows 8

Windows 7

Windows 10

Wifi tips

Virus Removal - Spyware

Speed ​​up the computer

Server

Security solution

Mail Server

LAN - WAN

Ghost - Install Win

Fix computer error

Configure Router Switch

Computer wallpaper

Computer security

Mac OS X

Mac OS System software

Mac OS Security

Mac OS Office application

Mac OS Email Management

Mac OS Data - File

Mac hardware

Hardware

USB - Flash Drive

Speaker headset

Printer

PC hardware

Network equipment

Laptop hardware

Computer components

Advice Computer

Game

PC game

Online game

Mobile Game

Pokemon GO

information

Technology story

Technology comments

Quiz technology

New technology

British talent technology

Attack the network

Artificial intelligence

Technology

Smart watches

Raspberry Pi

Linux

Camera

Basic knowledge

Banking services

SEO tips

Science

Strange story

Space Science

Scientific invention

Science Story

Science photo

Science and technology

Medicine

Health Care

Fun science

Environment

Discover science

Discover nature

Archeology

Life

Travel Experience

Tips

Raise up child

Make up

Life skills

Home Care

Entertainment

DIY Handmade

Cuisine

Christmas

Application

Web Email

Website - Blog

Web browser

Support Download - Upload

Software conversion

Social Network

Simulator software

Online payment

Office information

Music Software

Map and Positioning

Installation - Uninstall

Graphic design

Free - Discount

Email reader

Edit video

Edit photo

Compress and Decompress

Chat, Text, Call

Archive - Share

Electric

Water heater

Washing machine

Television

Machine tool

Fridge

Fans

Air conditioning

Program

Unix and Linux

SQL Server

SQL

Python

Programming C

PHP

NodeJS

MongoDB

jQuery

JavaScript

HTTP

HTML

Git

Database

Data structure and algorithm

CSS and CSS3

C ++

C #

AngularJS

Mobile

Wallpapers and Ringtones

Tricks application

Take and process photos

Storage - Sync

Security and Virus Removal

Personalized

Online Social Network

Map

Manage and edit Video

Data

Chat - Call - Text

Browser and Add-on

Basic setup