Anthropic launches Claude 3.5 Sonnet, beating ChatGPT 4o

By Isabella Humphrey Update 22 June 2024

Hot on the heels of releasing the Claude 3 model three months ago, Anthropic has now introduced the much improved Claude 3.5 Sonnet model..

This isn't the biggest model in Anthropic's lab, but it beats the ChatGPT 4o and Gemini 1.5 Proo, at least in some benchmarks. Claude 3.5 Sonnet is a mid-range model and is 2 times faster than the largest Claude 3 Opus model.

images 1 of Anthropic launches Claude 3.5 Sonnet, beating ChatGPT 4o

Anthropic has kept the API price unchanged for the Sonnet 3.5 model with a context window of 200K tokens. For general users, it is available for free on claude.ai and supports uploading both images and documents. Remember that there are rate limits for free users!

In terms of benchmarks, Claude 3.5 Sonnet beats GPT-4o in most benchmarks except MMLU and MATH, but the difference is very small. In HumanEval's encryption test, Claude 3.5 Sonnet scored 92% while GPT-4o scored 90.2%. In GPQA Diamond, which evaluates graduate-level reasoning ability, the new Sonnet model achieved a score of 59.4% while GPT-4o scored 53.6%.

images 1 of Anthropic launches Claude 3.5 Sonnet, beating ChatGPT 4o

In the MMLU test, Claude 3.5 Sonnet scored 88.3% and OpenAI's GPT-4o model scored 88.7%. From the table, you can deduce that Anthropic has developed a highly capable model that outperforms both the GPT-4o and the Gemini 1.5 Pro.

Next, the Claude 3.5 Sonnet is also a strong visual model and again outperforms the GPT-4o in various visual reasoning tests. It is very good at understanding and copying text from images that are difficult to read. It also excels at interpreting charts, graphs, and illustrations.

images 1 of Anthropic launches Claude 3.5 Sonnet, beating ChatGPT 4o

Furthermore, Anthropic announced a new Artifacts tool for Claude, which works like OpenAI's Code Interpreter tool. The Artifacts tool generates code and content using AI in a separate interface. It is not limited to Python only but can also work with other programming languages.

images 1 of Anthropic launches Claude 3.5 Sonnet, beating ChatGPT 4o

Anthropic says the Claude 3.5 Haiku and Claude 3.5 Opus will be available later this year. Overall I was impressed with the speed and intelligence of Claude 3.5 Sonnet. It looks like users can finally replace ChatGPT 4o with Anthropic's new model for their daily work.

Isabella Humphrey

Update 22 June 2024

Related Articles