Anthropic launches Claude 3.5 Sonnet, beating ChatGPT 4o
This isn't the biggest model in Anthropic's lab, but it beats the ChatGPT 4o and Gemini 1.5 Proo, at least in some benchmarks. Claude 3.5 Sonnet is a mid-range model and is 2 times faster than the largest Claude 3 Opus model.
Anthropic has kept the API price unchanged for the Sonnet 3.5 model with a context window of 200K tokens. For general users, it is available for free on claude.ai and supports uploading both images and documents. Remember that there are rate limits for free users!
In terms of benchmarks, Claude 3.5 Sonnet beats GPT-4o in most benchmarks except MMLU and MATH, but the difference is very small. In HumanEval's encryption test, Claude 3.5 Sonnet scored 92% while GPT-4o scored 90.2%. In GPQA Diamond, which evaluates graduate-level reasoning ability, the new Sonnet model achieved a score of 59.4% while GPT-4o scored 53.6%.
In the MMLU test, Claude 3.5 Sonnet scored 88.3% and OpenAI's GPT-4o model scored 88.7%. From the table, you can deduce that Anthropic has developed a highly capable model that outperforms both the GPT-4o and the Gemini 1.5 Pro.
Next, the Claude 3.5 Sonnet is also a strong visual model and again outperforms the GPT-4o in various visual reasoning tests. It is very good at understanding and copying text from images that are difficult to read. It also excels at interpreting charts, graphs, and illustrations.
Furthermore, Anthropic announced a new Artifacts tool for Claude, which works like OpenAI's Code Interpreter tool. The Artifacts tool generates code and content using AI in a separate interface. It is not limited to Python only but can also work with other programming languages.
Anthropic says the Claude 3.5 Haiku and Claude 3.5 Opus will be available later this year. Overall I was impressed with the speed and intelligence of Claude 3.5 Sonnet. It looks like users can finally replace ChatGPT 4o with Anthropic's new model for their daily work.
You should read it
- How to use Anthropic's new AI Claude 3 Prompt Library
- Claude or ChatGPT is the best LLM for everyday task?
- Anthropic Launches Claude 2: New Competitor for ChatGPT and Bard
- What is Forefront AI? Is it better than ChatGPT?
- Experience AI chatbots for free on the same website
- What is Llama 2? How to use Llama 2?
- Before AutoCAD, the drawings created were complex and elaborate like this
- The mystery of a giant 1.5-meter-long worm eating both hydrogen sulfide and rotten gas
May be interested
- These are exclusive features only available on Windows 11 Copilot+ PCsjune 18, 2024 is a memorable date in the technology world - a milestone marking the launch of the first wave of pc copilot+.
- SSD prices may decrease again later this yearfor much of 2023, solid-state drive (ssd) prices for pcs have fallen to record lows, making it difficult to upgrade gaming rigs, laptops, or other computers with ssds. becomes much cheaper.
- The 5 coolest Apple Home features coming to iOS 18apple's latest version of ios, ios 18, is launching with some exciting new features for homekit and apple home.
- The reason why the SSD does not reach the speed as announced by the manufacturermanufacturers often publish impressive ssd numbers in terms of read and write speeds. however, many users find that, when used in practice, this speed does not reach expectations and below are some reasons.
- List of current ThinkPad linesthe thinkpad laptop line is popular with many people because of its impressive performance and outstanding durability. let's explore the latest thinkpad laptop lines today!
- Apple launches Final Cut Camera app for iPhone and iPadyou can install final cut camera on iphone and ipad.