Microsoft Phi-3.5 launched: A more competitive AI model

These lightweight AI language models are built on aggregated data and filtered public web pages, supporting 128K token context lengths. All new Phi-3.5 models are now available on Hugging Face under the MIT license.

Microsoft Phi-3.5 launched: A more competitive AI model Picture 1

Phi-3.5-MoE: A groundbreaking combination

The Phi-3.5-MoE stands out as the first model in Microsoft's Phi family that can take advantage of Mixture of Experts (MoE) technology. This 16 x 3.8 billion parameter MoE model enabled only 6.6 billion parameters and was trained on 4.9T tokens using 512 H100 GPU systems. In today's popular AI standards, Phi-3.5-MoE outperforms Llama-3.1 8B, Gemma-2-9B and Gemini-1.5-Flash, and is close to the current leading model GPT-4o-mini .

Phi-3.5-mini: Compact and Powerful

The Phi-3.5-mini is a 3.8 billion parameter model, surpassing the Llama3.1 8B or Mistral 7B, and even competing with the Mistral NeMo 12B. It is trained on 3.4T tokens using 512 H100 GPUs. With only 3.8B active parameters, this model is competitive on multilingual tasks compared to LLMs with more active parameters. Additionally, Phi-3.5-mini now supports 128K context length, while main competitor Gemma-2 only supports 8K.

Phi-3.5-vision: Enhanced multi-frame image processing capabilities

Phi-3.5-vision is a 4.2 billion parameter model trained on 500B tokens using 256 A100 GPUs. This model now supports multi-frame image understanding and inference. Phi-3.5-vision improved performance on MMMU (from 40.2 to 43.0), MMBench (from 80.5 to 81.9) and TextVQA document processing benchmark (from 70.9 to 72.0).

Microsoft plans to share more details about the Phi-3.5 model line this month, primarily showcasing advances in AI model performance and capabilities. With a focus on lightweight design and multimodal understanding, the Phi-3.5 family of models can be applied more widely across a variety of AI applications.

Isabella Humphrey

Update 23 August 2024

You should read it

May be interested

Please experience 'ChatGPT's grandfather' on Excel
ishan anand, a software developer, successfully 'stuffed' gpt-2 - the predecessor of chatgpt launched by openai in 2019 - into a microsoft excel spreadsheet. this is an exciting project that allows users to understand how a large language model (llm) works).
Lenovo launched the art laptop model
for the first time, lenovo launched a laptop that does not focus much on technological power, but instead is artistic.
Lenovo launches G-series laptops at competitive prices
lenovo has just introduced g-series notebooks into vietnam market. this is a model with quite suitable price for users who want to switch from desktop to laptop.
DeepSeek 'lied' about the cost of developing AI chatbot?
deepseek has attracted attention with its claim of developing a competitive ai model at minimal cost.
Understand the business model in just 2 minutes - Business Model Canvas
in the new, new thing, michael lewis calls the phrase business model a term of art. and like art itself, many people think that they can recognize it when they see it (especially when that person is especially smart or vice versa) but cannot define exactly what it is. this is not surprising because the way people define a term depends on how they use that term.
Microsoft launched the first ad clip of the Surface Pro 3
invite you to enjoy the first promotional video for surface pro 3 with a lot of manipulation from the actors on the device to convey the purpose of microsoft that is confirmed to users that surface pro 3 completely you can replace your pc (desktop).
HP Envy x2 officially launched to compete with the Surface Pro 3
hp envy x2 is available in two models for users to choose from, including the 13-inch model with a resolution of 1366 x 768 pixels and a 15-inch 1920 x 1080 pixel model.
Added a MSI 'MacBook Air'
msi has launched the x600, the latest laptop in the x-slim line with apple's ultra-slim, competitive macbook air design.
LG EXAONE launched: Korea's first open source AI model
lg ai research has announced the launch of a completely new open source ai model called exaone 3.0. this is korea's first open source ai model.
Gigabyte launched a new 14-inch Q2440 computer
gigabyte has just released a new laptop model q2440. like many new laptops, this model of gigabyte will also be powered by intel core third-generation processors and a built-in laptop model that ensures multitasking with high performance and speed. fast.