What is Google's new PaLM 2 LLM?

At Google I/O 2023, held on May 10, CEO Sunda Pichai revealed about Google's newest product: PaLM 2.

While OpenAI is leading the way in Generative AI development, many have judged that Google is lagging behind. Not to be outdone, however, Google launched a new LLM (Large Language Model), PaLM 2, at its 2023 Google I/O conference.

Set up in four different sizes to accommodate a wide range of applications, Google's new LLM powers several Google services and many more.

What is PaLM2?

At Google I/O 2023, held on May 10, CEO Sunda Pichai revealed about Google's newest product: PaLM 2.

Short for Pathways Language Model 2, Google's upgraded LLM is the second version of PaLM (first version out April 2022). Do you remember anything about PaLM? Its birth was shocking news and received a lot of attention, thanks to its ability to chat, tell basic jokes, etc. 6 months went by quickly and OpenAI's GPT-3.5 has passed. blows everything away, including PaLM.

 

Since then, OpenAI has launched GPT-4, a major upgrade based on GPT-3.5, and this new model is being integrated into many tools, most notably Microsoft's Bing AI Chat. Google is targeting OpenAI and GPT-4, hoping its upgraded LLM can significantly close the gap, as Bard's launch barely resonates.

Pichai has announced that PaLM 2 will come in four different model sizes: Gecko, Otter, Bison and Unicorn.

Gecko is lightweight so it can work on mobile devices and is fast enough for great interactive apps on the device, even offline. This flexibility makes it possible for PaLM 2 to be tweaked to support a whole range of product types in more ways, to help more people.

With Gecko being able to process around 20 tokens per second - tokens are values ​​assigned to real words for use by Generative AI models - it looks like this will be a game-changer for tools to use. AI is deployable on mobile devices.

PaLM 2 . Training Data

Google doesn't have official information on PaLM 2's training data yet, which is understandable since it's just released. But the PaLM 2 report said that Google wants PaLM 2 to have a deeper understanding of math, logic, and science, and that much of Google's training corpus will focus on these topics.

In revealing PaLM, Google confirmed that it trained on 540 billion parameters, which at the time was a huge number.

What is Google's new PaLM 2 LLM? Picture 1What is Google's new PaLM 2 LLM? Picture 1

 

OpenAI's GPT-4 is said to use more than 1 trillion parameters, some speculations even put that number up to 1.7 trillion. It's safe to say that since Google wants PaLM 2 to compete directly with OpenAI's LLMs, at the very least, PaLM 2 will have a similar, or greater, number.

Another significant boost for PaLM 2 is its language training data. Google has trained PaLM 2 in more than 100 languages ​​to help PaLM 2 gain deeper understanding and contextualization, and increase its translation capabilities.

But it's not just spoken language. Linked to Google's need for PaLM 2 to provide better mathematical and scientific reasoning, LLM has also been trained in more than 20 programming languages, making it an invaluable asset for programmers. pellets.

PaLM 2 is ready to power Google services, but still needs more tweaking

It won't be long until we can get our hands on PaLM 2 and see what it can do. With luck, launching any of PaLM 2's apps and services will be better than Bard's.

But technically you can already use PaLM 2. Google confirmed PaLM 2 has been deployed and used on 25 of its products, including Android, Youtube, Gmail, Google Docs, Google Slides, Google Sheets, etc.

But the PaLM 2 report also reveals that there is still work to be done, especially for malicious responses in multiple languages.

For example, when given specific malicious requests, PaLM 2 generates more than 30% of malicious responses. In specific languages ​​- English, German and Portuguese - PaLM 2 delivers over 17% of malicious responses (suggestions including racial and religious identity would push that number up) than).

No matter how hard researchers try to clean up the LLM training data, some malicious requests will inevitably pass censorship. The next stage is to continue training PaLM 2 to reduce those toxic responses.

Now is the time for the explosion of big language models

OpenAI is not the first company to launch a major language model, but its GPT-3, GPT-3.5, and GPT-4 models have certainly played a big role in the development of Generative AI.

Google's PaLM 2 has some issues to work out, but it's already been used in several Google services showing the company's confidence in its latest LLM.

5 ★ | 1 Vote