What is Large Action Model (LAM)?

LLM is not the only major model; Large Action Models (LAM) could be the next big thing in AI.

The rise of Generative AI chatbots has popularized the term "large language modeling," the underlying AI technology that works behind the scenes. Large language models (LLMs) produce output based on a set of predicted languages in response to user input, making it appear as if the AI is capable of thinking for itself.

But the LLM is not the only great model; Large Action Models (LAM) could be the next big thing in AI.

What is Large Action Model (LAM)?

LAM is an artificial intelligence system capable of understanding human input and taking action accordingly. This is a slightly different approach than AI systems that focus solely on generating responses. The term "Large Action Model" was first introduced by Rabbit Inc., the developer of the rabbit r1 device. In the company's rabbit r1 launch video, LAM is a new platform model that helps take AI from speech to action.

LAM is trained on a large dataset of user actions; therefore, they learn by imitating human actions or through demos. Through demos, LAM can understand and navigate the user interface of different websites or mobile applications and perform specific actions based on instructions. According to Rabbit, LAM can achieve this even if the interface is slightly altered.

You can think of LAM as an extension of the existing capabilities of LLM. While LLM generates text or media output based on user input by predicting the next word or token (You ask a question and LLM provides text or media output), LAM goes further by how to add the ability to perform complex actions on your behalf.

What can LAM do?

LAM focuses on performing complex actions on your behalf. However, the important point to note is the ability to perform complex actions. This makes LAMs more useful for performing advanced tasks, but it doesn't mean they can't perform simple tasks.

In theory, this means you can ask LAM to do something on your behalf, like ordering coffee from the nearest Starbucks, hailing a ride, and even booking a hotel room. It is therefore different from performing simple tasks like asking Google Assistant, Siri or Alexa to turn on the TV or living room lights.

Essentially, according to the vision shared by Rabbit Inc., LAM can access the relevant website or application and navigate through its interface to take action, such as booking a ride or canceling a trip if you change your mind.

LAM will succeed LLM, but they are not ready yet

The concept of LAM is very interesting, maybe even more interesting than LLM. LAM will be the future after Generative AI, allowing us to tackle boring tasks and focus on other interesting activities. However, as exciting as it sounds, LAM is not ready yet.

The first commercial product that promised to leverage LAM (rabbit r1) did not fully deliver on its marketing promise of performing actions on behalf of users. The device failed so spectacularly at its core that many firsthand reviews found it pretty useless.

Worse still, an investigation by YouTuber Coffeezilla in collaboration with a select group of software engineers with access to part of r1's code base, discovered that Rabbit used Playwright scripts to perform action instead of LAM. So instead of a device running a unique AI model, it's really just running a series of If > Then type statements; Far different from what LAM promised.