9 pros and cons of using a local LLM

Since ChatGPT emerged in November 2022, the term large language model (LLM) has quickly moved from a term reserved for AI enthusiasts to a buzzword on everyone's lips.

The biggest appeal of a local LLM is the ability to replicate the capabilities of a chatbot like ChatGPT on your computer without the need for a cloud-hosted version.

There are arguments for and against setting up an LLM locally on your computer. So should we use a local LLM after all?

Advantages of using LLM locally

Picture 1 of 9 pros and cons of using a local LLM

Why are people so excited about setting up their own large language models on the computer? Besides the goal of "wowing others", there are some practical benefits.

1. Less censorship

When ChatGPT and Bing AI first came online, the things both chatbots were willing to say and do were both fascinating and alarming. At the time, both chatbots could even help you make bombs if you used the right prompts. This may sound perverse, but being able to do anything is symbolic of the unlimited possibilities of the language models that support them.

 

These days, both chatbots are so heavily censored that they wouldn't even help you write a fictional crime novel with violent scenes. Some AI chatbots won't even talk about religion or politics. While the LLMs you can set up locally aren't completely uncensored, many of them will be willing to do thought-provoking things that public-facing chatbots won't. So, if you don't want a robot lecturing you about the ethics of discussing topics of personal interest, running a local LLM might be the way to go.

2. Better data security

One of the main reasons people choose a local LLM is to ensure that everything that happens on their computer stays on the device. When you use LLM locally, it's like having a private conversation in your living room - no one outside can listen in. Whether you are testing your credit card details or having sensitive personal conversations with LLM, all data obtained is stored solely on your computer. The alternative is to use a public LLM like GPT-4, which gives the companies in charge access to your chat information.

3. Use offline

With the Internet being affordable and widely accessible, going offline may seem like a trivial reason to take a local LLM. Offline access can become especially important in remote or isolated locations where Internet service is unreliable or unavailable. In such situations, a local LLM that operates independently of an Internet connection becomes an important tool. It allows you to continue doing whatever you want without interruption.

4. Save costs

The average price to access an LLM with capabilities like GPT-4 or Claude 2 is $20 per month. While that might not seem like an alarming price, you still run into some annoying limitations for that amount. For example, with GPT-4, accessed via ChatGPT, you are limited to 50 messages every 3 hours. You can only get around those limits by switching to the ChatGPT Enterprise plan, which can cost thousands of dollars. With Local LLM, once you set up the software, you won't have to pay a monthly subscription fee or $20 recurring cost. It's like buying a car instead of relying on ride-sharing services. It's expensive at first, but over time you'll save a lot.

 

5. Better customization

Publicly available AI chatbots have limited customization due to security and censorship concerns. With a locally hosted AI assistant, you can completely customize the model for your specific needs. You can train your assistant on proprietary data relevant to your use case, improving relevance and accuracy. For example, a lawyer can optimize his local AI to generate more accurate legal insights. The main benefit is control over customization to your unique requirements.

Disadvantages of using LLM locally

Picture 2 of 9 pros and cons of using a local LLM

Before making the switch, you should consider some of the disadvantages of using a local LLM.

1. Uses a lot of resources

To run LLM locally effectively, you will need high-end hardware. Think powerful CPU, lots of RAM, and maybe a dedicated GPU. Don't expect a $400 laptop to deliver a good experience. Response will be very slow, especially with larger AI models. It's like running cutting-edge video games - you need powerful specs for optimal performance. You may even need specialized cooling solutions. The bottom line is that a local LLM requires an investment in top-notch hardware to get the speed and responsiveness you love on a web-based LLM (or even improve upon it). Your computing needs will be huge compared to using web-based services.

2. Slower response and poorer performance

A common limitation of local LLM is slower response times. Exact speeds depend on the specific AI model and hardware used, but most setups lag behind those of online services. After receiving instant responses from ChatGPT, Bard, and other tools, local LLM can be painfully slow. The average user faces serious downside from a fluid web experience. So be prepared for a "culture shock" from fast online systems to their slower local equivalents.

 

In short, unless you're using an absolute top-of-the-line setup (like an AMD Ryzen 5800X3D with Nvidia RTX 4090 and "massive" RAM), the overall performance of local LLM won't compare to Generative AI chatbots online that you are familiar with.

3. Complicated setup

Deploying LLM locally is more complicated than simply subscribing to a web-based AI service. With an Internet connection, your ChatGPT, Bard, or Bing AI account can be ready to start prompting in minutes. Setting up a full local LLM stack requires downloading frameworks, configuring infrastructure, and integrating various components. For larger models, this complex process can take hours, even with tools intended to simplify installation. Some of the most advanced AI systems still require extensive engineering to run locally. So, unlike plug-and-play web-based AI models, managing your own AI requires significant technical and time investments.

4. Limited knowledge

A lot of local LLMs are stuck in the past. They have limited knowledge of current events. Remember when ChatGPT couldn't access the Internet? It can then only provide answers to questions about events that occurred before September 2021. Similar to the original ChatGPT models, locally hosted language models are usually only train on data before a certain cutoff date. As a result, they lack awareness of updated developments after that time.

Additionally, local LLM cannot access Internet data directly. This limits the usefulness of real-time queries such as stock prices or weather. To enjoy a form of real-time data, local LLMs will often require an additional layer of integration with Internet-connected services. Internet access is one of the reasons why you might consider upgrading to ChatGPT Plus!

Should LLM be used locally?

Large local language models offer compelling benefits, but there are also real downsides to consider before getting started. Less censorship, better privacy, offline access, cost savings and customization are attractive reasons to set up a local LLM. However, these benefits come with trade-offs. With so many LLMs available for free online, jumping into a local LLM can be like swatting flies with a sledgehammer – capable but overkill. So there is no definitive right or wrong answer. Assessing your priorities will determine whether now is the right time to make the switch.

Update 26 October 2023
Category

System

Mac OS X

Hardware

Game

Tech info

Technology

Science

Life

Application

Electric

Program

Mobile