Quick Links

Key Takeaways

You can run a ChatGPT-like AI on your own PC with Alpaca, a chatbot created by Stanford researchers. It supports Windows, macOS, and Linux. You just need at least 8GB of RAM and about 30GB of free storage space.

Chatbots are all the rage right now, and everyone wants a piece of the action. Google has Bard, Microsoft has Bing Chat, and OpenAI's ChatGPT is practically synonymous with AI at this point. But what if you don't want to rely on a cloud service for your chatbot? We've got a ChatGPT-like AI you can download --- an Alpaca.

What Is Alpaca?

Alpaca is a language model (a chatbot, basically), much like ChatGPT. It is capable of answering questions, reasoning, telling jokes, and just about every other thing we've come to expect from chatbots. Alpaca was created by Stanford researchers by fine-tuning Facebook's LLaMA.

Unlike ChatGPT, and most other chatbots available today, Alpaca runs completely on your own PC. That means that no one can snoop on your conversations or what you ask Alpaca, and your exchanges can't be accidentally leaked, either. It also means that you don't have to pay any monthly fees, you can train the model further to better suit your needs if you have the hardware, and you can integrate it into any application you want. You're only limited by your hardware and your programming chops.

However, it also works beautifully as just a regular old chatbot you can talk to, and we're going to show you how to run it on just about any PC out there.

How Does Alpaca Compare with ChatGPT?

We'll just get it out of the way up front: ChatGPT, particularly ChatGPT running GPT-4, is smarter and faster than Alpaca at the moment.

Alpaca's speed is mostly limited by the computer it is running on --- if you have a blazing fast gaming PC with a ton of cores and plenty of RAM, you'll get good performance out of it. Slower PCs with fewer cores will take longer to generate responses. Of course, it isn't exactly fair or even reasonable to compare it to ChatGPT in this regard --- we don't know what kind of computer ChatGPT is running on, but it is certainly beefier than your average desktop PC.

There are three main variants of Alpaca currently, 7B, 13B, and 30B. Generally speaking, the larger the number, the smarter the chatbot will be.

Alpaca, especially the 7B model, is noticeably "dumber" than ChatGPT is. It doesn't reason as well and will certainly not pass the Turing test. 7B is still great if you want a recipe suggestion, however.

The 13B and 30B models are quite another story. 13B is capable of providing a coherent, human-like conversation, and can answer complex questions. 30B is even more impressive, if you've got the hardware to run it, and is within striking distance of ChatGPT. It'll wax on philosophically or make a joke without missing a beat if prompted.

What Do You Need to Run Alpaca?

Alpaca has pretty flexible system requirements. These guidelines are above the bare minimum but are good guidelines. We're also going to be installing this on Windows. If you're installing this on a system running Linux or macOS, just skip the Windows Subsystem for Linux section --- it isn't relevant to you.

  • 16 GB of RAM
  • 35 GB of storage on an SSD if you want all three models.
    • 4 GB for the 7B model, 8 GB for the 13B model, and 20 GB for the 30B model
    • 500MB for default Ubuntu with WSL2
    • A few more GBs between other dependencies
  • A modern CPU is ideal
    • Any Ryzen CPU
    • 7th Generation Intel Processor or Newer
  • Windows Subsystem for Linux 2 (WSL2)
  • GIT
  • Docker
  • A community project, Serge, which gives Alpaca a nice web interface

There is currently no reason to suspect this particular project has any major security faults or is malicious. We've been through the code and run the software ourselves and found nothing concerning. That does not mean it is or will remain safe. Always be cautious with things you find on internet and reevaluate the safety periodically.

How to Run Alpaca Locally on Your PC

It is important that you follow these steps in the order they're given. Docker will probably break if you don't, requiring a complete reinstall of both WSL2 and Docker.

Install Windows Subsystem for Linux 2

Microsoft's Windows Subsystem for Linux 2 (WSL2) allows you to run Linux software in Windows. It has a low overhead and is really handy in a lot of cases. Docker for Windows relies on WSL2, so we need to install WSL2 first.

If you already have WSL2 installed just run wsl --update in PowerShell to make sure everything is updated.

Open up a PowerShell window as Admin, then enter the command:

        wsl --install
    

It'll take a bit to download all of the WSL2 files and Ubuntu. You must restart your PC after the installation is complete.

Installing WSL2.

Once the restart has been performed, reopen PowerShell (not necessarily as admin) and run:

        wsl -l -v 
    

You should see something like the image below if everything worked correctly. You also don't need to install Ubuntu in particular. You can install any distro you like, Ubuntu is just the default.

Checking that WSL2 installed Ubuntu.

Related: How to Run ChatGPT Using ShellGPT From the Ubuntu Terminal

Install Docker

Docker is a program that lets you run programs in a "container." Containers are similar to virtual machines, but they tend to have less overhead and are more performant for a lot of applications. Serge uses Docker to make installation super convenient.

First, download the Docker installer from the Docker website. If you're going to be running Docker on Linux or macOS be sure you grab the appropriate installer.

If you're running a headless Linux server, you'll want to follow the appropriate instructions for your Linux distro to get Docker running.

Install Docker Desktop from the Docker website.

Run the installer and be prepared to wait a few minutes. Docker will take a while and set up a bunch of stuff behind the scenes. Once it is done, you'll want to restart your PC.

After restarting, open PowerShell and run

         wsl -l -v
    

again. This time you should see some entries related to Docker as well.

Docker uses WSL2 to create a VM.

Install GIT on Windows

The last prerequisite is Git, which we'll use to download (and update) Serge automatically from Github. It isn't strictly necessary since you can always download the ZIP and extract it manually, but Git is better.

Head over to the Git website and download the right version for your operating system. Windows users just need to run the executable. Make sure to at least look at the installation options instead of Just clicking rapidly through all of the options. One, shown in the screenshot below, is absolutely critical.

Make sure to select the option that adds Git to your system PATH.

Once Git is done installing, you're ready to install Serge and Alpaca.

Install Serge and Alpaca

First, make sure that Docker Desktop is running. Then, open PowerShell or Windows Terminal with a PowerShell window open (not as admin) and run the following command:

        git clone https://github.com/nsarrazin/serge.git && cd serge
    

This downloads the files from GitHub to a folder on your PC, then changes the active directory to the folder that was created.

Download Serge from Github.

The next command you need to run is:

        cp .env.sample .env
    

That line creates a copy of .env.sample and names the copy ".env." The file contains arguments related to the local database that stores your conversations and the port that the local web server uses when you connect.

Then run:

        docker compose up -d
    

Docker compose ties together a number of different containers into a neat package. You can check out the docker-compose.yml file in the Serge folder if you want to see more specifically what is involved here.

Docker-Compose setting up Serge.

The last command will initiate a download, and here you need to make a choice before proceeding. There are three different variants you can download: 7B, 13B, and 30B. 7B is the simplest and "dumbest" model, whereas 30B is the most sophisticated and smartest. 13B is the middle ground.

Variant

Download Size

Free RAM Required

System RAM Recommended on Windows

System RAM Recommended on Linux

7B

4 GB

4 GB

16 GB

8 GB

13B

8 GB

8 GB

16 GB

16 GB

30B

20

20 GB

64 GB (Likely)

32 GB

Linux (and probably macOS) installations will be able to get away with less system RAM than Windows installs --- Windows is a bit of a RAM hog. You should probably start with the 7B variant first since it is the least demanding option. You can always download 13B or 30B later if you want.

Run the following command to download the 7B model (or substitute 13B or 30B).

        docker compose exec api python3 /usr/src/app/utils/download.py tokenizer 7B
    

Be prepared to wait, especially if you opt for the 30B variant. The Huggingface server seems to top out at about 20 megabytes per second, so you'll be looking at 50 seconds per gigabyte downloaded in the best-case scenario.

Downloading the 7B language model.

Use Serge and Alpaca

Docker and all of the required containers are currently running if you've followed these instructions, however, you'll have to turn them back on if you restart your computer. To do that, just open up Docker Desktop and click the small triangular buttons. The icons to the left of the "Name" column turn green when the containers are running.

The Serge containers displayed in Docker, currently offline. Click the arrow button to run them.

Everything is installed and ready to go at this point. Just open up your browser and enter "localhost:8008" into the address bar, just like you would to visit Facebook or any other website.

If you're hosting Alpaca/Serge on another computer, you'll need to enter that device's local IP address instead of localhost.

The main screen has your previous conversations displayed along the left and the settings for a new chat displayed in the middle.

The Serge Web Interface.

There are a fair number of settings available, but there are five that you'll really want to pay attention to:

  • Temperature - Determines how freely the AI answers. Lower numbers result in more rigid answers, while higher numbers are more creative.
  • Maximum Generated Text Length in Tokens - How long the responses the bot writes can be.
  • Model Choice - Pick between 7B, 13B, 30B, and any other model you install.
  • n_threads - The number of threads Serge/Alpaca can use on your CPU. Allocating more will improve performance
  • Pre-Prompt for Initializing a Conversation - Provides context before the conversation is started to bias the way the chatbot replies.

In this case, we upped the temperature and threads, selected the 13B model, and told the chatbot it is a pirate.

Important Serge Settings.

Here is a sample of how the conversation went.

An example conversation.

You can talk about anything you'd like with Alpaca, and you don't have to worry about what is happening to your data. It remains on your device, under your control at all times.

Remember, ChatGPT, Alpaca, and other chatbots seem reliable, but they aren't at this point in time. They very much embody the sentiment: "If you can't dazzle them with brilliance, baffle them with BS." Their tendency to make things up has been dubbed "hallucinating." Do not rely on them for anything essential, especially not something critical to your job or health. They should only be used for entertainment or experimental purposes at this time.

However, the technology is only going to get better with time --- it won't be long before we see Alpaca (or other locally run AI) integrated into Discord servers, Minecraft mods, and any number of other creative applications. Further refinement will also result in faster, more accurate models that can run on weaker hardware.