Setting up a local Colab runtime in a GCE VM with GPU

Every year, I take two weeks off around Christmas and New Year. It’s my favorite time of the year because I usually don’t go anywhere, but staying at home, so that I can play my favorite video games, or fix things up in the house, or start some fun projects that require a large chunk of time that does not fit a typical weekend.

This year, I decided to play with Machine Learning, or like many people refer to it as a synonym, AI. There can be many angles to it: running an open weight LLM with Ollama / LM Studio; building some fun AI agent; play with coding agents like Antigravity / Claude Code / Codex. For me, I have always been more interested in training machine learning models.

My first experience with machine learning was Andrew Ng’s machine learning class on coursera.org. It was a few years before PyTorch and Tensorflow were introduced, so things were done in GNU Octave. I later implemented some basic training logic in python using numpy in a class project. It’s finally time for me to play with PyTorch, and obviously, the best way is in a Google Colab.

The free hosted Colab runtime offers only Nvidia T4 GPU, and a small amount of RAM (10-12GB). If I want to play with better GPU and bigger model, I have a few choises:

Colab Paid Service: It’s super vague on what GPU is available; and it’s a subscription that I have to remember to unsubscribe
Run a local runtime: I don’t have a good GPU because they are too expensive nowadays
Run a local runtime on GCE VM: let’s try this!

Google Compute Engine (GCE) G4 VM

Google Cloud Platform (GCP) GA’ed the GCE G4 VM in October 2025, which offers RTX PRO 6000 with 96GB of VRAM, which seems a perfect choice for LLM. With spot pricing, it’s about $2.5 per hour. It is a bit pricy if we run it 24×7, but I am going to just use it a couple hours here and there. It is also definitely much cheaper than owning any GPU.

Creating the VM is easy from the web console. We keep most things as default. The only few changes are:

Set provision model to spot – it’s about 50% cheaper this way
Select the Debian 12 OS image with a 40GB root disk
Add local SSD

My complaint here is that the UI says that it’s better to use a GPU-optimized Debian OS image to avoid manually install CUDA, but when I tried the Deep Learning on Linux images, they are actually not compatible with RTX PRO 6000¹. Any way, at least the documentation exists, but the UI could really be better.

Setting up the VM

GPU Driver Installation

It was straightforward to follow steps at https://docs.cloud.google.com/compute/docs/gpus/install-drivers-gpu to install the GPU driver.

The machine will restart and we should rerun the cuda_installer again to finish the installation. Running the nvidia-smi should produce some meaningful output.

We also install nvidia-container-toolkit so that we can use the GPU in the colab docker container, following https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html.

Local SSD Setup

To save cost, I decided to use local SSD instead of persistent storage. The downside, however, is that I need to jump through a few more hoops to make docker use a different location for all the images – the colab docker image can be as big as 55GB.

The VM should restart and the SSDs should be correctly mounted in gce_local_ssd_setup/local_ssd_mnt_[0-3].

Install Docker

First we follow the steps at https://docs.docker.com/engine/install/debian/.

Now that docker is installed, we need to point docker and containerd to use the local ssd for storage.

After this docker info should show Docker Root Dir as the new directory. Also, edit
sudo systemctl edit docker to include the following lines:

This makes sure docker is started after we have properly setup the permission of the SSD directory.

Finally, let’s make containerd also use a different root by editing
/etc/containerd/config.toml to have the following line:

And similarly, edit sudo systemctl edit containerd to include the following lines:

After restarting containerd by sudo systemctl restart containerd, we have properly started using the local SSD for docker.

Start the local Colab runtime

We should follow https://research.google.com/colaboratory/local-runtimes.html.

This might take a little while but once it finishes, you’ll have the URL with the token printed in the terminal. We just need to forward that port to our local machine before visiting https://colab.research.google.com/ to connect to the runtime.

After pasting the URL, and the runtime is connected, we can start using PyTorch!

I only learned about this reading https://www.reddit.com/r/googlecloud/comments/1gw6pkx/compute_engine_deep_learning_vm_images_still/ ↩︎