Every few weeks a new open-source 3D AI model drops, and half of them ship with the same catch — they want 24 GB of VRAM or more. My card has 14 GB. But I still want to test these models properly before I write about them or add them to the Arena. So I do not wait for a quantized build or a hosted demo — I rent a cloud GPU for an hour, let an AI coding agent install the thing from scratch, and run it.
Below is the exact process I used to get Roblox's CubePart running — every command, in order. The model is just the example; this same flow works for any heavy local 3D AI repo. CubePart itself is already live on top3d.ai if you only want to see its output — this guide is about getting it running.

The problem: it wants 24 GB+ of VRAM
The example throughout is Roblox CubePart, a new open-source local model — but the model barely matters here. What matters is its requirement: Roblox recommends 24 GB of VRAM for a normal run, and my card has 14 GB. That is the wall, and it is the same wall most heavy local 3D AI models put in front of you. Push quality higher and even 48 GB runs out.

The fix: rent a cloud GPU for an hour
Instead of buying hardware, I rent a GPU with SSH access, install everything from scratch, run my tests, and shut it down. For this I used RunPod with an A40 (48 GB) — overkill, but I wanted headroom. It cost about $0.50 per hour.
- Vast.ai — cheapest marketplace pricing
- Lambda Cloud — clean ML-focused instances
- Paperspace (DigitalOcean) — simple hourly GPUs
- Modal / Replicate — code-first, serverless GPU
The full workflow, command by command
Spin up the pod with enough room
On RunPod I picked an A40 (48 GB) — 24 GB is enough for standard CubePart runs. The part people miss is storage: these projects pull a lot of weights, so set a ~50 GB container and up to 100 GB persistent storage.

Add an SSH key and test the connection
An SSH key is just a pair of files — a private one that stays on your machine, a public one you give the provider — so it can let you in without a password. Generate it once with ssh-keygen -t ed25519, paste the contents of id_ed25519.pub into RunPod under Settings → SSH Public Keys, then test the connection from Windows PowerShell (use the host/port your pod shows):
ssh root@<POD-IP> -p <PORT> -i ~/.ssh/id_ed25519If you land inside the pod's shell, it works.

Connect Cursor / VS Code over Remote-SSH
In Cursor (or VS Code), press Ctrl+Shift+P → Remote-SSH: Open SSH Configuration File, and add a host block (fill in your pod's HostName and Port):
Host runpod-cubepart
HostName <POD-IP>
User root
Port <PORT>
IdentityFile ~/.ssh/id_ed25519Then Ctrl+Shift+P → Remote-SSH: Connect to Host → runpod-cubepart → choose Linux, and open the /workspace folder. Everything you keep needs to live here — only persistent storage survives a shutdown.
Create a safe non-root user
You connect as root. Before letting an AI agent run freely, I make a normal user so it is not operating as root. In the Cursor terminal, still as root:
apt update && apt install -y curl git sudo
useradd -m -s /bin/bash stefan
mkdir -p /workspace/stefan
chmod -R a+rwX /workspace/stefan
ls -ld /workspace/stefanSwitch to that user and install Claude Code
Drop into the new user (no need to reconnect), then install the agent. From here on, the agent runs as stefan, not root:
su - stefan
cd /workspace/stefan
curl -fsSL https://claude.ai/install.sh | bash
source ~/.bashrc
claude --versionOn first launch claude asks you to log in — open the link in your browser and paste the code back.

Run the agent in bypass mode and hand it the job
Start the agent with permission prompts skipped — safe here because it is an isolated, throwaway cloud box (Anthropic only recommends this mode inside containers/VMs, which is exactly what this is):
claude --dangerously-skip-permissionsFrom here I do not type install commands myself. I just tell Claude, in plain English, what I want: clone the CubePart repo, install it with all dependencies, download the model weights, and run a quick test. It works out the right CUDA wheels and commands for this exact GPU and does the whole thing. The one instruction I always make explicit: install into /workspace, since anything outside persistent storage is wiped when the pod shuts down.

Ask for the Gradio demo and open it in your browser
CubePart ships a Gradio app (no ComfyUI build). I simply ask Claude to launch the Gradio demo and expose it with a public share link. It starts the app and hands back a https://….gradio.live URL — I open that on my own laptop and get the full UI, while the model keeps running on the cloud GPU. From there it is the usual Gradio flow: drop in a GLB, type the part names, run.

That is it — and where to see the output
That is the whole method: a model I cannot run locally went from a GitHub link to a working UI in my own browser, for about a dollar. The same flow works for the next heavy model, and the one after that.
Cost, cleanup, and when this is worth it
Shut the pod down the moment you are done. Billing is hourly and it runs until you stop it. If you only need a model occasionally, this beats buying a $2,000+ GPU by a mile — you pay for the exact minutes you use. If you are testing models every single day, that maths flips and local hardware starts to make sense.
That is how I test the heavy stuff without the hardware. New models land constantly — whenever one does, it goes straight into the Arena so you can compare it without installing anything at all.
Stefan Vaskevich