
Overview
This devcontainer provides a self-contained, GPU-accelerated AI development environment running inside a DevPod workspace. It serves a full local LLM stack accessible publicly via a Cloudflare tunnel at https://ai.uclab.dev.
Internet
│
▼
Cloudflare (ai.uclab.dev)
│ Zero Trust Tunnel
▼
cloudflared (container)
│ http://localhost:8080
▼
Open WebUI ──────────────────► Ollama API (localhost:11434)
(port 8080) │
▼
GPU: RTX 5060 (8 GB VRAM)
Models: llama3.1:8b, qwen2.5:7b,
deepseek-coder-v2:16b-lite-instruct-q4_K_M,
mistral-nemo
Host Requirements
| Requirement | Details |
|---|---|
| Container runtime | Docker with nvidia-container-toolkit |
| GPU passthrough | CDI (nvidia.com/gpu=all) |
| DevPod workspace | Workspace ID: ai |
| Host directories | ~/.ollama (model storage), ~/.cloudflared (tunnel credentials) |
File Structure
.devcontainer/
├── Dockerfile # Image definition
├── devcontainer.json # DevPod/VS Code container config
├── pull-models # Script to pull Ollama models (idempotent)
├── supervisord.conf # Reference config (not active — see note below)
├── supervisord/
│ ├── ollama.conf # Supervisor program: Ollama
│ ├── open-webui.conf # Supervisor program: Open WebUI
│ └── cloudflared.conf # Supervisor program: Cloudflare tunnel
└── scripts/
├── setup # Trusts and installs mise tools for workspace
└── setup_project # One-time project setup (commitizen, pre-commit)
Docker Image
Base: nvidia/cuda:12.8.0-runtime-ubuntu24.04
The image is built from .devcontainer/Dockerfile with the repo root as context.
Build stages (in order)
| Step | What happens |
|---|---|
Copy mise, uv |
Pulled from upstream images (jdxcode/mise, astral-sh/uv) |
apt-get |
Installs: sudo zstd libatomic1 supervisor nvtop git curl vim zsh |
| Ollama | Installed system-wide via https://ollama.com/install.sh |
| cloudflared | Installed via Cloudflare’s official apt repo |
| open-webui | Installed system-wide via uv pip install --system --break-system-packages |
| Supervisor configs | supervisord/*.conf → /etc/supervisor/conf.d/ |
pull-models script |
Copied to /usr/local/bin/pull-models, made executable |
ubuntu sudoers |
ubuntu ALL=(ALL) NOPASSWD:ALL — passwordless sudo |
Switch to ubuntu |
All subsequent steps run as non-root |
| mise tools | mise.toml baked in; Node LTS, Python 3.13, uv installed |
| Claude Code | npm install -g @anthropic-ai/claude-code via mise-managed Node |
| Shell activation | mise activate added to .bashrc and .zshrc |
Developer tools (via mise)
Defined in /workspaces/ai/mise.toml:
[tools]
node = "lts"
python = "3.13"
uv = "latest"
Tools are installed into /home/ubuntu/.local/share/mise/ and shimmed onto PATH via /etc/profile.d/mise.sh.
Container Configuration (devcontainer.json)
GPU passthrough
"runArgs": [
"--device", "nvidia.com/gpu=all",
"--security-opt", "label=disable"
]
Requires nvidia-container-toolkit on the host with CDI configured. The label=disable flag is needed for SELinux compatibility.
Environment variables
Set at the container level (available to all processes):
| Variable | Value | Purpose |
|---|---|---|
NVIDIA_VISIBLE_DEVICES |
all |
Expose all GPUs |
NVIDIA_DRIVER_CAPABILITIES |
all |
Full GPU capability set |
OLLAMA_KEEP_ALIVE |
10m |
Keep model loaded in VRAM for 10 min after last use |
OLLAMA_MAX_LOADED_MODELS |
1 |
Max 1 model in VRAM at a time (8 GB card) |
OLLAMA_NUM_PARALLEL |
1 |
Single inference thread |
OLLAMA_BASE_URL |
http://127.0.0.1:11434 |
Used by Open WebUI to reach Ollama |
API keys passed from the host:
| Variable | Source |
|---|---|
ANTHROPIC_API_KEY |
${localEnv:ANTHROPIC_API_KEY} |
MISE_GITHUB_TOKEN |
${localEnv:MISE_GITHUB_TOKEN} |
Persistent mounts
Both directories live on the host and are bind-mounted into the container, so they survive full container rebuilds:
| Host path | Container path | Contents |
|---|---|---|
~/.ollama |
/home/ubuntu/.ollama |
Downloaded model weights (~27 GB) |
~/.cloudflared |
/home/ubuntu/.cloudflared |
Tunnel credentials and config |
Port forwarding
| Port | Service | Behaviour on attach |
|---|---|---|
8080 |
Open WebUI | Opens in browser automatically |
11434 |
Ollama API | Forwarded silently |
Port 8080 is also bound on the host via -p 8080:8080 in runArgs.
Startup sequence (postStartCommand)
sudo supervisord -c /etc/supervisor/supervisord.conf && \
{ until curl -sf http://localhost:11434 >/dev/null 2>&1; do sleep 2; done && pull-models; } &
supervisordstarts as a daemon — launches Ollama, Open WebUI, and cloudflared- A background job polls Ollama until it responds on port 11434
- Once Ollama is ready,
pull-modelsruns and downloads any missing models postStartCommandreturns immediately so the IDE is not blocked
Process Management (Supervisor)
The system supervisor (/etc/supervisor/supervisord.conf) is the system default Debian config. It includes all files from /etc/supervisor/conf.d/*.conf, which are baked into the image from .devcontainer/supervisord/.
All three programs run as user ubuntu with HOME="/home/ubuntu" explicitly set (supervisor does not inherit HOME when switching users).
Priority order
priority=1 ollama ← starts first
priority=2 open-webui ← waits 10s (startsecs) for Ollama to be ready
priority=3 cloudflared ← tunnel up last
ollama
command=/usr/local/bin/ollama serve
user=ubuntu
environment=
HOME="/home/ubuntu",
OLLAMA_HOST="0.0.0.0:11434",
OLLAMA_KEEP_ALIVE="10m",
OLLAMA_MAX_LOADED_MODELS="1",
OLLAMA_NUM_PARALLEL="1",
NVIDIA_VISIBLE_DEVICES="all",
NVIDIA_DRIVER_CAPABILITIES="all"
OLLAMA_HOST=0.0.0.0:11434 makes Ollama listen on all interfaces (required for DevPod port forwarding to work).
open-webui
command=/usr/local/bin/open-webui serve --host 0.0.0.0 --port 8080
user=ubuntu
environment=
HOME="/home/ubuntu",
DATA_DIR="/home/ubuntu/.local/share/open-webui",
OLLAMA_BASE_URL="http://127.0.0.1:11434",
WEBUI_AUTH="true",
ENABLE_MEMORY="true",
MEMORY_RETRIEVAL_TOP_K="5",
RAG_EMBEDDING_ENGINE="ollama",
RAG_EMBEDDING_MODEL="nomic-embed-text",
ENABLE_RAG_WEB_SEARCH="true",
RAG_WEB_SEARCH_ENGINE="duckduckgo",
DO_NOT_TRACK="true",
ANONYMIZED_TELEMETRY="false"
DATA_DIR must point to a writable path — open-webui was installed as root so its package directory is not writable by ubuntu.
RAG uses nomic-embed-text via Ollama for embeddings. Web search uses DuckDuckGo (no API key required).
cloudflared
command=/usr/local/bin/cloudflared tunnel run 2571f532-7790-4275-8bb4-33e0c20c64d6
user=ubuntu
environment=HOME="/home/ubuntu"
The tunnel ID and credentials are static. Configuration is read from the bind-mounted /home/ubuntu/.cloudflared/config.yml:
tunnel: 2571f532-7790-4275-8bb4-33e0c20c64d6
credentials-file: /home/ubuntu/.cloudflared/2571f532-7790-4275-8bb4-33e0c20c64d6.json
ingress:
- service: http://localhost:8080
All traffic on ai.uclab.dev is forwarded to Open WebUI on port 8080.
Model Management (pull-models)
Location in image: /usr/local/bin/pull-models
The script is idempotent — it checks ollama list before pulling and skips models already present. Since ~/.ollama is a persistent host mount, models are only downloaded once regardless of how many times the container is rebuilt.
Configured models
| Model | Size | Purpose |
|---|---|---|
llama3.1:8b |
4.9 GB | General-purpose chat |
qwen2.5:7b |
4.7 GB | General-purpose, strong reasoning |
deepseek-coder-v2:16b-lite-instruct-q4_K_M |
10 GB | Code generation |
mistral-nemo |
7.1 GB | Fast general-purpose |
Total: ~27 GB. Stored on a 1.8 TB volume (currently 10% used).
Note: The RTX 5060 has 8 GB VRAM. Only one model loads at a time (
OLLAMA_MAX_LOADED_MODELS=1). Models larger than VRAM will run partially on CPU.
Logs
All service logs are written to /var/log/ inside the container:
| File | Service |
|---|---|
/var/log/ollama.log / .err |
Ollama stdout / stderr |
/var/log/open-webui.log / .err |
Open WebUI stdout / stderr |
/var/log/cloudflared.log / .err |
Cloudflared stdout / stderr |
/var/log/supervisor/supervisord.log |
Supervisor daemon log |
Quick check:
sudo supervisorctl status
sudo supervisorctl tail -f open-webui stderr
Rebuild Checklist
Everything needed to fully recreate the environment is either in the image or on the host:
| What | Where | Survives rebuild? |
|---|---|---|
| Container config | .devcontainer/ in repo |
Yes — source of truth |
| Model weights | ~/.ollama (host mount) |
Yes |
| Cloudflare credentials | ~/.cloudflared (host mount) |
Yes |
| Open WebUI data (users, chats) | ~/.local/share/open-webui (inside container) |
No — lost on rebuild |
| mise tool cache | Inside image (baked at build time) | Rebuilt from mise.toml |
Warning: Open WebUI user accounts and chat history live inside the container at
/home/ubuntu/.local/share/open-webui. To persist this across rebuilds, add a host mount for that path indevcontainer.json.