AI Devcontainer — Architecture & Setup

AI

Overview

This devcontainer provides a self-contained, GPU-accelerated AI development environment running inside a DevPod workspace. It serves a full local LLM stack accessible publicly via a Cloudflare tunnel at https://ai.uclab.dev.

Internet
   │
   ▼
Cloudflare (ai.uclab.dev)
   │  Zero Trust Tunnel
   ▼
cloudflared (container)
   │  http://localhost:8080
   ▼
Open WebUI  ──────────────────►  Ollama API (localhost:11434)
(port 8080)                           │
                                       ▼
                                 GPU: RTX 5060 (8 GB VRAM)
                                 Models: llama3.1:8b, qwen2.5:7b,
                                         deepseek-coder-v2:16b-lite-instruct-q4_K_M,
                                         mistral-nemo

Host Requirements

Requirement Details
Container runtime Docker with nvidia-container-toolkit
GPU passthrough CDI (nvidia.com/gpu=all)
DevPod workspace Workspace ID: ai
Host directories ~/.ollama (model storage), ~/.cloudflared (tunnel credentials)

File Structure

.devcontainer/
├── Dockerfile              # Image definition
├── devcontainer.json       # DevPod/VS Code container config
├── pull-models             # Script to pull Ollama models (idempotent)
├── supervisord.conf        # Reference config (not active — see note below)
├── supervisord/
│   ├── ollama.conf         # Supervisor program: Ollama
│   ├── open-webui.conf     # Supervisor program: Open WebUI
│   └── cloudflared.conf    # Supervisor program: Cloudflare tunnel
└── scripts/
    ├── setup               # Trusts and installs mise tools for workspace
    └── setup_project       # One-time project setup (commitizen, pre-commit)

Docker Image

Base: nvidia/cuda:12.8.0-runtime-ubuntu24.04

The image is built from .devcontainer/Dockerfile with the repo root as context.

Build stages (in order)

Step What happens
Copy mise, uv Pulled from upstream images (jdxcode/mise, astral-sh/uv)
apt-get Installs: sudo zstd libatomic1 supervisor nvtop git curl vim zsh
Ollama Installed system-wide via https://ollama.com/install.sh
cloudflared Installed via Cloudflare’s official apt repo
open-webui Installed system-wide via uv pip install --system --break-system-packages
Supervisor configs supervisord/*.conf/etc/supervisor/conf.d/
pull-models script Copied to /usr/local/bin/pull-models, made executable
ubuntu sudoers ubuntu ALL=(ALL) NOPASSWD:ALL — passwordless sudo
Switch to ubuntu All subsequent steps run as non-root
mise tools mise.toml baked in; Node LTS, Python 3.13, uv installed
Claude Code npm install -g @anthropic-ai/claude-code via mise-managed Node
Shell activation mise activate added to .bashrc and .zshrc

Developer tools (via mise)

Defined in /workspaces/ai/mise.toml:

[tools]
node = "lts"
python = "3.13"
uv = "latest"

Tools are installed into /home/ubuntu/.local/share/mise/ and shimmed onto PATH via /etc/profile.d/mise.sh.


Container Configuration (devcontainer.json)

GPU passthrough

"runArgs": [
  "--device", "nvidia.com/gpu=all",
  "--security-opt", "label=disable"
]

Requires nvidia-container-toolkit on the host with CDI configured. The label=disable flag is needed for SELinux compatibility.

Environment variables

Set at the container level (available to all processes):

Variable Value Purpose
NVIDIA_VISIBLE_DEVICES all Expose all GPUs
NVIDIA_DRIVER_CAPABILITIES all Full GPU capability set
OLLAMA_KEEP_ALIVE 10m Keep model loaded in VRAM for 10 min after last use
OLLAMA_MAX_LOADED_MODELS 1 Max 1 model in VRAM at a time (8 GB card)
OLLAMA_NUM_PARALLEL 1 Single inference thread
OLLAMA_BASE_URL http://127.0.0.1:11434 Used by Open WebUI to reach Ollama

API keys passed from the host:

Variable Source
ANTHROPIC_API_KEY ${localEnv:ANTHROPIC_API_KEY}
MISE_GITHUB_TOKEN ${localEnv:MISE_GITHUB_TOKEN}

Persistent mounts

Both directories live on the host and are bind-mounted into the container, so they survive full container rebuilds:

Host path Container path Contents
~/.ollama /home/ubuntu/.ollama Downloaded model weights (~27 GB)
~/.cloudflared /home/ubuntu/.cloudflared Tunnel credentials and config

Port forwarding

Port Service Behaviour on attach
8080 Open WebUI Opens in browser automatically
11434 Ollama API Forwarded silently

Port 8080 is also bound on the host via -p 8080:8080 in runArgs.

Startup sequence (postStartCommand)

sudo supervisord -c /etc/supervisor/supervisord.conf && \
{ until curl -sf http://localhost:11434 >/dev/null 2>&1; do sleep 2; done && pull-models; } &
  1. supervisord starts as a daemon — launches Ollama, Open WebUI, and cloudflared
  2. A background job polls Ollama until it responds on port 11434
  3. Once Ollama is ready, pull-models runs and downloads any missing models
  4. postStartCommand returns immediately so the IDE is not blocked

Process Management (Supervisor)

The system supervisor (/etc/supervisor/supervisord.conf) is the system default Debian config. It includes all files from /etc/supervisor/conf.d/*.conf, which are baked into the image from .devcontainer/supervisord/.

All three programs run as user ubuntu with HOME="/home/ubuntu" explicitly set (supervisor does not inherit HOME when switching users).

Priority order

priority=1  ollama       ← starts first
priority=2  open-webui   ← waits 10s (startsecs) for Ollama to be ready
priority=3  cloudflared  ← tunnel up last

ollama

command=/usr/local/bin/ollama serve
user=ubuntu
environment=
    HOME="/home/ubuntu",
    OLLAMA_HOST="0.0.0.0:11434",
    OLLAMA_KEEP_ALIVE="10m",
    OLLAMA_MAX_LOADED_MODELS="1",
    OLLAMA_NUM_PARALLEL="1",
    NVIDIA_VISIBLE_DEVICES="all",
    NVIDIA_DRIVER_CAPABILITIES="all"

OLLAMA_HOST=0.0.0.0:11434 makes Ollama listen on all interfaces (required for DevPod port forwarding to work).

open-webui

command=/usr/local/bin/open-webui serve --host 0.0.0.0 --port 8080
user=ubuntu
environment=
    HOME="/home/ubuntu",
    DATA_DIR="/home/ubuntu/.local/share/open-webui",
    OLLAMA_BASE_URL="http://127.0.0.1:11434",
    WEBUI_AUTH="true",
    ENABLE_MEMORY="true",
    MEMORY_RETRIEVAL_TOP_K="5",
    RAG_EMBEDDING_ENGINE="ollama",
    RAG_EMBEDDING_MODEL="nomic-embed-text",
    ENABLE_RAG_WEB_SEARCH="true",
    RAG_WEB_SEARCH_ENGINE="duckduckgo",
    DO_NOT_TRACK="true",
    ANONYMIZED_TELEMETRY="false"

DATA_DIR must point to a writable path — open-webui was installed as root so its package directory is not writable by ubuntu.

RAG uses nomic-embed-text via Ollama for embeddings. Web search uses DuckDuckGo (no API key required).

cloudflared

command=/usr/local/bin/cloudflared tunnel run 2571f532-7790-4275-8bb4-33e0c20c64d6
user=ubuntu
environment=HOME="/home/ubuntu"

The tunnel ID and credentials are static. Configuration is read from the bind-mounted /home/ubuntu/.cloudflared/config.yml:

tunnel: 2571f532-7790-4275-8bb4-33e0c20c64d6
credentials-file: /home/ubuntu/.cloudflared/2571f532-7790-4275-8bb4-33e0c20c64d6.json

ingress:
  - service: http://localhost:8080

All traffic on ai.uclab.dev is forwarded to Open WebUI on port 8080.


Model Management (pull-models)

Location in image: /usr/local/bin/pull-models

The script is idempotent — it checks ollama list before pulling and skips models already present. Since ~/.ollama is a persistent host mount, models are only downloaded once regardless of how many times the container is rebuilt.

Configured models

Model Size Purpose
llama3.1:8b 4.9 GB General-purpose chat
qwen2.5:7b 4.7 GB General-purpose, strong reasoning
deepseek-coder-v2:16b-lite-instruct-q4_K_M 10 GB Code generation
mistral-nemo 7.1 GB Fast general-purpose

Total: ~27 GB. Stored on a 1.8 TB volume (currently 10% used).

Note: The RTX 5060 has 8 GB VRAM. Only one model loads at a time (OLLAMA_MAX_LOADED_MODELS=1). Models larger than VRAM will run partially on CPU.


Logs

All service logs are written to /var/log/ inside the container:

File Service
/var/log/ollama.log / .err Ollama stdout / stderr
/var/log/open-webui.log / .err Open WebUI stdout / stderr
/var/log/cloudflared.log / .err Cloudflared stdout / stderr
/var/log/supervisor/supervisord.log Supervisor daemon log

Quick check:

sudo supervisorctl status
sudo supervisorctl tail -f open-webui stderr

Rebuild Checklist

Everything needed to fully recreate the environment is either in the image or on the host:

What Where Survives rebuild?
Container config .devcontainer/ in repo Yes — source of truth
Model weights ~/.ollama (host mount) Yes
Cloudflare credentials ~/.cloudflared (host mount) Yes
Open WebUI data (users, chats) ~/.local/share/open-webui (inside container) No — lost on rebuild
mise tool cache Inside image (baked at build time) Rebuilt from mise.toml

Warning: Open WebUI user accounts and chat history live inside the container at /home/ubuntu/.local/share/open-webui. To persist this across rebuilds, add a host mount for that path in devcontainer.json.

my DevOps Odyssey

Logo

“Σα βγεις στον πηγαιμό για την Ιθάκη, να εύχεσαι να ‘ναι μακρύς ο δρόμος, γεμάτος περιπέτειες, γεμάτος γνώσεις.” - Kavafis’ Ithaka.



AI Devcontainer — Architecture & Setup

5 min read  ·  · views

2026-04-07

Series:lab

Categories:devops

Tags:#ai, #ollama, #open-webui, #devcontainer, #nvidia


AI Devcontainer — Architecture & Setup: