Post

Project cantaloop - Local LLM setup

Project cantaloop - Local LLM setup

🌍 Project cantaloop - Local LLM setup

Documenting steps setting up my server.

Steps

βœ… ollama
βœ… lmstudio

The solution consists of:

πŸ” 1. Hardened Ingress Layer (NGINX)

  • TLS termination using automated Let’s Encrypt certificates
  • Security-hardened configuration (strict headers, isolation, no open endpoints)
  • WebSocket/SSE routing for live model streaming
  • Reverse proxy rules ensuring backend components remain private

πŸ–₯️ 2. Open WebUI β€” User Interface & API Orchestration

  • Runs in Docker with isolated storage
  • Provides chat UI, model selection, knowledge tools, logs, and workspace
  • Uses SQLite with controlled migrations for safe persistence
  • Manages authentication, authorization, and API key generation
  • Injects model requests to Ollama via secure internal networking

🧠 3. Ollama Model Server β€” Local Model Execution

  • Serves modern LLMs such as Llama 3.x, Gemma 2B, Phi-3-mini, and embedding models
  • Runs fully offline with high performance (CPU/GPU local execution)
  • Exposes only a private API (172.17.0.1:11434) to prevent external access
  • Allows rapid hot-swapping and downloading of models

πŸ—„οΈ 4. Local Storage & Sovereignty

  • All models, logs, vectors, chats, API keys, and knowledge data remain on-premise
  • No cloud calls, no telemetry, no vendor lock-in
  • Fully compliant with privacy-sensitive workflows

πŸ› οΈ 5. DevOps / LLMOps Engineering

  • Automated container lifecycle (upgrades, migrations, recovery)
  • Diagnosed complex multi-layer issues across Docker, NGINX, WebSockets, OpenWebUI, and DB migrations
  • Enhanced reliability using isolation, volume management, and predictable network paths

Goal

⭐ Result

A fully operational sovereign AI system with multiple working models, a modern UI, secured endpoints, authenticated API access, and the ability to expand into additional frontends (LM Studio, custom apps, internal tooling).

  • The platform is now ready for:
  • Multi-user access
  • API integrations
  • Local retrieval/embedding pipelines
  • Private automation agents
  • Enterprise-like internal AI copilots

Local LLM

1
2
sudo apt update && sudo apt upgrade -y

Open Source Models

CPU oriented
Llama
Gemma 3
gpt-oss

real install and setup

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
sudo docker stop open-webui 
 
sudo docker rm open-webui
sudo docker pull ghcr.io/open-webui/open-webui:0.5.0

sudo docker run -d \
  --name open-webui \
  -p 3000:8080 \
  -v /opt/open-webui/data:/app/backend/data \
  -v /opt/open-webui/logs:/app/backend/logs \
  -e WEBUI_URL_PREFIX=/ui \
  --restart unless-stopped \
  ghcr.io/open-webui/open-webui:0.4.6

sudo docker ps

sudo docker pull ghcr.io/open-webui/open-webui:0.4.6

sudo docker exec -it open-webui ls /app/build/static

curl -I https://llm.cantaloop.dk/ui/static/loader.js


------------
sudo docker stop open-webui
sudo docker rm open-webui
sudo docker pull ghcr.io/open-webui/webui:latest




sudo nginx -t
sudo systemctl reload nginx

curl -I https://llm.cantaloop.dk/ui/_app/immutable/entry/start.aabf9670.js

sudo docker pull ghcr.io/open-webui/open-webui:0.5.12
sudo docker run -d \
  --name open-webui \
  -p 3000:8080 \
  -v /opt/open-webui/data:/app/backend/data \
  -v /opt/open-webui/logs:/app/backend/logs \
  -e WEBUI_URL_PREFIX=/ui \
  --restart unless-stopped \
  ghcr.io/open-webui/open-webui:0.5.12



dropped docker based

now git, python, clone into /opt/open-webui-src (with tom:tom peromission)

stop using /ui for redirect - then all works at llm.cantaloop.dk
last minute problems with docker access, had to adjust ufw

memory - too little for models
free -h

ahh api/chat stuff nu ny version


okay all over again
sudo docker exec -it open-webui bash

sudo docker stop open-webui
sudo rm /opt/open-webui/data/openwebui.db
sudo docker rm open-webui
sudo docker run -d \
  -p 3000:8080 \
  -v /opt/open-webui/data:/app/backend/data \
  --name open-webui \
  ghcr.io/open-webui/open-webui:0.6.41

sudo docker ps

goto https://llm.cantaloop.dk and sign in
still openwebui.db is 0 byte and there is a webui.db

one more time - with remove all

sudo docker stop open-webui
sudo docker rm open-webui

sudo rm -f /opt/open-webui/data/openwebui.db
sudo rm -f /opt/open-webui/data/webui.db

sudo chown -R $USER:$USER /opt/open-webui



last hickup - there needed to be /ws segment in nginx config



Open WebUI runs in container - here is healty pull and start sequence, can be replaced by specific version but this one take latest:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
sudo docker stop open-webui
sudo docker rm open-webui
(no data is deleted)

sudo docker pull ghcr.io/open-webui/open-webui:main
(it say image is up to date)

sudo docker run -d \
  --name open-webui \
  -p 3000:8080 \
  -v /opt/open-webui/data:/app/backend/data \
  -v /opt/open-webui/logs:/app/backend/logs \
  --restart unless-stopped \
  ghcr.io/open-webui/open-webui:main

sudo docker ps
(it will become healthy after 30 seconds)


Check from command line - you need to get API Key from UI
curl -v https://llm.cantaloop.dk/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <API KEY>>" \
  -d '{
    "model": "llama3.2:1b",
    "messages": [
      {"role": "user", "content": "Hello from Tom"}
    ],
    "stream": false
  }'


Ollama - pulling models Since I have very little server with only 8 Gb RAM and no GPU I looked for tiny CPU oriented models

1
2
3
4
5
6
7
8
9
10
11
12
ollama pull gemma:2b
ollama pull gemma:2b-instruct

ollama list
NAME                        ID              SIZE      MODIFIED           
gemma:2b-instruct           030ee63283b5    1.6 GB    About a minute ago    
gemma:2b                    b50d6c999e59    1.7 GB    About a minute ago    
llama3.2:1b                 baf6a787fdff    1.3 GB    41 hours ago          
mxbai-embed-large:latest    468836162de7    669 MB    5 days ago            
llama3.2:3b                 a80c4f17acd5    2.0 GB    5 days ago            
phi3:mini                   4f2222927938    2.2 GB    5 days ago  

Start, Stop, Disable ELK

Elasticsearch, Kibana and Filebeat

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

sudo systemctl start elasticsearch
sudo systemctl start kibana
sudo systemctl start mysql

sudo systemctl stop elasticsearch
sudo systemctl stop kibana
sudo systemctl stop mysql

sudo systemctl start elasticsearch
sudo systemctl start kibana
sudo systemctl start mysql

sudo systemctl status elasticsearch

πŸ’‘Still to do

πŸ”­ just getting started

β€”

This post is licensed under CC BY 4.0 by the author.