Project cantaloop - Local LLM setup
π Project cantaloop - Local LLM setup
Documenting steps setting up my server.
Steps
β ollama
β lmstudio
The solution consists of:
π 1. Hardened Ingress Layer (NGINX)
- TLS termination using automated Letβs Encrypt certificates
- Security-hardened configuration (strict headers, isolation, no open endpoints)
- WebSocket/SSE routing for live model streaming
- Reverse proxy rules ensuring backend components remain private
π₯οΈ 2. Open WebUI β User Interface & API Orchestration
- Runs in Docker with isolated storage
- Provides chat UI, model selection, knowledge tools, logs, and workspace
- Uses SQLite with controlled migrations for safe persistence
- Manages authentication, authorization, and API key generation
- Injects model requests to Ollama via secure internal networking
π§ 3. Ollama Model Server β Local Model Execution
- Serves modern LLMs such as Llama 3.x, Gemma 2B, Phi-3-mini, and embedding models
- Runs fully offline with high performance (CPU/GPU local execution)
- Exposes only a private API (172.17.0.1:11434) to prevent external access
- Allows rapid hot-swapping and downloading of models
ποΈ 4. Local Storage & Sovereignty
- All models, logs, vectors, chats, API keys, and knowledge data remain on-premise
- No cloud calls, no telemetry, no vendor lock-in
- Fully compliant with privacy-sensitive workflows
π οΈ 5. DevOps / LLMOps Engineering
- Automated container lifecycle (upgrades, migrations, recovery)
- Diagnosed complex multi-layer issues across Docker, NGINX, WebSockets, OpenWebUI, and DB migrations
- Enhanced reliability using isolation, volume management, and predictable network paths
Goal
β Result
A fully operational sovereign AI system with multiple working models, a modern UI, secured endpoints, authenticated API access, and the ability to expand into additional frontends (LM Studio, custom apps, internal tooling).
- The platform is now ready for:
- Multi-user access
- API integrations
- Local retrieval/embedding pipelines
- Private automation agents
- Enterprise-like internal AI copilots
Local LLM
1
2
sudo apt update && sudo apt upgrade -y
Open Source Models
CPU oriented
Llama
Gemma 3
gpt-oss
real install and setup
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
sudo docker stop open-webui
sudo docker rm open-webui
sudo docker pull ghcr.io/open-webui/open-webui:0.5.0
sudo docker run -d \
--name open-webui \
-p 3000:8080 \
-v /opt/open-webui/data:/app/backend/data \
-v /opt/open-webui/logs:/app/backend/logs \
-e WEBUI_URL_PREFIX=/ui \
--restart unless-stopped \
ghcr.io/open-webui/open-webui:0.4.6
sudo docker ps
sudo docker pull ghcr.io/open-webui/open-webui:0.4.6
sudo docker exec -it open-webui ls /app/build/static
curl -I https://llm.cantaloop.dk/ui/static/loader.js
------------
sudo docker stop open-webui
sudo docker rm open-webui
sudo docker pull ghcr.io/open-webui/webui:latest
sudo nginx -t
sudo systemctl reload nginx
curl -I https://llm.cantaloop.dk/ui/_app/immutable/entry/start.aabf9670.js
sudo docker pull ghcr.io/open-webui/open-webui:0.5.12
sudo docker run -d \
--name open-webui \
-p 3000:8080 \
-v /opt/open-webui/data:/app/backend/data \
-v /opt/open-webui/logs:/app/backend/logs \
-e WEBUI_URL_PREFIX=/ui \
--restart unless-stopped \
ghcr.io/open-webui/open-webui:0.5.12
dropped docker based
now git, python, clone into /opt/open-webui-src (with tom:tom peromission)
stop using /ui for redirect - then all works at llm.cantaloop.dk
last minute problems with docker access, had to adjust ufw
memory - too little for models
free -h
ahh api/chat stuff nu ny version
okay all over again
sudo docker exec -it open-webui bash
sudo docker stop open-webui
sudo rm /opt/open-webui/data/openwebui.db
sudo docker rm open-webui
sudo docker run -d \
-p 3000:8080 \
-v /opt/open-webui/data:/app/backend/data \
--name open-webui \
ghcr.io/open-webui/open-webui:0.6.41
sudo docker ps
goto https://llm.cantaloop.dk and sign in
still openwebui.db is 0 byte and there is a webui.db
one more time - with remove all
sudo docker stop open-webui
sudo docker rm open-webui
sudo rm -f /opt/open-webui/data/openwebui.db
sudo rm -f /opt/open-webui/data/webui.db
sudo chown -R $USER:$USER /opt/open-webui
last hickup - there needed to be /ws segment in nginx config
Open WebUI runs in container - here is healty pull and start sequence, can be replaced by specific version but this one take latest:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
sudo docker stop open-webui
sudo docker rm open-webui
(no data is deleted)
sudo docker pull ghcr.io/open-webui/open-webui:main
(it say image is up to date)
sudo docker run -d \
--name open-webui \
-p 3000:8080 \
-v /opt/open-webui/data:/app/backend/data \
-v /opt/open-webui/logs:/app/backend/logs \
--restart unless-stopped \
ghcr.io/open-webui/open-webui:main
sudo docker ps
(it will become healthy after 30 seconds)
Check from command line - you need to get API Key from UI
curl -v https://llm.cantaloop.dk/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API KEY>>" \
-d '{
"model": "llama3.2:1b",
"messages": [
{"role": "user", "content": "Hello from Tom"}
],
"stream": false
}'
Ollama - pulling models Since I have very little server with only 8 Gb RAM and no GPU I looked for tiny CPU oriented models
1
2
3
4
5
6
7
8
9
10
11
12
ollama pull gemma:2b
ollama pull gemma:2b-instruct
ollama list
NAME ID SIZE MODIFIED
gemma:2b-instruct 030ee63283b5 1.6 GB About a minute ago
gemma:2b b50d6c999e59 1.7 GB About a minute ago
llama3.2:1b baf6a787fdff 1.3 GB 41 hours ago
mxbai-embed-large:latest 468836162de7 669 MB 5 days ago
llama3.2:3b a80c4f17acd5 2.0 GB 5 days ago
phi3:mini 4f2222927938 2.2 GB 5 days ago
Start, Stop, Disable ELK
Elasticsearch, Kibana and Filebeat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
sudo systemctl start elasticsearch
sudo systemctl start kibana
sudo systemctl start mysql
sudo systemctl stop elasticsearch
sudo systemctl stop kibana
sudo systemctl stop mysql
sudo systemctl start elasticsearch
sudo systemctl start kibana
sudo systemctl start mysql
sudo systemctl status elasticsearch
π‘Still to do
π just getting started
β