Skip to content

Latest commit

 

History

History
120 lines (83 loc) · 2.34 KB

File metadata and controls

120 lines (83 loc) · 2.34 KB

Nginx Reverse Proxy

Use this when you want a clean public entrypoint like https://chat.example.com while keeping the API container bound to 127.0.0.1:8000 on the host.

Why this setup

Recommended production posture:

  • keep the FastAPI gateway on 127.0.0.1:8000
  • keep vLLM on 127.0.0.1:8001
  • expose only Nginx on 80/443
  • terminate TLS at Nginx
  • keep API-key auth enabled in the app

1. App binding

Use a local-only app bind in .env:

API_HOST=127.0.0.1
API_PORT=8000

Then restart the app stack if needed:

docker compose up -d --build

2. Install Nginx

sudo apt update
sudo apt install -y nginx

If you want automatic Let's Encrypt certificates through Nginx, also install:

sudo apt install -y certbot python3-certbot-nginx

3. Create the site config

Start from the included template:

sudo cp deploy/nginx/selfhosted-chat-api.conf /etc/nginx/sites-available/chat.example.com

Edit the file and replace:

  • your-domain.example.com → your real domain
  • proxy_pass http://127.0.0.1:8000; if you changed the app port

Then enable it:

sudo ln -s /etc/nginx/sites-available/chat.example.com /etc/nginx/sites-enabled/chat.example.com
sudo rm -f /etc/nginx/sites-enabled/default
sudo nginx -t
sudo systemctl reload nginx

4. Open cloud firewall / security group

Make sure inbound rules allow:

  • TCP 80
  • TCP 443

If those ports are blocked upstream, the domain will time out even if Nginx is configured correctly.

5. Issue TLS certs

sudo certbot --nginx -d chat.example.com

After cert issuance, test:

curl https://chat.example.com/health

6. Smoke tests

Health:

curl https://chat.example.com/health

Models:

curl https://chat.example.com/v1/models \
  -H "Authorization: Bearer <YOUR_API_KEY>"

Chat:

curl https://chat.example.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_API_KEY>" \
  -d '{
    "model": "Qwen/Qwen2.5-7B-Instruct",
    "messages": [
      {"role": "user", "content": "Reply with exactly: nginx works"}
    ],
    "temperature": 0
  }'

Notes

  • Nginx handles public ingress; the API service should stay private on localhost.
  • Do not expose the raw vLLM port publicly unless you really mean to.
  • Keep API keys enabled even behind TLS.