Quick Reference Handbook

Important

Follow checklists in order. Check each box. Stop at → IT Support if a step fails.

Support email: inventory-support@university.edu | Server path: /data/LUStores | Compose file: docker-compose.prod.yml


Error Code Lookup

Find your error code or message, then jump to the named procedure.

Code / Message

Likely cause

Go to

ERR-001 / Connection refused

Service not running

PROC-01 Service Not Responding

ERR-002 / ECONNREFUSED 5432

Database not running

PROC-04 Database Down

ERR-003 / No space left on device

Disk full

PROC-06 Disk Full

ERR-004 / certificate has expired

TLS cert expired

PROC-05 SSL Certificate Issue

ERR-005 / Invalid credentials / 401

Auth misconfigured

PROC-07 Login Not Working

ERR-006 / JWT_SECRET not set

Missing env var

PROC-09 Missing / Wrong Environment Variables

ERR-007 / Migrations skipped

Bad migrations path

PROC-10 Database Migrations Not Running

ERR-008 / Feature not available (402)

Licence not loaded

PROC-11 Licence / Feature Not Available

ERR-009 / Cannot have negative stock

Float drift (fixed v1.1)

Update to latest deploy branch

ERR-010 / Health check failing / Restarting

Container crash loop

PROC-12 Container Crash Loop

ERR-011 / Out of memory / OOMKilled

Insufficient RAM

PROC-13 Out of Memory

ERR-012 / Permission denied (pg data)

Wrong file ownership

sudo chown -R 999:999 /db

ERR-013 / getaddrinfo EAI_AGAIN db

DB container not ready yet

Wait 30 s → PROC-01 Service Not Responding

ERR-014 / SAML metadata invalid

IdP config mismatch

PROC-08 SAML / SSO Not Working

ERR-015 / Helm release stuck

Rancher deploy stalled

PROC-14 Helm / Rancher Deploy Stalled


PROC-01 Service Not Responding

Symptoms: Site unreachable, “Connection refused”, container shows Exit or Restarting.

cd /data/LUStores
docker compose -f docker-compose.prod.yml ps      # identify which service

☐ All services Up? → check firewall:

sudo ufw status   # ports 80 and 443 must be ALLOW

nginx down → PROC-02 Nginx Down

app down → PROC-03 Application Service Down

db down → PROC-04 Database Down

☐ None of the above → restart everything:

docker compose -f docker-compose.prod.yml restart
# wait 60 s, then test

☐ Still down → → IT Support


PROC-02 Nginx Down

docker compose -f docker-compose.prod.yml logs --tail=50 nginx

address already in use → another process holds port 80/443:

sudo lsof -i :80 -i :443     # find the PID
sudo kill <PID>               # stop it if safe to do so

ssl certificate not foundPROC-05 SSL Certificate Issue

☐ Other error → note exact message, then:

docker compose -f docker-compose.prod.yml restart nginx

☐ Still down → → IT Support with log output


PROC-03 Application Service Down

docker compose -f docker-compose.prod.yml logs --tail=100 app

ECONNREFUSED 5432 → DB not ready; wait 30 s then:

docker compose -f docker-compose.prod.yml restart app

JWT_SECRET / SESSION_SECRET / DB_PASSWORD missing → PROC-09 Missing / Wrong Environment Variables

migrations skipped or path error → PROC-10 Database Migrations Not Running

Feature not available 402 → PROC-11 Licence / Feature Not Available

☐ Verify health endpoint:

curl http://localhost:5000/health
# expected: {"status":"healthy"}

☐ Still unhealthy after restart → → IT Support


PROC-04 Database Down

Danger

If logs contain “corrupted” or “PANIC” — stop and → IT Support immediately. Do not attempt a restart.

docker compose -f docker-compose.prod.yml logs --tail=100 db

no space leftPROC-06 Disk Full

permission denied on data directory:

sudo chown -R 999:999 /db
docker compose -f docker-compose.prod.yml restart db

password authentication failed → passwords mismatch in .env.prod:

grep DB_PASSWORD .env.prod
grep POSTGRES_PASSWORD .env.prod
# must be identical — fix .env.prod then restart db + app

☐ Verify DB accepting connections:

docker compose -f docker-compose.prod.yml exec db \
    pg_isready -U postgres
# expected: "accepting connections"

☐ Still not accepting → → IT Support


PROC-05 SSL Certificate Issue

Symptoms: Browser shows “Certificate expired” or ERR_CERT_DATE_INVALID

docker compose -f docker-compose.prod.yml exec certbot \
    certbot certificates      # check expiry date

☐ Expired — renew:

docker compose -f docker-compose.prod.yml exec certbot \
    certbot renew --force-renewal
docker compose -f docker-compose.prod.yml exec nginx nginx -s reload
☐ Renewal fails — rate limited → wait 7 days; verify DOMAIN in

.env.prod is the correct public hostname

☐ Renewal fails — port 80 blocked:

sudo ufw allow 80/tcp

☐ Certificate missing entirely:

docker compose -f docker-compose.prod.yml run --rm certbot \
    certonly --webroot --webroot-path=/var/www/certbot \
    -d ${DOMAIN} --email ${EMAIL} --agree-tos --non-interactive

☐ Still failing → → IT Support


PROC-06 Disk Full

Symptoms: ERR-003, DB write errors, container exits immediately

df -h          # confirm which partition is full

/ (root) full — clean Docker artefacts:

docker system prune -a --volumes    # type y
find /data/LUStores/logs -name "*.log" -mtime +30 -delete

/db full — clean old DB logs, then vacuum:

docker compose -f docker-compose.prod.yml exec db \
    find /var/lib/postgresql/data/pg_log -name "*.log" -mtime +7 -delete
docker compose -f docker-compose.prod.yml exec db \
    psql -U postgres -d university_inventory -c "VACUUM FULL;"
# Note: VACUUM FULL locks tables — run during off-hours

☐ Trim old backups (keep last 10):

ls -t /data/LUStores/backups | tail -n +11 | \
    xargs -I{} rm /data/LUStores/backups/{}

☐ Check space again: df -h

☐ Still full → → IT Support (disk expansion required)


PROC-07 Login Not Working

Symptoms: “Invalid credentials”, 401 errors, redirect loop

☐ Try the known admin account first (rule out single-user issue)

☐ Verify users exist:

docker compose -f docker-compose.prod.yml exec db \
    psql -U postgres -d university_inventory \
    -c "SELECT id, email, role FROM users LIMIT 5;"

☐ No users → see First Admin Account Setup

☐ SAML / SSO errors → PROC-08 SAML / SSO Not Working

JWT_SECRET missing or recently changed → PROC-09 Missing / Wrong Environment Variables

(changing the secret invalidates all sessions)

☐ Restart auth stack:

docker compose -f docker-compose.prod.yml restart replit-auth app
docker compose -f docker-compose.prod.yml logs --tail=50 replit-auth

☐ Still failing → → IT Support


PROC-08 SAML / SSO Not Working

Symptoms: ERR-014, redirect back to login, “SAML metadata invalid”

☐ Verify IdP metadata URL is reachable:

curl -I ${SAML_IDP_METADATA_URL}    # must return 200

☐ Confirm SAML_SP_ENTITY_ID in .env.prod exactly matches the IdP registration

☐ Check SP certificate expiry:

openssl x509 -in saml/sp.crt -noout -enddate

☐ Re-sync IdP metadata:

docker compose -f docker-compose.prod.yml restart app

☐ Enable local auth as emergency fallback:

# .env.prod
LOCAL_AUTH_ENABLED=true
# then restart app
☐ Still failing → contact your IdP administrator with SP entity ID and

ACS URL (https://<DOMAIN>/auth/saml/callback)


PROC-09 Missing / Wrong Environment Variables

Symptoms: ERR-006, app refuses to start, blank secrets

☐ Confirm .env.prod exists:

ls -la /data/LUStores/.env.prod

☐ Check required vars are non-blank:

grep -E "^(SESSION_SECRET|JWT_SECRET|DB_PASSWORD|DOMAIN)=" .env.prod

☐ Helm/Rancher — verify the Kubernetes secret:

kubectl get secret lustores-secrets -n lustores -o jsonpath='{.data}'

☐ Regenerate if compromised:

openssl rand -hex 64   # paste into SESSION_SECRET
openssl rand -hex 64   # paste into JWT_SECRET
# edit .env.prod then restart app

Warning

Changing JWT_SECRET or SESSION_SECRET logs out all active users.


PROC-10 Database Migrations Not Running

Symptoms: ERR-007, column X does not exist errors in logs

☐ Check startup logs for migration output:

docker compose -f docker-compose.prod.yml logs app | grep -E "migration|Drizzle|✅|ℹ️"

migrations skipped path error → fixed in deploy ≥ v1.1:

git pull origin deploy
docker compose -f docker-compose.prod.yml up -d --build app

☐ Apply missed migrations manually:

docker compose -f docker-compose.prod.yml exec app \
    node -e "require('./dist/dbInit.js').initializeDatabase()"

☐ Verify the currentStock column is numeric:

docker compose -f docker-compose.prod.yml exec db \
    psql -U postgres -d university_inventory \
    -c "\d items" | grep currentStock
# expected:  currentStock  | numeric(10,2)

☐ Still failing → → IT Support with column name and error message


PROC-11 Licence / Feature Not Available

Symptoms: ERR-008, 402 responses on Analytics, Notifications, or Import pages

☐ Check licence status in app: Settings → Licence

☐ Confirm LICENCE_KEY is set in .env.prod:

grep LICENCE_KEY .env.prod
☐ Key present but features locked → cache may not have warmed

(fixed in deploy ≥ v1.1); restart:

docker compose -f docker-compose.prod.yml restart app
☐ Key expired → paste renewed token in Settings → Licence → Save, or

update .env.prod and restart


PROC-12 Container Crash Loop

Symptoms: ERR-010, status shows “Restarting”, never reaches “Up”

docker compose -f docker-compose.prod.yml logs --tail=50 <service>

Out of memoryPROC-13 Out of Memory

Port already in use:

sudo lsof -i :<port>     # find conflicting PID
sudo kill <PID>

☐ Dependency not ready → start in order:

docker compose -f docker-compose.prod.yml stop
docker compose -f docker-compose.prod.yml up -d db
sleep 30
docker compose -f docker-compose.prod.yml up -d

☐ Watch until stable:

watch -n 3 'docker compose -f docker-compose.prod.yml ps'    # Ctrl+C to exit

☐ Still crashing after 3 attempts → → IT Support with full logs


PROC-13 Out of Memory

Symptoms: ERR-011, containers killed randomly, system very slow

free -h
docker stats --no-stream    # look for containers near their limit

☐ Restart memory-heavy services:

docker compose -f docker-compose.prod.yml restart app redis

☐ Release page cache:

sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

☐ Add temporary swap if none exists:

sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile && sudo mkswap /swapfile && sudo swapon /swapfile

☐ If memory keeps growing → restart all services overnight:

docker compose -f docker-compose.prod.yml restart

☐ Repeated OOM → → IT Support (server needs more RAM)


PROC-14 Helm / Rancher Deploy Stalled

Symptoms: ERR-015, deployment stuck in “Deploying” for > 10 minutes

kubectl get pods -n lustores
kubectl describe pod <pod> -n lustores | tail -30    # check Events section

ImagePullBackOff → Docker Hub credentials missing:

kubectl get secret regcred -n lustores    # if missing, see deployment/docker-hub-setup.md

Pending → check node resources:

kubectl describe nodes | grep -A 5 "Allocated resources"

CrashLoopBackOffPROC-12 Container Crash Loop (same steps, use kubectl logs instead)

☐ Force re-deploy via Rancher UI:

Apps → lustores → ⋮ → Upgrade → Force Update

☐ Roll back:

helm rollback lustores -n lustores

☐ Still stuck → → IT Support


Backup Procedures

Emergency Backup

Run before any major change or maintenance window:

cd /data/LUStores
docker compose -f docker-compose.prod.yml exec -T db \
    pg_dump -U postgres university_inventory \
    | gzip > "backups/emergency_$(date +%Y%m%d_%H%M%S).sql.gz"
ls -lh backups/ | tail -1    # verify: file must be > 1 MB

Emergency Restore

Danger

This overwrites all current data. Confirm before proceeding.

ls -lh backups/                           # choose backup filename
docker compose -f docker-compose.prod.yml stop app
gunzip -c backups/<FILENAME>.sql.gz | \
    docker compose -f docker-compose.prod.yml exec -T db \
    psql -U postgres -d university_inventory
docker compose -f docker-compose.prod.yml start app
# wait 60 s → test login → verify data

Preventive Maintenance

Weekly (5 minutes):

df -h                                                       # disk space OK?
docker compose -f docker-compose.prod.yml ps               # all Up?
ls -lh backups/ | tail -3                                  # backups recent?
docker compose -f docker-compose.prod.yml logs --tail=50 app | grep -i error

Monthly (30 minutes):

docker compose -f docker-compose.prod.yml exec certbot certbot certificates   # SSL expiry?
docker compose -f docker-compose.prod.yml pull          # new image versions?
sudo apt update && sudo apt list --upgradable            # OS patches?

Escalation — What to Send IT Support

Collect these three files and attach them to your support email:

docker compose -f docker-compose.prod.yml logs --tail=300 > ~/lustores-logs.txt
docker compose -f docker-compose.prod.yml ps              > ~/lustores-status.txt
df -h && free -h                                          > ~/lustores-disk.txt

Include in your message:

  1. Error code (e.g. ERR-002) or exact error text

  2. Procedure attempted (e.g. PROC-04)

  3. The three output files above