1 Known Issues and Gotchas
Matt Cupp edited this page 2026-05-29 17:35:45 -04:00

Known Issues and Gotchas

Operator reference for recurring problems and non-obvious constraints. Scan this first when something is broken.


Atlas Loses Hyperion SMB Shares

Symptom: Atlas periodically drops its mapped network drives to Hyperion (192.168.1.217). Drive letters appear disconnected or throw errors.

Cause: Windows SMB client session instability. Root cause is not fully diagnosed; adding Windows Credential Manager entries for Hyperion did not eliminate the issue.

Fix — restart the SMB stack without rebooting:

# Run as Administrator
Restart-Service -Name "LanmanWorkstation" -Force
Restart-Service -Name "LanmanServer" -Force

If that doesn't work — reconnect the drive manually:

net use Z: /delete
net use Z: \\192.168.1.217\sharename /persistent:yes /savecred

Last resort: Reboot Atlas.


Docker Snap Limitations on Nexus

Constraint: Docker on Nexus is installed via snap (sudo snap install docker). The snap package does not ship docker-compose as a separate binary.

  • Always use: docker compose (space — the Docker CLI plugin)
  • Never use: docker-compose (hyphen — the standalone binary)

docker-compose may appear to work in some contexts but fails silently or resolves paths incorrectly. If a compose command seems to do nothing, check which form you used.


ConBee II Zigbee Dongle Loses Detection After Server Move

Symptom: Home Assistant loses all Zigbee devices after Nexus is moved or the ConBee II USB dongle is replugged.

Cause: RF interference from USB 3.0 ports causes the 2.4 GHz Zigbee radio to become unreliable when plugged directly into the server chassis.

Fix: Always use a USB extension cable to physically distance the ConBee II from the server. Device path is /dev/ttyACM0.

The compose file already passes the device through correctly:

devices:
  - /dev/ttyACM0:/dev/ttyACM0

If HA loses Zigbee after a move, check the cable first — not the software.


Never Update Komodo via Its Own Pipeline

Constraint: Komodo cannot deploy updates to itself. If you add Komodo to its own deploy procedure, it will kill the running container mid-deploy and the update will never complete. The container comes back in its old state (or doesn't come back at all until manually restarted).

Always update Komodo via SSH:

ssh matt@192.168.1.226 'cd /home/matt/repos/homelab-docker/komodo && docker compose up -d'

Komodo is intentionally excluded from the Stage 2 deploy procedure. Do not add it.


renovate.json in the Repo Does Nothing

Symptom: Editing renovate.json in the repo root has no effect on Renovate behavior.

Cause: The renovate.json file is a JSON schema reference pointer only — it was created by Renovate's onboarding PR and contains no actual configuration. Renovate's real config (Forgejo endpoint, API token, PR rules, automerge settings) lives server-side and is intentionally excluded from git because it contains an API token.

Real config location on Nexus:

/mnt/server/containers/renovate/config.js

To change Renovate behavior, edit that file directly on Nexus.


Komodo Procedure Completes in ~1ms (Nothing Deployed)

Symptom: You merge a PR, Komodo's webhook fires, the procedure run shows a completion time of roughly 1ms, and none of the stacks actually redeployed.

Cause: Stage 2 of the deploy procedure is misconfigured. It has a "Batch Deploy Stack If Changed" action instead of individual "Deploy Stack" actions. "Batch Deploy If Changed" checks for a linked repo commit hash to determine if a stack needs updating — since the stacks aren't linked to the repo in that way, it always evaluates to "no change" and skips everything.

Fix: Komodo UI → Procedures → edit the deploy procedure → Stage 2. Replace any "Batch Deploy Stack If Changed" action with 5 explicit Deploy Stack actions, one per service:

  1. Deploy Stack: dashy
  2. Deploy Stack: bookstack
  3. Deploy Stack: tandoor
  4. Deploy Stack: homeassistant
  5. Deploy Stack: pinchflat

Pull Repo Fails — "not a git repository (or any parent up to mount point)"

Symptom: Stage 1 of the Komodo deploy procedure (Pull Repo) fails with a message like fatal: not a git repository (or any parent up to mount point /home/matt/repos).

Cause: Komodo's Periphery agent runs as root. The homelab-docker repo is owned by matt. Git's safe.directory check prevents root from operating on repos owned by another user. When the configured path doesn't match exactly (due to git version differences or path normalization), git traverses upward until it hits the Docker bind-mount boundary and reports the misleading "not a git repository" error.

Fix: Edit komodo/.gitconfig in the homelab-docker repo (this file is bind-mounted as /root/.gitconfig inside the Periphery container). It must contain:

[safe]
    directory = /home/matt/repos/homelab-docker
    directory = *

The directory = * wildcard is the safe choice here — Periphery runs as root and already owns the host system. The change takes effect immediately (no container restart needed). Confirm with: Komodo UI → Repos → homelab-docker → Pull.


Constraint: The TP-Link office switch is connected to the APC UPS (Battery Backup 1), and must remain there.

Why it matters: If the switch loses power, all office machines (PVE, PBS, Hyperion, Atlas) lose LAN connectivity simultaneously. This prevents the NUT master (PVE) from sending graceful shutdown signals to PBS, Hyperion, and Atlas via the network. The result is unclean shutdowns for all of those machines during an outage.

Do not move the switch to a different circuit or UPS.


APC UPS Beeper is Intentionally Disabled

Symptom: No alarm sounds during a power outage on Battery Backup 1.

Cause: The APC beeper is disabled via NUT configuration. This is intentional.

If you need to re-enable the audible alarm, see the Power & UPS wiki page.


Webhook Fires but Procedure Silently Does Nothing

Symptom: Forgejo shows the webhook delivery as successful (200 OK), Komodo logs "Successfully authenticated," but no procedure runs.

Cause: New procedures are created with webhook_enabled: false by default. Komodo authenticates the incoming webhook but silently drops it if webhook triggering is disabled on the procedure.

Fix: Komodo UI → Procedures → select the procedure → Config tab → enable the "Webhook Enabled" toggle → Save. Reload the page to confirm the toggle persisted.


Container Running Wrong Image Tag After Deploy

Symptom: Komodo reports a successful deploy but docker ps shows the container is still on the old image.

Cause: Komodo does not pull images on deploy by default (pull_on_deploy is off). A container only gets a new image when the tag in its compose file changes. If the tag in git didn't change, docker compose up -d sees no diff and leaves the container alone.

Check what's actually running:

ssh matt@192.168.1.226 'docker ps --format "table {{.Image}}\t{{.Names}}"'

Force redeploy a specific stack:

ssh matt@192.168.1.226 'cd /home/matt/repos/homelab-docker/<service> && docker compose up -d'