Summary
A forgecode session that crashes or is killed mid-write leaves behind two 0-byte sidecar lock files in the state directory:
~/Library/Application Support/forge/.secrets.lock (macOS) / ~/.forge/.secrets.lock (Linux)
~/Library/Application Support/forge/config/config.json.lock (macOS) / ~/.forge/config/config.json.lock (Linux)
Every subsequent forgecode launch then races for these stale locks and surfaces a generic ERROR: database is locked line at the top of every chat, regardless of whether the user is doing anything lock-related. This is currently happening in every chat for me (see Repro).
The symptom is intermittent — sometimes a launch wins the race and the chat works; sometimes it loses and the error appears. With multiple parent daemon processes (see Diagnostic), the contention becomes near-constant.
Repro
rm -rf ~/.forge && forge --version (clean state, first run creates dirs and writes config)
- Verify
~/.forge/.secrets.lock is NOT present after first run completes
- Start a chat:
forge --conversation-id $(uuidgen) (or just open the CLI)
- From another terminal,
kill -9 the forgecode PID while it's actively writing config (e.g. mid-update_environment in env.rs)
- Verify
~/.forge/.secrets.lock is now present, 0 bytes, with mtime = the kill time
- Relaunch forgecode → observe
ERROR: database is locked on the next chat, repeatedly
On my machine the lock files have been present since 2026-06-17 02:24 (~6 days at time of filing), which means every chat in that window has been racing for them.
Diagnostic evidence (from my current machine)
$ ls -la ~/Library/Application\\ Support/forge/.secrets* \\\
~/Library/Application\\ Support/forge/config/config.json*
-rw-------@ 1 kooshapari staff 3681 Jun 17 19:56 .../forge/.secrets
-rw-r--r--@ 1 kooshapari staff 0 Jun 17 02:24 .../forge/.secrets.lock ← STALE
-rw-r--r--@ 1 kooshapari staff 242 Jun 17 19:56 .../forge/config/config.json ← (real config)
-rw-r--r--@ 1 kooshapari staff 0 Jun 17 02:24 .../forge/config/config.json.lock ← STALE
$ lsof .../forge/.secrets.lock .../forge/config/config.json.lock
(empty — no live process holds either lock)
State files themselves are healthy (3681-byte encrypted OAuth credentials, valid JSON config). Only the sidecar locks are stale.
Likely code locations
The lock-file pattern is consistent with rmcp's file-based credential fallback (Cargo.lock pins rmcp = 1.7.0 in my build); auth secrets in ~/Library/Application Support/forge/.secrets look like an rmcp-style encrypted store with a sidecar lock.
The .forge/.secrets and .forge/config/config.json paths also match the legacy forgecode state dir layout (see crates/forge_config/src/reader.rs:63 resolve_base_path).
Suggested fix
Pick one (or both):
Option A — Use fs2 (flock) instead of sidecar files
use fs2::FileExt;
let f = std::fs::OpenOptions::new()
.create(true).write(true).truncate(false)
.open(&path)?;
f.lock_exclusive()?; // auto-released by the kernel on process exit, even SIGKILL
// ... write ...
// (drop f to release)
flock(2) is released by the kernel when the holding process exits for any reason (including SIGKILL), so stale locks are impossible. fs2 = "0.4" is a 200-LOC crate that wraps this.
Option B — Startup recovery for stale sidecar locks
If keeping the sidecar pattern, add a one-time sweep on startup:
fn cleanup_stale_locks(state_dir: &Path) -> std::io::Result<()> {
for entry in std::fs::read_dir(state_dir)? {
let path = entry?.path();
if path.extension() == Some("lock") && path.metadata()?.len() == 0 {
// Only remove if no live process holds it (lsof-equivalent)
// AND sibling file is older than the lock (i.e. no in-flight write)
if is_unlocked(&path)? && sibling_is_older(&path)? {
tracing::info!(path = %path.display(), "removing stale lock file");
let _ = std::fs::remove_file(&path);
}
}
}
Ok(())
}
Either way: include the path of the cleared lock file in the INFO log so users can self-diagnose, and a short tracing::warn!(could not remove stale lock: {path}, please remove manually) on removal failure.
Workaround (until fixed)
A safe, idempotent bash script that clears only 0-byte *.lock files whose target file exists and has no live lsof holder, with a 1-hour minimum age (override via --force):
# /tmp/clear-forge-locks.sh
#!/usr/bin/env bash
set -euo pipefail
for d in "$HOME/Library/Application Support/forge" "$HOME/.forge"; do
[ -d "$d" ] || continue
find "$d" -type f -name '*.lock' -size 0 -mmin +60 | while read -r lock; do
if ! lsof "$lock" 2>/dev/null | grep -q .; then
rm -v "$lock"
fi
done
done
I just used this to clear both locks in this report; the error immediately disappeared from the next chat.
Environment
- forgecode v2.13.14 (binary:
~/.local/bin/forge)
- macOS 15.x (Apple Silicon)
- rmcp 1.7.0 (from Cargo.lock)
- ~13 zombie forge processes accumulated from prior crashed sessions on this device (3 parent daemons + 10 conversation children, some with duplicate
--conversation-id values — separate but related issue worth filing)
Severity
P1 — affects every user on the platform, every chat, indefinitely, with no in-app recovery path. Surface error is misleading ("database" implies SQLite when the actual cause is a stale sidecar file).
Summary
A forgecode session that crashes or is killed mid-write leaves behind two 0-byte sidecar lock files in the state directory:
~/Library/Application Support/forge/.secrets.lock(macOS) /~/.forge/.secrets.lock(Linux)~/Library/Application Support/forge/config/config.json.lock(macOS) /~/.forge/config/config.json.lock(Linux)Every subsequent forgecode launch then races for these stale locks and surfaces a generic
ERROR: database is lockedline at the top of every chat, regardless of whether the user is doing anything lock-related. This is currently happening in every chat for me (see Repro).The symptom is intermittent — sometimes a launch wins the race and the chat works; sometimes it loses and the error appears. With multiple parent daemon processes (see Diagnostic), the contention becomes near-constant.
Repro
rm -rf ~/.forge && forge --version(clean state, first run creates dirs and writes config)~/.forge/.secrets.lockis NOT present after first run completesforge --conversation-id $(uuidgen)(or just open the CLI)kill -9the forgecode PID while it's actively writing config (e.g. mid-update_environmentinenv.rs)~/.forge/.secrets.lockis now present, 0 bytes, with mtime = the kill timeERROR: database is lockedon the next chat, repeatedlyOn my machine the lock files have been present since 2026-06-17 02:24 (~6 days at time of filing), which means every chat in that window has been racing for them.
Diagnostic evidence (from my current machine)
State files themselves are healthy (3681-byte encrypted OAuth credentials, valid JSON config). Only the sidecar locks are stale.
Likely code locations
The lock-file pattern is consistent with rmcp's file-based credential fallback (Cargo.lock pins
rmcp = 1.7.0in my build); auth secrets in~/Library/Application Support/forge/.secretslook like an rmcp-style encrypted store with a sidecar lock.The
.forge/.secretsand.forge/config/config.jsonpaths also match the legacy forgecode state dir layout (seecrates/forge_config/src/reader.rs:63resolve_base_path).Suggested fix
Pick one (or both):
Option A — Use
fs2(flock) instead of sidecar filesflock(2)is released by the kernel when the holding process exits for any reason (including SIGKILL), so stale locks are impossible.fs2 = "0.4"is a 200-LOC crate that wraps this.Option B — Startup recovery for stale sidecar locks
If keeping the sidecar pattern, add a one-time sweep on startup:
Either way: include the path of the cleared lock file in the
INFOlog so users can self-diagnose, and a shorttracing::warn!(could not remove stale lock: {path}, please remove manually)on removal failure.Workaround (until fixed)
A safe, idempotent bash script that clears only 0-byte
*.lockfiles whose target file exists and has no livelsofholder, with a 1-hour minimum age (override via--force):I just used this to clear both locks in this report; the error immediately disappeared from the next chat.
Environment
~/.local/bin/forge)--conversation-idvalues — separate but related issue worth filing)Severity
P1 — affects every user on the platform, every chat, indefinitely, with no in-app recovery path. Surface error is misleading ("database" implies SQLite when the actual cause is a stale sidecar file).