apps/web runtime.
Files That Define The Stack
apps/web/Dockerfileapps/web/docker/blue-green-watcher.Dockerfileapps/web/docker/cron-runner.Dockerfileapps/web/cron.config.jsonapps/backend/Dockerfileapps/meet-realtime/Dockerfileapps/supermemory/Dockerfiledocker-compose.web.ymldocker-compose.web.prod.yml(Composeincludeentry that mergesdocker-compose/compose.web.prod.*.ymlfragments plus sharedsecrets/volumes)scripts/sync-web-crons.jsscripts/watch-web-crons.jsscripts/docker-web.jsscripts/check-docker-web.js
Supported Commands
| Command | Purpose |
|---|---|
bun dev:web:docker | Run the web dev workflow inside Docker |
bun devx:web:docker | Explicitly start local Supabase, then the Docker dev workflow |
bun devrs:web:docker | Explicitly start and reset local Supabase, then the Docker dev workflow |
bun dev:web:docker:down | Stop the Docker dev workflow |
bun serve:web:docker | Build and run the production web image in-place |
bun serve:web:docker:bg | Blue/green production deploy with health-checked cutover |
bun serve:web:docker:bg:watch | Recreate the watcher container, then tail its live logs while it polls the tracked branch and auto-runs blue/green after a successful fast-forward pull |
bun serve:web:docker:down | Stop the production Docker stack |
bun serve:web:docker:bg:down | Stop the blue/green stack and clear local runtime state |
bun test:e2e / bun test:e2e:web:docker | Start local Supabase, reset it, run the production blue/green Docker web stack, then run Playwright |
bun check:docker | Validate Dockerfile and compose parity rules |
Flags And Implicit Mappings
| Flag | Meaning | | ------------------------------- | ------------------------------------------------------------------------------------------- | ------------ | --------------------------------------------- | |--without-redis | Disable the bundled Redis profile and skip Docker-injected Redis env |
| --with-cloudflared | Enable the bundled Cloudflare Tunnel container profile |
| --with-supabase | Start local Supabase before the Docker web flow |
| --reset-supabase | Start and reset local Supabase before the Docker web flow |
| --env-file tmp/e2e/web.env | Use an explicit Docker web env file for build secrets and runtime env files |
| --mode prod | Use the production compose file instead of the dev stack |
| --strategy blue-green | Use blue/green production deployment instead of in-place replacement |
| --profile redis | Explicitly enable the Redis profile when calling the helper directly |
| --profile cloudflared | Explicitly enable the Cloudflare Tunnel profile when calling the helper directly |
| --build-memory 4g | Run builds through a capped Buildx builder with a memory ceiling |
| --build-cpus 4 | Run builds through a capped Buildx builder with an approximate CPU limit |
| --build-max-parallelism 2 | Limit concurrent BuildKit solve steps for lower build pressure |
| --build-builder-name tuturuuu | Override the throttled Buildx builder name |
| --resume-if-running | If another watcher PID already holds the lock, mirror its live dashboard instead of failing |
| --replace-existing | If another watcher PID already holds the lock, stop it and take over |
| --if-locked <fail | resume | replace> | Explicit lock-conflict policy for the watcher |
| Command | Implicit flags |
|---|---|
bun dev:web:docker | none |
bun devx:web:docker | --with-supabase |
bun devrs:web:docker | --reset-supabase |
bun serve:web:docker | --mode prod |
bun serve:web:docker:bg | --mode prod --strategy blue-green |
bun dev:web:docker -- --without-redis | --without-redis |
Runtime Requirements
.env.localshould be the primary Docker env file. The helper still falls back toapps/web/.env.localfor older hosts that have not moved their env yet.- When
--env-fileis provided, the Docker helper uses that file for the Dockerfile secret and for the Compose runtimeenv_fileentries. This keeps special-purpose runs such as E2E from accidentally inheriting a developer’s cloud Supabase.env.local. - Production Compose fragments live under
docker-compose/, so their relative host paths must be written from that directory. Use..to reach the repo root for build contexts, env files, and bind mounts; otherwise Docker Compose resolves paths likeapps/...asdocker-compose/apps/...and the watcher image fails before the deployment loop starts. - Docker BuildKit must be available. The helper sets
COMPOSE_DOCKER_CLI_BUILD=1,DOCKER_BUILDKIT=1, andBUILDX_NO_DEFAULT_ATTESTATIONS=1so local blue/green image exports do not stall while resolving default provenance metadata. - The dependency stages in
apps/web/Dockerfile,apps/hive/Dockerfile, andapps/hive-realtime/Dockerfilemust copy everyapps/*/package.jsonandpackages/*/package.jsonmanifest before running any frozen Bun install. Adding a new workspace app or package without updating all three lists makes Docker-only installs try to rewritebun.lock.bun check:dockervalidates this manifest parity.apps/backendis an independent Go module, andapps/meet-realtimeis intentionally not a workspace package; their Dockerfiles do not requirebun.lockchanges when service source changes. - The Docker web flow does not start local Supabase unless you explicitly choose
bun devx:web:dockerorbun devrs:web:docker. - By default the Docker container uses the Supabase URL already configured in
.env.local, falling back toapps/web/.env.local. It should stay pointed at the cloud project for normal Tuturuuu work. - If that configured URL explicitly points at host-run local Supabase, the helper rewrites the server-side Supabase URL to
host.docker.internalwhile leavingNEXT_PUBLIC_SUPABASE_URLalone for browsers. - Dockerized web services set
__NEXT_PRIVATE_ORIGINfromDOCKER_WEB_NEXT_PRIVATE_ORIGIN, defaulting tohttp://127.0.0.1:7803. This keeps Next.js Server Action forwarding on the in-container web listener even when nginx preserves an externalHost. If logs showfailed to forward action responseorUND_ERR_HEADERS_TIMEOUT, verify the running web container has__NEXT_PRIVATE_ORIGIN=http://127.0.0.1:7803or an intentional internal override. Do not useserverActions.allowedOriginsas the primary fix for this symptom; that setting controls Server Action origin/host validation, not the forwarded-action fetch URL. - If logs still show
Error checking if workspace is personalwith[locale], verify the running image includes commitb30d7e2b07or newer plus the shared@tuturuuu/utilsUUID guard. - Dockerized web commands auto-enable the local Redis companion stack and inject
UPSTASH_REDIS_REST_URLplus a generatedUPSTASH_REDIS_REST_TOKENinto the web container. - Dockerized production commands generate
BACKEND_INTERNAL_TOKENwhen one is not provided and injectBACKEND_INTERNAL_URL=http://backend:7820for the Go backend service. Dev Compose uses the same internal URL with a local fallback token. - The production stack runs the first-party AI memory sidecar as an internal
support service at
http://supermemory:8787. The service name andSUPERMEMORY_*env names stay compatible with existing web runtime wiring, butapps/supermemory/Dockerfilebuilds Tuturuuu-owned pgvector memory code. - Dockerized production commands auto-configure the memory sidecar unless explicitly
disabled.
scripts/docker-web/env.jsgenerates and persists the internalSUPERMEMORY_API_KEY,SUPERMEMORY_POSTGRES_PASSWORD, andSUPERMEMORY_DATABASE_URL, and defaultsSUPERMEMORY_ENABLED=true,SUPERMEMORY_FAIL_OPEN=true, andSUPERMEMORY_TIMEOUT_MS=1500. - Operators can override generated values with
DOCKER_SUPERMEMORY_API_KEY,DOCKER_SUPERMEMORY_POSTGRES_PASSWORD,DOCKER_SUPERMEMORY_DATABASE_URL, orDOCKER_SUPERMEMORY_ENABLED; standardSUPERMEMORY_*env still works. - Blue/green promotion health-gates
supermemorywith the rest of the support services. Changingapps/supermemory/, the production Compose fragments, or the Docker bake file refreshes the support service set. ExplicitSUPERMEMORY_ENABLED=falseorDOCKER_SUPERMEMORY_ENABLED=falseremoves that support service from blue/green builds, starts, and health gates for local-only runs.
Dockerized E2E
bun test:e2e from the repo root and bun test:e2e in apps/web run through
scripts/run-web-e2e-docker.js instead of starting next dev. The runner:
- writes
tmp/e2e/web.envwith local-only Supabase, local app-origin variables, app-session JWT values, and a local-only E2E auth bypass for Turnstile/dev-session, - starts and resets the Dockerized local Supabase stack,
- boots
apps/webthrough the production blue/green Docker flow, - starts Portless on unprivileged HTTPS port
1355and registers thehttps://tuturuuu.localhost:1355route only after the direct Docker proxy is healthy, - waits for
https://tuturuuu.localhost:1355/login, then runs Playwright against that shared-cookie origin, and - tears down Docker web plus local Supabase unless
E2E_KEEP_DOCKER_STACK=1.
--volumes --rmi local to Docker Compose and then removes
custom image tags for the current ttr-e2e-* project, so per-run containers,
Compose volumes, and baked blue/green images do not accumulate. E2E also sets
DOCKER_WEB_BUILDKIT_PRUNE_AFTER_BUILD=1 by default because the per-run
BuildKit cache/state is disposable; set
E2E_DOCKER_BUILDKIT_PRUNE_AFTER_BUILD=0 only when debugging a local E2E build
and you intentionally want to keep BuildKit cache.
Local E2E also starts Supabase with edge-runtime excluded by default through
DOCKER_WEB_SUPABASE_START_EXCLUDE=edge-runtime. The platform E2E suite does not
serve local Edge Functions, and excluding that service keeps local runs from
failing when the Supabase Edge Runtime tries to resolve external JSR packages.
Set E2E_SUPABASE_START_EXCLUDE= when you intentionally need a full local
Supabase stack for debugging.
Local E2E also pins DOCKER_SUPERMEMORY_ENABLED=false and
SUPERMEMORY_ENABLED=false; the memory integration is not under test there.
E2E build caps default to auto for memory, CPU, and BuildKit max parallelism.
The runner reads Docker’s current MemTotal before booting the stack, forwards
that value as DOCKER_WEB_DOCKER_MEMORY_LIMIT, and resolves the BuildKit memory
cap just under the active Docker Desktop allocation. On allocations below 10 GB,
the E2E runner keeps the inner Next build at one CPU, static generation
concurrency one, and a 4 GB Node heap so BuildKit keeps enough container
headroom. The Next build engine remains Turbopack. Do not switch local E2E runs
to the Webpack build path; the production and local Docker build paths are
expected to exercise the same Turbopack compiler.
For local machines where Docker Desktop still cannot allocate enough memory for
the Turbopack image build, set DOCKER_WEB_NATIVE_BUILD=1 when running E2E. The
blue/green build helper then runs bun run build:web:docker on the host with
DOCKER_WEB_STANDALONE=1, packages apps/web/.next/standalone plus static
assets into the same Node runtime image shape, and continues with Docker Compose
startup and Playwright. Native builds default their host-side build memory budget
to 12 GB instead of Docker Desktop’s memory cap; set
DOCKER_WEB_NATIVE_BUILD_MEMORY=16g or another explicit value when the host
needs a different Node heap bucket. Keep this as a local/debug escape hatch; CI
and production watchers should keep building the web image entirely inside
Docker.
On GitHub-hosted runners, the E2E workflow frees disk before restoring or
loading cached Supabase Docker images. Keep that cleanup ahead of the cache load:
running docker system prune -af --volumes after cached images are loaded would
remove the images the shard is about to use, while skipping the cleanup can leave
too little space for the web Docker image dependency layer.
When an E2E shard fails, the runner prints diagnostics before teardown while the
containers still exist. The job log includes the primary error, blue/green stage
state, the Playwright .last-run.json file when available, Docker containers
for the shard Compose project, production Compose status, recent logs for web,
Hive, proxy, and support services, the Portless route list, a probe against the
configured E2E BASE_URL, and bun sb:status. The workflow also has a
failure-only diagnostic step after Run Playwright shard as a backstop, so the
job output should show the failing service or stage even when Playwright report
artifacts are incomplete. The workflow uploads diagnostics, Playwright reports,
and apps/web/test-results for every non-cancelled shard so traces and
screenshots stay available when the job output is too short.
The Playwright global setup refuses non-local web origins and refuses Supabase
origins outside localhost, 127.0.0.1, or host.docker.internal on port
8001. CI shards E2E with --shard=x/4; each shard gets its own Compose
project name, but all shards still use ephemeral local Supabase rather than any
cloud Supabase project. Because the Docker web app runs with
NODE_ENV=production, the generated env file and Playwright process env also
pin WEB_APP_URL, NEXT_PUBLIC_WEB_APP_URL, and NEXT_PUBLIC_APP_URL to
the local shared-cookie origin; otherwise central-auth redirects can escape to
the real tuturuuu.com origin during setup. The auth bypass is guarded by the
local E2E web origin, the incoming request Host / forwarded host / Origin
headers, and both the public and server-side Supabase origins before
server-side auth code honors it, so it must not be used as a general production
configuration.
The blue/green proxy and apps/web runtime both allow 64 KB request headers.
That headroom lets the browser reach /~recover-browser-state or the normal
login flow when duplicated Supabase cookies make the default header limit too
small. If a request is still too large for the proxy to forward, nginx handles
431/494 directly with Clear-Site-Data and redirects to
/login?browserStateReset=1; this recovery must stay in the proxy because
Next.js middleware cannot run after nginx rejects the header.
The blue/green nginx proxy must forward the original Host header with its
port intact via $http_host. Local E2E auth setup posts to
http://localhost:7803/api/auth/dev-session, and the production-mode app
accepts the setup route only when the public request origin stays local. The
guard also tolerates production standalone/proxy normalization where request.url
or Host becomes an internal Docker web upstream, but only when the forwarded
public host is still the local E2E origin.
Coolify
Coolify can provide enough default deployment metadata for Tuturuuu’s Dockerfile setup to derive the app origin even when you do not manually define the usual app URL variables.- During Dockerfile builds,
scripts/build-web-docker.jsnow derives missingWEB_APP_URL,NEXT_PUBLIC_WEB_APP_URL, andNEXT_PUBLIC_APP_URLvalues from Coolify’sCOOLIFY_URLorCOOLIFY_FQDNdefaults before runningbun run build:web. - During production container startup,
apps/web/docker/prod-entrypoint.jsapplies the same Coolify fallback so server-side runtime code sees the same derived values. - The runtime URL resolvers used by the web proxy, internal API client, and
drive export/auto-extract flows also fall back to
COOLIFY_URLandCOOLIFY_FQDN.
- Still set explicit Tuturuuu env like
NEXT_PUBLIC_SUPABASE_URL,NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY,SUPABASE_SECRET_KEY, and any email or storage secrets yourself. - You can omit
WEB_APP_URL,NEXT_PUBLIC_WEB_APP_URL, andNEXT_PUBLIC_APP_URLif Coolify already injectsCOOLIFY_URLorCOOLIFY_FQDNfor the deployment. - If you need one specific canonical domain while Coolify exposes multiple domains, set the Tuturuuu app URL variables explicitly instead of relying on the automatic fallback.
Development Mode
Development mode exists to preserve the normal root script contract while moving the web runtime into containers.- Container-managed
node_modulesare isolated from the host. - Package-local
node_modulesanddistdirectories are also isolated so host installs do not shadow container artifacts. - The root Docker context excludes generated app artifacts such as
.next,.turbo, coverage output, and Flutter build directories. Keep these excludes intact so production builds do not stream multi-gigabyte local artifacts into BuildKit. - A host
bun installis not required just to boot the Dockerized web stack.
Production Mode
The production compose file uses therunner target from apps/web/Dockerfile.
In-Place
Blue/Green
- Reads the last active color from
tmp/docker-web/prod/active-color. - Ignores that state if the corresponding container no longer exists.
- Builds the target web image through Docker Buildx Bake using Compose-derived targets, then stops/removes only the old target web lane and starts the fresh replacement.
- Starts the target web lane after its healthcheck passes and records
web-promoteas a staged target intmp/docker-web/prod/target-state.json. The deploy does not reloadweb-proxyor writetmp/docker-web/prod/active-coloryet. - Builds and runs Hive separately.
hive-db-migrate, the targethive-blue/hive-greenservice,hive-realtime, and the Hive proxy check must pass before web can be publicly promoted. A migration or Hive health failure marks the Hive stage failed, leavesactive-coloron the previous web lane, and keeps the staged target web lane out of public routing. - Refreshes support services (
backend,meet-realtime,markitdown,storage-unzip-proxy, andweb-cron-runner) after web/Hive target work. A support build or health failure also blocksweb-proxyreload and leaves the previous active web lane serving. Their build step is scoped: ordinary web commits build onlyweb-blueorweb-green, while Hive and helper images rebuild only when their source, Dockerfile, compose wiring, or shared dependency inputs changed. Image-only services such asredis,serverless-redis-http,web-proxy, andcloudflaredare never passed to Bake. - Injects Docker-internal helper URLs into
apps/web:BACKEND_INTERNAL_URL=http://backend:7820,MARKITDOWN_ENDPOINT_URL=http://markitdown:8000/markitdown,DISCORD_APP_DEPLOYMENT_URL=http://markitdown:8000,DRIVE_AUTO_EXTRACT_PROXY_URL=http://storage-unzip-proxy:8788/extract,VALSEA_PRONUNCIATION_ASSESSOR_URL=http://pronunciation-assessor:8010/assess, andINTERNAL_WEB_API_ORIGIN=http://web-proxy:7803. - Keeps the stable
web-proxycontainer running in place during ordinary promotions instead of re-runningcompose upagainst the public:7803listener. If the running proxy is missing required host ports or its container image no longer matches the resolved Compose image, the deploy defers the forced proxy recreate until after the target web, Hive, and support gates have passed. - Validates the generated nginx config with
nginx -t, then reloads or recreates the proxy only after every staging gate has passed. - Immediately verifies the proxy can serve the internal
/__platform/drain-statusendpoint through the newly routed color before writingactive-colorand marking the staged web target healthy. This avoids false deployment failures from public API middleware or rate limits. - Polls an internal drain-status endpoint on the old color and waits until it has no in-flight HTTP work left before demoting it to standby. This keeps long-running server actions, route handlers, and other open requests from being cut off mid-flight.
- Falls back to the short fixed drain window only when the old image predates the drain-status endpoint and cannot report its active requests yet.
- Keeps the demoted color online as a warm nginx backup target instead of removing it immediately, so stale keepalive workers and Cloudflare Tunnel connections can still fail over cleanly during the post-promotion window.
- If the demoted standby color is still on the previous revision after 15 minutes, the watcher automatically rebuilds that stale standby in place so both colors converge on the latest checked-out code without flipping the active port or promoting traffic again.
PLATFORM_BUILD_* variables for both Docker image builds and runtime
containers. It infers PLATFORM_BUILD_COMMIT_HASH,
PLATFORM_BUILD_COMMIT_SHORT_HASH, PLATFORM_BUILD_COMMIT_MESSAGE,
PLATFORM_BUILD_REF_NAME, PLATFORM_BUILD_ENVIRONMENT,
PLATFORM_BUILD_BUILT_AT, PLATFORM_BUILD_DEPLOYMENT_URL, and
PLATFORM_BUILD_DEPLOYMENT_STAMP from the current checkout plus the deployment
context. The account-gated badge reads those runtime values before falling back
to generated Vercel/GitHub metadata, so on-prem watcher deployments show the
served commit instead of local / Unknown.
If those PLATFORM_BUILD_* values are missing or blank in a self-hosted
runtime, apps/web falls back to the mounted blue/green snapshot before using
generated/local defaults. The resolver reads only lightweight snapshot files:
prod/target-state.json, prod/active-color, prod/deployment-stamp,
watch/blue-green-auto-deploy.status.json, and
watch/blue-green-auto-deploy.history.json under
PLATFORM_BLUE_GREEN_MONITORING_DIR, with local tmp/docker-web candidates for
development. Selection prefers targets.web for the active color, then an
active deployment row, then the latest successful row for the active color, then
the latest successful row overall. commitSubject becomes the badge commit
message, millisecond deployment timestamps become ISO deployment times, and the
runtime deployment stamp file supplies the displayed deployment stamp. The
resolver does not invent deployment URL, ref, or environment from color or
commit data alone.
The helper writes support-image input hashes to
tmp/docker-web/prod/build-input-hashes.json and keeps recent decisions in
tmp/docker-web/prod/build-input-hashes.history.json. Infrastructure
monitoring reads that history so deployment rows can show which helper images
were rebuilt and which ones were served from the cached build inputs.
Meet Realtime
apps/meet-realtime is the internal control-plane service for Meet calls,
webinars, and low-latency broadcast coordination. It is a Bun WebSocket service
started by production Compose as meet-realtime on container port 7816.
web-proxy exposes /realtime for tumeet.me and meet.tuturuuu.com and
forwards WebSocket upgrades to that service.
Production meeting logic stays on Tuturuuu infrastructure:
apps/webowns protected meeting APIs, verifies workspace access, and mints short-livedMEET_REALTIME_TOKEN_SECRETjoin tokens.apps/webandapps/meet-realtimemust share the sameMEET_REALTIME_TOKEN_SECRET. Production Compose exposesMEET_REALTIME_URLandNEXT_PUBLIC_MEET_REALTIME_URLto Web, defaulting both towss://meet.tuturuuu.com/realtimefor browser join-token payloads.- Browsers connect to
wss://meet.tuturuuu.com/realtime?token=.... apps/meet-realtimevalidates the token, manages ephemeral room presence, chat, stage state, and reconnect resync, then calls Cloudflare Realtime SFU APIs with server-onlyCLOUDFLARE_REALTIME_APP_IDandCLOUDFLARE_REALTIME_APP_SECRET.- Browser media flows to Cloudflare Realtime SFU. The control WebSocket can reconnect during watcher-managed service refreshes without creating a new meeting record.
- Broadcast streaming stays API-owned by
apps/web: the meeting host calls/api/v1/workspaces/:wsId/meetings/:meetingId/stream,apps/webcreates or resumes a Cloudflare Stream live input with server-onlyCLOUDFLARE_ACCOUNT_IDplusCLOUDFLARE_STREAM_API_TOKEN(orCLOUDFLARE_API_TOKEN), stores the live input UID and WHIP/WHEP URLs inprivate.meet_stream_live_inputs, and returns the WHIP publish URL only to the host response. Workspace viewers receive only the WHEP playback URL.
off and hidden viewer counts by default; set
CLOUDFLARE_STREAM_ALLOWED_ORIGINS to a comma-separated allowlist when Stream
playback should be origin-restricted. Do not add Cloudflare Workers or Durable
Objects for production Meet room logic; use the internal service and
blue/green watcher instead.
scripts/docker-web/env.js persists generated helper tokens under
tmp/docker-web/markitdown-token, tmp/docker-web/storage-unzip-token,
tmp/docker-web/supermemory-api-key,
and tmp/docker-web/supermemory-postgres-password. Override them with
DOCKER_MARKITDOWN_ENDPOINT_SECRET,
DOCKER_DRIVE_UNZIP_PROXY_SHARED_TOKEN, or the DOCKER_SUPERMEMORY_* env when
an operator needs fixed values.
Workspace ZIP auto-extract is enabled by the workspace-level
DRIVE_AUTO_EXTRACT_ZIP secret. Workspaces with EXTERNAL_PROJECT_ENABLED=true
also opt in automatically so CMS/WebGL workspaces can reuse the unzipper without
duplicating storage automation setup. The Docker-internal URL and token are
fallbacks for workspaces that have not supplied custom proxy secrets. If a
workspace supplies a custom DRIVE_AUTO_EXTRACT_PROXY_URL, it must also supply
its own DRIVE_AUTO_EXTRACT_PROXY_TOKEN; the process-wide fallback token must
not be sent to a workspace-controlled proxy URL.
The pronunciation-assessor helper is local-only. It defaults to
PRONUNCIATION_ASSESSOR_DEFAULT_MODEL=local-whisper-large-v3-turbo, can switch
between local Whisper sizes and local-wav2vec2, and stores downloaded
Transformers checkpoints in the platform-pronunciation-assessor-cache volume.
The default model preloads at container startup. Set
PRONUNCIATION_ASSESSOR_PRELOAD=false only when startup speed matters more than
first-request latency. On shared hosts, keep
PRONUNCIATION_ASSESSOR_MAX_LOADED_MODELS=1 and tune
PRONUNCIATION_ASSESSOR_IDLE_TTL_SECONDS so idle models are unloaded before
they hold GPU or RAM capacity unnecessarily. The helper rejects uploads over
PRONUNCIATION_ASSESSOR_MAX_UPLOAD_BYTES (default: 10 MB) and rejects decoded
audio longer than PRONUNCIATION_ASSESSOR_MAX_AUDIO_SECONDS (default: 120
seconds) before local model inference. Leave
PRONUNCIATION_ASSESSOR_ADMIN_TOKEN unset unless operators need to call
POST /models/load or POST /models/unload; those model-control endpoints are
disabled without the token and require Authorization: Bearer <token> when it
is configured.
CMS WebGL package uploads also use the storage-unzip-proxy, but they are a
first-class CMS upload path rather than generic Drive automation. They require a
configured unzip proxy URL and token, but they do not require the
DRIVE_AUTO_EXTRACT_ZIP workspace opt-in secret. The CMS finalize route unpacks
the ZIP into workspace Drive, detects the playable index.html, and stores the
same-origin artifact map on the CMS webgl-package asset. Browser uploads go
directly to the signed storage URL returned by the self-hosted web app’s WebGL
upload-url route, so large ZIPs do not pass through the Vercel-hosted CMS app or
the web app proxy before reaching Supabase Storage or R2. The CMS client reports
per-file upload progress during the signed upload, then calls the WebGL finalize
route so the backend handles extraction and artifact-map persistence.
The unzip proxy fans out backend callbacks for extracted folders and asks the
callback route for per-file upload URLs. Before uploading extracted bytes, the
proxy verifies the callback response names a trusted provider and that the
signed upload URL belongs to hosted Supabase, Cloudflare R2, or an exact
operator-configured upload origin. It forwards only content type and generated
bearer-token headers to the upload URL. The storage auto-extract and CMS WebGL
extract callback routes still pass through the central API proxy guard before
they validate the shared unzip token, so malformed, rate-limited, or oversized
callback requests are rejected at the same cheap boundary as other API
mutations. Direct file callbacks are legacy/small-file only and enforce the
same 512 KiB body budget locally; large extracted files must use the
file-upload-url callback flow. The proxy currently buffers the downloaded
archive and each extracted file in memory, so the default caps stay
conservative: 100 MiB ZIP downloads, 50 MiB per extracted file, and 250 MiB
total extracted output.
Operators can tune those caps with
DRIVE_UNZIP_PROXY_MAX_ARCHIVE_BYTES,
DRIVE_UNZIP_PROXY_MAX_ENTRY_BYTES,
DRIVE_UNZIP_PROXY_MAX_TOTAL_EXTRACTED_BYTES, and
DRIVE_UNZIP_PROXY_MAX_ARCHIVE_ENTRIES; workspace Drive quota must still be
large enough for the uploaded archive and extracted files. Set
DRIVE_UNZIP_PROXY_ALLOWED_UPLOAD_ORIGINS for self-hosted Supabase or custom
R2/S3-compatible origins, and reserve
DRIVE_UNZIP_PROXY_ALLOW_LOCAL_UPLOAD_ORIGINS=true for local Supabase testing.
The MarkItDown endpoint is the conversion path for uploaded workspace files.
Do not route YouTube summaries through MarkItDown or Google Search. Google
Gemini chat requests attach one public or unlisted YouTube URL directly as a
native video/mp4 file input, so the model can summarize the video through the
provider-supported video path. Playlist/query parameters are stripped before
the URL is attached so each request references only one video. Any legacy direct
URL conversion path that still reaches MarkItDown must reserve and commit the
fixed MarkItDown credit charge before the sidecar request is sent.
Interrupted Docker Compose recreates can leave temporary container names such
as <hex>_platform-markitdown-1. The Docker helper treats those as recoverable
only when the suffix matches one of the services in the current compose up
request, removes that stale temp container, and retries the same narrow up
operation.
The production web-proxy service is pinned to the official mainline Alpine
image nginx:1.31.0-alpine, and scripts/check-docker-web.js verifies that
pin in the merged production Compose config.
The long-lived nginx proxy also raises its request-header buffer limits so larger
session/auth cookies do not fail at the proxy layer with 400 Request Header Or Cookie Too Large before the active web container sees the request. It now also
raises its upstream response-header buffers (proxy_buffer_size,
proxy_buffers, and proxy_busy_buffers_size) so larger Supabase auth
responses with multiple Set-Cookie headers do not fail with upstream sent too big header while reading response header from upstream. The proxy uses Docker
DNS re-resolution plus a shorter keepalive timeout so promotions are less
likely to produce transient 502 Host Error responses for existing Cloudflare
Tunnel connections, while the previous color remains alive as a warm standby.
The proxy keeps both blue and green in the nginx upstream group during steady
state, with the active color as the primary upstream and the standby color as a
backup. The runtime DNS resolver is defined at the nginx include/http scope,
not just inside server, so Docker service-name resolution continues to work
for the blue/green upstream block at reload time.
Both the production web image healthcheck and the web-proxy compose
healthcheck now use the internal /__platform/drain-status endpoint too, so
raw bun serve:web:docker:bg waits on the same non-rate-limited readiness path
as the blue/green promotion gate. The proxy exposes that path as an exact
loopback-only nginx location and forwards a private internal probe header to
the active web lane, because the web request tracker intentionally answers the
drain-status endpoint only for local or explicitly trusted Docker-network
requests.
Every blue/green deployment also stamps the runtime with
PLATFORM_DEPLOYMENT_STAMP and PLATFORM_BLUE_GREEN_COLOR. Those values are
surfaced through both nginx response headers and the web process itself, and
the web layout appends the deployment stamp to the service-worker URL with
updateViaCache: 'none' so new deployments push browsers toward the latest
worker instead of lingering on stale cached state.
The local runtime state lives in:
tmp/docker-web/prod/active-colortmp/docker-web/prod/deployment-stamptmp/docker-web/prod/nginx.conftmp/docker-web/prod/target-state.json
target-state.json, the watcher
deployment history, and the latest deployment stage handoff together, so
operators can see staged target work such as a prepared web color while Hive or
support gates still block public promotion. active-color and the generated
proxy config remain on the previous serving web lane until the final
proxy-reload stage passes. Watcher-managed deployments persist the
web-build, web-promote, hive-migrate, hive-promote,
support-refresh, and proxy-reload stage results into deployment history.
Modern rows that were recorded without a stage array are inferred from final
deployment status and build-cache metadata; truly pre-tracking rows still show
stage chips as not applicable. Active watcher deployments that only have
pending build/deploy status are surfaced with a synthetic current stage so
operators can see the build is in progress before full stage history is written.
After each Hive migration pass, the deploy helper runs
docker compose rm --stop -f hive-db-migrate so the completed one-shot
migration service is stopped if necessary and removed. This keeps
hive-db-migrate from lingering after depends_on starts it while Hive
services come up.
Native Cron Runner
Self-hosted production cron jobs useapps/web/cron.config.json as the shared
source of truth. apps/web/vercel.json.crons should stay generated from that
file with:
--check in CI and local verification when cron definitions change. The
sync script preserves Vercel behavior by copying each enabled job’s path and
schedule from the shared config into apps/web/vercel.json.
In Docker production, the web-cron-runner service runs
scripts/watch-web-crons.js against INTERNAL_WEB_API_ORIGIN, defaulting to
http://web-proxy:7803. Requests include
Authorization: Bearer ${CRON_SECRET || VERCEL_CRON_SECRET} so the same route
auth gate can protect Vercel and native Docker executions.
When neither CRON_SECRET nor VERCEL_CRON_SECRET is set on the host, the
Docker environment generator creates a persisted internal secret at
tmp/docker-web/cron-token and injects it into both the web containers and the
web-cron-runner service as CRON_SECRET. This keeps native Docker cron auth
self-contained while preserving explicit host-provided secrets when present.
Runtime cron telemetry is intentionally file-based and local to the host:
tmp/docker-web/cron/status.jsonfor runner health, the current cycle, and the latest manual run lifecycle records. Manual runs move throughqueued,processing, and a finalsuccess/failed/timeout/skippedstate. While a run is processing, the runner refreshes captured route console logs in this file so the monitoring UI can show near-realtime status and log updates.tmp/docker-web/cron/state.jsonfor restart-safe last-run markers.tmp/docker-web/cron/executions/*.jsonlfor per-run route response, duration, status, and captured web-container console logs.tmp/docker-web/watch/control/cron-control.jsonfor the global enabled switch.tmp/docker-web/watch/control/cron-run-requests/*.jsonfor queued manual runs created by the monitoring UI.
--once, which is used by script tests to verify due-run detection, queued
manual runs, restart-safe state, and log persistence without starting the
long-running loop.
Calendar cron routes should call workspace calendar APIs through
INTERNAL_WEB_API_ORIGIN when it is present. In Docker production this keeps
provider sync and smart scheduling traffic on the internal web-proxy origin
instead of accidentally depending on a public app URL from inside the container.
calendar-provider-sync intentionally calls the same
/api/v1/workspaces/:wsId/calendar/sync route that the calendar page uses, but
with cron auth and source: "cron" so dashboard runs stay manual and scheduled
runs stay auditable. The workspace sync route is responsible for provider
fan-out: Google connections are selected from active Google auth-token rows, and
Microsoft connections are selected from active Microsoft auth-token rows. Do not
reimplement provider-specific calendar fetching inside the cron wrapper.
Auto-Deploy Watcher
bun serve:web:docker:bg:watch locks the current branch/upstream at startup,
polls every second, fast-forwards when GitHub has a newer commit, and runs the
blue/green deploy flow automatically.
Run this command from a host-level process manager, not only from inside Docker.
The command starts and tails the web-blue-green-watcher container, but the
host process is the part that can recover after the Docker engine itself dies.
When Docker is unavailable, the host supervisor polls docker info; after
DOCKER_WEB_WATCHER_DOCKER_RESTART_AFTER_MS milliseconds of continuous failure
(default 30000), it attempts to restart Docker, waits for docker info to pass,
runs any configured host-level post-restart commands, then recreates the watcher
container. The recreated watcher reuses the existing cached blue/green recovery
path to bring web-proxy and the active/standby web lanes back to health.
Docker restart command defaults:
- Linux:
systemctl restart docker - macOS:
open -ga Docker - Windows:
powershell.exe -NoProfile -Command Start-Process "Docker Desktop"
DOCKER_WEB_WATCHER_DOCKER_RESTART_AFTER_MS: delay before the first Docker restart attempt whiledocker infois failing; set0to disable attempts.DOCKER_WEB_WATCHER_DOCKER_RESTART_COOLDOWN_MS: minimum time between restart attempts; default 300000.DOCKER_WEB_WATCHER_DOCKER_RESTART_COMMAND: command used to restart or open Docker. Prefer JSON array syntax for commands with quoted arguments.DOCKER_WEB_WATCHER_DOCKER_RESTART_DISABLED=1: hard-disable daemon restart attempts while still waiting for Docker to recover externally.DOCKER_WEB_WATCHER_DOCKER_RECOVERY_TIMEOUT_MS: optional maximum wait time for Docker recovery; unset or0means wait indefinitely.DOCKER_WEB_WATCHER_DOCKER_POST_RESTART_COMMAND_TIMEOUT_MS: timeout for each additional host-level recovery command; default 600000.DOCKER_WEB_WATCHER_DOCKER_POST_RESTART_COMMANDS: JSON array of host-level commands to run after Docker is reachable again and before Tuturuuu recreates its watcher container. Each entry is an object withcommand,args, and an optionalcwd.DOCKER_WEB_WATCHER_MAX_REQUEST_LOG_BYTES: maximum durable proxy request-log ledger size before the watcher rotates and prunes older JSONL chunks before appending new entries; default 268435456 bytes.
tmp/docker-web/watch/control/blue-green-docker-recovery-settings.json, and
the host supervisor reads that file before each Docker recovery wait. Dashboard
settings override the environment defaults without restarting the supervisor.
Host-level executable commands are different: configure
DOCKER_WEB_WATCHER_DOCKER_RESTART_COMMAND and
DOCKER_WEB_WATCHER_DOCKER_POST_RESTART_COMMANDS only in the host supervisor
environment. The supervisor intentionally ignores command fields from
blue-green-docker-recovery-settings.json so dashboard viewers cannot persist
host commands for a later recovery event.
That settings file also owns Docker crash email alerts:
emailAlertsEnabled: enables SES-backed Docker recovery alert emails from the web cron worker.emailAlertRecipients: explicit recipient list. If this is empty, the cron falls back toPLATFORM_DOCKER_RECOVERY_ALERT_EMAILS, then the last operator email that saved the settings.emailAlertCooldownMs: minimum time between alert emails; default 1800000.
docker info failures. Because the web app itself
usually runs inside Docker, SES email delivery happens after Docker is reachable
again and the web cron job /api/cron/infrastructure/docker-recovery-alerts
can run. The cron deduplicates by Docker recovery incident id using
tmp/docker-web/watch/control/blue-green-docker-recovery-alert-state.json.
Example post-restart commands for colocated projects:
systemd
service or run it as an operator account with permission to execute only the
configured Docker restart command and the explicit post-restart commands needed
by colocated projects. Use Restart=always so the host supervisor itself comes
back after reboots or process crashes.
Additional behavior:
- If the watcher script itself changed in the pulled revision, the current watcher process restarts first and the replacement process performs the deploy.
- If blue/green is already live and the standby color remains on an older revision for 15 minutes, the watcher rebuilds only the standby color in place. The active color remains primary for new traffic the whole time.
- If the watcher sees a degraded blue/green runtime with a proxy or runtime
marker present but no active web color serving traffic, it immediately
retags the latest retained successful image into the active web color and
starts it with
--no-build. It prefers a retained image for the currentmaincommit, then falls back to the newest retained successful image so the runtime can recover first and reconcile tomainafterward. It then retags the same cached image into the opposite color and starts that as the warm standby, creating two ready copies without waiting for a fresh build. - Blue/green active and standby discovery uses Docker health, not just container presence. If the persisted active color is unhealthy but the opposite color is healthy, the watcher rewrites the active marker and proxy to the healthy color before building or refreshing another lane.
- Cached recoveries write a fresh nginx proxy config before the proxy is started, so recovery never boots nginx with a stale upstream that points at a missing or unhealthy color.
- That standby catch-up path also stops and removes the stale standby container before rebuilding it, so health checks target the fresh replacement container rather than an outdated standby instance.
- Standby catch-up rebuilds reuse the current deployment stamp so the warm backup matches the latest deployment state instead of serving an older build if nginx needs to fail over.
- The watcher dashboard surfaces the top 3 most relevant deployments from the
recent history, prioritizing in-progress rollouts first, then the live
promoted color, then the warm standby. Direct manual
bun serve:web:docker:bgruns are written into that same history too. - Cached recoveries write both the active recovery and the standby refresh into the same retained ledger, preserving the current two warm copies plus the prior successful deployment as the fastest rollback reference. If no retained image exists, the watcher falls back to the normal recovery build path.
- The infrastructure monitoring rollback controls show the latest retained cached recovery images separately from the general deployment history, so an operator can quickly select a known cache-backed commit before pinning it for rollback or smoke testing.
- Successful active and standby builds tag the service image as
{compose-project}-web-cache:{commit}and prune older retained cache tags beyond the three newest successful deployments. Pruning is idempotent: already-removed cache tags are ignored instead of warning in the live watcher log.
Deployment build lock
Blue/green deploys coordinate on a JSON lock file undertmp/docker-web/watch/blue-green-deployment-build.lock (owner PID, command,
deployment kind, and a re-entrant token for nested helper calls).
- On Linux, the helper compares
/proc/<pid>/cmdlineto the recorded lock so a reused PID after a crash cannot masquerade as an in-flight deploy. The recorded command is the package script name (bun serve:web:docker:bg), but production deploys usually run asnode scripts/docker-web.js ...; the matcher treats those as the same holder so a livenodedeploy is not cleared as a stale PID reuse. On macOS and Windows, the same age-based stale window (DOCKER_WEB_DEPLOYMENT_LOCK_STALE_AFTER_MS, default eight hours) still clears abandoned locks because/procvalidation is unavailable. When noweb-proxy/web-blue/web-greencontainers exist, the auto-deploy watcher also runs the same stale-lock sweep before cached recovery. DOCKER_WEB_DEPLOYMENT_LOCK_STALE_AFTER_MS: optional override for the default eight-hour window used when/procis unreadable butkill(pid, 0)still reports a process (for example permission quirks). Set to0to disable age-based assists.DOCKER_WEB_CANCEL_ACTIVE_BUILD=1or--cancel-active-buildon a manualbun serve:web:docker:bgrun stops the watcher/buildkit services, clears the lock, and records a canceled history row before starting fresh.- The auto-deploy watcher treats an active deployment lock as a wait state, not a failed deploy attempt. Recovered pending handoffs, reconcile builds, standby refreshes, platform promotions, and imported Infrastructure project builds all defer behind the same lock so only one deployment build runs across the stack.
- The watcher also treats a build lock older than 30 minutes as a timed-out
build. For another live deployment PID, it sends
SIGTERMto the recorded owner. If the lock is owned by the watcher process itself and the watcher has already returned to the polling loop, the lock is treated as leakedcached-recoverystate and cleared without signaling the watcher. In both cases, the watcher records a failed deployment history row with the timeout reason and waits until the next polling cycle before retrying. Override the window withDOCKER_WEB_WATCHER_BUILD_TIMEOUT_MS; set it to0only when an operator explicitly wants to disable watcher-side build termination. - The
apps/webDockerfiledepsstage retriesbun install --frozen-lockfileup to three times with a Bun cache scrub between attempts. If the build still exits withbun install --frozen-lockfileexit code 1 after agit pull, regeneratebun.lockin a development checkout, commit the reviewed lockfile update, and deploy that commit. Do not let the production host rewritebun.lockas part of the auto-deploy path. If tarball extraction still fails (for example@biomejs/cli-linux-x64), the blue/green helper prunes BuildKit exec cache mounts, restarts the Compose-ownedbuildkitservice, and retries once withdocker compose build --no-cacheso a cached failed deps stage is not reused. The same one-time fresh retry is used forCACHED ERROR ... COPY --from=depsand for the build watchdog timeout.
Monitoring Surfaces
The infrastructure monitoring UI inapps/web is intentionally split into
smaller pages instead of one oversized dashboard:
/{wsId}/infrastructure/monitoringfor the operator overview, runtime snapshot, cron health summary, and jump points into deeper surfaces./{wsId}/infrastructure/monitoring/cronfor cron job schedules, global enable/disable control, manual run requests, recent execution status, route responses, and captured web-container console logs./{wsId}/infrastructure/monitoring/rolloutsfor rollout controls, deployment charts, event streams, and ledger history./{wsId}/infrastructure/monitoring/requestsfor paginated proxy request history backed by the durable JSONL request store undertmp/docker-web/watch/blue-green-request-logs/./{wsId}/infrastructure/monitoring/watcher-logsfor paginated watcher log browsing backed bytmp/docker-web/watch/blue-green-auto-deploy.logs.json.
Build Resource Caps
When build and serve run on the same machine, use the Docker web helper’s Buildx throttling options instead of letting BuildKit consume the full host. Example:bun serve:web:dockerdefaults to--build-memory 12g --build-cpus 4 --build-max-parallelism 1bun serve:web:docker:bgdefaults to--build-memory 12g --build-cpus 4 --build-max-parallelism 1
mem_limit to 12g and
cpus to 4 when DOCKER_WEB_BUILD_MEMORY and DOCKER_WEB_BUILD_CPUS are
unset, which keeps large monorepo builds from overcommitting shared hosts that
still run the web stack, Redis, log drain, and other sidecars alongside builds.
Raise or lower the caps with env vars or helper flags when your machine is
tighter or has spare capacity.
You can still override those defaults per run by appending your own flags after
--, for example:
DOCKER_WEB_BUILD_MEMORY=16gDOCKER_WEB_BUILD_CPUS=4DOCKER_WEB_BUILD_MAX_PARALLELISM=2DOCKER_WEB_BUILD_BUILDER_NAME=tuturuuuDOCKER_WEB_BUILDKIT_PORT=7914DOCKER_WEB_BUILDKIT_ENDPOINT=tcp://127.0.0.1:7914DOCKER_WEB_BUILDKIT_PRUNE_AFTER_BUILD=0for blue/green watcher handoffsDOCKER_WEB_BUILDKIT_STOP_AFTER_BUILD=0to keep thebuildkitcontainer warm after a buildDOCKER_WEB_DOCKER_MEMORY_LIMIT=<bytes from docker info>DOCKER_WEB_STATIC_PAGE_GENERATION_TIMEOUT=180DOCKER_WEB_STATIC_GENERATION_MAX_CONCURRENCY=autoDOCKER_WEB_NEXT_BUILD_CPUS=autoDOCKER_WEB_NEXT_APP_ONLY=1DOCKER_WEB_NODE_MAX_OLD_SPACE_SIZE=autoDOCKER_WEB_NEXT_BUILD_ENGINE=turbopackDOCKER_WEB_REACT_COMPILER=0
- The helper starts the Compose-owned
buildkitservice and then creates or reuses the remote Buildx builder named byDOCKER_WEB_BUILD_BUILDER_NAME. The container is named${COMPOSE_PROJECT_NAME:-tuturuuu}-buildkit-1, so it stays visually grouped under thetuturuuuDocker Desktop stack. - The BuildKit caps accept
auto. Auto memory uses Docker’s reported memory limit minus a small host overhead buffer, rounded down to MiB precision; auto CPU uses 1 CPU below 10 GB, 2 CPUs below 16 GB, and 4 CPUs on larger Docker allocations; auto max parallelism uses 1 below 16 GB and 2 above that. The E2E runner uses these auto caps by default so local Playwright verification adapts to the current Docker Desktop setting without requiring one-off env overrides. DOCKER_WEB_BUILD_MEMORYcaps the Compose-owned BuildKit service’s memory budget.DOCKER_WEB_BUILD_CPUSsets the BuildKit service CPU budget.DOCKER_WEB_BUILD_MAX_PARALLELISMwrites a BuildKit config that limits concurrent solve steps, which is often the most effective way to reduce CPU spikes on smaller machines.- Host-side helper runs point Buildx at
DOCKER_WEB_BUILDKIT_ENDPOINT(defaulttcp://127.0.0.1:${DOCKER_WEB_BUILDKIT_PORT:-7914}). The watcher container usestcp://buildkit:1234on the Compose network. - Blue/green watcher handoffs preserve the Compose-owned BuildKit cache volume
by default (
DOCKER_WEB_BUILDKIT_PRUNE_AFTER_BUILD=0), but stop and remove thebuildkitcontainer after the build/deploy phase (DOCKER_WEB_BUILDKIT_STOP_AFTER_BUILD=1). This frees idle CPU and memory while keeping layer state for the next deployment. SetDOCKER_WEB_BUILDKIT_STOP_AFTER_BUILD=0only when an operator intentionally wants BuildKit to stay warm after a handoff. - Dockerized E2E is the exception: its per-run BuildKit state is disposable and
scripts/run-web-e2e-docker.jssetsDOCKER_WEB_BUILDKIT_PRUNE_AFTER_BUILD=1unless explicitly overridden. - The same max-parallelism value is also forwarded as
COMPOSE_PARALLEL_LIMITwhen that variable is not already set. When the limit is1, the blue/green workflow builds each Bake target group separately so image export and web compilation do not overlap on memory-constrained hosts. - When Docker reports less than 10 GB of total memory for a blue/green run, the
helper also restarts the Compose-owned
buildkitservice immediately before the build batch. This clears long-lived BuildKit RSS before the replacement web image builds while the active lane is still running. SetDOCKER_WEB_BUILDKIT_RESTART_BEFORE_BUILD=0to skip that low-memory restart, or1to force it on a larger host. - Docker web builds use
bun run build:web:docker, which keeps the normal web build dependency graph, setsNODE_OPTIONS=--max-old-space-sizeto at least 4 GB on Docker allocations below 10 GB, then scales to 8 GB, 12 GB, or 16 GB based on the lower of Docker’s reported memory limit and the selected BuildKit memory cap, falling back toDOCKER_WEB_BUILD_MEMORYfor environments where Docker memory cannot be detected. The helper reads Docker’sMemTotaland forwards it asDOCKER_WEB_DOCKER_MEMORY_LIMIT; auto buckets reserve 1 GB of effective Docker build memory for BuildKit, the active runtime lane, and sidecar overhead before selecting the Node heap bucket on larger allocations. Docker production builds use Turbopack under the real Node 24 runtime with App Router-only compilation and React Compiler disabled. This avoids Bun runtime crashes while loading native Next SWC modules and keeps local E2E aligned with production Docker builds. The Dockerbuilderstage is based onnode:24-bookworm-slimand copies the Bun binary in only for workspace script orchestration, so the actualnext buildprocess runs under real Node instead of Bun’snodeshim. The@tuturuuu/webbuild:dockerscript delegates toscripts/run-web-docker-next-build.js, which spawnsDOCKER_WEB_NODE_BINARY(pinned by the Dockerfile to/usr/local/bin/node) for the Next CLI and honorsDOCKER_WEB_NODE_MAX_OLD_SPACE_SIZE=autoby default. Set a numeric heap value only when you need to override the bucket selection; values below 4096 MB are rejected. SetDOCKER_WEB_NEXT_APP_ONLY=0orDOCKER_WEB_REACT_COMPILER=1only when the host has enough memory headroom. KeepDOCKER_WEB_NEXT_BUILD_ENGINEon its default Turbopack value for production watcher hosts and local E2E runs. - Docker standalone Next builds default static page generation to a 180 second
timeout, and auto-scale the inner Next build CPU count plus static generation
concurrency from Docker memory. Docker allocations below 10 GB use 1 Next
build CPU and static generation concurrency 1; 10-16 GB allocations use 2
for both; 16 GB and larger allocations use 4 for both. The Compose-owned
BuildKit service still defaults to a 4 CPU budget, while the inner Next
workers stay lower on smaller hosts to avoid OOM kills when the same machine
is also running the active blue/green lane and sidecars. Override those with
DOCKER_WEB_STATIC_PAGE_GENERATION_TIMEOUT,DOCKER_WEB_STATIC_GENERATION_MAX_CONCURRENCY, andDOCKER_WEB_NEXT_BUILD_CPUSwhen a host has more or less headroom. - Hive Docker images use a filtered workspace install and a Next standalone
runner. Before
apps/hiverunsnext build, the image must build@tuturuuu/types,@tuturuuu/internal-api, and@tuturuuu/supabasebecause those packages expose productiondist/*subpath exports that Turbopack resolves during the standalone build. Hive realtime installs its filtered production workspace with Bun’s hoisted linker so the directbun apps/hive-realtime/src/index.tsruntime can resolve top-level production packages such aspostgresand@tuturuuu/realtime. Keep.dockerignoreexplicit about recursive generated directories such as**/.next/**,tmp/**, andapps/mobile/build/**; otherwise previous local builds can be copied into the next Docker context and inflate small sidecar images by several gigabytes.
- These caps affect image builds, not the runtime
apps/webcontainer after it has started. - If no build caps are configured, the helper continues using Docker’s default builder behavior.
- Do not switch capped builds back to the Buildx
docker-containerdriver. It creates Docker-managed containers named likebuildx_buildkit_*outside the Compose project, which makes Docker Desktop grouping and health reporting confusing. - During capped-build setup, the helper removes known legacy Buildx builders
such as
platform-web-capped-builderbefore creating or reusing the Compose-owned remotetuturuuubuilder. Ifdocker buildx lsstill shows that legacy builder, run the capped web deploy helper once so it can clean up the stale Buildx record. - A lower parallelism setting usually trades build speed for host stability.
- If BuildKit fails with
ResourceExhaustedorcannot allocate memorywhile host memory is still available, the builder cgroup is too small; raise--build-memorywhile keeping--build-max-parallelism 1. - If host memory or swap is saturated, lower
--build-max-parallelismfirst and stop unrelated containers before raising the builder memory cap. - If Bun fails during an image install with a tarball extraction error such as
Fail extracting tarball for "@biomejs/cli-linux-x64", the blue/green helper treats it as BuildKit exec-cache corruption once per deployment attempt. It prunes BuildKit exec cache mounts, restarts the Compose-ownedbuildkitservice, and retries the build once with--no-cache. A second failure is recorded as a real deployment failure with the original command context preserved in logs. - If BuildKit reports
CACHED ERRORafter a failed deps stage, or if the compose build exceedsDOCKER_WEB_BUILD_TIMEOUT_MS(default 45 minutes), the helper uses the same one-time cache recovery and fresh--no-cacheretry. - If a deployment fails during the build, the watcher captures the actionable
failure lines into the retained deployment history as
failureReason. The Infrastructure → Monitoring → Deployments page and rollout ledger display that reason inline so operators do not need to reconstruct failures from a terminal scrollback.
Redis Profile
Redis is enabled by default in both dev and production-style Docker web stacks. The Redis andserverless-redis-http host ports bind to 127.0.0.1 only; do
not expose them through Cloudflare Tunnel, public firewall rules, or
all-interface Docker port mappings.
The helper persists the generated token in:
tmp/docker-web/redis-token
apps/web automatically:
UPSTASH_REDIS_REST_URL=http://serverless-redis-http:80UPSTASH_REDIS_REST_TOKEN=<generated local token>
UPSTASH_REDIS_REST_TOKEN during
Compose interpolation. Use the Docker web helper, which injects the generated
token automatically, or export a strong token before running direct
docker compose --profile redis ... commands. Service env_file entries do
not satisfy Compose interpolation for the Redis HTTP bridge token.
Docker Redis mode intentionally ignores generic UPSTASH_REDIS_REST_URL and
UPSTASH_REDIS_REST_TOKEN values from the host shell. This prevents old Upstash
REST URLs from leaking into self-hosted Docker containers after the Upstash
instance is shut down. If a Docker host must override the bundled Redis sidecar,
use the Docker-specific DOCKER_UPSTASH_REDIS_REST_URL and
DOCKER_UPSTASH_REDIS_REST_TOKEN variables.
If you intentionally want the memory-only fallback, opt out:
UPSTASH_REDIS_REST_URL / UPSTASH_REDIS_REST_TOKEN
variables. Redis-optional features such as rate limiting fall back to their
non-Redis behavior, but security-sensitive one-time state such as CLI
refresh-token replay protection fails closed until Redis is restored or the CLI
user signs in again.
Vercel-hosted satellite apps such as CMS, Calendar, Finance, Learn, Teach, and
Tasks cannot reach Docker-private Redis hosts such as serverless-redis-http.
Do not point their Vercel UPSTASH_REDIS_REST_URL at the Docker sidecar or
expose Redis through Cloudflare Tunnel. Satellite proxy guards should run
without Redis when Upstash is retired; protected product APIs continue to flow
through apps/web, where Docker Redis is available.
Cloudflare Tunnel Profile
The Docker compose files include an optionalcloudflared service. Enable it
when the same host should publish the Dockerized web proxy through Cloudflare
Tunnel:
CLOUDFLARED_TOKENorDOCKER_CLOUDFLARED_TOKEN
- Production blue/green:
https://tuturuuu.com->http://web-proxy:7803 - Dev stack:
https://dev.tuturuuu.comor a temporary hostname ->http://web:7803
cms.tuturuuu.com and
other satellite app hostnames on Vercel unless those apps are explicitly moved
into this Docker stack.
Production compose binds host-published web, Hive, Meet, and Redis ports to
127.0.0.1 only. Do not remove that loopback prefix during blue/green
migration; public exposure should go through Cloudflare Tunnel or another
controlled frontend, not the staged Docker host ports.
When blue/green is deployed with --with-cloudflared, the watcher receives
DOCKER_WEB_WITH_CLOUDFLARED=1 so future auto-deploys keep the tunnel profile
active and do not remove the cloudflared container as an orphan.
Auto-Pull Blue/Green Watcher
For simple self-hosted boxes that deploy directly from a Git branch, the repo also provides a long-running auto-deploy watcher:- Writes the forwarded watcher CLI args to
tmp/docker-web/watch/blue-green-auto-deploy.args.json. - Rebuilds and force-recreates the dedicated
web-blue-green-watcherservice. - Tails that container’s live logs so the terminal still shows the watcher dashboard.
- the repo worktree at
/workspace - the same repo again at the real host checkout path via
PLATFORM_HOST_WORKSPACE_DIR, so host Docker bind mounts resolve against the host filesystem when the watcher shells intodocker compose ... /var/run/docker.sockso it can manage the blue/green compose stack itself- the shared Bun install cache volume
- a dedicated watcher
node_modulesvolume so the frozen dependency install stays container-local
- Reads the built-in
platformproject from the log-drain Postgres project registry. The production watcher service is wired withPLATFORM_LOG_DRAIN_DATABASE_URLso a live watcher can consume queued Infrastructure project deployments instead of falling back to the legacy single-branch loop. - The seeded branch is
production, but operators can change it from Infrastructure → Monitoring → Projects. If the selected project branch differs from the current checkout, the watcher restarts its child process, fetches, and checks out that branch only when the worktree is clean. Dirty worktrees are reported as blocked instead of being force-switched. If the watcher is already stuck on the wrong branch, the monitoring UI queues a watcher recovery request andweb-cron-runnerrecreatesweb-blue-green-watcherthrough Docker Compose out of band. - Locks the selected local branch and tracked upstream at startup.
- Writes a PID-backed lock file at
tmp/docker-web/watch/blue-green-auto-deploy.lock. - Renders a live terminal dashboard with the locked branch, tracked upstream, latest local commit, relative commit age, last check time, next poll time, current blue/green runtime state, and recent watcher events.
- Polls the tracked upstream every
1000msby default. - Auto-clears and redraws the dashboard in place on each state change when attached to a TTY.
- Runs the Git and deploy subprocesses quietly so the dashboard is not
disrupted by
git fetch,git pull, or Docker build output during normal watcher operation. - Skips pulls if the worktree is dirty.
- Uses
git pull --ff-onlyonly when the local branch is strictly behind the locked upstream. - Runs
bun install --frozen-lockfileautomatically after every successful fast-forward pull so installed dependencies match the reviewedbun.lockbefore the deploy handoff continues. The watcher does not runbun upgradeor a non-frozen install on the production host. - Treats any dirty
bun.lockas a blocking worktree change. Lockfile updates must be reviewed and committed before the watcher can continue polling or deploying. - Runs
bun serve:web:docker:bgautomatically after a successful fast-forward pull. - Polls imported Infrastructure projects from log-drain Postgres, synchronizes enabled public GitHub projects into
tmp/docker-web/projects/<projectId>/repo, deploys them through generated Next.js compose files under the sharedtuturuuuCompose project, and merges hostname routes into the central nginx proxy. The imported-project and manual deployment queue cadence is independent from the normal Git polling interval, so a watcher configured with a long Git interval such as 1000 seconds still wakes on the shorter project queue interval to advance queued Deploy actions. Platform project state is updated on both queue-only deploys and normal fast-forward deploys, so a successful pull/deploy clearsqueuedand refreshes the latest commit columns instead of relying only on deployment history. Imported project builds share the same deployment build lock as platform blue/green builds. If platform, standby, recovery, or another imported project build is already active, the project poll is deferred instead of starting a second Docker build. - If watcher runtime code such as
scripts/watch-blue-green-deploy.js,scripts/docker-web/blue-green.js, orscripts/docker-web/env.jschanged in the pulled revision, the current watcher does not deploy from the old process. It releases its lock, spawns a replacement watcher with the same CLI args, and exits first. - The replacement watcher refreshes the live
web-proxynginx config and workers in place if blue/green is already serving traffic, verifies proxy routing through/__platform/drain-status, and only then starts the new blue/green build/promotion. - If compose or helper-image wiring changed, including
docker-compose.web.prod.yml, Hive service files, MarkItDown service files,apps/storage-unzip-proxypackage/source files, orapps/web/docker/cron-runner*, the containerized watcher recreates its own compose service before the pending deploy handoff. The deploy then includes only the affected buildable helper images in the blue/green build command instead of rebuilding every service on every commit. - Retries recoverable Git command failures instead of exiting. The first retry waits 1 minute, then the watcher backs off exponentially on consecutive Git failures up to a 15 minute ceiling.
- Caps deployment attempts at 3 failures per commit. A recovered pending
handoff failure is recorded, the pending request is cleared, and the
watcher keeps polling; once the cap is reached, that commit reports
retry-limiteduntil a new commit is available or an operator pins a different deployment. - Stops immediately if the checked-out branch changes while the watcher is running.
- If another watcher already owns the lock, a new invocation can fail with
guidance, mirror the active watcher with
--resume-if-running, or replace it with--replace-existing.
- Manual
bun serve:web:docker:bgand watcher-triggered deploys share a deployment-build lock attmp/docker-web/watch/blue-green-deployment-build.lock. This lock is separate fromblue-green-auto-deploy.lock: the watcher may remain alive, but only one build/deploy phase can be active across manual deploys, auto-pulls, standby refreshes, rollback pins, cached recovery, and reconcile deploys. - If a manual deploy sees that lock or a live watcher status of
buildingordeploying, an interactive terminal prompts before it interrupts the active deployment. Confirming stopsweb-blue-green-watcher, stops/resets the Compose-owned BuildKit work, clears the active build lock/status, records the interrupted entry ascanceled, then starts the requested deployment alone. - Non-interactive manual automation fails fast on an active deployment unless
--cancel-active-buildorDOCKER_WEB_CANCEL_ACTIVE_BUILD=1is provided. Use that override only when it is acceptable to interrupt all BuildKit work owned by the platform deployment stack. - Re-running
bun serve:web:docker:bg:watchintentionally recreates the watcher container so it picks up local repo changes, new CLI args, and watcher-image updates in one path. - The host log follower treats Docker’s
143exit from an intentionally recreated watcher container as a reconnect signal, then reattaches to the replacement service instead of leaving the terminal dark. - If the followed watcher logs explicitly request host-supervised watcher
service recreation, the host wrapper force-recreates
web-blue-green-watcherbefore reattaching. Do not rely only on Docker’s restart policy in this path: the old container can briefly report healthy while still running the stale image/runtime. - Git fetch/pull credentials now need to be usable inside the watcher container because the watcher no longer runs directly on the host.
- Full Docker daemon or Docker Desktop crashes cannot be recovered by a watcher
that is itself running inside Docker. Keep
bun serve:web:docker:bg:watchrunning from the host, ideally undersystemd,launchd, or another host process supervisor. That host command waits for the Docker daemon to respond again, reruns the watcher composeup --build --detach --force-recreate, and then resumes tailing logs. Every hosted project with its own Docker watcher needs its own host-side watch process; containerrestart:policies only help after Docker is already healthy again. - The host Docker recovery loop polls
docker infoevery 5 seconds by default. Override withDOCKER_WEB_WATCHER_DOCKER_RECOVERY_POLL_MS. By default it waits indefinitely because a host process manager is expected to own the terminal process; setDOCKER_WEB_WATCHER_DOCKER_RECOVERY_TIMEOUT_MSto a positive value to fail after a bounded recovery window. - The watcher image lives at
apps/web/docker/blue-green-watcher.Dockerfile. - Its entrypoint wrapper relaunches the watcher in-place when
scripts/watch-blue-green-deploy.jsrequests a self-restart after pulling a new watcher revision. - The entrypoint is also the watcher supervisor. It restarts the child process
after crashes, after the status snapshot fails to appear during startup, or
after
blue-green-auto-deploy.status.jsonbecomes stale. The compose service usesrestart: unless-stoppedso Docker also brings the watcher back after a daemon or container failure. - A stale status snapshot is tolerated while the snapshot already shows an
active
buildingordeployingdeployment. During a longdocker compose build, the watcher child is intentionally busy inside the deploy command and may not rewrite the status file until the command exits. The wrapper keeps the child alive untilDOCKER_WEB_WATCHER_BUILD_TIMEOUT_MSplus a short grace window, then treats the stale snapshot as unhealthy. bun serve:web:docker:bg:downalso stops the watcher service because it is part of the production compose stack now.
- Shows the current active blue/green color when
web-proxyis serving live traffic. - Docker resource rows use the running containers directly as a fallback when
docker compose pscannot inspect the prod stack because of env interpolation issues, so watcher metrics can still appear on an already-live deployment. - Docker stats are read with an explicit field format instead of Docker’s
version-dependent JSON object shape, which avoids bogus
0CPU/memory readings when the watcher is running against a different Docker release. - The watcher parser also normalizes locale-style decimal commas from
docker stats, so hosts that emit values like0,10%or24,0MiBno longer collapse into zeroed metrics. - Each watcher snapshot now includes
docker psmetadata for every running container visible through the host Docker socket, plus compose service health for containers in the production project. The monitoring overview uses that persisted snapshot to show service health and a full running-container inventory without mounting the Docker socket intoapps/web. - The request archive view computes route summaries, status totals, RSC counts, and error totals across the selected timeframe instead of only the visible page. The default timeframe is seven days, the API rejects unbounded or oversized windows, and operators can query at most 30 days of retained request logs at a time. The web API keeps a short in-process aggregate cache keyed by bounded timeframe plus telemetry log file stats, but the cache stores only aggregate analytics so request rows are not retained in memory between page reads.
- Drive ZIP extraction does not stream extracted file bytes back through the web app proxy. The unzip worker requests a per-entry signed upload URL from the callback route and uploads extracted files directly to trusted storage origins only, which avoids nginx body-size limits for large WebGL artifacts while keeping folder creation and auth checks in the backend callback.
- Hive is promoted with the web blue/green color:
hive-blueandhive-greenare routed fromhive.tuturuuu.com, andhive-realtimeserves/realtimewithHIVE_REALTIME_TOKEN_SECRET,HIVE_REALTIME_URL, andNEXT_PUBLIC_HIVE_REALTIME_URLconfigured in the same production stack. Hive product data is stored in the Docker-managedhive-postgresservice viaHIVE_DATABASE_URL; Supabase remains the identity/session source only. Theweb,web-cron-runner,hive-{color}, andhive-realtimeservices must all receive that URL so API routes, disabled-by-default simulation cron, the editor, and the CRDT realtime service share the same Hive product database. Optional local LLM support runs behind thehive-ollamaprofile and is disabled unless operators enable the profile and Hive settings enable the exactgemma4model. Production compose publishes127.0.0.1:7814:7814fromweb-proxy, not from a direct Hive container, so host-local or Cloudflare tunnel traffic tolocalhost:7814always reaches the currently promoted Hive color without exposing staged migration ports on every host interface. Deploys verify that the runningweb-proxycontainer has the required loopback host bindings (7803,7814, and7816) and that its running image matches the resolved Compose image before reusing it; if an older proxy was created before Hive moved behind blue/green or before the nginx image pin changed, the next deploy force-recreates the proxy so the host-level Cloudflare Tunnel route can reach Hive on the expected proxy runtime. The Hive color services use the same Supabase env source asapps/web: runtime env files are shared, and production image builds mount theweb_envBuildKit secret so hidden-locale auth pages can prerender with the platform Supabase URL. Deploy coordination:scripts/docker-web/blue-green.jsstill scopes prod builds by changed service group, but runtime promotion happens in staged order: first the targetweb-{color}, thenhive-{color}andhive-realtime, then refreshed support services such as backend, MarkItDown, storage-unzip-proxy,web-cron-runner, and optional Redis-backed helpers. Ifweb-proxyorcloudflaredmust be bootstrapped or recreated for host-port changes, they start only after target web, Hive, and support services are healthy, sohive.tuturuuu.comis not exposed with an emptyhive_app_upstreamand web is not publicly switched before dependent gates finish. Promotion waits for the final proxy route check before writingactive-color. - Every service owned by
docker-compose.web.prod.ymlshould declare a healthcheck, either directly in compose or in the image. The resources inventory treats anUpcontainer without Docker health metadata as healthy for cross-project runtime visibility, but first-party prod services and sidecars still need explicit probes so deploy gates can fail before promotion. - The MarkItDown sidecar needs
SUPABASE_URLset to the same Docker-internal Supabase URL used by the web container. The service validates signed Storage URLs before downloading attachments, and local Docker runs may usehost.docker.internalover HTTP. - MarkItDown source changes and storage-unzip-proxy package/source changes are part of the watcher refresh globs. Keep those globs in sync with any future sidecar entrypoints so a running watcher refreshes helper containers during the next deploy handoff, not just the web app container.
- Host dependency refreshes must not rewrite
bun.lockon the production host. The watcher treats a dirty lockfile as a blocking worktree change, and its automatic dependency sync usesbun install --frozen-lockfileonly. - Runtime upgrades are an explicit operator action. The watcher does not run
bun upgrade; update the host Bun runtime only after reviewing the pinned version in the repository and the watcher image. - Recoverable Git poll failures stay visible in the dashboard as a retrying watcher state instead of terminating the process, and the next-check timer reflects the active backoff delay.
- If
git fetchorgit pull --ff-onlyfails only because a Git lock already exists, the watcher inspects the lock age and removes it automatically only when it is stale (older than 2 minutes). This covers.git/index.lock,.git/packed-refs.lock, and remote-ref locks such as.git/refs/remotes/origin/staging.lock. Fresh lock files are left in place so active Git operations are not interrupted. - Build/deploy failures also stay inside the watcher loop. The watcher records failed attempts in deployment history, clears stale pending handoff files after recovery failures, and stops retrying the same commit after the third failed deployment attempt. Once that cap is reached, it reports the retry-limited state once for that commit instead of logging the same skip on every poll.
- Normal promotions keep the long-lived
web-proxycontainer and bound port stable, which avoids transient listener drops for upstreams such as Cloudflare Tunnel that are connected to:7803. Proxy container recreates are reserved for required host-port or image drift and happen only after the replacement web/Hive lane is healthy. - Persists recent deployment history, including manual
bun serve:web:docker:bgruns, and renders the top 3 most operationally relevant entries as stacked terminal cards that favor vertical scanability over very wide lines. - Each deployment card now uses a stronger header with status/color badges plus grouped metric bands, so active traffic state, rollout intent, and request-rate data are easier to scan while multiple cards are stacked.
- As soon as a new commit starts rolling out, the recent deployment section
shows it immediately as
DEPLOYINGinstead of waiting for the rollout to finish. - Each deployment block includes:
- deploy status (
ACTIVE,ENDED, orFAILED) - build time
- activation/finish time
- deployment lifetime while it served traffic
- total requests served during that deployment window
- average requests per minute
- peak requests per minute
day: requests served on the current day for the active deployment, or the final active day for an ended deploymentdavg: average requests per day across that deployment’s serving lifetimedpeak: busiest single-day request count across that deployment’s serving lifetime
- deploy status (
- The live blue/green summary uses the same traffic metrics as the deployment history cards, with consistent color coding for build/lifetime/traffic/age metrics so the dashboard is easier to scan quickly.
- A dedicated Docker resources row summarizes aggregate CPU, memory, and
network usage across the live blue/green containers, followed by a
per-container row for
proxy,green, andbluewhen those services are running. This is sampled fromdocker stats --no-stream, so it stays local to the host and is appropriate for self-hosted operator monitoring. - The infrastructure dashboard’s Docker Runtime Inventory uses the watcher snapshot as the source of truth for every running Compose container and derives total CPU and memory from those rows when present, so the summary cards stay aligned with the detailed container inventory.
- The bundled
serverless-redis-httpcompanion uses an in-containerwgethealth check that posts["PING"]to/with the generatedSRH_TOKEN; do not use a Node-based probe for that image because it is an Erlang release image, and do not probe/pingbecause SRH does not expose that route. - Production Redis compose requires
UPSTASH_REDIS_REST_TOKENand binds Redis host ports to127.0.0.1. Do not reintroduce theplatform-local-redis-tokenfallback in production fragments or remove the loopback host bind; direct Compose users must export a strong token before enabling theredisprofile. - After a successful host-triggered
serve:web:docker:bgrollout, the Docker helper starts or resumes the containerizedweb-blue-green-watcherwith--resume-if-running. Deploys that are already running inside the watcher skip that handoff viaPLATFORM_BLUE_GREEN_WATCHER_CONTAINER=1, which avoids recursive watcher starts while still leaving a poller alive for future Git commits. - The watcher wrapper and child process must agree on runtime files. If
PLATFORM_BLUE_GREEN_WATCH_ARGS_FILE,PLATFORM_BLUE_GREEN_WATCH_RUNTIME_DIR, orPLATFORM_BLUE_GREEN_WATCH_STATUS_FILEare set, both the wrapper and child use those paths so the wrapper does not restart a healthy child for a missing status snapshot. - Request counters now come from a persisted local proxy-log drain under
tmp/docker-web/watch/blue-green-request-telemetry.*, not from one-offdocker logsscrapes in the dashboard. The watcher continuously drains structuredweb-proxyaccess logs into a local ledger, so request metrics survive watcher restarts and do not require any external analytics service. - Internal proxy health checks for
/api/healthand/__platform/drain-statusare excluded from the request totals so the numbers reflect real served traffic more closely. - The proxy now emits structured JSON access logs that include the upstream deployment stamp and blue/green color. That lets the watcher link requests back to the correct deployment instead of only estimating by time window.
- For each newly drained proxy request, the watcher also reads recent
web-blueandweb-greencontainer stdout/stderr and stores up to 20 route console lines that fall inside the request latency window. Those captured lines are persisted on the request-log record itself so the request explorer can show request-scoped server console output after the live Docker logs have moved on. - The watcher retains up to 10,000 deployment history entries and up to 100,000,000 recent drained request-log records on disk, bounded by a 256 MiB durable request-log byte cap by default. When the next request record would exceed the byte cap, the watcher rotates the current JSONL chunk if needed and prunes older chunks before appending, so public request URIs cannot grow the host-backed ledger without an aggregate limit. Rolling daily/weekly/monthly/yearly metric buckets plus a recent-request excerpt still feed the monitoring dashboard.
- The watcher also persists a separate latest-log ledger under
tmp/docker-web/watch/blue-green-auto-deploy.logs.json, which captures the high-level poll/pull/build/deploy watcher messages with deployment stamps and commit hashes when available. The infrastructure dashboard uses that ledger for a deployment-scoped latest-log view without needing livedocker logsaccess. - The watcher uses the same Docker runtime env resolution as the real deploy flow, so blue/green status probes still work when the Redis profile is part of the production compose file.
- The active watcher also persists a live status snapshot under
tmp/docker-web/watch/, which is what--resume-if-runninguses to mirror the dashboard without taking over the PID lock. - The infrastructure dashboard at
/{ROOT_WORKSPACE_ID}/infrastructure/monitoringreads the same watcher status snapshot and renders it as a Next.js control room with rollout, request-rate, container-resource, and event-feed views. - The monitoring dashboard now exposes paginated request and watcher-log
explorers. Route filters come from normalized request paths, the raw request
URI still surfaces query signatures, and
?_rsc=*requests are called out so React Server Component traffic is inspectable separately from document hits. - Deployment-facing dashboard surfaces deduplicate successful blue/green rows for the same commit so active and standby colors do not appear as separate rollouts. Failed attempts remain separate because the retry cap and recovery debugging depend on seeing each failed build/deploy attempt. Large deployment, rollback-candidate, Docker-service, and container lists are paginated in the UI instead of rendering every retained row at once.
- Production
web,web-blue, andweb-greencontainers mount./tmp/docker-webread-only at/app/runtime/docker-weband usePLATFORM_BLUE_GREEN_MONITORING_DIRto find the watcher snapshot. Keep that mount/env pair in sync if the runtime path changes, or the dashboard will degrade to an empty offline state even while blue/green deployments still work. - Production
web,web-blue, andweb-greenalso mount the narrower./tmp/docker-web/watch/controlpath read-write at/app/runtime/docker-web-controlviaPLATFORM_BLUE_GREEN_CONTROL_DIR. Keep operator command files in that control directory so the broader watcher runtime and telemetry mount can stay read-only. - The monitoring dashboard’s “Sync Standby Now” action writes
tmp/docker-web/watch/control/blue-green-instant-rollout.request.json. The watcher consumes that file on its next poll, clears it after a success, failure, or no-op, and uses it to rebuild the standby color immediately so blue and green can converge on the same commit without waiting for the stale standby window. - The dashboard reads that pending instant-rollout request back from the watcher control directory. While the request is queued, or while the latest standby refresh is building/deploying, the sync button stays disabled and shows a queued/building status instead of allowing duplicate control files.
- The monitoring dashboard’s rollback pin action writes
tmp/docker-web/watch/control/blue-green-deployment-pin.json. The watcher treats that file as authoritative: it skips normal fetch/pull work, checks out the pinned commit in detached mode, deploys it if the latest successful deployment is different, and keeps production on that commit until the pin is removed from the dashboard. Removing the pin lets the watcher check out its locked branch again and resume normal fast-forward polling. - In-container watcher child restarts preserve the locked branch/upstream
metadata even when the child is killed while Git is detached for a rollback or
parent-fallback build. The replacement child can recover
productionfrom the target-only lock instead of trying to poll detachedHEAD. If an older child already removed the lock, a clean detached startup falls back to the selected platform branch (productionby default) before locking and polling. - When the watcher receives a shutdown signal while it is temporarily detached,
it attempts to check out the locked branch again before exiting, as long as
the worktree is clean. This keeps manual operator commands such as
git pull && bun serve:web:docker:bgfrom inheriting a detached checkout after a stopped watcher. - The watcher image must also include both the Docker Compose and Buildx CLI
plugins (
docker-cli-composeanddocker-cli-buildxon Alpine) because the rollout handoff shells intodocker compose ..., and capped production builds create/use the remotetuturuuuBuildx builder from inside the watcher container. The watcher reaches the Compose-owned BuildKit daemon attcp://buildkit:1234. - When the watcher drives Docker Desktop through
/var/run/docker.sock, it must run the deploy handoff from the mirrored host-path mount, not a container-only path like/workspace. Otherwise Docker Desktop rejects bind mounts such as./tmp/docker-web/prod/nginx.confwith “mounts denied” because/workspace/...is not a real shared host path. - The watcher compose environment preserves
PLATFORM_HOST_WORKSPACE_DIRand pinsCOMPOSE_PROJECT_NAMEfrom that host checkout path unlessDOCKER_WEB_COMPOSE_PROJECT_NAMEis explicitly set. A canonical checkout directory namedplatformmaps to thetuturuuuCompose project so Docker Desktop groups the stack under the product name on clean startups. During self-refresh from an already-running legacyplatformCompose project, the legacy watcher starts a stagedtuturuuuwatcher with non-conflicting host ports, then stops only the old watcher service. The target watcher builds or recovers thetuturuuustack before it touches the public proxy port. When the target proxy is healthy on the staged port, it stops the legacyplatformproxy, recreates thetuturuuuproxy on port7803, and verifies the internal drain-status route within 3 seconds. If that handoff exceeds 3 seconds or the target proxy health check fails, the watcher stops the target proxy, restores the legacyplatformproxy, and leaves the legacy project intact for another retry. After a successful handoff, it removes the oldplatformCompose project withdocker compose down --remove-orphans. Once the legacy project is absent, a watcher that lacks inherited Compose project env is treated as the fully migratedtuturuuuwatcher and remains in the normal Git poll/build loop instead of starting another migration handoff. Do not inherit arbitrary container-scoped Compose project names from the watcher container; doing so can create duplicate service names such as nestedtuturuuu-markitdown-1containers during self-refresh. - If Docker reports
container name ... is already in usefor a requested production service, the helper removes only the exact expected container name for the current Compose project, then retriesdocker compose up. This handles stale names left by interrupted rollouts without pruning unrelated containers. - Starting
bun serve:web:docker:bg:watchnow clears the persisted watcher status snapshot and active PID before the watcher service is force-recreated, but preserves any complete branch/upstream target metadata. A stale lock from the previous container cannot block the replacement watcher, and a detached checkout still has enough metadata to reattach toproduction. - If the watcher pulls a revision that changes its own Dockerfile or baked
entrypoint, it now rebuilds and recreates the
web-blue-green-watcherservice automatically before handing off to the next deployment cycle. - When that container-refresh request is emitted from inside the followed
watcher logs,
bun serve:web:docker:bg:watchtreats the log text itself as the recreate signal, then rebuilds/recreates and resumes tailing. This keeps the service from getting stuck in a Docker-restarted but operationally offline state. - Recovery handoffs now persist a pending-deploy request under
tmp/docker-web/watch/and reconcile it against the latest successful deployment history entry on startup. IfHEADis newer than the last successful built/deployed commit after a watcher restart or container recreate, the watcher builds the currentHEADbefore settling back into normal polling. - The same reconciliation now also runs during steady-state polling: if Git is
already up to date but the latest successful deployment record still points
at an older commit, the watcher rebuilds/deploys the current
HEADinstead of incorrectly reportingup-to-date. - Blue/green deploys build the replacement lane before stopping or removing any
existing blue/green container. A failed
docker compose buildmust leave the currently serving lane and any warm standby untouched; only after the build succeeds may the target lane be recreated with--no-build. - If
tmp/docker-web/prod/active-coloris missing or stale, the deployer and watcher recover the serving lane from the generated nginx proxy config before deciding what to rebuild. Treat the proxy config as the runtime source of truth during drift so a failed reconciliation cannot misclassify and clear a stable deployment.
Browser-State 502 Recovery
If some normal browsers still return Cloudflare502 Host Error, 431, or
Chrome ERR_INVALID_RESPONSE while incognito works, treat it as stale client
state or an auth-cookie/header-size problem before assuming the tunnel itself is
broken. Normal browser-state recovery should use the app route: GET is a
no-store confirmation page only, and the destructive Clear-Site-Data response
happens only after a same-origin POST.
Oversized request headers are different: the request may never reach Next.js.
The production web-proxy therefore gives web, Hive, and Meet 64 KB request
header headroom, maps Nginx 431/494 oversized-header failures to a local
browser-state recovery response, and the Docker web app server starts with
--max-http-header-size=65536 so ordinary Supabase auth-cookie chunking has
matching headroom after proxying.
How to recognize each failure mode:
upstream sent too big header while reading response header from upstreammeans nginx response-header buffers were too small for the auth response.web-green could not be resolvedorweb-blue could not be resolvedmeans a stale nginx worker or keepalive connection still tried to reach a color that no longer existed. The current warm-standby model is designed to avoid that.- A browser that fails only in regular mode but works in incognito usually has stale Supabase auth cookies, stale service-worker state, or both.
- Send affected users to the recovery route on the affected origin:
https://tuturuuu.com/~recover-browser-statefor the main app, orhttps://hive.tuturuuu.com/~recover-browser-statefor Hive. - That route is public and bypasses auth/onboarding middleware. When the request can reach the app, use the confirmation form so the cookie-clearing POST explicitly expires Supabase auth cookie variants for the current host.
- If the browser is already sending too many stale cookies, the proxy catches
the
431/494before Next.js and returnsClear-Site-Data: "cache", "cookies", "storage", "executionContexts"while redirecting the browser back to/login?browserStateReset=1.
- Inspect
X-Platform-Deployment-Stamp,X-Platform-Blue-Green-Primary, andX-Platform-Blue-Green-Colorresponse headers to confirm which rollout is currently serving a request. - If the recovery URL fixes the issue for a user, the likely root cause was stale browser state rather than an active deploy outage.
- If recovery does not help and proxy logs still show
too big header, focus on auth redirect size or additional cookie bloat.
- This is intended for clean deployment clones on a server, not for active developer worktrees.
- If the local branch is ahead of or diverged from the tracked upstream, the watcher logs and skips the pull instead of forcing a merge or reset.
- The self-restart path only triggers when the watcher script itself changed in the fetched revision; normal app-code deploys keep the current watcher process alive.
- During that self-restart path, nginx keeps the assigned proxy port up the whole time because the replacement watcher refreshes the existing proxy container in place before it starts the new build.
- The watcher inherits the default blue/green build caps from
bun serve:web:docker:bg, so the current defaults still apply during auto-deploys. - Deployment history is watcher-managed. Manual blue/green rollouts still show up in the live runtime status if the stack is active, but they do not backfill the watcher’s last-3 deployment list unless they were performed through the watcher itself.
- Rollback pins are intended for bad latest deployments or failed reconciliation
builds. Pin only a known successful deployment from the retained ledger, then
remove the pin after
maincontains the corrective commit you want the watcher to resume deploying.
Validation And CI
docker-setup-check.yaml now validates all of the following:
node scripts/check-docker-web.jsnode --test scripts/check-docker-web.test.js scripts/docker-web.test.jsdocker compose -f docker-compose.web.yml configdocker compose -f docker-compose.web.yml --profile redis configdocker compose -f docker-compose.web.yml --profile cloudflared configdocker compose -f docker-compose.web.prod.yml configdocker compose -f docker-compose.web.prod.yml --profile redis configdocker compose -f docker-compose.web.prod.yml --profile cloudflared configdocker build -f apps/backend/Dockerfile .docker build --target dev -f apps/web/Dockerfile .docker build --target runner --secret id=web_env,src=.env.local -f apps/web/Dockerfile .
node --test --test-name-pattern "pullTrackedBranch" scripts/watch-blue-green-deploy.test.js. Running bun test scripts/... from
the repo root invokes the package test script and can expand into the full Turbo
test suite.
When scripts/check-docker-web.js or similar root validators need to assert a
literal Dockerfile template placeholder like ${process.env.PORT || 7803},
prefer a regex or another explicitly escaped matcher instead of a plain string
literal. Biome treats raw ${...} text inside normal strings as
lint/suspicious/noTemplateCurlyInString, which can break CI even when the
runtime behavior is unchanged.
Operator Notes
- Do not paste
docker compose configoutput into chat or tickets; it expands env values. - If you need rebuild-before-restart on a server, use
bun serve:web:docker:bg. - If the latest blue/green deployment is bad, use the infrastructure monitoring dashboard to pin a previous successful deployment before debugging forward.
- If a blue/green deploy is interrupted, rerunning the same command from the intended commit is the normal recovery path.