Skip to main content

Overview

apps/web captures server-side API, cron, and infrastructure logs through the internal log drain. The drain preserves normal stdout/stderr output, stores structured events in a dedicated Postgres container, and exposes them in Infrastructure → Monitoring. The monitoring navigation uses compact labels:
  • Overview
  • Deployments
  • Logs
  • Analytics
  • Observability
  • Cron
  • Requests
  • Resources
  • Projects
  • Stress Tests

Runtime

Docker web runs a dedicated log-drain-postgres service. It is separate from Supabase and is owned by the Docker web runtime. Compose pins the visible container name to ${COMPOSE_PROJECT_NAME:-tuturuuu}-log-drain-postgres-1 so it stays grouped with the rest of the Tuturuuu stack in Docker Desktop. Default connection inside Docker:
postgres://platform_log_drain:platform_log_drain@log-drain-postgres:5432/platform_log_drain
Relevant environment variables:
  • PLATFORM_LOG_DRAIN_DATABASE_URL: Postgres connection string.
  • PLATFORM_LOG_DRAIN_ENABLED: set to false to disable persistence while preserving stdout/stderr.
  • PLATFORM_LOG_DRAIN_RAW_RETENTION_DAYS: raw log retention, default 30.
  • PLATFORM_LOG_DRAIN_SUMMARY_RETENTION_DAYS: request, cron, deployment, and usage retention, default 90.

Logging Rules

Server-side code in apps/web API, cron, and infrastructure paths must not add raw console.* calls. Use:
import { serverLogger } from '@/lib/infrastructure/log-drain';

serverLogger.info('Processed job', { jobId });
serverLogger.error('Job failed', error);
serverLogger mirrors each call to stdout/stderr as one serialized line. Pass structured objects directly to the logger, but do not rely on Node’s default multi-line object printing; the drain and Docker log capture treat one logger call as one retained event. For route or cron handlers, wrap execution with withRequestLogDrain(...) or withCronLogDrain(...) when the handler should attach logs to a request or cron run id. The drain is fail-open: if Postgres is unavailable, requests and cron jobs continue normally and logs still go to stdout/stderr.

Legacy Compatibility

Projects

Infrastructure → Monitoring → Projects is the log-drain-owned project registry for Docker web deployments. The built-in project is seeded as:
  • id: platform
  • repository: https://github.com/tutur3u/platform
  • default selected branch: production
  • app root: apps/web
  • environment: production
  • addons: nginx proxy locked on, log drain on, Redis on, cron on
Project definitions live in the log-drain Postgres tables infrastructure_projects and infrastructure_project_branches, not Supabase. Existing log, request, deployment, cron, and resource records default to project_id = 'platform' so the current monitoring views keep showing the deployed platform while new projects are introduced. V1 GitHub import supports public GitHub repositories only. The UI syncs repository metadata and branches through unauthenticated GitHub REST calls by default; set GITHUB_TOKEN only when the deployment host needs a higher GitHub API rate limit. New projects use the Next.js preset, default app root to the repository root, require hostname routing through the central nginx proxy, and start with log drain plus Redis enabled. Cron is disabled by default for imported projects until an operator enables it. Imported projects can be deleted from the Projects tab. Deletion removes the log-drain project registry entry and its branch cache, but it does not purge retained request, log, deployment, cron, or resource records. The built-in platform project is locked and cannot be deleted from the UI or API. Changing the selected branch queues the project for deployment. For the built-in platform project, the blue/green watcher reads the selected branch from the project registry. If the selected branch differs from the current checkout and the worktree is clean, the watcher fetches and checks out that branch before continuing. If the worktree is dirty, the project is marked blocked in watcher status instead of crashing or forcing a checkout. Manual deploy actions also write deployment_status = 'queued'; the watcher must consume that queued status whether Git is already up to date or the watcher first fast-forwards to a newer commit, then advance it through building, deploying, and ready or failed. Successful platform deploys must update the project row’s latest commit fields at the same time as deployment history so the Projects tab cannot remain queued with an older commit after the runtime has advanced. The project list also reconciles stale queued state for the built-in project from the blue/green runtime snapshot. If the active or latest successful deployment matches the queued commit, or was completed after the queue timestamp, the API clears queued, marks the project ready, and refreshes the latest commit fields from deployment history. For imported projects, the same watcher polls enabled projects from the registry, keeps managed source checkouts under tmp/docker-web/projects/<projectId>/repo, and writes generated Next.js compose/runtime files under tmp/docker-web/projects/<projectId>/runtime. These services run with the shared tuturuuu Compose project name, attach to the central Docker network, and receive PLATFORM_PROJECT_ID, selected branch, log-drain, Redis, PORT, and HOSTNAME=0.0.0.0 environment wiring. Hostname routes are merged into the central nginx proxy config so nginx remains the mandatory entrypoint. Managed project hostnames must be plain DNS names; the API rejects ports, wildcards, whitespace, nginx syntax, and reserved Tuturuuu platform hostnames, and the watcher defensively drops unsafe persisted values before rendering nginx. The revamped monitoring UI reads both the Postgres drain and the pre-drain blue/green files under tmp/docker-web. Requests and Logs include retained proxy traffic, watcher logs, and request console lines while deployments are enriched from the watcher snapshot so commit subject, full hash, short hash, stamp, active color, and runtime lane remain visible even before the Postgres drain has a complete history. The Logs tab serves grouped rows. Events with the same request id collapse into one expandable row; legacy standalone logs fall back to route, source, deployment, and minute bucket grouping. Operators can filter by route, status family, source, level, request id, and deployment stamp, then expand a row to see the child event timeline, metadata, error stack, client context, and deployment context. Legacy Docker console capture coalesces timestamped continuation lines from the same container write before persisting request console logs. This keeps object dumps from old route console output attached to one function/request row instead of turning every serialized property into a separate log row. Deployment rows keep historical failed attempts searchable, but failed and successful attempts for the same commit are not merged into one current failed state. The global deployment failure banner and current blocked-target summary come only from the latest active deployment row; older failures remain visible inside deployment history rows and filters. The Requests tab intentionally freezes its result window when opened. New traffic is counted in the background and offered as an explicit “show new” action, while older pages are appended automatically as the operator scrolls. This prevents live traffic from shifting the visible rows during investigation. The frozen since and until cursor bounds must be applied by the Postgres log-drain query or legacy archive reader before any ORDER BY ... LIMIT, archive page size, or aggregate row cap so request floods after the freeze cannot evict older rows still inside the operator’s investigation window. Monitoring links preserve the selected project query parameter so Overview, Deployments, Logs, Analytics, Observability, Cron, Requests, and Resources stay scoped to the same project while an operator moves between tabs. The project scope card on each tab is the visible source of truth for the active project, branch, hostnames, commit, and addons. Queued deployment rows only advance while the blue/green deployment watcher is live and connected to log-drain Postgres. If the Projects tab sees a queued project while the watcher is missing, offline, stale, or locked to a different branch than the built-in project, the UI queues a blue-green-watcher-recovery.request.json control request. The Docker cron runner reads that request from the shared control directory, clears stale watcher lock/status files, and recreates the web-blue-green-watcher service through Docker Compose so recovery does not depend on the stuck watcher process. The production watcher service must receive PLATFORM_LOG_DRAIN_DATABASE_URL, optional GITHUB_TOKEN, and any addon env such as Redis from Compose. Each drained request stores the client IP address and user agent when those headers are available. Server logs emitted inside the request or cron AsyncLocalStorage context are scoped back to the same request id, so Requests can show related console/server lines next to the originating request without relying on terminal access.

Cron Control

Infrastructure → Monitoring → Cron exposes the native Docker cron runner state, a global enable/disable switch, per-job enable/disable switches, manual run buttons, retained execution rows, and captured response/console output. Runtime overrides are stored in tmp/docker-web/watch/control/cron-control.json; they do not edit apps/web/cron.config.json, so Vercel cron config remains source-controlled while local Docker operations can pause individual jobs safely. Cron snapshot and execution history reads are available to infrastructure viewers, but cron mutations are operator-only. Manual run requests and global or per-job runtime enablement changes must go through routes guarded by authorizeInfrastructureOperator so roles with only view_infrastructure cannot change scheduler behavior. Cron log-drain wrappers must not persist unauthorized attempts. Keep withCronLogDrain configured so 401 and 403 cron responses return to callers without writing requests, cron_runs, or buffered log_events; attackers must not be able to fill observability storage with unauthenticated cron probes. Request-archive console lenses must store and return redacted summaries only. Watcher ingestion and web-side archive normalization both redact common bearer tokens, JWTs, sensitive key/value fields, sensitive query parameters, and email addresses, then cap each attached console message at 500 characters. Do not persist raw app-container console payloads in blue-green-request-logs. Cron jobs should show both the raw expression and a natural description such as “Every 15 minutes”, plus the previous and next scheduled run timestamps when the runtime snapshot provides them. Cron expressions are stored as runtime config, while visible daily schedule descriptions and run timestamps are rendered in the viewer’s browser timezone. The infrastructure-sample-resources job runs every minute through /api/cron/infrastructure/sample-resources. It is the automated source for retained resource charts; do not rely on opening the Resources page to create samples.

Resources

Infrastructure → Monitoring → Resources reads the Docker runtime snapshot and displays container health, image/service identity, uptime, CPU, memory, compact network ingress/egress, and aggregate service counts. This is the operator view for Docker Desktop resource pressure without opening Docker Desktop directly. The same tab also separates Docker build consumption from the generic container inventory. Build resources are derived from BuildKit and builder containers captured by docker stats; while a watcher deployment is building or deploying, the web-blue-green-watcher container is also counted as builder process pressure because it owns the Docker build command. This matters because Docker Desktop can show an active Buildx record before the watcher captures a fresh stats sample, and the Resources tab should still show that a build process is active instead of claiming the build lane is idle. The sampled resource metrics are charted as docker.build.* usage events so operators can distinguish build pressure from runtime web, proxy, Redis, and sidecar pressure. The internal resource sampler writes the current Docker snapshot into the log drain usage_events table at most once per minute. The UI charts that history across the supported resource windows: 1 hour, 6 hours, 12 hours, 24 hours, 3 days, and 7 days. If the log drain is disabled or unavailable, the tab still shows the live Docker snapshot and falls back to a single current sample. Chart gaps on the Resources tab mean a retained sampler record is missing for that time bucket; they are not automatically downtime. The Resources tab now shows sampling continuity for runtime and build metrics, including sampled buckets, gap buckets, latest retained sample age, and a live-snapshot marker for the current bucket. If the live snapshot is healthy but historical charts have gaps, inspect the Cron tab and the infrastructure-sample-resources job before treating the gap as runtime unavailability. Resource pressure uses the same thresholds as the dashboard: memory under 200 MB is green, 200-500 MB is amber, 500-1024 MB is orange, and anything above 1024 MB is red. CPU under 5% is green, 5-20% is amber, 20-40% is orange, and anything above 40% is red.

Stress Tests

Infrastructure → Monitoring → Stress Tests queues controlled native load runs for root workspace infrastructure operators. The web app only validates permissions, writes a control request, and reads status. It must not generate load inside a Next.js route handler. Allowed targets come from PLATFORM_STRESS_TEST_TARGETS, a JSON array of objects with id, label, baseUrl, optional defaultPath, and optional description. If the variable is absent, the dashboard exposes only the local web target at http://127.0.0.1:7803. Do not accept arbitrary operator-entered URLs; add production, staging, or canary targets explicitly to this allowlist. Run the native worker with:
node scripts/watch-stress-tests.js
The web container writes queued run and abort request files only under PLATFORM_STRESS_TEST_CONTROL_DIR or the default tmp/docker-web/watch/control/stress-tests, which should resolve to the writable blue/green control mount in production. The native worker consumes those control files, writes live run state under PLATFORM_STRESS_TEST_MONITORING_DIR or tmp/docker-web/stress-tests, and tags synthetic requests with X-Tuturuuu-Stress-Test-Run plus a stress-test user agent. Keep the runtime/monitoring tree read-only for web containers; run directories, samples, and result files are worker-owned. Durable run summaries are stored in private Supabase tables and read only through /api/v1/infrastructure/monitoring/stress-tests. Completed runtime files are synced back to those private tables when the dashboard/API reads them, so rollout order stays fail-open if the migration or log-drain database is not available yet. Per-run resource samples are also copied into log-drain usage_events with stress.* metrics and metadata.runId. This lets operators compare RPS, latency, CPU, memory, and network spikes against the existing Requests, Logs, Deployments, and Resources tabs for the same run window.

Troubleshooting

If the UI is empty:
  1. Confirm log-drain-postgres is healthy with Docker Compose.
  2. Confirm PLATFORM_LOG_DRAIN_DATABASE_URL is present in the web container.
  3. Check that the route or cron handler is using serverLogger or a log-drain wrapper.
  4. Use Infrastructure → Monitoring → Logs to search by message, route, request id, level, or source.
If logs are missing only for cron jobs, verify the cron route is wrapped with withCronLogDrain(...) and the cron runner is calling the expected web origin.