- Stop Runtime Payload Migrations in Distributed Systems
Stop Runtime Payload Migrations in Distributed Systems
Why prodMigrations fails at scale and how to run Payload CMS schema migrations once per deployment on Kubernetes or ECS
You are viewing this article before its public release.
This goes live on April 10, 2026 at 6:00 AM.

Need Help Making the Switch?
Moving to Next.js and Payload CMS? I offer advisory support on an hourly basis.
Book Hourly AdvisoryIf you are deploying Payload CMS with Postgres on Kubernetes, AWS ECS, or any environment with multiple running instances, migrations should run once per deployment in a controlled step — not on every app startup. Payload supports runtime production migrations through prodMigrations, but that feature is designed for single-instance environments. On distributed infrastructure, letting every replica attempt migrations at boot creates race conditions, startup contention, and deployment fragility. This article explains the problem, the right mental model, and where prodMigrations is actually appropriate.
I ran into this question while setting up a Payload project on ECS. The prodMigrations option looked like the obvious thing to reach for, and the docs did not immediately make clear why that would be a problem at scale. After thinking through how ECS tasks start up and what happens when you have multiple of them, the issue became obvious — and the fix was simpler than I expected.
Schema Migrations Are a Deployment Step, Not a Startup Side Effect
The core confusion is treating schema migrations and application startup as the same concern. They are not.
A schema migration changes the shape of the database — adding columns, renaming fields, modifying indexes, restructuring tables. That work belongs to the deployment lifecycle. Your running application replicas should assume the database is already at the correct schema version when they start. They should not be responsible for establishing it.
Once you hold that framing, the runtime migration pattern becomes obviously wrong in distributed setups. Multiple containers or Pods starting at roughly the same time will each attempt to run migrations against the same database. Even if Payload's migration system handles some failure cases gracefully, you are now depending on startup ordering and shared database state to prevent conflicts. That is not a pattern you want in production.
The Kubernetes and ECS Specifics
In Kubernetes, init containers run per Pod, not once globally for the cluster. If your Deployment scales to three replicas, all three get their own init sequence. A Kubernetes Job, by contrast, runs to completion — it is the right primitive for one-off work like a migration.
In ECS, the same distinction applies. An ECS service is for long-running replicated tasks. A standalone ECS task runs and stops. Migrations belong in the standalone task, not in the startup path of every service task.
The practical rule is:
One migration runner per deployment. All application replicas start only after migrations succeed.
Where that runner lives depends on your platform:
| Platform | Migration runner location |
|---|---|
| Kubernetes | Kubernetes Job before Deployment rollout |
| AWS ECS | Standalone ECS task in the deploy pipeline |
| Docker Compose | command override in a one-off service before app starts |
| CI/CD pipeline | Pre-deploy step with DB access before build or push |
| Single VM or container | prodMigrations is acceptable here |
When prodMigrations Actually Makes Sense
Payload includes prodMigrations specifically for cases where build-time database access is not available, or where a single process genuinely owns startup. It is a reasonable option when all of the following are true: you have exactly one application instance, startup-time migration is acceptable, and you are not in a multi-replica race at boot.
Environments where prodMigrations is a reasonable default:
- a single Docker container
- a single VM or long-running Node process
- a tightly controlled internal environment with one startup authority
Environments where it should not be your default:
- Kubernetes Deployments with multiple replicas
- ECS services with multiple tasks
- autoscaled containers
- blue-green or rolling deployments
- any serverless-style cold-start environment (Payload's docs explicitly flag this)
What Payload Recommends
Payload's own guidance reflects this split cleanly. For local development, Drizzle push mode keeps your dev schema in sync while you iterate fast. Once the feature is ready, you generate migration files, commit them, and run them in shared or production environments. Payload also recommends running migrations in CI before the build when DB access is available, and notes that prodMigrations exists mainly as a fallback for when it is not.
The intended split is:
- development — fast schema iteration with push mode
- deployment — explicit migration execution in a controlled step
- runtime — serve traffic, not mutate schema
Implementing the One-Runner Pattern
The implementation varies by platform, but the principle is always the same: run payload migrate once before your replicas scale up, and gate the rollout on migration success.
For a Kubernetes Job running before a Deployment:
# File: k8s/migration-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: payload-migrate
spec:
template:
spec:
containers:
- name: migrate
image: your-app-image:latest
command: ["node", "dist/payload", "migrate"]
env:
- name: DATABASE_URI
valueFrom:
secretKeyRef:
name: db-secret
key: uri
restartPolicy: Never
backoffLimit: 0
Run this Job in your CI pipeline before applying the Deployment manifest. The Deployment rollout only proceeds after the Job completes successfully.
For ECS, the equivalent is a run-task call in your deploy pipeline before updating the service:
# File: deploy.sh
aws ecs run-task \
--cluster your-cluster \
--task-definition payload-migrate \
--launch-type FARGATE \
--network-configuration "..."
# Wait for task to complete, then update the service
aws ecs update-service \
--cluster your-cluster \
--service your-service \
--force-new-deployment
In both cases, the app replicas start already knowing the schema is correct. They do not race to establish it.
FAQ
Can I just use a database lock to prevent concurrent migrations?
Some migration frameworks use advisory locks to prevent concurrent runs. Payload's migration system does offer some protection, but relying on lock behavior in a startup race is operational complexity you do not need. Running migrations in a controlled one-off step is simpler and more explicit.
What if my CI environment does not have database access?
That is the exact case Payload designed prodMigrations for. If you cannot reach the database at build or deploy time, a controlled pre-start migration step is the next best option — either a Kubernetes init Job, an ECS task, or a startup script that runs before replicas come online.
Should I commit Payload migration files to the repository?
Yes. Payload migration files are ordered deploy artifacts with up and down paths. Treat them like release artifacts — generate, review, commit, and execute in a controlled step. Never skip committing them.
Does this apply to local development with Docker Compose?
In a single-developer local environment the risk is low, but the habit is still useful. You can add a migration service to your Compose file that runs and exits before the app service starts, using depends_on with condition: service_completed_successfully.
What happens if a migration fails in the pipeline?
The deploy stops. That is the right behavior. Payload migration files include down paths, so you can roll back if needed. A failed migration surfacing in the deploy pipeline is far better than a failed migration surfacing inside a running production container.
Conclusion
In distributed Payload deployments, migrations belong in a controlled deployment step — not in the startup path of every replica. Running payload migrate once per deployment, before replicas scale up, eliminates startup race conditions and keeps schema readiness as an explicit operational concern. Reserve prodMigrations for single-instance environments where one process genuinely owns startup. For everything else, treat migrations the same way you treat any other release artifact: controlled, sequential, and gated.
Let me know in the comments if you have questions, and subscribe for more practical Payload and Next.js guides.
Thanks, Matija
📚 Comprehensive Payload CMS Guides
Detailed Payload guides with field configuration examples, custom components, and workflow optimization tips to speed up your CMS development process.