In-depth Next.js guides covering App Router, RSC, ISR, and deployment. Get code examples, optimization checklists, and prompts to accelerate development.
If you have ever wanted to run a Next.js or Payload-style app on your own infrastructure without leaning on Vercel for hosting, you already know the gap between "I can deploy this manually" and "another developer could reproduce this from scratch." I kept running into that gap. Manual deploys work right up until you need a second environment, a second person, or a clean rollback at 11pm. So I wrote this down as the canonical version of how I actually do it, grounded in a real working deployment rather than an idealized one.
This is the full implementation guide for running an app with separate DEV, STAGING, and PRODUCTION environments, Docker Compose for the runtime, a self-hosted GitHub Actions runner, automatic staging deploys, manual production promotion, and separate .env files, databases, and object storage per environment. It is written to be reusable, but every major decision is anchored in this repo's working model, so you can see why each choice exists rather than just taking it on faith.
Reach for this when you want a deployment system that does not depend on Vercel for app hosting, that another developer can reproduce from scratch, and that stays explicit about secrets, permissions, networking, and rollback rather than hiding them inside a platform.
A quick naming note that runs through everything below: my-app is a placeholder image and repository name, prod-app is a placeholder SSH host alias from ~/.ssh/config, example.com is a placeholder domain, and placeholder filenames like example.com.conf or setup-example-letsencrypt.sh should be swapped for your project's real filenames.
Every deployment guide carries hidden assumptions, so let me make mine explicit up front. This guide assumes Ubuntu 24.04 LTS or Debian 12 on both VPS hosts, with one VPS used for build and staging and a separate VPS used for production. GitHub Actions runs on the build/staging VPS, deploys go through Docker Compose, and staging and production share the same container image shape. Production is promoted by a specific commit SHA rather than a floating main, secrets live on the servers rather than in git, and wildcard TLS uses DNS-01 rather than HTTP-01.
Your stack will probably differ in places, and that's fine. What matters is that you preserve the same control points even if the surrounding details change: one canonical secret location, one canonical runner user, one canonical deploy directory per environment, and one canonical production promotion input. Hold those four constant and the rest can flex.
The Runner Workspace Model
Before going further, it helps to be precise about where things actually live on disk, because this is a common source of confusion later. This guide assumes GitHub Actions uses the default runner workspace created by actions/checkout. That means the only persistent directories you need to care about are:
/srv/my-app-staging
/srv/my-app-prod
/srv/my-app-secrets
/srv/actions-runner
Notably, do not rely on /srv/my-app unless you intentionally maintain a persistent local clone for manual debugging or ad-hoc operator commands. The CI flow does not need it, and treating it as required will lead you astray.
Target Architecture
With those assumptions in place, here is the shape of the whole system. It is worth picturing this before touching any commands, because every later step is just filling in one of these boxes.
Developer machine
└── pushes code to GitHub
Build/Staging VPS
├── GitHub Actions workspace for CI checkouts
├── self-hosted GitHub Actions runner
├── staging deploy directory
├── staging nginx
└── Docker daemon used for builds and staging runtime
Production VPS
├── production deploy directory
├── production nginx with TLS
├── production app + worker containers
├── production database or direct DB access
└── production object storage or direct object-store access
The delivery flow connects those two machines through a deliberate two-phase rhythm: staging happens automatically on every push, and production happens only when you choose to promote a verified commit.
push to main
├── build image on self-hosted runner
├── migrate staging
├── deploy staging
└── smoke-test staging
manual workflow_dispatch with SHA
├── rebuild or reuse image for that SHA
├── verify production backup policy
├── migrate production with direct DB connection
├── stream image to production over SSH
├── restart production
└── smoke-test production
Canonical Host State Before First Deploy
Architecture diagrams are aspirational until the hosts are actually prepared, so this section lists the minimum required state on each machine. Get this right once and the first deploy stops being a guessing game.
Build/Staging VPS
This machine carries the heaviest load because it both builds images and runs staging. It must have Docker Engine, the Docker Compose plugin, git, the Node.js version required by the app, pnpm, and postgresql-client if backups or direct DB inspection run on the runner. It also needs the self-hosted GitHub Actions runner service, an SSH private key that can reach production, a known_hosts entry for production, a staging deploy directory, and a server-side secret directory. On the network side, it needs access to the staging DB and, if production migrations run from the runner, to the production direct DB port.
The recommended directory layout keeps each concern in its own place:
Production is leaner by design, since it should only ever run, never build. It must have Docker Engine, the Docker Compose plugin, a production deploy directory, an nginx config directory, a cert directory, and an env file for runtime. For the database it needs either production Postgres plus PgBouncer on the same host or network access to a production DB host, along with object storage credentials. Finally it needs inbound firewall rules for 80/tcp and 443/tcp.
Its directory layout mirrors that single-purpose intent:
Now that you know the target state, the next job is bringing each server up to it. This starts with users and permissions, because every later command inherits whatever identity you set up here.
Create users
The reusable recommendation is a dedicated deploy user on both the build/staging VPS and the production VPS. For transparency, the Farmica repo this guide is based on currently runs build/staging under the VPS login user and deploys production to root, but a cleaner setup prefers a dedicated deploy user. On the build/staging VPS:
This step is only required if migrations run from source on the runner, which this repo does. Install the Node.js version your app requires, deriving it from .nvmrc, the package.jsonengines field, or your CI or Dockerfile conventions. This repo specifically requires Node.js 24:
Install Postgres client tools on the build/staging VPS
Closely related, if you create backups or run direct DB checks from the runner, you need the Postgres client tools and a backup directory the runner can write to:
Finally, lock down the network surface. The build/staging VPS should allow 22/tcp, plus 80/tcp and 443/tcp if staging is public. Production should allow 22/tcp, 80/tcp, and 443/tcp. With ufw:
Servers are useless without names pointing at them, so DNS comes next. The recommended split gives each environment its own domain space: dev on *.dev.example.com, staging on *.staging.example.com, and production on *.example.com. That translates to records like:
example.com A <PRODUCTION_VPS_IP>
*.staging.example.com A <STAGING_VPS_IP>
staging.example.com A <STAGING_VPS_IP>
*.example.com A <PRODUCTION_VPS_IP>
If you need wildcard TLS for production, use DNS-01 and keep the production records DNS only, not proxied, during initial validation. For reference, the Farmica repo uses *.farmica.online for staging and *.farmica.si for production, with the apex farmica.si intentionally split for unrelated reasons. That apex split is not something this architecture requires, so don't feel obliged to copy it.
Provision Databases and Object Storage
It is tempting to treat data stores as something you wire up later, but that is exactly how staging ends up writing into production. Treat this as a mandatory first-deploy step.
Databases
Create a separate database and credentials per environment so the three never touch each other:
my_app_dev
my_app_staging
my_app_prod
For production specifically, the recommended shape splits pooled and unpooled access: DATABASE_URL points to PgBouncer or another pooled endpoint, DATABASE_URL_UNPOOLED points to direct Postgres, the app runtime uses the pooled connection, and migrations use the unpooled one. For staging and dev, direct Postgres is usually enough, though you should still keep separate DB names and credentials.
Object storage
Apply the same isolation to buckets, with separate buckets and credentials per environment:
The non-negotiable rule here is that staging and dev must never write into the production bucket. If you are using S3-compatible storage such as Garage or MinIO, create a separate bucket, access key, and secret key for each environment.
Canonical Secret Locations
This is the single most important clarity rule in the entire guide, so it gets its own section: never copy .env files from the git checkout. Secrets belong on the server, in one canonical place. On the build/staging VPS, those canonical locations are:
The meaning behind these is straightforward: .env.staging and .env.production are full runtime env files, while .env.staging.build and .env.production.build are optional pre-trimmed build env files. If you don't maintain separate build env files, you generate them from the full env files before each build. This repo uses that second pattern:
That script extracts only what the build actually needs: PAYLOAD_SECRET, DATABASE_URL, NEXT_PUBLIC_VAPID_PUBLIC_KEY, and optional Sentry build vars. The working implementation lives at prepare-narocilnica-build-env.sh.
Environment Files
To make those canonical locations concrete, here is what the files themselves look like in sanitized form. Notice how staging and production are structurally identical but never share a single value.
With secrets sorted, the next link to forge is the one between the two machines, because production deploys happen by the runner reaching across to production over SSH. The runner user must be able to do this non-interactively.
Generate or place the key on the build/staging VPS
To make the later workflow readable, give production a short alias:
Host prod-app
HostName <PRODUCTION_IP>
User deploy
IdentityFile ~/.ssh/id_ed25519
IdentitiesOnly yes
Then confirm the whole chain works:
ssh prod-app 'docker ps'
One security note worth flagging: the current Farmica workflow uses StrictHostKeyChecking=no, but for a new setup you should prefer known_hosts plus normal host verification.
Install the Self-Hosted GitHub Runner
Now that the runner host can talk to production, give it the actual CI brain. Run this on the build/staging VPS as the runner user:
For this to be useful, the runner needs a specific set of capabilities: it must be able to run docker build and docker compose, read /srv/my-app-secrets/*, SSH to production, reach the staging DB, and reach the production direct DB port if migrations run from the runner. Verify all of that before moving on:
docker ps
docker compose version
ssh prod-app 'docker ps'sudo systemctl status actions.runner.*
If you want a persistent manual checkout for debugging, create it separately. The workflow examples in this guide do not depend on it.
Dockerfile and Build-Time Environment Injection
A subtle but important decision is how secrets reach the build without leaking into the image. You must choose one build-time pattern and document it end to end. This repo uses BuildKit secrets for the build env file, which keeps the secret out of the final image layers.
Build env generation
First, generate the trimmed build env from the full production env:
Inside the Dockerfile, the secret is sourced only for the duration of the build step that needs it:
RUN --mount=type=secret,id=env,required=true \
set -a && . /run/secrets/env && set +a && \
pnpm run build
The working implementation is at Dockerfile. One practical detail about healthchecks: this repo installs curl in the image, so curl-based healthchecks work out of the box. If your image does not contain curl, either add it or use a Node-based healthcheck. The Dockerfile installs curl for exactly this reason.
Compose Templates
The build produces an image, but Compose is what actually runs it. Keep these deploy templates in git and sync them on every deploy so the deploy directories stay reproducible rather than drifting into hand-edited snowflakes.
Staging compose
The defining rule for staging is that it must not expose the app on 0.0.0.0; it binds to localhost and sits behind nginx. Alongside the app, it runs dedicated workers for media and inventory queues:
Production builds on the same app-plus-workers core but adds the public-facing layer: nginx, cert mounts, and nginx config mounts, with an optional observability profile. The canonical shape:
That nginx service in the production compose needs a config and a certificate, which brings us to TLS. As with the build pattern, the rule is to choose one strategy and document it completely. This guide uses nginx inside the production Compose stack, a Let's Encrypt wildcard certificate, and DNS-01 validation.
Production nginx config
The config does two jobs: it serves the ACME challenge and redirects HTTP to HTTPS on port 80, then terminates TLS and proxies to the app on 443:
The reason for DNS-01 is simple: when you need a wildcard certificate like *.example.com, HTTP-01 will not do it because it does not support wildcards. DNS-01 does.
Certbot command
Install certbot and the DNS plugin first, and drop the Cloudflare credentials in a locked-down file:
If you use a non-root deploy user, place the credentials file under that user's home and update the command accordingly. The canonical issuance command:
Staging should also sit behind nginx, either via a separate host-level nginx or a separate proxy Compose project. The critical rule, repeated because it matters, is that the staging app itself binds only to 127.0.0.1.
nginx config sync policy
Finally, make one explicit choice about how nginx config is managed: either manage it in CI or treat it as a one-time manual bootstrap. This guide recommends managing nginx config in CI so the production deploy directory stays reproducible. In the workflow examples below, deployment-templates/nginx/example.com.conf and deployment-templates/scripts/setup-example-letsencrypt.sh are placeholders for your real files.
GitHub Actions Workflow
Everything so far has been preparation; the workflow is where it all comes together. The most important thing it must define is one canonical production promotion mechanism, so there is never ambiguity about what is going live.
Required triggers
The workflow responds to two events: an automatic push to main, and a manual dispatch that takes an exact SHA to promote:
The reasoning behind the SHA input is worth internalizing: staging verifies one exact commit, so production must promote that same commit. Checking out a floating main at click time is not deterministic, and that non-determinism is precisely what bites you during an incident.
Canonical workflow shape
The full workflow splits into three jobs. The first two react to a push, building and migrating staging then deploying and smoke-testing it. The third reacts only to a manual dispatch and handles the more careful production path, including a backup before migration and streaming the image over SSH:
jobs:build:runs-on:self-hostedif:github.event_name=='push'steps:-uses:actions/checkout@v6-name:Preparebuildenvrun:|
bash deployment-templates/prepare-narocilnica-build-env.sh \
/srv/my-app-secrets/.env.staging \
/tmp/my-app-build.env
-name:Buildimagerun:|
docker build \
--secret id=env,src=/tmp/my-app-build.env \
-t my-app:${{ github.sha }} .
-name:Installdepsformigraterun:pnpminstall--frozen-lockfile-name:Migratestagingrun:|
set -a && source /srv/my-app-secrets/.env.staging && set +a
pnpm payload migrate
deploy-staging:runs-on:self-hostedneeds:buildif:github.event_name=='push'steps:-uses:actions/checkout@v6-name:Syncruntimeenvrun:cp/srv/my-app-secrets/.env.staging/srv/my-app-staging/.env.staging-name:Synccomposetemplaterun:cpdeployment-templates/docker-compose.staging.yml/srv/my-app-staging/docker-compose.yml-name:Deploystagingrun:|
cd /srv/my-app-staging
export IMAGE=my-app:${{ github.sha }}
docker compose up -d --remove-orphans --no-build
-name:Smoketeststagingrun:|
cd /srv/my-app-staging
docker compose exec -T app curl -fsS http://localhost:65434/
deploy-production:runs-on:self-hostedif:github.event_name=='workflow_dispatch'steps:-uses:actions/checkout@v6with:ref:${{inputs.sha}}-name:Buildenvrun:|
bash deployment-templates/prepare-narocilnica-build-env.sh \
/srv/my-app-secrets/.env.production \
/tmp/my-app-build.env
-name:Buildorreuseimagerun:|
TAG="my-app:${{ inputs.sha }}"
if [ "${{ inputs.force_rebuild }}" = "true" ] || ! docker image inspect "$TAG" >/dev/null 2>&1; then
docker build --secret id=env,src=/tmp/my-app-build.env -t "$TAG" .
fi
-name:Installdepsformigraterun:pnpminstall--frozen-lockfile-name:Createproductionbackupbeforemigraterun:|
set -a && source /srv/my-app-secrets/.env.production && set +a
mkdir -p /srv/backups
pg_dump "$DATABASE_URL_UNPOOLED" > /srv/backups/my-app-prod-${{ inputs.sha }}.sql
-name:Migrateproductionrun:|
set -a && source /srv/my-app-secrets/.env.production && set +a
DATABASE_URL="$DATABASE_URL_UNPOOLED" pnpm payload migrate
-name:Syncruntimeenvrun:scp/srv/my-app-secrets/.env.productionprod-app:/srv/my-app-prod/.env.production-name:Synccomposerun:scpdeployment-templates/docker-compose.production.ymlprod-app:/srv/my-app-prod/docker-compose.yml-name:Syncnginxconfigandscriptsrun:|
ssh prod-app 'mkdir -p /srv/my-app-prod/nginx /srv/my-app-prod/scripts /srv/my-app-prod/certbot/www /srv/my-app-prod/certs/example.com'
scp deployment-templates/nginx/example.com.conf prod-app:/srv/my-app-prod/nginx/default.conf
scp deployment-templates/scripts/setup-example-letsencrypt.sh prod-app:/srv/my-app-prod/scripts/setup-example-letsencrypt.sh
ssh prod-app 'chmod +x /srv/my-app-prod/scripts/setup-example-letsencrypt.sh'
-name:Streamimagerun:dockersavemy-app:${{inputs.sha}}|gzip|sshprod-app'gzip -d | docker load'-name:Restartproductionrun:|
ssh prod-app '
cd /srv/my-app-prod
export IMAGE=my-app:${{ inputs.sha }}
docker compose --env-file .env.production up -d --remove-orphans
'
For honesty about the current state: the live Farmica workflow still dispatches against current main, but for a new canonical setup you should prefer the SHA input shown above. The working implementation reference is deploy.yml.
Migration Policy
Migrations are where a deploy stops being reversible by a simple image swap, so the policy around them has to be explicit rather than assumed.
The canonical rule
For this repo's shape, the order is: build the image first, install dependencies on the runner, run migrations from source on the runner, and fail the deploy before restart if migration fails. The reason migrations run from source rather than from the image is that the shipped image does not contain runnable TS migration files.
The production backup rule
Before any production migration, you either verify that a restorable backup already exists or you create one. A direct backup looks like:
If you do not want to take a fresh dump on every deploy, at minimum enforce a check that the scheduled backup succeeded recently. Either way, this is why the runner needs postgresql-client and write access to /srv/backups or another backup path.
Direct DB connection
Migrations always use the unpooled connection, overriding DATABASE_URL for the duration of the command:
set -a && source /srv/my-app-secrets/.env.production && set +a
DATABASE_URL="$DATABASE_URL_UNPOOLED" pnpm payload migrate
If migration succeeds but restart fails
This is the dangerous in-between state, and it is no longer a pure app rollback. You now have a new schema running against an old or failed application process. Your options are to fix and redeploy a compatible image, or to restore the DB backup if the migration is incompatible and rollback is required.
Irreversible migrations
Some migrations cannot be undone by an app rollback at all, so document them before merge. Dropped columns, renamed tables without a compatibility layer, and destructive data transforms all fall in this category. App rollback alone does not undo them. A working helper reference is docker-run-payload.sh.
First Staging Deploy
With the policy understood, you are ready for the first real deploy. Thanks to the default runner workspace model, you do not need a manual persistent clone under /srv/my-app for CI to work. Before pushing, verify that DNS points to the staging VPS, the staging DB exists, the staging S3 bucket exists, /srv/my-app-secrets/.env.staging exists, the runner service is online, the runner can run docker ps, and staging nginx is already configured.
Then push a test commit to main and confirm the whole chain: the image built on the runner, migrations succeeded, /srv/my-app-staging/docker-compose.yml and /srv/my-app-staging/.env.staging both exist, docker compose ps shows the app and workers healthy, and https://demo.staging.example.com/ returns 200.
First Production Deploy
Once staging is proven, production follows the same spirit with more guardrails. Before the first production deploy, verify that DNS points to the production VPS, the production DB and bucket exist, /srv/my-app-secrets/.env.production exists on the runner host, /srv/my-app-prod/ exists on the production host, the runner can SSH to production, production can run docker compose, the production nginx config is synced, the TLS certificate exists or the issuance script is ready, and the production firewall allows 80 and 443.
If you are using DNS-01 wildcard TLS, issue the certificate before the first public cutover:
ssh prod-app
cd /srv/my-app-prod
set -a && source .env.production && set +a
certbot certonly \
--dns-cloudflare \
--dns-cloudflare-credentials /root/.secrets/cloudflare.ini \
--dns-cloudflare-propagation-seconds 60 \
-d "*.example.com" \
--email ops@example.com \
--agree-tos \
--non-interactive \
--keep-until-expiring
Then run the workflow manually with the exact staging-verified SHA, and afterward verify the result:
Those final checks deserve their own section, because internal localhost checks are necessary but not sufficient. A container can answer on localhost while the public route is broken, so you test both layers.
For a tenant-routed app, check at least one real tenant hostname, not only the base domain, since base-domain success can mask broken tenant routing.
Rollback Policy
No matter how careful the deploy is, you eventually need to undo one, and not all rollbacks are equal. It helps to split them into three classes so you reach for the right tool under pressure.
App rollback
Use this when the image or config is bad but the schema is still compatible. It is the cheap, fast case:
ssh prod-app '
cd /srv/my-app-prod
export IMAGE=my-app:<previous-good-sha>
docker compose --env-file .env.production up -d --remove-orphans
'
Schema rollback
Use this when a migration changed the schema incompatibly and the previous image cannot run against it. Here you restore a backup, or run a documented down-migration if your project supports one.
Data rollback
Use this when background jobs changed data format, object storage writes changed structure, or partial job execution created inconsistent state. The procedure is to stop workers if needed, then restore data from backup or run a manual repair plan.
The rule that ties all three together, and the one most worth remembering: image rollback is not data rollback.
Image Retention and Cleanup
Because images are built locally and streamed over SSH, both hosts accumulate tags over time, and left unchecked that quietly fills disks. A minimum policy is to keep the current production image, keep the previous known-good image, and prune unused images regularly.
The safer cleanup is more deliberate: list all my-app:* tags, confirm which are still referenced by running containers, and remove only the unreferenced ones. This repo's production workflow already applies that safer pattern after each deploy.
Troubleshooting
Even a well-built pipeline fails sometimes, so here are the failure modes I actually hit and where to look first.
When staging is unreachable, check DNS, staging nginx, that the app bind address is 127.0.0.1:<port>, docker compose ps, and the container healthcheck.
When production restarted but serves the old version, check that the workflow promoted the intended SHA, that the image was loaded on production, that IMAGE in the shell matches the target tag, and that docker compose up -d --remove-orphans actually ran.
When a production migration fails, check DATABASE_URL_UNPOOLED, DB privileges, the network path from the runner to direct Postgres, and recent backup availability.
When a curl healthcheck fails in the container, check whether curl is installed in the image and whether the app listens on the expected internal port.
And when TLS fails in the browser, check that DNS points at the production VPS, that nginx has the expected cert paths mounted, that the wildcard cert actually covers the hostname, and that the cert was renewed and nginx reloaded.
Farmica Working Implementation Map
Since this whole guide is grounded in a real deployment rather than a hypothetical one, here is the concrete mapping between the reusable placeholders and the actual Farmica repo that proves the pattern:
For completeness, the live repo differs from the reusable recommendation in three ways: production is currently accessed as root, production dispatch currently uses current main instead of a required SHA input, and the live workflow still uses StrictHostKeyChecking=no. For a new setup, prefer the stricter canonical pattern described throughout this guide.
Conclusion
The problem this guide set out to solve was the one that quietly blocks most self-hosting efforts: it is easy to deploy an app by hand, and hard to build a deployment system that another developer can reproduce, that keeps environments truly separate, and that gives you a clean answer when something breaks at the worst possible moment. The approach here solves that by being explicit about the things platforms usually hide from you, with one canonical secret location, one runner user, one deploy directory per environment, and one deterministic production promotion driven by a verified commit SHA.
Walking through it, you have set up two VPS hosts, isolated databases and object storage per environment, wired a self-hosted runner that builds images and streams them to production over SSH, terminated wildcard TLS with DNS-01, and given yourself a layered rollback story that distinguishes app, schema, and data. The result is a system you own end to end, where every moving part is visible and reproducible rather than abstracted away.
Let me know in the comments if you have questions, and subscribe for more practical development guides.
Thanks,
Matija
I'm Matija Žiberna, a self-taught full-stack developer and co-founder passionate about building products, writing clean code, and figuring out how to turn ideas into businesses. I write about web development with Next.js, lessons from entrepreneurship, and the journey of learning by doing. My goal is to provide value through code—whether it's through tools, content, or real-world software.