- PostgreSQL Replicas Explained: Patroni, Swarm & Kubernetes
PostgreSQL Replicas Explained: Patroni, Swarm & Kubernetes
A practical, concise guide to PostgreSQL replication and HA decisions using Patroni, Docker Swarm, Kubernetes, and…

📚 Get Practical Development Guides
Join developers getting comprehensive guides, code examples, optimization tips, and time-saving prompts to accelerate their development workflow.
Related Posts:
PostgreSQL has built-in support for database replication. One server takes writes, others follow it and stay in sync. That mechanism is entirely native to PostgreSQL itself — it has nothing to do with Docker, Kubernetes, or any orchestration tool. Every other technology in this guide operates on top of that foundation. They decide where PostgreSQL runs, how it is supervised, and how failover is handled when a server goes down. If you have ever felt confused about why Patroni, Docker Swarm, and Kubernetes all seem to do something "similar" around PostgreSQL — this guide untangles that completely.
I was confused about this for longer than I'd like to admit
A few years ago I was setting up a PostgreSQL cluster for a production project. I kept running into these overlapping pieces — Patroni managing failover, Kubernetes running the pods, Docker Swarm mentioned in some blog posts, replication slots in the PostgreSQL docs. Everything seemed vaguely related and I kept asking myself the same question: who is actually doing the replication here?
The answer, once I found it, was clarifying. PostgreSQL is always doing the replication. Everything else is playing a different role. This guide is the explanation I wish I had found first.
Start here: what a PostgreSQL replica actually is
Before anything else, you need a clear picture of what PostgreSQL replication looks like at its most basic level.
PostgreSQL operates on a primary-standby model. You have one server — the primary — that accepts read and write queries. You have one or more standby servers that follow the primary and receive a continuous stream of data changes. Standbys can be configured as warm standbys (they follow the primary but reject all queries) or hot standbys (they follow the primary and accept read-only queries). Either way, the standby is always a few moments behind the primary, replaying the same sequence of changes.
A helpful analogy: imagine a shared Google Doc where one person is typing and a second person has the document open in read-only mode. The second person's screen updates as the first person types. The primary is the person typing. The standby is the read-only viewer, always catching up. Except in PostgreSQL, the "viewer" is also recording every single change to disk in real time.
That standby server is your replica. If the primary dies, you can promote the standby to become the new primary — and it picks up from exactly where the original left off.
How PostgreSQL keeps replicas in sync: the WAL
PostgreSQL keeps everything consistent through something called the write-ahead log, or WAL.
WAL is a fundamental concept in databases. Before PostgreSQL writes any change to the actual data files, it records that change in a sequential log first. This is why it is called "write-ahead" — you log the intention before you act on it. If the server crashes mid-write, PostgreSQL can replay the log on startup and end up in a consistent state.
Replication works by sending that WAL stream to standby servers in real time. The primary generates WAL records as writes come in, and standby servers consume those records and apply the same changes to their own copy of the data. This is called streaming replication, and it is the default mechanism you will use in any modern PostgreSQL HA setup.
Think of WAL like a detailed activity log for every database change — the kind a meticulous accountant might keep. The primary writes every entry to the log as it happens. Each standby reads the same log entries in the same order and reproduces the same result. At any given moment, the standby's data is slightly behind, but it is catching up continuously.
PostgreSQL also supports cascading replication, where a standby forwards its WAL stream to a further downstream standby. This means you can have a chain: primary sends to standby A, standby A sends to standby B. This reduces the load on the primary when you are running many replicas.
Setting up replication: what it actually looks like in practice
Configuring a replica involves changes on both the primary and the standby server. On the primary, you open pg_hba.conf — PostgreSQL's client authentication config — and add an entry that allows the standby to connect using a replication role. You create a PostgreSQL user with the REPLICATION privilege. You set max_wal_senders in postgresql.conf to tell PostgreSQL how many standby connections to allow simultaneously. Optionally, you configure replication slots, which cause the primary to hold WAL data until the standby has confirmed it received everything — a safety net against the standby falling too far behind.
On the standby side, you point the server at the primary using connection details stored in postgresql.conf or a primary_conninfo setting. You start the server in standby mode, and PostgreSQL automatically connects to the primary and begins streaming.
The whole mechanism is plain PostgreSQL talking to PostgreSQL over a regular network connection, using standard PostgreSQL authentication. There is no middleware involved. No container runtime. No orchestrator. Just two PostgreSQL processes exchanging WAL over TCP.
The problem replication alone does not solve
Once you have streaming replication working, you hit a practical problem that PostgreSQL intentionally leaves to you.
What happens when the primary server dies?
PostgreSQL gives you the building blocks. You can manually promote a standby to become the new primary by running pg_ctl promote. The standby stops reading WAL and starts accepting writes. Your application can reconnect to the new primary and continue.
The problem is the word "manually." In a production system, you do not want to wake up at 3am and SSH into a server to run a promotion command. You want the cluster to handle that automatically. You also want the cluster to prevent a situation where two nodes both think they are the primary at the same time — a condition called split-brain, which causes data divergence and is genuinely catastrophic.
PostgreSQL does not solve this coordination problem on its own. That is intentional — the PostgreSQL project focuses on the database, and the coordination problem belongs to a different layer. This is the gap that tools like Patroni exist to fill.
Option 1: VMs or bare metal with Patroni
This is the most direct mental model to understand, so it is worth covering first.
Imagine you have three servers. Each one is a virtual machine or a physical box. Each one has PostgreSQL installed. One PostgreSQL instance is your primary. The other two are replicas, streaming WAL from the primary in real time.
Now add Patroni.
Patroni is a Python-based high availability agent for PostgreSQL. You run it on each of those three servers, alongside PostgreSQL. Patroni watches the health of the local PostgreSQL instance, coordinates with the other Patroni agents running on the other servers, and manages the decision of who is currently the primary.
Patroni describes itself as a template for PostgreSQL high availability, and it supports several backends for storing shared cluster state — etcd, Consul, ZooKeeper, or Kubernetes objects. You pick one and use it as a distributed configuration store, often called a DCS. The DCS is where the cluster records the current leader and dynamic configuration settings that all nodes need to agree on.
Think of the DCS as a shared whiteboard that all three servers can read from and write to. Patroni uses that whiteboard to run a leader election. Only one node can hold the leader key at a time. The node that holds the key is the primary. When the current primary disappears, the remaining nodes race to claim the key, and the winner gets promoted. This is how Patroni prevents split-brain — there is one authoritative record of who is currently in charge.
The roles in this setup are clean and distinct. PostgreSQL handles the actual data replication through WAL streaming. Patroni handles the cluster coordination and failover decision-making. The DCS gives the cluster a single shared source of truth. Nothing about this requires containers or a cloud provider. It works on bare metal in your own data center.
Option 2: PostgreSQL in Docker Swarm
To understand PostgreSQL on Docker Swarm, you first need to understand what Docker Swarm actually is.
Docker Swarm is Docker's native clustering mode. When you run multiple servers that have Docker installed, you can join them into a Swarm. A Swarm is a group of machines where Docker manages scheduling — deciding which machine runs which container, restarting containers that die, distributing load. You define a "service" with a desired number of "replicas" and the Swarm manager ensures that many containers are running across the available machines at all times.
That word "replicas" is where confusion enters. In Docker Swarm, a replica is a container instance — a copy of a running process. If you tell Swarm you want three replicas of a web application, Swarm runs three identical containers. If one dies, Swarm starts a replacement. This is stateless horizontal scaling.
PostgreSQL replication is a completely different concept. A PostgreSQL replica is a standby server that holds a synchronized copy of the database state and receives a continuous WAL stream from the primary. PostgreSQL replicas are not interchangeable. The primary has a specific role. The standbys have a specific role. You cannot just start another container and have it automatically join the replication cluster.
When you run PostgreSQL in Docker Swarm, you get the container scheduling and restart behavior that Swarm provides. Swarm will keep your PostgreSQL container running and restart it if it crashes. Swarm will schedule containers across available nodes. What Swarm will not do is understand PostgreSQL's replication protocol, coordinate leader election, or prevent two containers from both trying to act as the primary.
If you want PostgreSQL HA on Docker Swarm, you still need to design the replication and failover layer yourself. You might run Patroni inside the containers. You might use replication slots and handle promotion manually. The point is that Swarm solves the container scheduling problem, and the PostgreSQL HA problem remains a separate responsibility.
Option 3: PostgreSQL in Kubernetes
Kubernetes is worth understanding properly before discussing how PostgreSQL fits into it, because Kubernetes is substantially more sophisticated than Docker Swarm and has a richer set of primitives.
Kubernetes is an open-source system for automating the deployment, scaling, and management of containerized workloads. At its core, Kubernetes gives you a cluster of machines — called nodes — and a control plane that schedules containers onto those nodes as units called Pods. Kubernetes watches the desired state you declare and continuously works to make the actual state match it.
Kubernetes has built-in support for stateful applications through a resource called a StatefulSet. A StatefulSet gives each Pod a stable network identity and a stable storage volume that persists across restarts. This matters for databases, where each instance has its own data that must survive container restarts and rescheduling.
However — and this is the key point — Kubernetes is still a general-purpose orchestration system. It knows that a Pod died and needs to be rescheduled. It does not know that your PostgreSQL primary just failed and standby B is a better candidate for promotion than standby A because it has a lower replication lag. That kind of application-specific knowledge requires additional tooling.
This is where the Kubernetes operator pattern comes in. A Kubernetes operator is a custom controller that extends the Kubernetes API with application-specific automation. An operator watches for custom resources you define, interprets their desired state, and takes actions that a human operator would otherwise perform manually. For PostgreSQL, this means an operator can understand WAL streaming, replication lag, switchover logic, and failover — and encode all of that knowledge into automated Kubernetes-native behavior.
Patroni on Kubernetes
Patroni supports Kubernetes as a native distributed configuration store. Instead of running a separate etcd cluster or Consul cluster to hold the leader key and cluster state, Patroni can use Kubernetes API objects directly — ConfigMaps, Endpoints, or Leases.
This means you can run the same Patroni-based HA model inside Kubernetes. Kubernetes schedules the Pods and manages their lifecycle. PostgreSQL handles WAL streaming between the instances. Patroni coordinates leader election and failover using Kubernetes objects as the shared coordination layer. The division of responsibility is exactly the same as in the VM model — the only change is where Patroni stores the shared cluster state.
This is a natural fit for teams that already have Kubernetes infrastructure and want to adopt a proven PostgreSQL HA approach without switching to a new tool. Patroni's Kubernetes support is well-documented and actively maintained.
CloudNativePG: the Kubernetes-native operator
If you are building entirely within Kubernetes and want a solution that is designed for the platform from the ground up, CloudNativePG is the most prominent answer.
CloudNativePG is a PostgreSQL operator for Kubernetes. You declare a Cluster resource in YAML — specifying the number of instances, storage size, PostgreSQL version, backup configuration — and the operator handles the rest. It provisions the Pods, configures streaming replication between them, sets up TLS-encrypted replication channels, manages failover, and handles day-two operations like backups, major version upgrades, and switchovers.
CloudNativePG does not replace PostgreSQL replication. It configures and manages PostgreSQL replication on your behalf. Underneath the operator's automation, PostgreSQL instances are still streaming WAL to each other in exactly the same way they would on bare metal. The operator is a management layer, and PostgreSQL is the database.
The advantage of CloudNativePG over running Patroni in Kubernetes is integration depth. CloudNativePG understands Kubernetes-native primitives — Services, PersistentVolumeClaims, Secrets, RBAC — and exposes PostgreSQL cluster management through custom Kubernetes resources. For teams that are already operating Kubernetes fluently, this fits naturally into existing workflows and tooling.
If you want a working reference, I put together a complete three-node CloudNativePG example on GitHub: cnpg-three-postgres. It includes the full Cluster resource definition, storage configuration, and replication setup — ready to apply to your own Kubernetes cluster.
Comparison: which approach fits which situation
| Approach | What it provides | What you still need to handle |
|---|---|---|
| VMs + Patroni | Clear HA model, proven in production, runs anywhere | Your own DCS (etcd, Consul), server provisioning |
| Docker Swarm | Container scheduling and restart across nodes | PostgreSQL HA layer — Swarm does not provide this |
| Kubernetes + Patroni | Kubernetes scheduling with Patroni-managed failover | Kubernetes cluster to operate |
| Kubernetes + CloudNativePG | Kubernetes-native PostgreSQL HA, full lifecycle management | Kubernetes cluster to operate |
FAQ
Does Kubernetes automatically handle PostgreSQL failover?
Kubernetes will restart a failed Pod and reschedule it on a healthy node, but it has no understanding of PostgreSQL replication roles. Without an operator like CloudNativePG or a tool like Patroni, Kubernetes cannot perform a PostgreSQL failover. You need a PostgreSQL-aware layer on top of Kubernetes for that.
What is the difference between a Docker Swarm replica and a PostgreSQL replica?
A Docker Swarm replica is a container instance — a copy of a running process, interchangeable with other copies of the same service. A PostgreSQL replica is a standby database server with a specific role, maintaining a synchronized copy of the primary's data through WAL streaming. They use the same word but describe completely different concepts.
Can I run Patroni inside Kubernetes without CloudNativePG?
Yes. Patroni has native Kubernetes support and can use Kubernetes API objects as its distributed configuration store. Many production setups run Patroni-managed PostgreSQL inside Kubernetes pods without using a separate operator. CloudNativePG is an alternative approach, not a requirement.
What is a replication slot and do I need one?
A replication slot is a mechanism on the primary that tracks how much WAL each connected standby has consumed. The primary holds onto WAL data until the slot's standby has confirmed it received everything, preventing the primary from deleting WAL that a standby still needs. Replication slots are useful when standbys have unreliable connectivity, but they carry a risk: if a standby disconnects for a long time, the primary accumulates WAL indefinitely and can fill its disk. Use slots deliberately and monitor them.
Is Patroni the only option for self-hosted PostgreSQL HA on VMs?
Patroni is the most widely adopted open-source tool for this purpose, but it is not the only one. repmgr is another popular choice. Stolon is used in some environments. For most teams starting fresh with self-hosted PostgreSQL HA, Patroni's documentation and community support make it the natural first choice.
Conclusion
PostgreSQL replication is a native database feature. The primary streams WAL to standbys. Standbys apply those changes and stay in sync. Every tool discussed in this guide sits around that mechanism, not underneath it.
Patroni adds the HA coordination layer — leader election, failover logic, and a shared cluster state — that PostgreSQL intentionally leaves to external tools. Docker Swarm provides container scheduling across multiple hosts, which keeps PostgreSQL containers running but contributes nothing to PostgreSQL-aware failover. Kubernetes provides a richer orchestration platform for stateful workloads, and becomes genuinely useful for PostgreSQL when paired with Patroni or an operator like CloudNativePG.
The mental model that makes all of this click: PostgreSQL always handles the data. The platform handles where that data runs. The HA layer handles what happens when something goes wrong.
If you have questions about any specific part of this stack — running Patroni with etcd, setting up CloudNativePG in a production cluster, or designing a Swarm setup for PostgreSQL — drop them in the comments below. And if you want more guides like this one, subscribe to get them as they come out.
Thanks, Matija
Frequently Asked Questions
Comments
No comments yet
Be the first to share your thoughts on this post!


