Table of Contents
- Prerequisites
- Setting Up the Publisher
- Setting Up the Subscriber
- Testing Replication
- Monitoring Replication
- Troubleshooting Common Issues
- Advanced Configurations
- Conclusion
- References
Prerequisites
Before starting, ensure the following:
- PostgreSQL Version: Logical replication is available in PostgreSQL 10+. For best results, use PostgreSQL 12+ (includes improvements like parallel apply and better conflict handling).
- Access: You need
superuserorreplicationrole privileges on both publisher and subscriber databases. - Network Connectivity: The subscriber must be able to connect to the publisher over the network (typically on port 5432).
- Schema Compatibility: The subscriber database must have the same schema as the publisher (logical replication does not replicate DDL changes like
CREATE TABLE). - WAL Configuration: The publisher must be configured to generate logical WAL (Write-Ahead Log) records.
Setting Up the Publisher
The “publisher” is the source database that shares its data via a “publication.” Follow these steps to configure it:
2.1 Configure PostgreSQL for Logical Replication
First, update the publisher’s postgresql.conf (usually in /var/lib/postgresql/<version>/main/ or /etc/postgresql/<version>/main/) to enable logical replication:
# Enable logical WAL (required for logical replication)
wal_level = logical
# Maximum number of replication slots (one per subscriber)
max_replication_slots = 10 # Adjust based on expected subscribers
# Maximum number of WAL sender processes (one per subscriber)
max_wal_senders = 10 # Should be >= max_replication_slots
# Optional: Limit WAL retention (prevents disk bloat if subscribers lag)
wal_keep_size = 1024 # Retain 1GB of WAL (adjust as needed)
Restart PostgreSQL to apply changes:
sudo systemctl restart postgresql
Next, allow the subscriber to connect to the publisher by updating pg_hba.conf (in the same directory as postgresql.conf). Add a line for the subscriber’s IP address (replace 192.168.1.100 with your subscriber’s IP):
# TYPE DATABASE USER ADDRESS METHOD
host replication replicator 192.168.1.100/32 md5 # Allows replication from subscriber
Reload PostgreSQL to apply pg_hba.conf changes without restarting:
sudo -u postgres psql -c "SELECT pg_reload_conf();"
Create a dedicated replication user on the publisher (replace replicator and secure_password with your credentials):
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';
2.2 Create a Publication
A “publication” defines which tables to replicate and which operations (INSERT/UPDATE/DELETE/TRUNCATE) to include.
Example 1: Publish All Tables in a Database
To replicate all tables in the mydb database (excluding system tables):
-- Connect to the publisher database
sudo -u postgres psql -d mydb
-- Create a publication for all tables
CREATE PUBLICATION my_publication FOR ALL TABLES;
Example 2: Publish Specific Tables
To replicate only users and orders tables:
CREATE PUBLICATION my_publication
FOR TABLE users, orders;
Example 3: Filter Operations or Rows
- Restrict operations: Replicate only INSERTs and UPDATEs (exclude DELETEs):
CREATE PUBLICATION my_publication FOR TABLE users, orders WITH (publish = 'insert, update'); - Row-level filtering (PostgreSQL 15+): Replicate only active users (e.g.,
status = 'active'):CREATE PUBLICATION my_publication FOR TABLE users WHERE (status = 'active'), orders WHERE (total > 100); -- Replicate orders over $100
Verify the publication with:
-- List all publications
\dp+ -- In psql, or:
SELECT * FROM pg_publication;
-- Check which tables are in a publication
SELECT * FROM pg_publication_tables WHERE pubname = 'my_publication';
Setting Up the Subscriber
The “subscriber” is the target database that receives data from the publisher.
3.1 Prepare the Subscriber Schema
Logical replication does not replicate DDL changes (e.g., CREATE TABLE, ALTER TABLE). The subscriber must have an identical schema to the publisher.
To copy the schema from the publisher to the subscriber:
- Dump the publisher’s schema (replace
mydbwith your database name):pg_dump -s -d mydb -h publisher_ip -U replicator > schema_dump.sql - Restore the schema on the subscriber:
psql -d mydb_sub -U postgres -f schema_dump.sql
3.2 Create a Subscription
A “subscription” on the subscriber connects to the publisher’s publication and starts replicating data.
Connect to the subscriber database and run:
-- Replace placeholders with your publisher details
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=publisher_ip port=5432 dbname=mydb user=replicator password=secure_password'
PUBLICATION my_publication;
Key Subscription Options:
copy_data = true(default): Copies existing data from the publisher to the subscriber during setup. Usecopy_data = falseif the subscriber already has the data (e.g., after a restore).enabled = true(default): Starts replication immediately. Set tofalseto pause initially.
Example with copy_data = false:
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=publisher_ip port=5432 dbname=mydb user=replicator password=secure_password'
PUBLICATION my_publication
WITH (copy_data = false);
Verify the subscription:
-- List subscriptions
SELECT * FROM pg_subscription;
-- Check subscription status
SELECT * FROM pg_stat_subscription;
Testing Replication
To confirm replication works, perform write operations on the publisher and verify they appear on the subscriber.
Step 1: Insert Data on the Publisher
-- On publisher (mydb)
INSERT INTO users (id, name, status) VALUES (1, 'Alice', 'active');
INSERT INTO orders (id, user_id, total) VALUES (101, 1, 250);
Step 2: Verify on the Subscriber
-- On subscriber (mydb_sub)
SELECT * FROM users; -- Should return Alice
SELECT * FROM orders; -- Should return order 101
Step 3: Test Updates and Deletes
-- On publisher: Update a row
UPDATE users SET name = 'Alice Smith' WHERE id = 1;
-- On publisher: Delete a row
DELETE FROM orders WHERE id = 101;
-- On subscriber: Verify changes
SELECT * FROM users WHERE id = 1; -- Name should be 'Alice Smith'
SELECT * FROM orders WHERE id = 101; -- Should return 0 rows
Monitoring Replication
Use these tools to monitor replication health and lag:
4.1 pg_stat_subscription (Subscriber-Side)
Shows subscription status, including last sync time and lag:
SELECT
subname AS subscription_name,
status AS replication_status,
last_sync,
sync_state AS initial_sync_state,
latest_end_lsn AS last_applied_lsn
FROM pg_stat_subscription;
4.2 pg_replication_slots (Publisher-Side)
Logical replication uses a “replication slot” on the publisher to track which WAL records the subscriber has processed. Check slot status with:
SELECT
slot_name,
active, -- Is the subscriber connected?
wal_status AS wal_retention_status,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal_size
FROM pg_replication_slots WHERE slot_type = 'logical';
4.3 Logs
Check PostgreSQL logs (e.g., /var/log/postgresql/postgresql-<version>-main.log) for errors like:
- Schema mismatches (e.g., “column does not exist”).
- Connection issues (e.g., “could not connect to publisher”).
Troubleshooting Common Issues
5.1 Schema Mismatch
Problem: Subscriber is missing a column/table from the publisher.
Fix: Replicate the schema manually (e.g., using pg_dump -s from the publisher) and restart the subscription:
-- On subscriber: Drop and recreate the subscription
DROP SUBSCRIPTION my_subscription;
CREATE SUBSCRIPTION my_subscription ... ; -- Use copy_data=false if data is already synced
5.2 Replication Slot Not Advancing
Problem: Publisher retains WAL (disk bloat) because the subscriber is offline.
Fix: If the subscriber is permanently offline, drop the slot on the publisher:
-- On publisher: Drop the slot (replace with your slot name)
SELECT pg_drop_replication_slot('my_subscription');
5.3 Permission Denied
Problem: “permission denied for relation users” during initial sync.
Fix: Ensure the replication user (replicator) has SELECT privileges on published tables:
-- On publisher: Grant SELECT to replicator
GRANT SELECT ON ALL TABLES IN SCHEMA public TO replicator;
5.4 Conflicts (Subscriber Writes)
Problem: Subscriber has conflicting data (e.g., a row updated on both publisher and subscriber).
Fix: Logical replication is asynchronous, so avoid writing to the subscriber. If a conflict occurs, skip the problematic transaction:
-- On subscriber: Skip the conflicting transaction
ALTER SUBSCRIPTION my_subscription SKIP;
Advanced Configurations
6.1 Multiple Subscribers
Add more subscribers by creating additional subscriptions on new databases:
-- On subscriber 2:
CREATE SUBSCRIPTION my_subscription2
CONNECTION 'host=publisher_ip port=5432 dbname=mydb user=replicator password=secure_password'
PUBLICATION my_publication;
6.2 Pause/Resume Replication
Pause replication for maintenance (e.g., schema updates on the subscriber):
-- Pause
ALTER SUBSCRIPTION my_subscription DISABLE;
-- Resume
ALTER SUBSCRIPTION my_subscription ENABLE;
6.3 Cascading Replication
Replicate from a subscriber to another database (requires the intermediate subscriber to enable wal_level = logical).
6.4 Column-Level Filtering (PostgreSQL 16+)
Replicate only specific columns (e.g., exclude sensitive data like password):
-- On publisher: Replicate only id and name from users
CREATE PUBLICATION my_publication FOR TABLE users (id, name);
Conclusion
Logical replication is a powerful tool for flexible data synchronization in PostgreSQL. By following this guide, you’ve learned to:
- Configure publishers and subscribers.
- Create publications and subscriptions.
- Test, monitor, and troubleshoot replication.
For production use, ensure:
- Regular schema synchronization (e.g., via CI/CD pipelines for DDL changes).
- Monitoring of replication lag and disk usage (to avoid WAL bloat).
- Backups of both publisher and subscriber databases.