Table of Contents
- Understanding PostgreSQL Configuration Files
- postgresql.conf
- pg_hba.conf
- pg_ident.conf
- Core Configuration Parameters
- Memory Management
- Connection Settings
- Query Optimization
- Logging
- Security Configuration
- Authentication with pg_hba.conf
- SSL Encryption
- Advanced Configurations
- Write-Ahead Logging (WAL)
- Autovacuum Tuning
- Replication Basics
- Step-by-Step Configuration Workflow
- Troubleshooting Common Issues
- Conclusion
- References
1. Understanding PostgreSQL Configuration Files
PostgreSQL stores configuration settings in several key files. The location of these files varies by operating system and installation method (e.g., package manager, source, or Docker). Common paths include:
- Linux (Debian/Ubuntu):
/etc/postgresql/<version>/main/ - Linux (RHEL/CentOS):
/var/lib/pgsql/<version>/data/ - macOS (Homebrew):
/usr/local/var/postgres/ - Windows:
C:\Program Files\PostgreSQL\<version>\data\
To confirm the data directory, run SHOW data_directory; in a psql session.
Key Configuration Files
1. postgresql.conf
The primary configuration file for PostgreSQL. It controls database behavior, including memory allocation, connections, query optimization, logging, and more. Parameters are commented out by default with their default values; uncomment and modify to customize.
2. pg_hba.conf
(Host-Based Authentication) Controls who can connect to the database and how. It defines rules for user authentication, IP address restrictions, and authentication methods (e.g., password, SSL, peer).
3. pg_ident.conf
(Identity Mapping) Maps OS usernames to PostgreSQL roles for peer or ident authentication methods. Useful for local connections where you want to link OS users to database roles without passwords.
2. Core Configuration Parameters
Let’s break down critical parameters in postgresql.conf by category.
Memory Management
PostgreSQL relies heavily on memory for caching data and indexes, reducing disk I/O. Misconfiguring memory settings is a common cause of poor performance.
shared_buffers
- Purpose: The amount of memory PostgreSQL uses for caching data pages (shared across all connections).
- Default: 128MB (way too low for production).
- Recommendation: For dedicated database servers, set to 25% of available RAM (e.g., 4GB if total RAM is 16GB). For shared servers (e.g., with app and DB on the same machine), use 10-20% to avoid starving other services.
- Example:
shared_buffers = 4GB
work_mem
- Purpose: Memory allocated per operation (e.g., sorts, hashes, joins) for a single connection.
- Default: 4MB.
- Risk: Setting this too high can cause memory bloat! If 100 connections each run a query with 5 sorts, total memory could be
100 * 5 * work_mem. - Recommendation: Start with 16-64MB for moderate workloads. Use
EXPLAIN ANALYZEto check if queries are spilling to disk (look forSort Method: External Merge), which indicateswork_memis too low. - Example:
work_mem = 32MB
maintenance_work_mem
- Purpose: Memory for maintenance operations (e.g.,
VACUUM,CREATE INDEX,ALTER TABLE). These are resource-intensive but run infrequently. - Default: 64MB.
- Recommendation: Set to 10-20% of RAM (but cap at 1GB to avoid waste). For example, 1GB on a 16GB server.
- Example:
maintenance_work_mem = 1GB
effective_cache_size
- Purpose: An estimate of how much memory the OS and PostgreSQL will use to cache data and indexes. Guides the query planner to choose better plans (e.g., index scan vs. sequential scan).
- Default: 4GB.
- Recommendation: Set to 50-75% of RAM (e.g., 8GB on a 16GB server). The planner assumes this much memory is available for caching, so overestimating is safer than underestimating.
- Example:
effective_cache_size = 8GB
Connection Settings
These parameters control how clients connect to PostgreSQL.
max_connections
- Purpose: Maximum number of concurrent connections allowed.
- Default: 100.
- Considerations: Each connection uses memory (e.g.,
work_mem, session state). Too many connections can exhaust memory or slow the server. - Recommendation: Use connection pooling (e.g., PgBouncer) if you need >200 connections. For small apps, 100-200 is safe.
- Example:
max_connections = 150
listen_addresses
- Purpose: IP addresses PostgreSQL listens on for connections.
- Default:
localhost(only local connections). - Use Case: To allow remote connections, set to
'*'(all interfaces) or specific IPs (e.g.,'192.168.1.0/24'for a subnet). - Security Note: Always restrict with
pg_hba.confif opening to remote networks! - Example:
listen_addresses = '*'
port
- Purpose: Network port for PostgreSQL (default: 5432). Change if needed to avoid conflicts (e.g., with another PostgreSQL instance).
- Example:
port = 5433
Query Optimization
Tweak these to help the PostgreSQL planner generate faster queries.
random_page_cost and seq_page_cost
- Purpose: Cost estimates for reading random vs. sequential data pages. Lower values make the planner prefer index scans; higher values favor sequential scans.
- Defaults:
random_page_cost = 4.0,seq_page_cost = 1.0. - Recommendation: On SSDs, set
random_page_cost = 1.1(SSDs have low random access latency), making indexes more appealing. - Example:
random_page_cost = 1.1
default_statistics_target
- Purpose: Controls the amount of statistics collected by
ANALYZE(used by the query planner). Higher values improve plan accuracy but take longer to compute. - Default: 100.
- Recommendation: For large tables with skewed data (e.g., rare values), set to 200-500. Use
ALTER TABLE ... ALTER COLUMN ... SET STATISTICS 500for specific columns. - Example:
default_statistics_target = 200
Logging
Logging is critical for debugging, monitoring, and auditing.
log_statement
- Purpose: Which SQL statements to log.
- Options:
none,ddl,mod,all(logs all statements). - Recommendation: Use
'mod'(logsINSERT/UPDATE/DELETE) for production, or'all'temporarily to debug slow queries. - Example:
log_statement = 'mod'
log_min_duration_statement
- Purpose: Logs queries taking longer than a specified duration (in milliseconds).
- Example:
log_min_duration_statement = 1000(logs queries taking >1 second).
log_directory and log_filename
- Purpose: Where logs are stored and how they’re named.
- Example:
log_directory = 'pg_log' # Subdirectory of data_dir log_filename = 'postgresql-%Y-%m-%d.log' # Daily log rotation
3. Security Configuration
pg_hba.conf Rules
Each line in pg_hba.conf follows this format:
# TYPE DATABASE USER ADDRESS METHOD [OPTIONS]
- TYPE:
local(Unix socket),host(TCP/IP),hostssl(TCP/IP over SSL),hostnossl(TCP/IP without SSL). - DATABASE: Database(s) to allow (e.g.,
all,mydb). - USER: Role(s) to allow (e.g.,
all,johndoe). - ADDRESS: IP range (e.g.,
192.168.1.0/24,0.0.0.0/0for all). - METHOD: Authentication method (e.g.,
peer,md5,scram-sha-256,ssl).
Common Authentication Methods
-
peer: For localUnix socketconnections. Matches OS username to PostgreSQL role.
Example:local all all peer(local users connect without password if OS username = role). -
scram-sha-256: Secure password authentication (better thanmd5).
Example:host all myappuser 192.168.1.0/24 scram-sha-256 -
ssl: Requires SSL-encrypted connections.
Example:hostssl all all 0.0.0.0/0 scram-sha-256
SSL Encryption
Enable SSL to encrypt data in transit:
In postgresql.conf:
ssl = on
ssl_cert_file = 'server.crt' # Path to SSL certificate
ssl_key_file = 'server.key' # Path to private key
Generate self-signed certificates (for testing) with:
openssl req -new -x509 -nodes -out server.crt -keyout server.key
chmod 600 server.key
chown postgres:postgres server.crt server.key
4. Advanced Configurations
Write-Ahead Logging (WAL)
WAL ensures data durability by writing changes to a log before applying them to the database. Critical for write-heavy workloads.
wal_buffers
- Purpose: Buffer for WAL data before writing to disk.
- Default: 16MB (or 1/32 of
shared_buffers, whichever is larger). - Recommendation: Leave at default unless you have very high write throughput (e.g., 64MB for 10k+ writes/sec).
max_wal_size
- Purpose: Maximum size of WAL files before checkpoint (disk write). Larger values reduce checkpoints but use more disk space.
- Default: 1GB.
- Recommendation: 4-8GB for write-heavy workloads.
Autovacuum Tuning
PostgreSQL automatically runs VACUUM and ANALYZE to clean up dead tuples (bloat) and update statistics. Tune these for large or busy databases:
autovacuum = on # Enabled by default
autovacuum_vacuum_scale_factor = 0.02 # Trigger VACUUM when 2% of tuples are dead (default 0.2)
autovacuum_analyze_scale_factor = 0.01 # Trigger ANALYZE when 1% of tuples change (default 0.1)
5. Step-by-Step Configuration Workflow
Let’s walk through modifying postgresql.conf to optimize a 16GB RAM server:
1. Edit postgresql.conf
sudo nano /var/lib/postgresql/15/main/postgresql.conf # Path varies by OS/version
2. Update Key Parameters
# Memory
shared_buffers = 4GB # 25% of 16GB RAM
work_mem = 32MB # Per-operation memory
maintenance_work_mem = 1GB # For VACUUM/CREATE INDEX
effective_cache_size = 8GB # 50% of RAM
# Connections
max_connections = 150
listen_addresses = '*' # Allow remote connections
# Logging
log_statement = 'mod' # Log writes/updates
log_min_duration_statement = 1000 # Log slow queries (>1s)
# Query Optimization
random_page_cost = 1.1 # For SSDs
default_statistics_target = 200
3. Update pg_hba.conf (Restrict Access)
Allow a remote app user with SSL:
# Allow "myappuser" from 192.168.1.100 with SCRAM password over SSL
hostssl myappdb myappuser 192.168.1.100/32 scram-sha-256
4. Reload Configuration
Apply changes without restarting PostgreSQL:
sudo systemctl reload postgresql # Systemd systems
# OR
pg_ctl reload -D /var/lib/postgresql/15/main # Direct reload
5. Verify Changes
In psql, check parameters:
SHOW shared_buffers;
SHOW work_mem;
SHOW log_min_duration_statement;
6. Troubleshooting Common Issues
- Server fails to start: Check logs in
log_directoryfor errors (e.g., invalid parameter values). - Connections denied: Verify
pg_hba.confrules (order matters—earlier rules take precedence) andlisten_addresses. - Slow queries: Use
log_min_duration_statementto identify slow queries, then optimize withEXPLAIN ANALYZE. - Memory bloat: If
OOM killerterminates PostgreSQL, reducework_memormax_connections.
7. Conclusion
PostgreSQL configuration is not a “set it and forget it” task. By tailoring postgresql.conf, pg_hba.conf, and other files to your workload (memory, connections, write patterns), you can drastically improve performance and security. Always test changes in staging first, monitor metrics (e.g., with Prometheus + Grafana), and iterate based on real-world usage.
8. References
- PostgreSQL Official Documentation:
postgresql.conf - PostgreSQL Official Documentation:
pg_hba.conf - High Performance PostgreSQL (Book by Gregory Smith)
- PGTune: Automated Configuration Tool (Generates
postgresql.confrecommendations)