cyberangles guide

Understanding PostgreSQL Data Types: A Developer’s Reference

In PostgreSQL, data types are the foundation of database design. They define the kind of data a column can store, enforce constraints on input values, optimize storage efficiency, and impact query performance. Choosing the right data type ensures data integrity, reduces storage overhead, and simplifies application logic. Whether you’re designing a new schema or optimizing an existing one, a deep understanding of PostgreSQL’s data types is critical for building robust, efficient systems. This guide explores PostgreSQL’s rich ecosystem of data types, organized by category, with practical examples, use cases, and best practices. By the end, you’ll be equipped to select the optimal type for any scenario.

Table of Contents

  1. Numeric Types
    • Integer Types
    • Arbitrary Precision Types
    • Floating-Point Types
    • Serial Types
  2. Character Types
    • Fixed-Length: char(n)
    • Variable-Length: varchar(n)
    • Unlimited Length: text
  3. Binary Types
    • bytea
  4. Date/Time Types
    • date, time, timestamp, timestamptz, interval
  5. Boolean Type
  6. Enumerated Types (Enum)
  7. Geometric Types
  8. Network Address Types
    • inet, cidr, macaddr
  9. UUID Type
  10. JSON Types
    • json vs jsonb
  11. Array Types
  12. Range Types
  13. Composite Types
  14. Specialized Types
  15. Reference: Data Type Cheat Sheet

Numeric Types

PostgreSQL offers a range of numeric types to handle integers, decimals, and floating-point values, each optimized for specific precision and range requirements.

Integer Types

Designed for whole numbers with fixed ranges.

TypeStorageRangeUse Case
smallint2 bytes-32,768 to 32,767Small counts (e.g., number of items in a cart)
integer4 bytes-2,147,483,648 to 2,147,483,647General-purpose integers (e.g., user IDs)
bigint8 bytes-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807Large datasets (e.g., analytics counters)

Example:

CREATE TABLE products (  
  id integer PRIMARY KEY,  
  stock smallint NOT NULL CHECK (stock >= 0),  
  total_sales bigint DEFAULT 0  
);  

INSERT INTO products (id, stock) VALUES (1, 150);  

Arbitrary Precision Types: numeric/decimal

For numbers requiring exact precision (e.g., financial data). These types store values as strings of digits, avoiding floating-point errors.

  • numeric(precision, scale): precision = total digits; scale = digits after the decimal point.
  • decimal is a synonym for numeric.

Example:

CREATE TABLE transactions (  
  id integer PRIMARY KEY,  
  amount numeric(10, 2) NOT NULL -- e.g., $1234.56  
);  

INSERT INTO transactions (id, amount) VALUES (1, 99.99);  

Floating-Point Types: real and double precision

For approximate values (e.g., scientific measurements). Use when precision is not critical.

TypeStoragePrecision
real4 bytes~6 decimal digits
double precision8 bytes~15 decimal digits

Example:

CREATE TABLE sensor_data (  
  id integer PRIMARY KEY,  
  temperature real, -- e.g., 23.5°C (approximate)  
  humidity double precision -- e.g., 65.2345% (higher precision)  
);  

Serial Types: Auto-Incrementing Integers

serial types are shortcuts for creating auto-incrementing columns (backed by sequences).

TypeUnderlying TypeUse Case
smallserialsmallintSmall auto-increment IDs
serialintegerGeneral auto-increment IDs
bigserialbigintLarge-scale auto-increment

Example:

CREATE TABLE users (  
  id serial PRIMARY KEY, -- Auto-increments from 1  
  username text NOT NULL  
);  

INSERT INTO users (username) VALUES ('alice'); -- id auto-sets to 1  

Character Types

PostgreSQL provides three main character types for text data.

char(n): Fixed-Length

Stores exactly n characters, padding with spaces if the input is shorter. Rarely used unless fixed length is required (e.g., postal codes with strict formatting).

Example:

CREATE TABLE postal_codes (  
  code char(5) PRIMARY KEY -- e.g., '10001' (padded to 5 chars)  
);  

varchar(n): Variable-Length with Limit

Stores up to n characters, no padding. Use when a maximum length is enforced (e.g., usernames limited to 50 chars).

text: Unlimited Variable-Length

Stores arbitrary-length text with no practical limit. Preferred for most text data (e.g., descriptions, comments) due to flexibility.

Best Practice: Use text instead of varchar(n) unless a strict length constraint exists. Both have similar performance, but text avoids arbitrary limits.

Example:

CREATE TABLE blog_posts (  
  title varchar(100) NOT NULL, -- Max 100 chars  
  content text NOT NULL -- Unlimited length  
);  

Binary Types

bytea: Binary Data

Stores binary data (e.g., images, files, or serialized objects). For large binaries (e.g., >1MB), consider storing file paths instead, but bytea works well for small data.

Example:

CREATE TABLE avatars (  
  user_id integer PRIMARY KEY,  
  image_data bytea -- Stores small profile images  
);  

-- Insert binary data (e.g., from a file)  
INSERT INTO avatars (user_id, image_data)  
VALUES (1, pg_read_binary_file('/path/to/avatar.png'));  

Date/Time Types

PostgreSQL excels at handling temporal data with precise types.

date: Calendar Date

Stores year, month, day (e.g., 2024-05-20).

Example:

CREATE TABLE events (  
  id integer PRIMARY KEY,  
  event_date date NOT NULL -- e.g., '2024-12-31'  
);  

time and timetz: Time of Day

  • time: Time without time zone (e.g., 14:30:00).
  • timetz: Time with time zone (rarely used; prefer timestamptz for time zone-aware data).

timestamp and timestamptz: Date + Time

  • timestamp: Date and time without time zone (use with caution—no time zone context).
  • timestamptz (timestamp with time zone): Date, time, and time zone (preferred for most applications). Stores UTC internally and converts on input/output.

Example:

CREATE TABLE logs (  
  id integer PRIMARY KEY,  
  event_time timestamptz NOT NULL DEFAULT NOW() -- Auto-sets to current time (UTC-aware)  
);  

-- Insert with time zone  
INSERT INTO logs (event_time) VALUES ('2024-05-20 15:30:00 America/New_York');  

interval: Time Duration

Represents a span of time (e.g., 2 days, 3 hours 30 minutes). Useful for calculations like “add 1 week to a date.”

Example:

SELECT NOW() + INTERVAL '1 month' AS next_month; -- e.g., '2024-06-20 10:00:00+00'  

Boolean Type

The boolean type stores true or false values. Inputs like TRUE, true, t, 1 are treated as true; FALSE, false, f, 0 as false.

Example:

CREATE TABLE users (  
  id integer PRIMARY KEY,  
  is_active boolean DEFAULT true -- Flag for active users  
);  

INSERT INTO users (id, is_active) VALUES (1, TRUE);  

Enumerated Types (Enum)

Enums let you define custom types with a fixed set of allowed values (e.g., order statuses: pending, shipped, delivered).

Syntax:

-- Step 1: Define the enum type  
CREATE TYPE order_status AS ENUM ('pending', 'shipped', 'delivered', 'cancelled');  

-- Step 2: Use it in a table  
CREATE TABLE orders (  
  id integer PRIMARY KEY,  
  status order_status NOT NULL DEFAULT 'pending'  
);  

-- Insert valid value  
INSERT INTO orders (id, status) VALUES (1, 'shipped');  

-- Fails (invalid value)  
INSERT INTO orders (id, status) VALUES (2, 'lost'); -- ERROR: invalid input value for enum order_status: "lost"  

Caveat: Adding/removing enum values requires altering the type (use ALTER TYPE ... ADD VALUE), which can be restrictive in production.

Geometric Types

PostgreSQL supports geometric types for spatial data (e.g., points, lines, circles). Common types include:

  • point: (x, y) coordinates (e.g., (1, 2)).
  • circle: Circle defined by center point and radius (e.g., circle '(0,0), 5').

Example:

CREATE TABLE locations (  
  id integer PRIMARY KEY,  
  coordinates point NOT NULL -- e.g., '(40.7128, -74.0060)' (New York City)  
);  

-- Query points within a radius  
SELECT id FROM locations WHERE coordinates <-> point '(40.7128, -74.0060)' < 10; -- Distance < 10 units  

Network Address Types

For storing IP addresses, CIDR blocks, and MAC addresses.

inet: IP Address (with Subnet)

Stores IPv4 or IPv6 addresses, optionally with a subnet mask (e.g., 192.168.1.1/24).

cidr: CIDR Block

Similar to inet, but enforces valid CIDR notation (e.g., 192.168.1.0/24).

macaddr: MAC Address

Stores Ethernet MAC addresses (e.g., 08:00:2b:01:02:03).

Example:

CREATE TABLE devices (  
  id integer PRIMARY KEY,  
  ip inet NOT NULL, -- e.g., '10.0.0.1' (IPv4) or '2001:db8::1' (IPv6)  
  subnet cidr NOT NULL, -- e.g., '10.0.0.0/24'  
  mac macaddr -- e.g., 'aa:bb:cc:dd:ee:ff'  
);  

-- Find devices in a subnet  
SELECT id FROM devices WHERE ip <<= '10.0.0.0/24'; -- <<= = "is contained within"  

UUID Type

The uuid type stores 128-bit Universally Unique Identifiers (UUIDs), ideal for unique IDs across distributed systems.

Setup: Enable the uuid-ossp extension to generate UUIDs:

CREATE EXTENSION IF NOT EXISTS "uuid-ossp";  

Example:

CREATE TABLE global_ids (  
  id uuid PRIMARY KEY DEFAULT uuid_generate_v4(), -- Auto-generate UUID v4  
  data text  
);  

INSERT INTO global_ids (data) VALUES ('test'); -- id = e.g., 'a1b2c3d4-...'  

JSON Types

PostgreSQL supports two JSON types for semi-structured data: json and jsonb.

json: Text-Based JSON

Stores JSON as plain text. Use for simple storage/serialization when querying is rare.

jsonb: Binary JSON (Optimized)

Stores JSON in binary format, enabling indexing and faster queries. Preferred for most use cases.

Key Features:

  • Indexing with GIN or BTREE.
  • Operators for querying: -> (get JSON object field), ->> (get field as text), @> (contains), etc.

Example:

CREATE TABLE products (  
  id integer PRIMARY KEY,  
  metadata jsonb NOT NULL -- e.g., '{"color": "red", "sizes": ["S", "M", "L"]}'  
);  

-- Add GIN index for fast queries  
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);  

-- Query: Find red products  
SELECT id FROM products WHERE metadata ->> 'color' = 'red';  

-- Query: Find products with size "M"  
SELECT id FROM products WHERE metadata @> '{"sizes": ["M"]}';  

Array Types

PostgreSQL allows arrays of any data type (e.g., integer[], text[]). Use for ordered lists (e.g., tags, permissions).

Example:

CREATE TABLE articles (  
  id integer PRIMARY KEY,  
  tags text[] NOT NULL -- e.g., '{"sql", "postgres", "data-types"}'  
);  

-- Insert array  
INSERT INTO articles (id, tags) VALUES (1, ARRAY['sql', 'postgres']);  

-- Access elements (1-based index)  
SELECT tags[1] FROM articles WHERE id = 1; -- Returns 'sql'  

-- Query: Find articles with "postgres" tag  
SELECT id FROM articles WHERE 'postgres' = ANY(tags);  

Best Practice: Avoid overusing arrays—they can complicate joins and aggregation.

Range Types

Range types represent intervals (e.g., date ranges, numeric ranges). Common types:

  • int4range: Integer range (e.g., [1, 10)).
  • daterange: Date range (e.g., [2024-01-01, 2024-12-31]).

Example:

CREATE TABLE bookings (  
  id integer PRIMARY KEY,  
  room_id integer NOT NULL,  
  stay_dates daterange NOT NULL -- e.g., '[2024-07-01, 2024-07-05)' (half-open interval)  
);  

-- Find overlapping bookings  
SELECT * FROM bookings  
WHERE stay_dates && daterange('[2024-07-03, 2024-07-04]'); -- && = overlaps  

Composite Types

Composite types group multiple fields into a single type (like structs in programming languages).

Example:

-- Define a composite type for addresses  
CREATE TYPE address AS (  
  street text,  
  city text,  
  zip char(5)  
);  

-- Use in a table  
CREATE TABLE customers (  
  id integer PRIMARY KEY,  
  home_address address NOT NULL  
);  

-- Insert composite value  
INSERT INTO customers (id, home_address)  
VALUES (1, ('123 Main St', 'New York', '10001'));  

-- Access fields (use parentheses)  
SELECT (home_address).city FROM customers WHERE id = 1; -- Returns 'New York'  

Specialized Types

  • hstore: Key-value pairs (deprecated in favor of jsonb for most cases).
  • tsvector/tsquery: For full-text search (e.g., to_tsvector('english', 'hello world')).
  • oid: Object identifier (avoid for user tables; used internally by PostgreSQL).

Reference: Data Type Cheat Sheet

CategoryTypesUse CasesKey Notes
Numericinteger, bigint, numericIDs, counts, financial dataUse numeric for exact decimals; serial for auto-increment.
Charactertext, varchar(n)Usernames, descriptionsPrefer text unless length is constrained.
Date/Timetimestamptz, date, intervalTimestamps, dates, durationsUse timestamptz for time zone-aware data.
BooleanbooleanFlags (active/inactive)Accepts TRUE/FALSE or t/f.
JSONjsonbSemi-structured data (e.g., metadata)Use GIN indexes for fast queries.
Networkinet, cidrIP addresses, subnetsinet allows individual IPs; cidr for blocks.
UUIDuuidGlobal unique IDsUse uuid-ossp extension to generate.

Conclusion

PostgreSQL’s diverse data types empower developers to model data precisely. By choosing the right type, you ensure data integrity, optimize performance, and simplify application logic. Refer to this guide as a quick reference, and always consult the PostgreSQL Documentation for edge cases.

Happy coding! 🚀