Table of Contents#
- Understanding
NOT EXISTSin Raw SQL - SQLAlchemy ORM Basics: Setup & Models
- Converting Raw SQL
NOT EXISTSto SQLAlchemy ORM - Advanced Use Cases
- Troubleshooting Common Errors
- Best Practices
- Conclusion
- References
1. Understanding NOT EXISTS in Raw SQL#
Before diving into SQLAlchemy, let’s clarify how NOT EXISTS works in raw SQL. The NOT EXISTS clause returns TRUE if the subquery inside it returns no rows. It’s correlated, meaning the subquery references columns from the outer query, creating a link between the two.
Example Scenario#
Suppose we have two tables:
users: Stores user data (id,name).orders: Stores order data (id,user_id(foreign key tousers.id),order_date).
Goal: Find all users who have never placed an order.
Raw SQL Query#
SELECT *
FROM users
WHERE NOT EXISTS (
SELECT 1 -- Subquery checks for orders linked to the user
FROM orders
WHERE orders.user_id = users.id -- Correlation: Link subquery to outer query
); How it works:
- The subquery
SELECT 1 FROM orders WHERE orders.user_id = users.idchecks if there’s at least one order for the current user (from the outeruserstable). NOT EXISTSnegates this: If the subquery returns no rows (no orders for the user), the outer row (user) is included in the result.
2. SQLAlchemy ORM Basics: Setup & Models#
To follow along, we’ll use SQLAlchemy’s declarative base to define models and a session to execute queries.
Step 1: Install SQLAlchemy#
pip install sqlalchemy Step 2: Define Models#
We’ll reuse the users and orders tables from the raw SQL example. Here’s how to model them in SQLAlchemy:
from sqlalchemy import create_engine, Column, Integer, String, ForeignKey, DateTime
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship
from datetime import datetime
# Initialize base class for models
Base = declarative_base()
# User model
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
name = Column(String(50), nullable=False)
# Relationship to orders (one-to-many)
orders = relationship("Order", back_populates="user")
# Order model
class Order(Base):
__tablename__ = "orders"
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey("users.id"), nullable=False)
order_date = Column(DateTime, default=datetime.utcnow)
# Relationship back to user
user = relationship("User", back_populates="orders")
# Setup in-memory SQLite database (for demo)
engine = create_engine("sqlite:///demo.db")
Base.metadata.create_all(engine) # Create tables
# Create session to interact with the database
Session = sessionmaker(bind=engine)
session = Session() 3. Converting Raw SQL NOT EXISTS to SQLAlchemy ORM#
Now, let’s convert the raw SQL NOT EXISTS query (find users with no orders) to SQLAlchemy ORM.
Key SQLAlchemy Tools#
exists(): Constructs anEXISTSsubquery.not_(): Negates a condition (equivalent toNOTin SQL).
Step 1: Build the EXISTS Subquery#
First, create a subquery that checks for the existence of orders for a user. In SQLAlchemy, this is done with exists().
from sqlalchemy import select
# Subquery: "EXISTS (SELECT 1 FROM orders WHERE orders.user_id = users.id)"
subquery = select(1).where(Order.user_id == User.id).exists() select(1): We only care if rows exist, so we select a constant (1) instead of actual columns (faster).where(Order.user_id == User.id): Correlates the subquery to the outerUsertable (critical for correctness!)..exists(): Converts the selectable to anEXISTSclause.
Step 2: Negate with not_()#
Next, negate the EXISTS subquery using not_() to get NOT EXISTS.
from sqlalchemy import not_
# Final query: "WHERE NOT EXISTS (...)"
users_with_no_orders = session.query(User).filter(not_(subquery)).all() Full SQLAlchemy Query#
Combining it all:
from sqlalchemy import select, not_
# 1. Define the EXISTS subquery
subquery = select(1).where(Order.user_id == User.id).exists()
# 2. Filter users where NOT EXISTS (subquery)
result = session.query(User).filter(not_(subquery)).all()
# Print results
for user in result:
print(f"User {user.id}: {user.name} (no orders)") Verify Generated SQL#
To confirm SQLAlchemy generates the expected SQL, print the query:
print(session.query(User).filter(not_(subquery))) Output:
SELECT users.id AS users_id, users.name AS users_name
FROM users
WHERE NOT EXISTS (SELECT 1
FROM orders
WHERE orders.user_id = users.id) This matches our raw SQL example!
4. Advanced Use Cases#
Use Case 1: NOT EXISTS with Multiple Conditions#
Filter users with no orders in the last 30 days.
from datetime import datetime, timedelta
# Subquery: Orders for the user in the last 30 days
thirty_days_ago = datetime.utcnow() - timedelta(days=30)
subquery = select(1).where(
(Order.user_id == User.id) & # Correlate to user
(Order.order_date >= thirty_days_ago) # Additional condition
).exists()
# Users with no recent orders
users_no_recent_orders = session.query(User).filter(not_(subquery)).all() Use Case 2: Self-Referencing Tables (e.g., Employees & Managers)#
For a self-referencing table (e.g., employees with manager_id), find employees with no direct reports.
Model Definition#
class Employee(Base):
__tablename__ = "employees"
id = Column(Integer, primary_key=True)
name = Column(String(50))
manager_id = Column(Integer, ForeignKey("employees.id")) # Self-reference
# Relationships
manager = relationship("Employee", remote_side=[id], backref="reports") Query: Employees with No Reports#
# Subquery: Check if employee has direct reports (reports.manager_id = employee.id)
subquery = select(1).where(Employee.id == Employee.reports.any().manager_id).exists()
# Employees with no reports (NOT EXISTS subquery)
employees_no_reports = session.query(Employee).filter(not_(subquery)).all() Use Case 3: Combine NOT EXISTS with Other Filters#
Find active users (status=1) with no orders.
subquery = select(1).where(Order.user_id == User.id).exists()
active_users_no_orders = session.query(User).filter(
(User.status == 1) & # Active users
not_(subquery) # No orders
).all() 5. Troubleshooting Common Errors#
Error 1: Uncorrelated Subquery (No Results or All Results)#
Problem: The subquery isn’t linked to the outer query, so it checks for any rows (not per outer row).
Example Mistake:
# ❌ Uncorrelated: Checks if there are ANY orders (not orders for the user)
subquery = select(1).where(Order.user_id == 123).exists() # Hardcoded user_id Fix: Always correlate the subquery to the outer table using outer_table.column == subquery_table.column:
# ✅ Correlated: Checks orders for the current user (from outer query)
subquery = select(1).where(Order.user_id == User.id).exists() Error 2: Using not Instead of sqlalchemy.not_()#
Problem: Using Python’s built-in not instead of SQLAlchemy’s not_() function.
Example Mistake:
# ❌ Python's `not` doesn't work in SQLAlchemy filters
session.query(User).filter(not subquery).all() # Raises error! Fix: Use sqlalchemy.not_():
# ✅ Correct: SQLAlchemy's not_() negates the EXISTS subquery
from sqlalchemy import not_
session.query(User).filter(not_(subquery)).all() Error 3: Subquery Returns Columns Instead of a Scalar#
Problem: The subquery selects actual columns (e.g., select(Order.id)), which is unnecessary and slower.
Fix: Select a constant (e.g., select(1)) since EXISTS only cares about row existence:
# ✅ Efficient: Select 1 instead of columns
subquery = select(1).where(Order.user_id == User.id).exists() 6. Best Practices#
- Correlate Subqueries: Always link the subquery to the outer query (e.g.,
Order.user_id == User.id) to ensure row-level checks. - Use
select(1)forEXISTS: Selecting a constant is faster than selecting columns, asEXISTSstops searching after the first match. - Test with
print(query): Verify the generated SQL matches your expectations (critical for debugging). - Prefer
NOT EXISTSOverLEFT JOIN + IS NULL:NOT EXISTSis often more efficient thanLEFT JOIN+IS NULL(databases optimizeEXISTSto short-circuit). - Index Foreign Keys: Add indexes to columns used in subquery conditions (e.g.,
orders.user_id) to speed up correlated subqueries.
7. Conclusion#
The NOT EXISTS clause is indispensable for filtering records based on absent relationships, and SQLAlchemy ORM makes it accessible with exists() and not_(). By following the steps in this guide—correlating subqueries, negating with not_(), and avoiding common pitfalls—you can seamlessly translate raw SQL NOT EXISTS logic into maintainable Python code.
Whether you’re working with simple parent-child tables or complex self-referencing relationships, SQLAlchemy’s NOT EXISTS implementation ensures your queries are both readable and efficient.