Table of Contents#
- Introduction to MongoEngine’s Document Class
- Custom Validation with
clean()andvalidate() - Pre-Save Hooks: Modifying Data Before Save
- Auto-Generating Custom IDs
- Advanced Field Validation with Custom Validators
- Practical Example: Building a Self-Validating Blog Post Model
- Best Practices
- Conclusion
- References
1. Introduction to MongoEngine’s Document Class#
At the core of MongoEngine is the Document class, which serves as the base for all your data models. A Document subclass defines the structure of your MongoDB collection, including fields, validation rules, and custom methods.
Basic Document Structure#
Here’s a simple example of a User document:
from mongoengine import Document, StringField, EmailField, IntField
class User(Document):
username = StringField(required=True, max_length=50, unique=True)
email = EmailField(required=True, unique=True)
age = IntField(min_value=13) # Built-in validation
# Optional: Custom method
def greet(self):
return f"Hello, {self.username}!" This model maps to a MongoDB collection named user (lowercase by default) and includes built-in validation (e.g., required=True, max_length, min_value). However, to enforce more complex rules (e.g., "username must not contain spaces"), we need custom logic.
2. Custom Validation with clean() and validate()#
MongoEngine triggers validation automatically when you call save() or validate() on a document. For custom validation, override the clean() method or use the validate() method.
The clean() Method#
The clean() method is called during validation and is ideal for cross-field validation (e.g., ensuring two fields are consistent) or complex rules that can’t be handled by built-in field validators.
How to Use clean():#
- Override
clean()in yourDocumentsubclass. - Raise a
ValidationErrorif validation fails. - Use
selfto access the document’s fields.
Example: Username and Email Validation#
from mongoengine import ValidationError
class User(Document):
username = StringField(required=True, max_length=50, unique=True)
email = EmailField(required=True, unique=True)
age = IntField(min_value=13)
def clean(self):
# Ensure username has no spaces
if ' ' in self.username:
raise ValidationError("Username cannot contain spaces.")
# Ensure email domain is allowed (e.g., no disposable emails)
if self.email.split('@')[-1] in ["mailinator.com", "tempmail.com"]:
raise ValidationError("Disposable email addresses are not allowed.")
# Cross-field validation: If under 18, email must be parental
if self.age and self.age < 18 and not self.email.endswith(".parental.com"):
raise ValidationError("Minors must use a parental email.") Now, when you call user.save() or user.validate(), clean() runs, and invalid data raises a ValidationError.
The validate() Method#
The validate() method is the entry point for validation. By default, it calls clean() and checks field-level validators. You can override validate() for full control over the validation workflow, but this is rarely needed—clean() is sufficient for most cases.
3. Pre-Save Hooks: Modifying Data Before Save#
Pre-save hooks let you modify or enrich data before it’s saved to MongoDB. Common use cases include:
- Auto-generating timestamps (
created_at,updated_at). - Logging save events.
- Sanitizing input (e.g., stripping whitespace from strings).
MongoEngine provides a @pre_save decorator or an overridable pre_save() method to implement hooks.
Using the pre_save() Method#
Override pre_save() to define logic that runs before save().
Example: Auto-Setting Timestamps#
from datetime import datetime
class User(Document):
username = StringField(required=True)
created_at = DateTimeField()
updated_at = DateTimeField()
def pre_save(self):
# Set created_at on first save (if not already set)
if not self.created_at:
self.created_at = datetime.utcnow()
# Update updated_at on every save
self.updated_at = datetime.utcnow() Now, whenever user.save() is called:
created_atis set to the current UTC time if the document is new.updated_atis updated to the current UTC time.
Using the @pre_save Decorator#
For more flexibility (e.g., multiple hooks), use the @pre_save decorator from mongoengine.signals:
from mongoengine.signals import pre_save
from datetime import datetime
class User(Document):
username = StringField(required=True)
created_at = DateTimeField()
updated_at = DateTimeField()
@pre_save(sender=User)
def set_timestamps(sender, document, **kwargs):
if not document.created_at:
document.created_at = datetime.utcnow()
document.updated_at = datetime.utcnow() This achieves the same result as overriding pre_save(), but keeps the logic separate from the model class.
4. Auto-Generating Custom IDs#
MongoDB uses ObjectId as the default _id field, but you may need custom IDs (e.g., slugs, UUIDs, or sequential numbers). Use pre-save hooks to generate these IDs automatically.
Example 1: Auto-Generating Slugs#
Slugs are URL-friendly versions of a title (e.g., "my-blog-post" from "My Blog Post"). Generate a slug from a title field using pre_save():
from slugify import slugify # Install with: pip install python-slugify
class BlogPost(Document):
title = StringField(required=True)
slug = StringField(primary_key=True, unique=True) # Use slug as _id
content = StringField(required=True)
def pre_save(self):
# Generate slug if not already set
if not self.slug:
base_slug = slugify(self.title)
existing_slugs = BlogPost.objects(slug__startswith=base_slug).count()
# Add a suffix if slug exists (e.g., "my-blog-post-2")
if existing_slugs > 0:
self.slug = f"{base_slug}-{existing_slugs + 1}"
else:
self.slug = base_slug primary_key=Truereplaces the defaultObjectIdwithslugas the unique identifier.slugifyconverts the title to lowercase and replaces spaces with hyphens.- We check for existing slugs to avoid duplicates (e.g., "my-blog-post" becomes "my-blog-post-2" if the first exists).
Example 2: Auto-Generating UUIDs#
For globally unique identifiers (UUIDs), use Python’s uuid module:
import uuid
class Product(Document):
product_id = StringField(primary_key=True, unique=True)
name = StringField(required=True)
def pre_save(self):
if not self.product_id:
self.product_id = str(uuid.uuid4()) # Generate UUID4 This ensures each Product has a unique product_id like a1b2c3d4-5678-90ef-ghij-klmnopqrstuv.
Example 3: Sequential IDs (Advanced)#
For sequential IDs (e.g., order-001, order-002), use a counter collection to track the last ID and increment it:
class Counter(Document):
name = StringField(required=True, unique=True)
value = IntField(default=0)
class Order(Document):
order_id = StringField(primary_key=True)
total = FloatField(required=True)
def pre_save(self):
if not self.order_id:
# Get or create a counter for orders
counter = Counter.objects(name="order_counter").modify(
upsert=True, new=True, inc__value=1
)
self.order_id = f"order-{counter.value:03d}" # Format as 001, 002, etc. Note: This uses MongoDB’s atomic modify to avoid race conditions when incrementing the counter.
5. Advanced Field Validation with Custom Validators#
For field-specific validation (e.g., password complexity), use MongoEngine’s validators parameter in field definitions. Validators are functions that check a field’s value and raise ValidationError if invalid.
Step 1: Define a Validator Function#
A validator function takes three arguments: value (the field’s value), field (the field object), and **kwargs.
def password_complexity(value):
if len(value) < 8:
raise ValidationError("Password must be at least 8 characters long.")
if not any(c.isupper() for c in value):
raise ValidationError("Password must contain an uppercase letter.")
if not any(c.isdigit() for c in value):
raise ValidationError("Password must contain a number.") Step 2: Attach the Validator to a Field#
Add the validator to a StringField using the validators parameter:
from mongoengine import StringField
class User(Document):
username = StringField(required=True)
password = StringField(
required=True,
validators=[password_complexity] # Attach custom validator
) Now, when a User is saved, password_complexity runs automatically.
Built-in Validators#
MongoEngine also provides built-in validators (e.g., EmailValidator, URLValidator). For example:
from mongoengine.validators import URLValidator
class Profile(Document):
website = StringField(validators=[URLValidator()]) # Ensures URL format 6. Practical Example: Building a Self-Validating Blog Post Model#
Let’s combine everything into a BlogPost model with:
- Auto-generated slug (ID).
- Timestamps (
created_at,updated_at). - Custom validation (title length, content checks).
- Field validators (author name format).
Full Example Code#
from mongoengine import Document, StringField, DateTimeField, ValidationError
from mongoengine.signals import pre_save
from datetime import datetime
from slugify import slugify
import re
# Custom validator for author name (letters only)
def validate_author(value):
if not re.match(r'^[A-Za-z\s]+$', value):
raise ValidationError("Author name can only contain letters and spaces.")
class BlogPost(Document):
title = StringField(required=True, max_length=200)
slug = StringField(primary_key=True, unique=True) # Custom ID
content = StringField(required=True)
author = StringField(required=True, validators=[validate_author])
created_at = DateTimeField()
updated_at = DateTimeField()
def clean(self):
# Validate title length
if len(self.title) < 5:
raise ValidationError("Title must be at least 5 characters long.")
# Validate content is not empty
if not self.content.strip():
raise ValidationError("Content cannot be empty.")
# Pre-save hook to generate slug and timestamps
@pre_save(sender=BlogPost)
def blog_post_pre_save(sender, document, **kwargs):
# Generate slug if missing
if not document.slug:
base_slug = slugify(document.title)
existing = BlogPost.objects(slug__startswith=base_slug).count()
document.slug = f"{base_slug}-{existing + 1}" if existing else base_slug
# Set timestamps
if not document.created_at:
document.created_at = datetime.utcnow()
document.updated_at = datetime.utcnow()
# Usage Example
try:
post = BlogPost(
title="Getting Started with MongoEngine",
content="MongoEngine is a powerful ODM for MongoDB...",
author="Jane Doe"
)
post.save()
print(f"Created post with slug: {post.slug}") # Output: "getting-started-with-mongoengine"
except ValidationError as e:
print(f"Validation failed: {e}") Key Features Explained:#
- Slug Generation:
pre_savegenerates a URL-friendly slug from the title, appending a number if the slug already exists. - Timestamps:
created_atis set on first save, andupdated_atupdates on every save. - Validation:
clean()ensures the title is long enough and content isn’t empty.validate_authorensures the author’s name contains only letters and spaces.
7. Best Practices#
- Prefer Built-in Validators: Use MongoEngine’s built-in validators (e.g.,
max_length,EmailField) for simplicity before writing custom logic. - Keep
clean()Focused: Useclean()for cross-field validation; use field validators for single-field rules. - Test Validation Logic: Write unit tests for custom validators and
clean()methods to catch edge cases. - Avoid Side Effects in Hooks: Pre-save hooks should modify the document or log, not perform external actions (e.g., API calls).
- Document Custom Logic: Add docstrings to
clean(),pre_save(), and validators to explain their purpose.
8. Conclusion#
MongoEngine’s Document class methods empower you to build robust, self-validating data models. By overriding clean() for custom validation, using pre_save hooks for pre-save logic, and leveraging validators for field-specific rules, you can ensure data integrity and enforce business requirements. Auto-generating custom IDs (like slugs or UUIDs) further tailors your models to application needs.
With these tools, you’ll create MongoDB collections that are consistent, secure, and easy to maintain.