Table of Contents#
- Understanding DynamoDB’s Case Sensitivity
- Why Case Sensitivity Matters
- Limitation of Key Conditions
- Solution 1: Maintain a Normalized (Lowercase/Uppercase) Attribute
- How It Works
- Java SDK Implementation
- Solution 2: Use Global Secondary Indexes (GSIs) with Normalized Attributes
- When to Use GSIs
- Java SDK Implementation
- Solution 3: Filter Expressions with String Functions (DynamoDB 2019 Update)
- How Filter Expressions Work
- Limitations and Trade-offs
- Java SDK Implementation
- Workarounds for Specific Scenarios
- Scans with Filter Expressions (For Small Datasets)
- Integrating with External Search Tools (e.g., Elasticsearch)
- Best Practices and Considerations
- Common Pitfalls to Avoid
- Conclusion
- References
Understanding DynamoDB’s Case Sensitivity#
Why Case Sensitivity Matters#
DynamoDB’s string comparisons follow strict binary collation, meaning "Apple" ≠ "apple" and "123" ≠ "123 " (trailing spaces matter). This behavior aligns with DynamoDB’s design for performance and predictability but conflicts with user expectations in applications like:
- User authentication (usernames are often case-insensitive).
- Search features (e.g., "search for all users named John").
- Data validation (ensuring unique values regardless of case).
Limitation of Key Conditions#
DynamoDB queries rely on key conditions (for primary keys or global secondary indexes/GSIs) to fetch data efficiently. However, key conditions only support:
- Equality (
=), - Begins-with (
begins_with), - Range comparisons (
<,>,<=,>=), - And a few others (e.g.,
BETWEEN).
Critically, key conditions do not support string normalization functions like LOWER() or UPPER(). This means you cannot directly query a primary key or GSI with a case-insensitive condition.
Solution 1: Maintain a Normalized (Lowercase/Uppercase) Attribute#
How It Works#
The most straightforward solution is to store a normalized version of the attribute (e.g., lowercase or uppercase) alongside the original value. For example, if you have a username attribute, add a username_lower attribute that stores username.toLowerCase(). You can then query username_lower with a case-insensitive input (e.g., convert the user’s search term to lowercase and query username_lower).
Java SDK Implementation#
Prerequisites#
- AWS SDK for Java 2.x (we’ll use
dynamodb-enhanced-clientfor object mapping). - A DynamoDB table (e.g.,
Userswith primary keyuserId(partition key)).
Step 1: Define the Data Model with Normalized Attribute#
Create a Java class representing your DynamoDB item, including the normalized attribute:
import software.amazon.awssdk.enhanced.dynamodb.mapper.annotations.DynamoDbAttribute;
import software.amazon.awssdk.enhanced.dynamodb.mapper.annotations.DynamoDbBean;
import software.amazon.awssdk.enhanced.dynamodb.mapper.annotations.DynamoDbPartitionKey;
@DynamoDbBean
public class User {
private String userId;
private String username;
private String usernameLower; // Normalized lowercase attribute
@DynamoDbPartitionKey
@DynamoDbAttribute("userId")
public String getUserId() { return userId; }
public void setUserId(String userId) { this.userId = userId; }
@DynamoDbAttribute("username")
public String getUsername() { return username; }
public void setUsername(String username) {
this.username = username;
// Automatically update normalized attribute when username changes
this.usernameLower = username.toLowerCase();
}
@DynamoDbAttribute("username_lower")
public String getUsernameLower() { return usernameLower; }
public void setUsernameLower(String usernameLower) { this.usernameLower = usernameLower; }
}Step 2: Insert Items with Normalized Attributes#
Use the DynamoDB Enhanced Client to insert items. The usernameLower field is auto-populated when username is set:
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbEnhancedClient;
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbTable;
import software.amazon.awssdk.enhanced.dynamodb.TableSchema;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;
public class UserDao {
private final DynamoDbTable<User> userTable;
public UserDao() {
DynamoDbClient ddbClient = DynamoDbClient.builder()
.region(Region.US_EAST_1)
.build();
DynamoDbEnhancedClient enhancedClient = DynamoDbEnhancedClient.builder()
.dynamoDbClient(ddbClient)
.build();
this.userTable = enhancedClient.table("Users", TableSchema.fromBean(User.class));
}
public void saveUser(User user) {
userTable.putItem(user);
}
// Example usage
public static void main(String[] args) {
UserDao dao = new UserDao();
User user = new User();
user.setUserId("123");
user.setUsername("JohnDoe"); // Automatically sets usernameLower to "johndoe"
dao.saveUser(user);
}
}Step 3: Query Using the Normalized Attribute#
To query case-insensitively, convert the input to lowercase and query username_lower:
import software.amazon.awssdk.enhanced.dynamodb.conditions.ExpressionCondition;
import java.util.List;
import static software.amazon.awssdk.enhanced.dynamodb.ExpressionCondition.equalTo;
public List<User> findByUsernameCaseInsensitive(String username) {
String normalizedInput = username.toLowerCase();
// Query for items where username_lower equals normalizedInput
return userTable.query(
r -> r.queryConditional(
ExpressionCondition.sortBetween(
equalTo("username_lower", normalizedInput)
)
)
).items().stream().toList();
}Note: For this to work, username_lower should be part of a GSI (see Solution 2) if it’s not already a sort key in the base table.
Solution 2: Use Global Secondary Indexes (GSIs) with Normalized Attributes#
When to Use GSIs#
If the normalized attribute (e.g., username_lower) is not part of the base table’s primary key, create a Global Secondary Index (GSI) with the normalized attribute as the sort key (or partition key). This allows efficient queries on the normalized value.
Java SDK Implementation#
Step 1: Create a GSI on the Normalized Attribute#
Add a GSI to your Users table via the AWS Console or CloudFormation. For example:
- GSI name:
UsernameLowerIndex - Partition key:
username_lower(string) - Projection:
ALL(or specific attributes you need).
Step 2: Query the GSI#
Use the DynamoDB Enhanced Client to query the GSI:
import software.amazon.awssdk.enhanced.dynamodb.Index;
public List<User> findByUsernameCaseInsensitiveViaGsi(String username) {
String normalizedInput = username.toLowerCase();
// Get the GSI from the table
Index<User> usernameLowerIndex = userTable.index("UsernameLowerIndex");
// Query the GSI for username_lower = normalizedInput
return usernameLowerIndex.query(
r -> r.queryConditional(
ExpressionCondition.sortBetween(
equalTo("username_lower", normalizedInput)
)
)
).items().stream().toList();
}Why This Works: GSIs allow you to query on non-primary key attributes efficiently. By indexing username_lower, you turn a case-insensitive search into a fast equality query on the GSI.
Solution 3: Filter Expressions with String Functions (DynamoDB 2019-02-12 Update)#
How Filter Expressions Work#
In 2019, DynamoDB introduced string functions like LOWER() and UPPER() in filter expressions. Filter expressions apply after a query or scan, allowing you to normalize attributes on the fly. For example:
LOWER(username) = :searchTermLower
Limitations and Trade-offs#
- Filter expressions do not reduce read capacity units (RCUs): They are applied after the query/scan retrieves data, so you pay for all data read before filtering.
- Inefficient for large datasets: Use this only for small datasets or when you cannot modify the schema.
Java SDK Implementation#
Query the base table (or GSI) with a key condition, then apply a filter expression to normalize the attribute:
public List<User> findByUsernameWithFilter(String username) {
String normalizedInput = username.toLowerCase();
// Query the base table (e.g., by userId), then filter on username
return userTable.query(
r -> r.queryConditional(
ExpressionCondition.equalTo("userId", "123") // Example key condition
).filterExpression(
ExpressionCondition.equalTo(
ExpressionFunction.lower(ExpressionAttributeValue.fromAttributeValue(AttributeValue.fromS("username"))),
normalizedInput
)
)
).items().stream().toList();
}Key Note: Replace the key condition (userId = "123") with a relevant condition for your table. Without a narrow key condition, this will read excessive data.
Workarounds for Specific Scenarios#
Scans with Filter Expressions (Last Resort)#
If you cannot modify the schema and need to search across the entire table, use a Scan with a filter expression. Avoid scans on large tables—they are slow and expensive:
public List<User> scanForUsernameCaseInsensitive(String username) {
String normalizedInput = username.toLowerCase();
return userTable.scan(
r -> r.filterExpression(
ExpressionCondition.equalTo(
ExpressionFunction.lower(ExpressionAttributeValue.fromAttributeValue(AttributeValue.fromS("username"))),
normalizedInput
)
)
).items().stream().toList();
}Integrating with External Search Tools (Elasticsearch)#
For advanced use cases (e.g., full-text search, partial matches), integrate DynamoDB with Amazon Elasticsearch Service (OpenSearch Service):
- Use DynamoDB Streams to sync data to Elasticsearch.
- Index data in Elasticsearch with case-insensitive analyzers.
- Query Elasticsearch for case-insensitive results.
This is complex but ideal for large-scale applications needing robust search.
Best Practices and Considerations#
- Choose Normalized Attributes for Efficiency: Prefer Solutions 1 or 2 (normalized attributes + GSIs) for performance. They minimize RCU usage and scale well.
- Maintain Data Consistency: Use DynamoDB Transactions to update both the original and normalized attributes atomically:
// Example: Update username and username_lower in a transaction dynamoDbClient.transactWriteItems(r -> r.transactItems( TransactWriteItem.update( u -> u.tableName("Users") .key(Map.of("userId", AttributeValue.fromS("123"))) .updateExpression("SET username = :newName, username_lower = :newNameLower") .expressionAttributeValues(Map.of( ":newName", AttributeValue.fromS("NewJohnDoe"), ":newNameLower", AttributeValue.fromS("newjohndoe") )) ) )); - Avoid Scans: Scans with filter expressions are suitable only for small datasets (e.g., <1k items).
- Monitor RCUs: Use Amazon CloudWatch to track read usage and optimize queries.
Common Pitfalls to Avoid#
- Stale Normalized Attributes: Forgetting to update
username_lowerwhenusernamechanges leads to incorrect queries. - Overusing Filter Expressions: They do not reduce RCU costs—always pair them with narrow key conditions.
- Ignoring GSI Projections: Ensure GSIs project only necessary attributes to reduce storage costs.
Conclusion#
Case-insensitive queries in DynamoDB require careful planning. For most applications:
- Use Solution 1 (normalized attributes) if the normalized attribute is part of the primary key.
- Use Solution 2 (GSIs with normalized attributes) for flexible query patterns.
- Avoid Solution 3 (filter expressions) unless you have small datasets.
By combining these strategies with best practices like transactions and GSI optimization, you can achieve efficient case-insensitive queries in DynamoDB.