AWS DynamoDB data modeling - Part 1
Introduction
It's very important to keep in mind that we should model our data before we start writing the code. Defining your access patterns helps you to model the data. PK (partition key) and SK (sort key) handle most of the access patterns.
Some Dynamodb terminology:
- PK (partition key) - the primary key that uniquely identifies each item in the table if the table has only a partition key. If the table has a sort key, then the combination of the partition key and sort key uniquely identifies each item. It's also called the hash key.
- SK (sort key) - together with the partition key, it makes up the composite primary key. In the item collection (same partition key for all items), the sort key should be unique. SK plays an important role in our access patterns. It's also called the range key.
- Attribute - Name and value pair.
- Item - a collection of attributes. For example, user item, album item, etc.
You can find the complete code in the feature/user-albums branch if you purchased the kit. The code also covers creating API endpoints, integrating lambda with API Gateway, and more.
Read more about DynamoDB core components and items/attributes.
Access patterns
For this tutorial, we will use the following access patterns:
Access Patterns | Query conditions | |
---|---|---|
1 | Get all the albums for a given user | PK: USER#<user-id> and begins_with SK: ALBUM#<album-id> |
2 | Get a specific album for a given user | PK: USER#<user-id> and SK: ALBUM#<album-id> |
3 | Get all the albums for a given user that are created after a specific date | PK: USER#<user-id> and SK: ALBUM#<album-id> > specific time |
4 | Get all the albums for a given user that are created before a specific date | PK: USER#<user-id> and SK: ALBUM#<album-id> < specific time |
5 | Get all the albums for a given user that are created between two dates | PK: USER#<user-id> and SK: ALBUM#<album-id> between specific time1 and time2 |
Let's go through each access pattern and see how we can model the data.
Access pattern 1
This is the simplest access pattern. All we need is the user id. The user item collection might include other items such as payments, addresses, etc. We need to make sure we sort the albums by using the ALBUM#
prefix. The hash sign is used to separate the prefix from the id.
const queryCommandParams: QueryCommandInput = {
TableName: tableName,
KeyConditionExpression: 'PK = :PK and begins_with(SK, :SK)',
ExpressionAttributeValues: {
':PK': `USER#${userId}`,
':SK': 'ALBUM#',
},
};
const queryItemsCommand = new QueryCommand(queryItemsCommandParams);
const res = await documentClient.send(queryItemsCommand);
const albumItems = res.Items;
Read more about DynamoDB condition expressions here.
Access pattern 2
PK is the user id and SK is the album id.
const getItemCommandParams: GetCommandInput = {
TableName: tableName,
Key: {
PK: `USER#${userId}`,
SK: `ALBUM#${albumId}`,
},
};
const getItemCommand = new GetCommand(getItemCommandParams);
const res = await documentClient.send(getItemCommand);
const item = res.Item;
Access pattern 3
If the access pattern involves a date, then then SK should be sortable. UUID is not sortable. So, we need a sortable unique id. KSUID is a good option here. It's sortable and unique. KSUID is composed of a timestamp and a random string. If we create an album item with a KSUID, then we can use the SK for access patterns 3, 4, and 5.
You can use other sortable unique ids such as ulid, ulidx, etc. KSUID is just an example.
// npm i ksuid
import KSUID from 'ksuid';
const albumId = KSUID.randomSync().string;
Now let's see how we can retrieve the albums that are created after a specific date. First, we need to convert the date to a KSUID string. Then we will use the >
operator to query the albums that are created after the date.
const timeInMilliseconds = 1700170103565;
const ksuidFromTimeInMilliseconds = (await KSUID.random(timeInMilliseconds))
.string;
// or from an ISO date
// const ksuidFromIsoDate = (await KSUID.random(new Date('2023-01-12:00:00Z')))
// .string;
const queryCommandParams: QueryCommandInput = {
TableName: tableName,
KeyConditionExpression: 'PK = :PK AND SK > :SK',
ExpressionAttributeValues: {
':PK': PK,
':SK': `ALBUM#${ksuidFromTimeInMilliseconds}`,
},
};
const queryItemsCommand = new QueryCommand(queryItemsCommandParams);
const res = await documentClient.send(queryItemsCommand);
const albumItems = res.Items;
This query will return all the albums that are created after the specific date.
Access pattern 4
This is similar to access pattern 3. The only difference is we use the <
operator instead of >
.
const timeInMilliseconds = 1700170103565;
const ksuidFromTimeInMilliseconds = (await KSUID.random(timeInMilliseconds))
.string;
const queryCommandParams: QueryCommandInput = {
TableName: tableName,
KeyConditionExpression: 'PK = :PK AND SK < :SK',
ExpressionAttributeValues: {
':PK': PK,
':SK': `ALBUM#${ksuidFromTimeInMilliseconds}`,
},
};
const queryItemsCommand = new QueryCommand(queryItemsCommandParams);
const res = await documentClient.send(queryItemsCommand);
const albumItems = res.Items;
Access pattern 5
This is also similar to access patterns 3 and 4. We use the between
operator to query the albums that are created between two dates.
const fromTimeInMilliseconds1 = 1700170103565;
const toTimeInMilliseconds2 = 1700170303565;
const fromKsuid = (await KSUID.random(fromTimeInMilliseconds1)).string;
const toKsuid = (await KSUID.random(toTimeInMilliseconds2)).string;
const queryCommandParams: QueryCommandInput = {
TableName: tableName,
KeyConditionExpression: 'PK = :PK AND SK BETWEEN :ft AND :tt',
ExpressionAttributeValues: {
':PK': PK,
':ft': `ALBUM#${fromKsuid}`, // ft = from time
':tt': `ALBUM#${toKsuid}`, // tt = to time
},
};
const queryItemsCommand = new QueryCommand(queryItemsCommandParams);
const res = await documentClient.send(queryItemsCommand);
const albumItems = res.Items;
Useful links: