DataLoader is a utility pattern used to improve the performance of queries. It helps to offload the server and reduces unnecessary rounds of requests to the server (data source) inside an application’s data fetching layer. DataLoader creates batches and cache to avoid performing the same backend query on a single request twice (or more).
Caching is a common use case that can be used with other technologies, such as Redis. However, it is essential to note that DataLoader differs significantly from caching implemented by technologies such as Redis. A cached DataLoader request is never persistent. DataLoader performs cache as per request. This means a cache is short-lived.
The goal of DataLoader is not to share the cache of information among all the requests to the server. Its objective is to reduce the load for a single request to the backend server. This means a DataLoader cached request only lives for a single server request.
So when a query request is sent to the server, the DataLoader creates a new instance, and a cache is created on the requesting query on the fly. This cache instance is only used for the duration of that request. Once a response is sent from the server, the created DataLoader cache instance is automatically destroyed.
DataLoader solves a problem. To comprehensively understand DataLoaders, it is essential to understand and analyze the problem statement, then see what advantage using DataLoader as the solution to the identified problem provides.
DataLoaders are commonly used hand in hand with GraphQL queries. When traversing a graph, each node needs to return an array of information. However, a GraphQL query model can have recursive information. This means nodes depend on other nodes in the same query.
Let’s see how to do this in a NestJS app. First, we need to create a GraphQL API and then narrow down the number of queries using the DataLoader.
Take a look at this simple nested GraphQL query.
To set up the application, ensure you have the NestJS CLI installed:npm install -g @nestjs/cli
Then Initialize a basic NestJS app:nest new posts_dataloader_project
After the initialization is done, proceed to the newly created directory and set up the posts GraphQL as so:cd posts_dataloader_project
Create the posts
module first by using the following command:nest g module posts
A posts
module will be created in the src
directory using the above mentioned command. It will also add the PostsModule
to the app.module.ts
file. Here we need first to define a post’s structure (schema). Therefore, inside the newly created posts
directory, add a posts.entity.ts
file.
Define the schema as below:
1 2 3 4 5 6 7 8 9 10 11
// You need to export this post entity export class Post { // add the post id id: number, // add the post title title: string, // add the post content content: string, // add the true or false value for done posts isDone: boolean }
Since we will be loading our data from a JSON file, next create the file and add the dummy data. Go to your src
directory, and create a store
directory. Inside the created store directory, create a posts.json
file.
Each record on the file should follow the schema defined above. Add the following dummy contents to the posts.json file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[
{
"id": 1,
"title": "This is my First Test post",
"content": "First post content",
"isDone": false
},
{
"id": 2,
"title": "This is Second Test post",
"content": "Second post content",
"isDone": false
},
{
"id": 3,
"title": "Third post",
"content": "Third post content",
"isDone": false
},
{
"id": 4,
"title": "Fourth post",
"content": "Fourth post content",
"isDone": false
},
{
"id": 5,
"title": "Fifth post",
"content": "Fifth post content",
"isDone": false
}
]
Create a posts.service.ts
file inside the src/posts
directory, and create a posts.service.ts
file. In this file, we will define the methods of fetching the data from the file.
Import the necessary modules:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
import { Injectable } from '@nestjs/common'; // import the posts entity import { Post } from './posts.entity'; // add the post module to access the json file import { join } from 'path'; // add the file system dependencies to read the json file import { readFileSync } from 'fs';
Define the PostsService class:
1
2
@Injectable()
export class PostsService {}
Inside the PostsService class:
Read the data from the file and store it on an array:
1
private readonly posts: Post[] = JSON.parse(readFileSync(join(process.cwd(), 'src', 'store', 'posts.json'), 'utf-8'));
Define a method for getting all posts:
1 2 3
findAll(): Post[] { return this.posts; }
Define a method for getting a single post:
1
2
3
findOne(id: number): Post {
return this.posts.find(post => post.id === id);
}
Next, we will connect the defined functionalities above with their respective queries.
To add GraphQL to our application, we will need to install the relevant packages:npm i @nestjs/graphql graphql apollo-server-express @nestjs/apollo
In the app.module.ts
, we will define the GraphQLModule
:
Import the necessary packages:
1 2 3 4 5 6 7 8 9 10 11 12
// graphql from nestjs import { GraphQLModule } from "@nestjs/graphql"; // add Apollo from nestjs import { ApolloDriver } from "@nestjs/apollo"; // the path for the data file import { join } from "path";
Define the GraphQLModule inside the imports array by appending the following:
1 2 3 4 5 6
GraphQLModule.forRoot({ // auto load the schema and process it against graphql autoSchemaFile: join(process.cwd(), 'src/schema.gql'), // execute the schema with apollo driver: ApolloDriver, })
With the introduction of graphql
, we will update the src/posts/posts.entity.ts
file as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import {
Field,
Int,
ObjectType
} from "@nestjs/graphql";
// add the object types
@ObjectType()
export class Post {
// execute the post fields
@Field(() => Int)
// post id
id: number;
@Field()
// post title
title: string;
@Field()
// post content
content: string;
@Field(() => Boolean)
// post boolean value
isDone: boolean;
}
The additions will enable GraphQL to understand the type of the data clearly.
Inside the same directory, create a posts.resolver.ts
file. The file will connect the queries to the functionalities defined on posts.service.ts
. Add the following code blocks to the posts.resolver.ts
file:
Import the necessary modules:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
import { Query, Resolver, Args, Int, Context } from "@nestjs/graphql"; // load the post module/service import { PostsService } from "./posts.service"; // load the posts entity import { Post } from "./posts.entity";
Define the resolver and the return type:
1
@Resolver(Post)
Define the PostsResolver
class:
1
2
3
4
5
@Resolver(Post)
// export the created post resolver
export class PostsResolver {
constructor(private readonly postsService: PostsService) {}
}
Inside the PostsResolver
class:
Define a method for getting all posts:
1
2
3
4
@Query(() => [Post])
async posts() {
return await this.postsService.findAll();
}
Define a method for getting a single post:
1
2
3
4
5
6
@Query(() => Post)
async post(@Args('id', {
type: () => Int
}) id: number) {
return await this.postsService.findOne(id);
}
On the posts.module.ts
, we will have to define the PostsService
and the PostsResolver
. Therefore, inside the posts.module.ts
, we will make the following changes:
Import both classes respectively:
1 2 3 4 5 6
import { PostsResolver } from './posts.resolver'; import { PostsService } from './posts.service';
Define the providers and the exports of the module:
1 2 3 4
@Module({ providers: [PostsService, PostsResolver], exports: [PostsService] })
The GraphQL server looks good now. However, to understand how this simple API works, we need to log the number of requests/queries that this API will make to the server. This will give us a clear image of how the API makes subsequent repeated calls that DataLoader wants to solve.
Head over to the post.service.ts
file and add a console.log()
for both findAll()
and findOne()
methods, respectively:
1
2
3
4
5
6
7
8
9
10
11
findAll(): Post[] {
console.log(`Getting all posts`);
return this.posts;
}
findOne(id: number): Post {
// log the posts to the terminal
console.log(`Getting post with id: {id}`);
// logs all posts from the data file
return this.posts.find(post => post.id === id);
}
With that, we should be able to test the functionality. Start the development server by running the following command on the terminal:npm run start:dev
When the server is up, visit the GraphQL playground from http://localhost:3000/graphql
.
Compose a query for getting all posts as follows:
1 2 3 4 5 6 7 8
query GetPosts { posts { id title content isDone } }
When you run it, you should get the following result:
Similarly, compose a query for getting a single post as follows:
1 2 3 4 5 6 7 8
query GetPost($id: Int!) { post(id: $id) { id title content isDone } }
Make sure you pass the id on the query variables section down there as:
1 2 3
{ "id": "your_id" }
Your response should resemble the following:
To add DataLoader to the application, follow these steps:
Install the package using npm
or yarn
:npm i dataloader
yarn add dataloader
Inside the src/posts
directory, create a posts.loader.ts
directory. Inside the file:
Import the necessary modules:
1 2 3 4 5 6 7 8 9 10
// import the dataloader dependencies import * as Dataloader from 'dataloader'; // Load the Post from the posts entity import { Post } from './posts.entity'; // load the post service to access the PostsService import { PostsService } from './posts.service';
Create a function to derive a map from an array of posts:
1
2
3
4
5
6
7
8
function deriveMapFromArray < T > (array: T[], mapFn: (item: T) => any) {
const map = new Map < any,
any > ();
array.forEach(item => {
map.set(mapFn(item), item);
});
return map;
}
Create a function for creating a posts loader:
1
2
3
4
5
6
7
export function createPostsLoader(postsService: PostsService) {
return new Dataloader < number, Post > (async (keys: number[]) => { // data loader initialization
const posts = await postsService.findByIds(keys); // get posts by ids
const postsMap = deriveMapFromArray(posts, (post: Post) => post.id); // get a map of posts
return keys.map((id) => postsMap.get(id)); // map posts using their ids from keys.
});
}
From above, we need to create the findByIds
method on the postsService:
1
2
3
findByIds(ids: number[]): Post[] {
return this.posts.filter(post => ids.includes(post.id)); // return only the posts that are included.
}
On the app.module.ts
, import the createPostsLoader
function and also the PostsService
class as follows:
1 2 3 4 5 6
import { createPostsLoader } from "./posts/posts.loader"; import { PostsService } from "./posts/posts.service";
We will define the loader on the context parameter so that we can access it from all the queries. Therefore, we will modify the GraphQLModule definition on the imports array as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
GraphQLModule.forRootAsync({
imports: [PostsModule],
// import PostsModule since it contains the definition of PostsService
useFactory: (postsService: PostsService) => ({
// pass in postsService
autoSchemaFile: join(process.cwd(), 'src/schema.gql'),
// define the auto schema file
context: () => ({
// define the context
postsLoader: createPostsLoader(postsService),
// initialize the postsLoader
})
}),
inject: [PostsService],
// Inject PostsService
driver: ApolloDriver,
// define driver
}),
We will use the loader on the src/posts/posts.resolver.ts
file. We will therefore make the following changes:
1 2
Import the dataloader package: import * as DataLoader from "dataloader";
On the method of getting all posts, get the context parameter that holds the postsLoader
defined earlier:
1
async posts(@Context('postsLoader') postsLoader: DataLoader < number, Post > )
Replace loading from the postsService
with loading from the postsLoader
:
1
2
3
async posts(@Context('postsLoader') postsLoader: DataLoader < number, Post > ) {
return postsLoader.loadMany([1, 2, 3, 4, 5]); // Passed the keys of the posts that i need.
}
On the method of getting a single post, get the context parameter that holds the postsLoader
:
1 2 3
async post(@Context('postsLoader') postsLoader: DataLoader < number, Post > , @Args('id', { type: () => Int }) id: number)
Replace getting the data from the postsService
with loading it from the loader:
1
2
3
4
5
async post(@Context('postsLoader') postsLoader: DataLoader < number, Post > , @Args('id', {
type: () => Int
}) id: number) {
return postsLoader.load(id);
}
With that, our application is fit to be tested. Run the previous queries after making sure the development server is still running. You should be able to get the same result.
This guide demonstrates how to add a DataLoader to a GraphQL NestJS app. We have described the problem statement that the DataLoader tries to solve. The GraphQL will always work fine without a DataLoader. However, based on your graph node, we see that the number of queries you can make to the server can be unexpected. This can worsen if you have a graph with many numbers of independent nodes.
I hope you found this guide helpful!