Implementing DataLoader with Apollo Server GraphQL
Burak Sonmez / June 12, 2023
5 min read • ––– views
What is DataLoader?
Without DataLoader, it is impossible to develop a GraphQL-powered API in Node. Simply expressed, DataLoaders address the "N+1 problem" in GraphQL. If you're not familiar with it, look it up quickly (I'll also write an article about it). From here on, I'll assume you know what it means, but basically, you must do N extra queries for every database query that yields 1 result and returns N results. That many requests are wasteful. The idea is that, combining those N inquiries into 1 would be preferable, making N+1 always equal to 2. In this situation, DataLoaders are useful.
How does it work?
The most advanced form of a DataLoader
- Gathers an array of keys within a single event loop tick.
- Uses all those keys to make one hit on the database.
- Returns an array of values resolved promises as number three.
A batching function that accepts an array of keys and resolves to an array of values is all you need to create a DataLoader. If either array is longer/lesser than the other, the attempt to convert them into a key/value store will fail. So be careful and try to handle the length.
Using DataLoader
Using batching and caching, DataLoader is a general utility that may be used as part of your application's data fetching layer to provide a consistent and streamlined API over a variety of distant data sources, such as databases or web services.
Setup Apollo Server with DataLoader
First, we need to set up Apollo Server with DataLoader. We'll use the context function (assumed it creates the all context) to set up the DataLoader. We'll also use the DataLoaderFactory
class to create a DataLoader instance for each model.
// DataLoader.js
const DataLoader = require("dataloader");
class DataLoaderFactory {
constructor(context) {
this.context = context;
}
byId(model) {
return new DataLoader(async (ids) => {
// If ids is an array of arrays, search ids in an array query
if (ids.length > 1) {
const resultMap = {};
// Search ids in a single query
const result = await this.context.db[model].find({ids});
result.forEach((each) => {
resultMap[each._id] = each;
});
// Must return array with same length as ids
return ids.map((id) => resultMap[id]);
}
// If there is a single id, search it in a single query
const result = await this.context.db[model].findOne({_id:ids[0]});
// Must return array with same length as ids
return [result];
});
}
}
module.exports = { DataLoaderFactory };
Once we've created the DataLoaderFactory
class, we may use it to create a DataLoader instance for each model. You might add different methods to the DataLoaderFactory
class to create different types of DataLoader instances. For example, you might add a byEmail
method to the DataLoaderFactory
class to create a DataLoader instance that searches for users by email.
// app.js
const { ApolloServer } = require("@apollo/server");
const { DataLoaderFactory } = require("./DataLoaderFactory");
// Set up ApolloServer.
this.apollo = new ApolloServer({
schema,
});
// Use ApolloServer with Express.
this.app.use(
expressMiddleware(this.apollo, {
context: async ({ req }) => {
// generate core context
const context = await this.context(req);
// Setup DataLoader.
const loader = new DataLoaderFactory(context);
let loaders = {
users: {
byId: loader.byId("Users"),
},
posts: {
byId: loader.byId("Posts"),
},
comments: {
byId: loader.byId("Comments"),
},
likes: {
byId: loader.byId("Likes"),
},
};
return { req, ...context };
},
}),
);
Use DataLoader in Resolvers
Now that we've set up Apollo Server with DataLoader, we may use DataLoader in our resolvers. We'll use the context function to get the DataLoader instances we created in the previous step.
// resolvers.js
/**
* @param {Object} root - outcome of the preceding/parent kind
* @param {string} arg - Arguments offered to the subject
* @param {Object} context - A Mutable object that all resolvers receive
* @param {Object} info - Information particular to a field that is pertinent to the query (rarely used)
*
* @return {Object} - The outcome of the resolver
*/
owner: async (root, arg, context, info) => {
const { loaders, db } = context;
let owner = {};
if (root.ownerId) {
owner = await loaders.users.byId.load(root.ownerId);
}
return owner;
}
The core concept is to use cached data for duplicate data and generate a batch to aggregate all fetching operations into one process.
Cache
After calling .load()
, the outcome is cached and will be utilised to prevent making a second call to get the value associated with the keys that .load()
used.
Clear Cache
We only seldom need to manually delete the cache.
For instance, the cached value may need to be removed when several values are changed with the same key in the same request. We may just use .clear()
or .clearAll()
in this situation.
Visit the GraphQL DataLoader GitHub page for additional details regarding caching.
Note: Dataloader caches only for the single request. It does not keep the cache for the next request.