Implementing DataLoader with Apollo Server GraphQL

Burak Sonmez

Burak Sonmez / June 12, 2023

5 min read––– views

Dataloader GraphQL

What is DataLoader?

Without DataLoader, it is impossible to develop a GraphQL-powered API in Node. Simply expressed, DataLoaders address the "N+1 problem" in GraphQL. If you're not familiar with it, look it up quickly (I'll also write an article about it). From here on, I'll assume you know what it means, but basically, you must do N extra queries for every database query that yields 1 result and returns N results. That many requests are wasteful. The idea is that, combining those N inquiries into 1 would be preferable, making N+1 always equal to 2. In this situation, DataLoaders are useful.

How does it work?

The most advanced form of a DataLoader

  1. Gathers an array of keys within a single event loop tick.
  2. Uses all those keys to make one hit on the database.
  3. Returns an array of values resolved promises as number three.

A batching function that accepts an array of keys and resolves to an array of values is all you need to create a DataLoader. If either array is longer/lesser than the other, the attempt to convert them into a key/value store will fail. So be careful and try to handle the length.

Using DataLoader

Using batching and caching, DataLoader is a general utility that may be used as part of your application's data fetching layer to provide a consistent and streamlined API over a variety of distant data sources, such as databases or web services.

Setup Apollo Server with DataLoader

First, we need to set up Apollo Server with DataLoader. We'll use the context function (assumed it creates the all context) to set up the DataLoader. We'll also use the DataLoaderFactory class to create a DataLoader instance for each model.

// DataLoader.js
const DataLoader = require("dataloader");

class DataLoaderFactory {
  constructor(context) {
    this.context = context;
  }

  byId(model) {
    return new DataLoader(async (ids) => {  
      // If ids is an array of arrays, search ids in an array query
      if (ids.length > 1) { 
        const resultMap = {};
        // Search ids in a single query
        const result = await this.context.db[model].find({ids});

        result.forEach((each) => {
          resultMap[each._id] = each; 
        });
        // Must return array with same length as ids 
        return ids.map((id) => resultMap[id]);
      }
      // If there is a single id, search it in a single query
      const result = await this.context.db[model].findOne({_id:ids[0]});
      // Must return array with same length as ids
      return [result];
    });
  }
}

module.exports = { DataLoaderFactory };

Once we've created the DataLoaderFactory class, we may use it to create a DataLoader instance for each model. You might add different methods to the DataLoaderFactory class to create different types of DataLoader instances. For example, you might add a byEmail method to the DataLoaderFactory class to create a DataLoader instance that searches for users by email.

// app.js
const { ApolloServer } = require("@apollo/server");
const { DataLoaderFactory } = require("./DataLoaderFactory");


// Set up ApolloServer.
this.apollo = new ApolloServer({
  schema,
});

// Use ApolloServer with Express.
this.app.use(
  expressMiddleware(this.apollo, {
    context: async ({ req }) => {
    // generate core context
    const context = await this.context(req);

    // Setup DataLoader. 
    const loader = new DataLoaderFactory(context);

    let loaders = {
      users: {
        byId: loader.byId("Users"),
      },
      posts: {
        byId: loader.byId("Posts"),
      },
      comments: {
        byId: loader.byId("Comments"),
      },
      likes: {
        byId: loader.byId("Likes"),
      },
    };
    return { req, ...context };
    },
  }),
);

Use DataLoader in Resolvers

Now that we've set up Apollo Server with DataLoader, we may use DataLoader in our resolvers. We'll use the context function to get the DataLoader instances we created in the previous step.

// resolvers.js
/**
* @param {Object} root - outcome of the preceding/parent kind
* @param {string} arg - Arguments offered to the subject
* @param {Object} context - A Mutable object that all resolvers receive
* @param {Object} info - Information particular to a field that is pertinent to the query (rarely used)
*
* @return {Object} - The outcome of the resolver
*/
 owner: async (root, arg, context, info) => {
    const { loaders, db } = context;
    let owner = {};
    if (root.ownerId) {
      owner = await loaders.users.byId.load(root.ownerId);
    }
    return owner;
  }

The core concept is to use cached data for duplicate data and generate a batch to aggregate all fetching operations into one process.

Cache

After calling .load(), the outcome is cached and will be utilised to prevent making a second call to get the value associated with the keys that .load() used.

Clear Cache

We only seldom need to manually delete the cache.

For instance, the cached value may need to be removed when several values are changed with the same key in the same request. We may just use .clear() or .clearAll() in this situation.

Visit the GraphQL DataLoader GitHub page for additional details regarding caching.

Note: Dataloader caches only for the single request. It does not keep the cache for the next request.