Efficient Database Management: Bulk Operations with Prisma

Databases

Prisma1 is a powerful ORM (Object-Relational Mapping) tool that simplifies database access in Node.js applications. It provides a type-safe API for executing CRUD operations against a database and many more features. But what if we need to execute thousands of queries at once? In this article, I’ll explain the challenges with Prisma and how to overcome them.

Abstract and modern graphic showing a database symbol in blue, black, and white, with flowing streams of binary data, 1's and 0's, illustrating dynamic data transfer and processing in a high-tech digital environment.
Futuristic Representation of High-Speed Data Processing in a Database

The Problem

I found myself in a situation where I had to import thousands of rows from an external API into a database. Then I had to loop over each row to retrieve additional information from another API. I wanted to separate the data import by API to save data during the import process and not lose all the data if the import fails. That meant that I had to execute queries in order to save this data and later update the rows with additional information.

The problem here is that each row was unique and I could not use the updateMany methods provided by Prisma2. I had to execute thousands of queries one after another. This is not only slow but also expensive. The connection pool of your database will be exhausted in no time and queries will time out or fail. All kinds of weird errors will occur and you will have a hard time figuring out what is going on.

The Solution

So clearly the problem here is that we have to reduce the amount of queries we execute. Using one query for each row is not ideal and very inefficient. I managed to think of several possible solutions to this problem. Each solution has its own pros and cons and will be suitable for different use cases. You have to decide which solution is best for your use case. I will explain each solution in detail and show you how to implement it.

1. Don’t update each row but delete and re-create in bulk

The first solution is to delete the rows and re-create them with the updated data. This is a very simple solution and works well if your data is different for each row and there is no way to reduce the amount of queries. The downside of this solution is that you will first have to read the data, store that in memory, then delete the data in the database and insert in bulk using createMany3. It may come as a surprise, but reading, deleting, and re-inserting the data in bulk is quicker than updating each individual row when you have to update thousands. You have to consider using transactions to make sure that the data is not lost if a step in the process fails. If you have to maintain availability of the data, consider using another table or database to store the data temporarily and then swap the tables or databases when the process is finished.

// Read all rows and store in memory
const data = await prisma.table.findMany()

// Delete all rows
await prisma.table.deleteMany()

// Do some data manipulations here
await prisma.table.createMany({ data })

2. Find similarities in the data, group them and update

The second solution is to find similarities in the data and group them together. This will result in fewer queries because you can use the updateMany2 feature of Prisma. This solution works well if you have similarities in your data. Data in each updateMany statement cannot be different. Under the hood, Prisma will convert the statement to an update statement with a WHERE clause containing multiple IDs. But if you don’t, you will end up with a lot of groups with only one row. This is not ideal and will not solve our problem. This solution can be complex and requires a lot of compute resources.

The following example shows a simple implementation of this solution. It groups the data by a property and then updates each group. This is a very simple example. You will have to do some additional data manipulations to make this work for your use case.

// Read all rows and store in memory
const data = await prisma.table.findMany()

// Perform any additional data manipulations here 
// for example retrieving additional data

// Group the data by similarity
const groups = data.reduce((acc, row) => {
  const key = row.someProperty
  if (!acc[key]) {
    acc[key] = []
  }
  acc[key].push(row)
  return acc
}, {})

// Update each group
for (const group of Object.values(groups)) {
  await prisma.table.updateMany({
    where: {
      id: {
        in: group.map(row => row.id)
      }
    },
    data: {
      someProperty: 'someValue'
    }
  })
}

3. Use transactions to better use the connection pool

The third solution is to use transactions to better use the connection pool 4. You can prevent the connection pool from being exhausted by using one or multiple transactions. Group queries in each transaction and execute the transaction. This will prevent any additional complexity in your software but can be very slow, as running a transaction with hundreds or thousands of queries can take a long time. This solution is not ideal but can be used if you don’t have any other options. You can also apply a transaction in any of the other solutions to better use the connection pool.

const transaction = await prisma.$transaction([
  prisma.table.update({
    where: {
      id: 1
    },
    data: {
      someProperty: 'someValue'
    }
  }),
  prisma.table.update({
    where: {
      id: 2
    },
    data: {
      someProperty: 'someValue'
    }
  })
])

4. Execute queries in bulk using executeRaw

You can execute queries in bulk using the executeRaw method. This method allows you to execute raw SQL queries against your database. This will give you complete control over the queries you execute. You can use this method to execute thousands of queries at once or use any specific database optimization logic. You will have to write the queries yourself and make sure that you don’t make any mistakes. You will also have to make sure that you don’t introduce any security vulnerabilities (SQL Injection). Don’t use queryRawUnsafe here!


await prisma.$executeRaw`
  UPDATE table
  SET someProperty = 'someValue'
  WHERE ...
`

Conclusion

In this article, I explained how to execute thousands of queries at once using Prisma. I showed you several solutions to this problem and explained the pros and cons of each solution. I hope this article helped you to solve your use case. If you have any questions or feedback, feel free to drop a comment below.

Hi there 👋

My name is Joeri Smits and I write about technical stuff on this website. If you have an exciting new project or just want to share thoughts, feel free to reach out to me on LinkedIn!