GraphQL and MySQL: Solving the Join Problem

by Nicola Marcacci Rossi

December 2019

As GraphQL adoption grows, developer teams have to implement GraphQL APIs for larger and larger enterprises and data sets. The larger the project, the more care has to be put into the database choice. While document databases such as MongoDB are the natural GraphQL companion, SQL databases are the standard for relational data.

Good news! Though trickier, it is possible to couple GraphQL and relational databases to very satisfying results. Unfortunately, documentation about how to achieve this has been lacking. This is what we want to remediate here.

This article assumes familiarity with GraphQL, Apollo Server, MongoDB and MySQL. It is part of a series on enterprise-grade GraphQL hosted by smartive.

Sample Data Structure: A Blog

All examples will build on a simple data structure: users can create posts, posts can have comments.

Our data structure

This is how you would describe it in a GraphQL schema (notice the to-one and to-many relationships, we will need to resolve them in different ways).

jsx
type Post {
id: ID
title: String
content: String
user: User! # to-one
comments: [Comment] # to-many
}
type Comment {
id: ID
content: String
post: Post! # to-one
user: User! # to-one
}
type User {
id: ID
name: String
}
query {
post(id: ID): Post
}

Simple Queries

Let’s start by implementing the resolver that allows us to answer a simple query without joins:

jsx
query {
post(id: "xyz") {
id
title
content
}
}

To build MySQL queries we will use the node library knex:

jsx
import Knex from "knex";
const client = Knex({ client: "mysql", connection: { host: "127.0.0.1" } });
const resolvers = {
Query: {
post: (_, { id }) => client.from("Posts").where({ id }).first()
}
};

Linked Queries

So far so good, what we have seen is equivalent to what one would do with a NoSQL database like MongoDB. But what if we want to query linked data?

jsx
query {
post(id: "xyz") {
title
content
user {
name
}
comments {
content
user {
name
}
}
}
}

Let’s look at a number of approaches and their tradeoffs.

Structured Data

With MongoDB one could store the whole document (the post, its author, its comments etc.) as a single entry. Like this we would be done: no more resolvers needed.

jsx
post: (_, { id }) => client.collection("posts").findOne(id)

This approach does have some downsides:

  1. It can’t be done for arbitrary data access patterns (what if I want each commenter’s posts? Suddenly more resolvers are needed).
  2. It leads to redundant/denormalized storage of data, which makes updates to it a complex issue.

And finally, this approach can’t be used with relational databases. We need a different approach.

The Naive Approach

The most natural way to solve linked queries is to fetch each entry independently, and let Apollo join everything:

jsx
Post: {
user: ({ userId }) => client.from("Users").where({ id: userId }).first(),
comments: ({ id }) => client.from("Comments").where({ postId: id })
},
Comment: {
user: ({ userId }) => client.from("Users").where({ id: userId }).first()
}

This is really what Apollo resolvers want you to do. But this approach, too, has some downsides:

  1. It leads to “N+1” queries to the database (still better than sending “N+1” queries from the client to the server)
  2. Joining data in the Node runtime is slower than in the database

Let’s look at the MySQL queries these resolvers generate when resolving the above query:

jsx
SELECT * FROM Posts WHERE id = "xyz";
SELECT * FROM Users WHERE id = <post.userId>;
SELECT * FROM Comments WHERE postId = "xyz";
/* For every comment */
SELECT * FROM Users WHERE id = <comment.userId>;

Number of queries = 3 + number of comments. We know we can do better:

jsx
SELECT
Posts.title AS title,
Posts.content AS content,
user.name AS user__name
FROM Posts
LEFT JOIN Users AS user ON Posts.userId = user.id
WHERE Posts.id === "xyz";
SELECT
Comments.content as content,
user.name AS user__name,
FROM Comments
LEFT JOIN Users AS user ON Comments.userId = user.id
WHERE Comments.postId === "xyz";

Number of queries = 2. The question is: how do we implement resolvers to generates such optimized queries?

The “Apollo Expert” Approach

If classic Apollo resolvers are like training wheels, they are definitely coming off now. We need to take the matter of resolving queries fully into our own hands. For that we need to implement an Almighty Root Resolver (ARR), i.e. a resolver that is responsible for solving the full query (recursively) without delegating things to lower resolvers.

To implement an ARR (and to do a number of other nifty things) you need to understand some key Apollo internals, namely how it represents the GraphQL schema and queries (hint: AST objects). I recommend you read our article Embrace The AST! before you proceed.

Solving To-One Queries

Let’s look at the post→ user relationship. It’s a to-one relationship, i.e. for each post there is exactly one user. Thinking in terms of MySQL results, to-one relationships are convenient because the result can always be represented in 1 single row — meaning, one can solve any amount of to-one relationships with a single query (with left joins). Here is a post resolver that builds such a query:

jsx
post: async (_, { id }, _ctx, info: GraphQLResolveInfo) => {
const query = info.fieldNodes.find(field => field.name.value === info.fieldName);
const sql = client.from("Posts").where({ id }).first();
for (const field of query.selectionSet.selections.filter(isFieldNode)
.filter(f => f.name.value !== "post")
) {
sql.select(field.name.value);
}
for (const field of query.selectionSet.selections.filter(isFieldNode)
.filter(f => f.name.value === "post")
) {
sql.leftJoin(`Users as user`, `Posts.userId`, `user.id`);
for (const userField of field.selectionSet.selections.filter(isFieldNode)) {
sql.select(
`user.${userField.name.value} as user__${userField.name.value}`
);
}
}
return unflatten(await sql);
}

What happens in this code snippet? We first query a post and its simple fields, then we left join the User table to get the relevant user information. unflatten is a function that transforms the flat MySQL result row into a well-structured document, we’ll leave the implementation of that as an exercise for the reader.

Obviously, this query only solves a single hard-coded relationship, but we’re slowly getting there.

By the way: if you find yourself wondering what that info object is and what you can do with it (spoiler: a lot), go back and read Embrace The AST! where we examine it in depth.

Solving To-Many Queries

To-many relationships, like the post→ comments relationship requires one to build a separate query for the related entity. This is how we do it:

jsx
post: async (_, { id }, _ctx, info: GraphQLResolveInfo) => {
const query = info.fieldNodes.find(field => field.name.value === info.fieldName);
const post = await client.from("Posts").where({ id }).first();
for (const field of query.selectionSet.selections.filter(isFieldNode)
.filter(f => f.name.value === "comments")
) {
post.comments = await client.from("Comments").where({ postId: id });
}
return post;
}

The resolver first fetches the post, then the comments with a separate MySQL query, which it then appends to the post.

Generic Case With Directives

Let’s now look at how to structure an ARR to perform all needed joins (for to-one and to-many relationships) in a generic way. First, we need to annotate the schema with special directives:

jsx
type Post {
id: ID
title: String
content: String
user: User @toOne
comments: [Comment] @toMany(key: "userId")
}
type Comment {
id: ID
content: String
post: Post @toOne
user: User @toOne
}

In the resolver we can then inspect the schema AST to decide when to join, and how:

jsx
post: async (_, { id }, _ctx, info: GraphQLResolveInfo) => {
const query = info.fieldNodes.find(field => field.name.value === info.fieldName);
const type = info.schema.getType('Post').astNode as ObjectTypeDefinitionNode;
const sql = client.from("Posts").where({ id }).first();
for (const field of query.selectionSet.selections.filter(isFieldNode)
.filter(field => !getDirective(type, field, 'toOne') && !getDirective(type, field, 'toMany'))
) {
sql.select(field.name.value);
}
for (const field of query.selectionSet.selections.filter(isFieldNode)
.filter(field => getDirective(type, field, 'toOne'))
) {
// left join...
}
const post = unflatten(await sql);
for (const field of query.selectionSet.selections.filter(isFieldNode)
.filter(field => getDirective(type, field, 'toMany'))
) {
// sub query...
}
return post;
}

To complete the ARR, we need to extend this function to recursively compute sub joins. It’s a matter of further inspecting the query AST and deriving the queries that need to be executed based on the same core approach. But we’ll stop here as that the handling of that goes beyond the scope of such an article.

Just The Beginning

To translate GraphQL queries to efficient MySQL queries is not easy, but worth it. Once we had achieved that for our largest client, we were able to start adding more and more powerful features to our GraphQL schema in a generic way, always keeping the interface with MySQL optimized. Examples:

  • Filters and ordering (e.g. posts(where: { title: "xyz" }, orderBy: [{created: ASC}], first: 10, skip: 0) { ... })
  • Aggregations (e.g. posts { commentsAggregations { COUNT AVG { rating } } })
  • Aggregation filters (e.g. posts(where: { comments: {GTE: { COUNT: 6 } } }) { ... })
  • Ordering by aggregations (e.g. posts(orderBy: [{ comments: { COUNT: DESC }]) { ... })
  • Distinct aggregations (e.g. posts { commentsAggregations { distinctUsers { COUNT user { username } } } })
  • Polymorphism and joining sub-tables containing non-common fields (e.g. users { name ...on Student { school } })
  • Handling external (as in: not coming from the database) fields, external filters, write and access permissions, sanitization rules… all with schema directives
  • Schema generation based on a tighter (TypeScript) representation of the model

All in all, our decision to embrace the AST and create our own ARRs has allowed us to create a very powerful model-driven API (we keep being surprised ourselves about what use cases are already covered by it).

Build Enterprise-Grade GraphQL Applications

Want to become a GraphQL pro? Follow us and read our whole series on enterprise-grade GraphQL applications.

Part 1: Building Enterprise Grade APIs with GraphQL, MySQL and Node.js

Part 2: Advanced GraphQL Patterns: The Almighty Root Resolver

Part 3: Advanced GraphQL Patterns: Embrace the AST!

Part 4: GraphQL and MySQL: Solving the Join Problem

Part 5: GraphQL and Elasticsearch: A Love Letter 💌

Part 6: The most popular GraphQL Servers and Clients for Node.js

Need an expert team to implement your advanced web application? Check out our portfolio.

Previous post
Back to overview
Next post