So here we are again, deep in the trenches of development. It's fascinating how, as developers, we often think we know a lot, but in reality, sometimes what we know is just the tip of the iceberg. In fact, a good 90% of the time, what we think we know might be incorrect or incomplete ๐. I recently had one of those moments while working with Drizzle, which I think is one ORM that is very close to SQL, and encountered a frustrating bug that took me a day to figure out.
The Bug
I was working on a project involving transactions where each transaction could optionally have a studentId
. I had written an API endpoint to fetch transaction data, but noticed that transactions without a studentId
were not being returned. Only those with a studentId
were showing up. After a day of frustration and debugging, I finally realized that the issue lay in the type of join I was using in my SQL query.
Here's the API code that led me to this revelation:
javascriptCopy code.get(
"/:id",
zValidator(
"param",
z.object({
id: z.string().optional(),
})
),
clerkMiddleware(),
async (c) => {
const auth = getAuth(c);
const { id } = c.req.valid("param");
if (!id) {
return c.json({ error: "Missing Id" }, 401);
}
if (!auth?.userId) {
return c.json({ error: "Unauthorized" }, 401);
}
const [data] = await db
.select({
id: transactions.id,
date: transactions.date,
categoryId: transactions.categoryId,
payee: transactions.payee,
studentId: registrationNumbers.number,
amount: transactions.amount,
notes: transactions.notes,
accountId: transactions.accountId,
})
.from(transactions)
.innerJoin(accounts, eq(transactions.accountId, accounts.id))
.leftJoin(students, eq(transactions.studentId, students.id))
.leftJoin(
registrationNumbers,
eq(students.registrationNumberId, registrationNumbers.id)
)
.where(and(eq(transactions.id, id), eq(accounts.userId, auth.userId)));
if (!data) {
return c.json({ error: "Not Found" }, 404);
}
return c.json({ data });
}
)
The key change I made was switching from INNER JOIN
to LEFT JOIN
for the students
and registrationNumbers
tables. This adjustment ensured that transactions without a studentId
were also included in the results. To understand why this worked, let's dive into the differences. Here is the research I did.
INNER JOIN
Definition:
An INNER JOIN
returns only the rows that have matching values in both tables.
Usage:
It is often used to retrieve only the records that have corresponding matches in the joined tables.
Behavior:
If there is no match between the tables, the row is excluded from the result set.
Example: Suppose we have two tables, A
and B
:
A.id | A.value |
1 | A1 |
2 | A2 |
3 | A3 |
B.id | B.value |
1 | B1 |
2 | B2 |
An INNER JOIN
on A.id
=
B.id
would produce:
Rows with A.id
= 3
are excluded because there is no matching B.id
= 3
.
LEFT JOIN (or LEFT OUTER JOIN)
Definition:
ALEFT JOIN
returns all rows from the left table and the matched rows from the right table. If there is no match, the result is NULL
on the side of the right table.
Usage:
It is often used to retrieve all records from the left table, regardless of whether they have matching rows in the right table.
Behavior:
If there is no match, the result will still include the row from the left table, but with NULL
values for the columns from the right table.
Example: Using the same tables A
and B
:
A LEFT JOIN
on A.id
=
B.id
would produce:
Row with A.id
= 3
is included with NULL
values for columns from table B
because there is no matching B.id
= 3
.
RIGHT JOIN (or RIGHT OUTER JOIN)
Definition:
A RIGHT JOIN
returns all rows from the right table, and the matched rows from the left table. If there is no match, the result is NULL
on the side of the left table.
Usage:
It is used often used to retrieve all records from the right table, regardless of whether they have matching rows in the left table.
Behavior:
If there is no match, the result will still include the row from the right table, but with NULL
values for the columns from the left table.
Example: Using the same tables A
and B
:
A RIGHT JOIN
on A.id
=
B.id
would produce:
Row with B.id
= 3
is included with NULL
values for columns from table A
because there is no matching A.id
= 3
.
FULL OUTER JOIN
Definition:
A FULL OUTER JOIN
returns all rows when there is a match in either left or right table. This means it returns all rows from the left table and all rows from the right table. If there is no match, the result is NULL
on the side that does not have a match.
Usage:
It isoften used to retrieve all records from both tables, with NULL
in places where there is no match.
Behavior:
If there is no match, the result will include the row from both tables, but with NULL
values for the columns from the table that does not have a match.
Example: Using the same tables A
and B
:
A FULL OUTER JOIN
on A.id
=
B.id
would produce:
Rows with A.id
= 3
and B.id
= 3
are included with NULL
values for columns from the other table because there is no matching B.id
= 3
and A.id
= 3
respectively.
CROSS JOIN
Definition:
A CROSS JOIN
returns the Cartesian product of the two tables. This means that every row from the first table is paired with every row from the second table.
Usage:
It is used when you want to combine all rows from the first table with all rows from the second table.
Behavior:
The result set includes all possible combinations of rows from the two tables.
Example: Using the same tables A
and B
:
A CROSS JOIN
on A
and B
would produce:
Every row from A
is combined with every row from B
.
SELF JOIN
Definition:
A SELF JOIN
is a regular join but the table is joined with itself.
Usage:
It is used when you want to compare rows within the same table or query hierarchical data.
Behavior:
The table is effectively duplicated and then joined on the specified condition.
Example: Using a single table A
:
A.id | A.value | A.parent_id |
1 | A1 | NULL |
2 | A2 | 1 |
3 | A3 | 1 |
A SELF JOIN
on A.id
= A.parent_id
would produce:
Rows with A.parent_id
are matched with rows from the same table where A.id
equals A.parent_id
.