Database Joins

Sometimes we need information from more than one table. If we keep our users in one table, and our user demographic data in another, we need some way to associate (say) a user's name with their age:

users

id	name
8	Orinocco
12	Tomsk
3	Uncle Bulgaria

user_demographics

id	user_id	age
1	8	3
2	12	4
3	3	61

A join allows us to combine the information based on the values in a column that exists in both tables, in this case combining values when id is equal to user_id:

name	age
Orinocco	3
Tomsk	4
Uncle Bulgaria	61

A nod to normalisation

The question you're probably asking is, "Why not put all the information in one table to begin with?" Well, several reasons, centering around a concept called normalisation. This is a huge topic in itself, but can be summarised at a beginning level by saying, "Keep information in lots of small tables and link them to each other using unique IDs." This has a number of benefits including reducing redundancy, ease of adding new types of data, and reducing complexity.

Knex joins

Using Knex.js we can perform joins without having to know the SQL syntax for the query:

knex('dogs')
  .join('breeds', 'dogs.breed_id', '=', 'breeds.id')
  .select('dogs.name', 'breeds.name as breed')

In raw SQL this translates to:

SELECT dogs.name, breeds.name AS breed
FROM dogs
INNER JOIN breeds ON dogs.breed_id = breeds.id

dogs

id	name	breed_id
1	Daisy	5
2	Dexter	3
3	Clarry	2

breeds

id	name
1	Boxer
2	Spoodle
3	King Charles Cavalier
4	Bulldog
5	Mutt

joined table

name	breed
Daisy	Mutt
Dexter	King Charles Cavalier
Clarry	Spoodle

Note a few things about the above result:

Not all of the dog breeds are listed: only those that correspond to the dogs in the first table
Only the columns we asked for are returned, and they are labelled name and breed.

Name conflicts

Let's talk about that second point for a moment. Why do we use this syntax?

  .select('dogs.name', 'breeds.name as breed')

The trick is, Knex returns column names as object properties. What's wrong with this picture?

  [
    {
      name: 'Daisy',
      name: 'Mutt'
    }
  ]

Of course, we can't have an object with two properties the same, so what actually happens is the second one gets overwritten:

  [
    {
      name: 'Mutt'
    }
  ]

Obviously that's not what we want! The solution is to provide another name, an alias, for any properties that conflict in this way. We can write 'tablename.columnname as foobar' to achieve this. In the example above, our output from Knex looks like so:

  [
    {
      name: 'Daisy',
      breed: 'Mutt'
    },
    {
      name: 'Dexter',
      breed: 'King Charles Cavalier'
    }
  ]

So if we were to describe the above query in English we could say:

"Join the dogs table and the breeds table where the breed id matches. Then return the data in two columns: "name" (the name column in the dogs table) and "breed" (the name column in the breeds table)."

Resources

Database Keys and Migrations

Knex Routes