This detailed guide covers all important SQL JOIN topics, from basic concepts to advanced techniques. Bookmark this guide for future reference – it’s packed with useful resources and guides to help you work with JOINs effectively.
SQL JOINs are essential in SQL and data analysis, as they let you combine data from different tables into a unified view. In this article, I've gathered everything you need to know about SQL JOINs. My goal is to give you a clear and easy guide that helps you understand how JOINs work.
Along with the guide, I've picked the best resources and tools to help you learn more. You'll find articles, tutorials, and courses that are great for beginners and experienced SQL users. These resources will help you practice using JOINs in your projects.
Introduction to SQL JOINs
JOINs are a fundamental concept in SQL. They let you combine data from two or more tables in a database, offering a more comprehensive view of data for use in analysis and reporting. JOINs are indispensable for anyone working with SQL, whether you're a data analyst, a database administrator, a software developer, or just beginning to explore SQL.
The primary purpose of a SQL JOIN is to combine data from multiple tables so that you can work with them as a single table. This is particularly useful in relational databases, where data is often distributed across various tables.
For instance, consider a scenario where one table contains customer information and another holds order details. Using JOIN, these tables can be combined to see which customers placed which orders, simplifying the process of complex data queries.
Question: What is a SQL Join?
A SQL JOIN clause combines data from two or more tables into a single dataset. Records are matched (i.e. joined) based on a given condition. For example, you could join the book
and author
tables based on the author's ID equal in both tables.
SQL JOIN Syntax and Examples
Let’s say that you have two tables: books
with the columns book_id
, title
, and author_id
, and authors
with the columns author_id
and name:
books
book_id | title | author_id |
---|---|---|
1 | The Great Gatsby | 101 |
2 | To Kill a Mockingbird | 102 |
3 | Moby-Dick | 103 |
4 | The Catcher in the Rye | 104 |
5 | Greek Mythology | null |
authors
author_id | name |
---|---|
101 | F. Scott Fitzgerald |
102 | Harper Lee |
103 | Herman Melville |
104 | J.D. Salinger |
105 | Walt Whitman |
If you want to see a list of books along with their authors, you need to use a JOIN
to bring these two tables together. You would use an SQL query like this:
SELECT title, name FROM books JOIN authors ON books.author_id = authors.author_id;
This query combines data from the books
and authors
tables. Observe the syntax of JOIN
: After FROM
, you put the name of the first table, as you would in any other query. After the name of the first table, you put the keyword JOIN
, followed by the name of the second table. After that, you put the ON
keyword with the condition telling the database how to connect tables.
RESULT:
title | name |
---|---|
The Great Gatsby | F. Scott Fitzgerald |
To Kill a Mockingbird | Harper Lee |
Moby-Dick | Herman Melville |
The Catcher in the Rye | J.D. Salinger |
This query combines data from the two tables where the books.author_id
value is equal to the authors.author_id
value. For example, the author_id
for the book “The Great Gatsby” is 101. The author with ID 101 is F. Scott Fitzgerald. That’s why F. Scott Fitzgerald is listed as the author of “The Great Gatsby”. You can look at other rows in the table and similarly justify why they are in the result.
It’s important to emphasize that the ON condition in JOIN
can be anything you want. It typically is an equality (match) between column values from two different tables, but it can be any condition you want. The database takes rows from both tables and checks if the condition is true for those rows. The rows with true values are added to the result.
The animation below provides a visual explanation of how SQL JOINs work:
Types of SQL JOINs
There are different types of JOINs in SQL; each serves a different purpose depending on what you need from your data. In this section, I will discuss various types of JOINs, exploring how and when to use them to the best effect. I will show you INNER JOIN
, LEFT JOIN
, RIGHT JOIN
, FULL JOIN
, and others – each providing unique ways to combine and analyze data across multiple tables in your database.
INNER JOIN
INNER JOIN
is the most basic type of JOIN
. It is used to combine rows from two or more tables based on a specified condition, ensuring that only the rows that meet the specified criteria are included in the result set. It's particularly useful when you want to match and merge rows that have common values in these columns.
Let's consider the books
and authors
tables from the previous section. Note that in our data there is a book called “Greek Mythology” that doesn’t have an author. We also have the author Walt Whitman, who hasn’t written any books in our database.
To match each book with its author, you'd use an INNER JOIN
on these tables. The SQL query would look like this:
SELECT title, name FROM books INNER JOIN authors ON books.author_id = authors.author_id;
RESULT:
title | name |
---|---|
The Great Gatsby | F. Scott Fitzgerald |
To Kill a Mockingbird | Harper Lee |
Moby-Dick | Herman Melville |
The Catcher in the Rye | J.D. Salinger |
This query combines data where the books.author_id
value is equal to the authors.author_id
value. For example, the author_id
for the book “The Great Gatsby” is 101. The author with ID 101 is F. Scott Fitzgerald. That’s why F. Scott Fitzgerald is listed as the author of “The Great Gatsby”.
INNER JOIN
is powerful because it allows you to combine relevant data from different tables. However, it only returns rows where there's a match in both tables. If there's no matching row, the data won't appear in the output. The book “Greek Mythology” is not in the result because it has no author. Author Walt Whitman is not there because there is no book associated with him in the second table.
It’s important to note that the keyword INNER
is optional. If you just write JOIN
, the database understands it as INNER JOIN
. The query from the previous section will return exactly the same result. JOIN
with no additional keywords always means INNER JOIN
in SQL.
SELECT title, name FROM books JOIN authors ON books.author_id = authors.author_id;
Understanding INNER JOIN
s is crucial because they are the most basic type of JOIN
. As we explore other types of JOINs, you'll see how each serves a unique purpose. However, INNER JOIN
s are often the go-to choice for straightforward matching scenarios.
Want more on INNER JOINs? Check out these articles:
- What Is an SQL INNER JOIN?
- SQL JOINs Explained: 5 Clear SQL INNER JOIN Examples for Beginners
- SQL INNER JOIN Explained in Simple Words
- An Illustrated Guide to the SQL INNER JOIN
OUTER JOINs
An OUTER JOIN
in SQL is used when you want to combine rows from two or more tables and include those rows in one table that don't have matching rows in the other table. This is the opposite of INNER JOIN
, which includes only matching rows from both tables.
Let’s use the same books
and authors
tables from our INNER JOIN
example. Perhaps you want to include all books in the result, even those that don’t have an author. Perhaps you want to include all authors, even those who haven’t written any books. Perhaps you want to include all the authors and all the books, regardless if they have a match in the other table. In all those scenarios you’ll need an OUTER JOIN
. An OUTER JOIN
is used for finding data that doesn't have an exact match across tables.
There are three variants of OUTER JOIN
: LEFT OUTER JOIN
, RIGHT OUTER JOIN
, and FULL OUTER JOIN
. The word OUTER
is optional when referring to JOIN
s. Therefore, in describing the types, it's common to simply use LEFT JOIN
, RIGHT JOIN
, and FULL JOIN
.
You can find more about OUTER JOIN in these resources:
- What Is the OUTER JOIN in SQL?
- An Illustrated Guide to the SQL OUTER JOIN
LEFT JOIN
When using a LEFT JOIN
, the query returns all the records from the left (first) table. This is the table listed immediately after FROM
. It also shows the corresponding rows from the right (second) table. Let’s look at the example:
SELECT title, name FROM books LEFT JOIN authors ON books.author_id = authors.author_id;
RESULT:
title | name |
---|---|
The Great Gatsby | F. Scott Fitzgerald |
To Kill a Mockingbird | Harper Lee |
Moby-Dick | Herman Melville |
The Catcher in the Rye | J.D. Salinger |
Greek Mythology | null |
The query retrieves all rows from the left table (books
) and the matching rows from the right table (authors
). The query retrieves every book and pairs it with its author. It includes all books, even those that don’t have a matching author. “Greek Mythology” has no corresponding author in the authors
table, but it’s still included in the result. However, the name of the author is listed as null
for this row, since there is no matching ID in the authors
table.
Want to go deeper in LEFT JOIN? Check out these articles:
- What Is LEFT JOIN in SQL?
- What Is a LEFT OUTER JOIN in SQL? An Explanation With 4 Examples
RIGHT JOIN
A RIGHT JOIN
is the mirror image of the LEFT JOIN
: it returns all the records from the right table with the corresponding rows from the left table. (The right or second table is immediately after the JOIN
keyword.) Let’s look at the example:
SELECT title, name FROM books RIGHT JOIN authors ON books.author_id = authors.author_id;
RESULT:
title | name |
---|---|
The Great Gatsby | F. Scott Fitzgerald |
To Kill a Mockingbird | Harper Lee |
Moby-Dick | Herman Melville |
The Catcher in the Rye | J.D. Salinger |
null | Walt Whitman |
This query lists all authors, pairing them with their books where possible. For authors without a corresponding book in the books
table, the book details will be null
. You can see that there are no Walt Whitman books in our table, so a null
value is displayed in the column title
.
To find our more about RIGHT JOIN
visit this article:
- RIGHT JOIN in SQL: A Beginner's Tutorial
FULL JOIN
A FULL JOIN
combines the functionality of LEFT JOIN
and RIGHT JOIN
. It will produce a result that includes all records from both tables. Here’s an example:
SELECT title, name FROM books FULL JOIN authors ON books.author_id = authors.author_id;
RESULT:
title | name |
---|---|
The Great Gatsby | F. Scott Fitzgerald |
To Kill a Mockingbird | Harper Lee |
Moby-Dick | Herman Melville |
The Catcher in the Rye | J.D. Salinger |
Greek Mythology | null |
null | Walt Whitman |
All books and all authors are listed, matching where they can and showing nulls where there isn't a match in the other table. As you can see in the result, “Greek Mythology” appears without a matching author, and Walt Whitman appears without any book.
You can read more on FULL JOIN
s here:
- What Is FULL JOIN in SQL? An Explanation with 4 Examples
- What FULL JOIN Is and When to Use It
CROSS JOIN
A CROSS JOIN
in SQL combines all rows from two tables, creating every possible pair between the rows of the tables. In the case of our example tables, a CROSS JOIN
would pair every book with every author – regardless if they're actually related:
SELECT name, title FROM books CROSS JOIN authors;
The syntax of CROSS JOIN
is similar to other JOIN
types: you put the name of the first table before the CROSS JOIN
keyword and the name of the other table after CROSS JOIN
. CROSS JOIN
does not require an ON
condition.
RESULT:
title | name |
---|---|
The Great Gatsby | F. Scott Fitzgerald |
To Kill a Mockingbird | F. Scott Fitzgerald |
Moby-Dick | F. Scott Fitzgerald |
The Catcher in the Rye | F. Scott Fitzgerald |
Greek Mythology | F. Scott Fitzgerald |
The Great Gatsby | Harper Lee |
To Kill a Mockingbird | Harper Lee |
Moby-Dick | Harper Lee |
The Catcher in the Rye | Harper Lee |
Greek Mythology | Harper Lee |
The Great Gatsby | Herman Melville |
To Kill a Mockingbird | Herman Melville |
… | … |
The table continues in the same manner for all authors from table authors
, creating a comprehensive list of all possible combinations of authors and books. It's easy to calculate that since we have 5 books in the books
table and 5 authors in the authors
table, we will get a result of 5*5, or 25 rows.
Would you like to know more about CROSS JOIN
? Read our Illustrated Guide to the SQL CROSS JOIN.
NATURAL JOIN
A NATURAL JOIN
in SQL is a JOIN
type that automatically combines rows from two or more tables based on columns with the same name.
Let’s take our books
and authors
tables. To perform a NATURAL JOIN
between these tables to find books and their respective authors, you would use the following SQL query:
SELECT title, name FROM books NATURAL JOIN authors;
NATURAL JOIN
identifies matching columns by their names, so it doesn’t require specifying the ON
condition. In our example, NATURAL JOIN
will automatically match rows based on the common column author_id
. The result will be a unified table like this:
RESULT:
title | name |
---|---|
The Great Gatsby | F. Scott Fitzgerald |
To Kill a Mockingbird | Harper Lee |
Moby-Dick | Herman Melville |
The Catcher in the Rye | J.D. Salinger |
In practice, NATURAL JOIN
is rarely used. Databases have different naming conventions. Some column names, like id
or name
, can be shared by many tables – even though they do not store the same information. It’s more convenient to specify the ON
condition explicitly and have full control over how the JOIN
is performed. Still, it’s worth knowing about NATURAL JOIN
for historical reasons (or in the rare event you’re asked about it in an interview).
More SQL JOIN Resources
For further insights and practical usage scenarios of all SQL JOIN
types, refer to these resources:
- SQL JOIN Types Explained
- A Short Overview of SQL JOINs
- 7 SQL JOIN Examples with Detailed Explanations
How to Practice SQL JOINs
Understanding SQL JOIN types is essential, but the real skill lies in practical application. In this section, we’ve gathered some recommended resources for practicing JOINs.
SQL Basics
The SQL Basics course at LearnSQL.com provides an excellent introduction to JOINs. The course covers the basics of SQL, dedicating a significant portion of its time to JOINs. It is an ideal choice for those who are getting started with SQL and are looking to build a solid foundation.
This course – like all of our courses – is interactive. Each new concept is reinforced by an exercise where you write an actual SQL query. Our platform automatically verifies your solution and gives you feedback on your work. Everything happens right in your web browser; you don’t need to install anything on your computer. You can focus solely on learning and practicing SQL queries, making our courses an ideal choice for beginners looking for a hassle-free learning experience.
In the SQL Basics course, you'll learn and practice all the different JOIN types. The course contains 129 exercises, which is equivalent to over 10 hours of coding. Over one-third of the course is devoted solely to SQL JOINs. In many other parts of the course, you’ll combine JOIN knowledge with other SQL features.
The SQL Basics course uses standard SQL. If you need to learn a specific SQL dialect, we offer the same course in three popular SQL dialects:
- SQL Basics in MySQL
- SQL Basics in MS SQL Server
- SQL Basics in PostgreSQL
SQL JOINs
Our SQL JOINs course is designed for learners who want to focus on practicing SQL JOINs. In the course, you will find 93 interactive exercises. You will review most of the topics beginners find challenging, including:
- Joining and selecting data from three or more tables.
- The use cases of
LEFT JOIN
,RIGHT JOIN
, andFULL JOIN
. - How to correctly filter data with different kinds of JOINs.
- How to join a given table with itself (self-joins).
- How to join tables on non-key columns.
This course is perfect for new learners as well as experienced professionals who want to refresh their SQL JOIN knowledge before a test or an interview. You can also take this course in MySQL, MS SQL Server, or PostgreSQL.
If you’re not ready to enroll in a course, check out our article SQL Joins: 12 Practice Questions with Detailed Answers. It contains 12 carefully selected SQL JOIN exercises to help you practice the most important aspects of JOINs. Each question is accompanied by an answer and a detailed explanation. This approach helps you build a strong foundational understanding of JOINs. The exercises in the article are taken from our SQL JOINs course, so you can get a taste of what the course is like.
More JOIN Practice
If you're looking to delve even deeper into SQL JOINs, our practice courses are the perfect choice. Our offering includes:
- SQL Practice Set
- Basic SQL Practice: A Store
- Basic SQL Practice: University
- Basic SQL Practice: Run Track Through Queries!
Each of these courses utilizes a realistic database, enabling you to practice on true-to-life examples rather than abstract ones. All our SQL practice courses are structured into chapters dedicated to different topics, such as single table queries, joins, aggregation, and grouping. This allows you to focus on the areas you wish to understand better.
Our SQL practice courses are also a great resource for those preparing for job interviews. If you are waiting for your interview or need to recap your knowledge, check out The Top 10 SQL JOIN Interview Questions and How to Answer Them.
Advanced JOIN Techniques
As you work more with SQL JOINs, you'll find yourself faced with more complex scenarios that demand a deeper understanding of how to manipulate and connect your database tables.
In this section, I'll show you some common real-world challenges that call for advanced JOIN
techniques. For each of these challenges, I’ll show you what the problem is and give you links to our practical guides that discuss the problem in detail.
How to JOIN Two Tables in SQL
We’ve covered the basics of joining two tables. If you’re looking for an in-depth guide on how to join two tables in SQL, with multiple examples and actionable tips, read our articles:
- How to Join Two Tables in SQL
- How to JOIN Tables in SQL
How to JOIN 3 or More Tables
In real life, you usually join more than two tables in one query. Some complex SQL reports can require joining a dozen different tables.
Let’s take our books
table, now a little bit updated, and the authors
table. Let’s add another one, publishers
.
books
book_id | title | author_id | publisher_id |
---|---|---|---|
1 | The Great Gatsby | 101 | 201 |
2 | To Kill a Mockingbird | 102 | 202 |
3 | Moby-Dick | 103 | 202 |
4 | The Catcher in the Rye | 104 | 203 |
5 | Greek Mythology | null | 204 |
publishers
publisher_id | publisher_name |
---|---|
201 | Penguin Classics |
202 | Harper & Brothers |
203 | Little, Brown |
204 | Ballantine Books |
Each row in this table represents a publisher. The publisher_id
is a unique identifier for each publisher.
To join these three tables, we can use this query:
SELECT b.title, a.name AS author_name, p.publisher_name AS publisher_nameFROM books bINNER JOIN authors a ON b.author_id = a.author_idINNER JOIN publishers p ON b.publisher_id = p.publisher_id;
Joining three tables essentially involves first performing a join between two tables and then adding a third table. The JOIN
operation between the first two tables creates a kind of 'virtual' table; the subsequent JOIN
connects this virtual table with the third table. Additional JOINS
can be added as needed.
As you can see, I used aliases in the SQL query for clarity. SQL aliases are custom names that you can give to the columns and tables you include in your queries. Aliases make the query more readable, particularly in complex joins involving multiple tables. They help distinguish between columns of the same name from different tables, ensuring accuracy in data retrieval. This approach streamlines the query, making it easier to understand and maintain, especially in larger databases with extensive table names and structures.
Our query will combine the data from all three tables based on their relationships: books
and authors
are joined on author_id
, and books
and publishers
are joined on publisher_id
. The result will display each book's title, its author's name, and the publisher's name.
RESULT:
title | author_name | publisher_name |
---|---|---|
The Great Gatsby | F. Scott Fitzgerald | Penguin Classics |
To Kill a Mockingbird | Harper Lee | Harper & Brothers |
Moby-Dick | Herman Melville | Harper & Brothers |
The Catcher in the Rye | J.D. Salinger | Little, Brown |
For detailed examples and insights on how to join multiple tables in SQL, I recommend referring to:
- How to Join 3 Tables (or More) in SQL
- An Illustrated Guide to Multiple Join
- How to Write Multiple Joins in One SQL Query
These awesome articles provide a broader range of scenarios and examples, enhancing your understanding of SQL joins in different database contexts.
LEFT JOIN
OUTER JOIN
s – in particular, the most commonly used LEFT JOIN
– are more challenging for SQL users than INNER JOIN
. In this section, we’ll discuss typical problems that occur when you work with LEFT JOIN
. We’ll include links to relevant articles if you want to learn more.
How to LEFT JOIN Multiple Tables
Using LEFT JOIN
when joining multiple tables can be surprisingly tricky. Let’s see an example.
Suppose you want to list all authors in our database with the books they authored and the name of the publisher. You want to include all authors, even those who haven’t authored any books. Here’s the query:
SELECT a.name AS author_name, b.title, p.publisher_name AS publisher_nameFROM authors aLEFT JOIN books bON b.author_id = a.author_idLEFT JOIN publishers p ON b.publisher_id = p.publisher_id;
You have to use LEFT JOIN
twice in the query. First, you join authors
and books
; you need to use LEFT JOIN
to include all authors. The second LEFT JOIN
is probably less expected: you need to LEFT JOIN
the table publishers
to keep authors with no books in the result. If you used INNER JOIN
, the database would only keep the records with books and publishers; records with no books would be removed.
RESULT:
author_name | title | publisher_name |
---|---|---|
F. Scott Fitzgerald | The Great Gatsby | Penguin Classics |
Harper Lee | To Kill a Mockingbird | Harper & Brothers |
Herman Melville | Moby-Dick | Harper & Brothers |
J.D. Salinger | The Catcher in the Rye | Little, Brown |
Walt Whitman | null | null |
Note that Walt Whitman is in our result. He has no books in our database, so obviously he had no publisher records. Thanks to the second LEFT JOIN
, his record is kept in the result even though there’s no associated publisher record.
For a detailed discussion on how to use LEFT JOIN
with multiple tables, refer to our guide:
- How to LEFT JOIN Multiple Tables in SQL
Using WHERE and ON Conditions in LEFT JOINs
Using WHERE
and ON
conditions in OUTER JOIN
s can be trickier than you think. The ON
clause is primarily used in JOIN
operations to define how tables are connected based on related columns. In contrast, the WHERE
clause filters the results after the JOIN
is executed.
Suppose you want to list all authors with all their books – but only with the books that have a capital M in the title. If the author hasn't authored the book with M in the title, you still want to list them in the result. This is the query you should write:
SELECT a.name, b.title FROM authors a LEFT JOIN books bON b.author_id = a.author_id AND title LIKE ‘%M%’;
RESULT:
name | title |
---|---|
Harper Lee | To Kill a Mockingbird |
F. Scott Fitzgerald | null |
Herman Melville | Moby-Dick |
J.D. Salinger | null |
Walt Whitman | null |
Note that the following query is not right:
SELECT b.title, a.nameFROM books bLEFT JOIN authors a ON b.author_id = a.author_id WHERE title LIKE ‘%M%’;
In this query, the condition on the title “cancels” the LEFT JOIN.
If there is no book record joined to the author, then the condition title LIKE ‘%M%’
is not true and authors without any books are not included in the result:
name | title |
---|---|
Harper Lee | To Kill a Mockingbird |
Herman Melville | Moby-Dick |
For a detailed discussion on how to use LEFT JOIN
with multiple tables, refer to our guide:
- How to LEFT JOIN Multiple Tables in SQL
For more information go to:
What Is the Difference Between WHERE and ON in SQL JOINs?
How to Include Unmatched Rows in JOINs
You might need unmatched rows in your data analysis to get a full picture of your dataset. In a sales database, for example, including rows with products that haven't been sold yet can provide insights into inventory or customer preferences. This comprehensive view is crucial for making informed decisions and understanding all aspects of your data.
To include all rows from one table in a JOIN
, you can use OUTER JOIN
s: LEFT JOIN
, RIGHT JOIN
, or FULL JOIN
. The choice depends on which table's unmatched rows you want to include.
For a detailed guide on how to keep unmatched rows when you join tables, check out this article: How to Keep Unmatched Rows When You Join Two Tables in SQL
Uncommon JOIN Methods
So far, we have discussed the most typical JOIN methods. However, you can join tables on any condition you want. In this section, we’ll focus on less common JOIN methods. Still, the scenarios presented here do happen. If you want to be an SQL expert, you should be aware that you may encounter them in your data analysis work.
Joining Tables by Multiple Columns
Usually, tables are connected with the help of a single column: one table refers to the ID column in the other table. However, sometimes the ID of a column consists of two or more columns. Any table that references that composite key must use multiple columns. If that’s the case, then the JOINs between these tables need to use multiple columns. Let’s see an example.
In an online game store, users can buy access to different games. Information about games is stored in the games
table, which includes the columns game_id
and name
. Information about users is stored in the table users
. The table game_access
stores information on which users ordered which games. Its primary key (the main identifier) is a pair of columns: user_id
and game_id
; each user can buy each game only once. The status of the payment is stored in the table payments
. It includes the columns game_id
and user_id
, which refer to the table game_access
.
To list the user ID, game ID, and payment status for game access, you’d write a query like this.
SELECT game_access.game_id, game_access.user_id, payments.status as payment_statusFROM game_access JOIN paymentsON game_access.user_id = payments.user_id AND game_access.game_id = payments.game_id;
The table payments
refers to the table game_access
by two columns: game_id
and user_id
. You have to provide both these columns to identify which game access record you’re referring to. That’s why the ON condition must use an AND operator to combine two equalities: game_access.user_id = payments.user_id and game_access.game_id = payments.game_id
.
Another example of joining tables on multiple columns can be found in t How to Join on Multiple Columns.
Joining Tables Without Common Columns
Sometimes you want to combine data from multiple tables that don’t share a common column. There are several ways to do it in SQL.
One method involves using a CROSS JOIN
, which pairs each row of one table with every row of another and results in all possible combinations. This is ideal for scenarios like matching every type of wine with every main course in a restaurant menu.
Another technique is to use UNION
, which is useful for combining similar data from different tables – e.g. merging lists of suppliers from various sources. UNION
is a set operator. In SQL, set operators combine the results of two or more queries into a single result.
For more detailed examples and explanations, please refer to Join Two Tables Without a Common Column article.
Joining a Table with Itself: Self-Joins
A self-join in SQL is a colloquial name for joining a table with itself. You may join the table with itself when you join multiple tables and the same table is joined in two different roles. You may join the table with itself when the table contains hierarchical data.
For instance, consider the employee
table with columns like employee_id
, employee_name
, and manager_id
(where manager_id
refers to the employee_id
of the manager of this employee). You’d like to list the names of employees with the names of their managers. Here's a sample query:
SELECT e.employee_name AS employee, m.employee_name AS managerFROM employees eJOIN employees m ON e.manager_id = m.employee_id;
In this query, the employees
table is joined with itself. This is what we call a self-join. The same table is put before and after the JOIN
keyword. To distinguish between the two instances of the employee
table, it is aliased twice: e
for employees and m for managers. When we refer to the columns of the table, we use the alias to tell the database which instance of the table we’re referring to.
You can read more on self-JOINs in our detailed articles here:
- What Is a Self-Join in SQL? An Explanation with Seven Examples
- An Illustrated Guide to the SQL Self-Join
Non-Equi JOINs
Typically the JOIN condition is equality. We call these types of joins equi-join. However, the ON condition in JOIN can be any condition you want. In some scenarios, the condition you’d write is not an equality. We call these types of JOINs non-equi JOINs.
Imagine you have a table person
that stores information about people and their budgets. In the table product
, you have products with the prices. You want to list people with products that they can afford. Here’s a query you would write:
SELECT person.name AS person_name, product.name AS product_nameFROM personJOIN product ON product.price <= person.budget;
The product is within a person’s budget, if its price is below the budget of the person. That’s what we put in the ON condition of the query: product.price <= person.budge
t. The query will list people with the names of the products that they can afford.
Other conditions you could use in a non-equi JOIN include inequality operators like <
, >
, <>
, or BETWEEN
. This is useful for matching rows based on a range or excluding duplicates in self-join.
To deepen your understanding of non-equi JOINs and enhance your data analysis skills, check out these insightful articles:
- An Illustrated Guide to the SQL Non-Equi Join
- Practical Examples of When to Use Non-Equi JOINs in SQL
Other JOIN Scenarios
SQL JOIN
s offer a rich tapestry of possibilities, extending far beyond basic table merges. They can tackle a wide array of data challenges, opening doors to more advanced and nuanced data analysis. Whether it's refining how data is combined or exploring alternative ways to link tables, the versatility of JOIN
in SQL is vast.
For those keen to explore these diverse scenarios, consider delving into the following resources, each presenting a unique scenario:
Joining To Only The First Row
Sometimes you may want to join tables where only the first row of the second table is considered. This is commonly used when you want to fetch the latest record or a specific piece of information from one table for each row in another table.
There are a few ways to do this, all described in How to Join Only the First Row in SQL article.
Joining Tables Without a JOIN Clause
In some specific SQL scenarios, you can combine data from multiple tables without using the traditional JOIN
keyword. You can achieve this by using subqueries, UNION
, or other set operations. For example, you can combine data from two tables using UNION
:
SELECT book_title AS combined_title FROM author_booksUNIONSELECT book_name AS combined_title FROM publisher_books;
For more information, go to How to Join Tables in SQL Without Using JOINs.
Eliminating Duplicates in SQL JOINs
Duplicate rows can occur in JOIN
results when there are multiple matching rows in the joined tables. To avoid duplicates, you can use the DISTINCT
keyword or GROUP BY
clause to aggregate data appropriately. Here's an example:
SELECT DISTINCT author.name AS author_name, publisher.name AS publisher_nameFROM author_books authorJOIN publisher_books publisher ON author.book_id = publisher.book_id;
In this example, we have two tables, author_books
and publisher_books
, with columns book_id
and name
for authors and publishers, respectively. The query retrieves distinct author and publisher names based on matching book IDs and gives them more meaningful aliases.
For more information, I recommend the following articles:
- How Do You Get Rid of Duplicates in an SQL JOIN?
- What Is the Role of DISTINCT in SQL?
More SQL JOIN Resources
There are numerous resources available online for those looking to learn more about SQL JOINs. These resources range from interactive tutorials and courses that offer a hands-on approach to learning to comprehensive articles and guides that provide in-depth explanations and examples. They cater to different learning styles, ensuring that whether you prefer structured courses or self-paced learning, there's something available to suit your needs. One of my favorites is the SQL JOIN Cheat Sheet.
This awesome learning resource provides a comprehensive guide to JOINs, including syntax and practical examples. It covers JOIN types, along with advanced concepts like self-join and non-equi join.
The cheat sheet is available in PDF and PNG formats for easy reference and includes examples of queries with multiple joins and multiple conditions. It's an essential resource for anyone looking to master SQL JOINs, from beginners to advanced users.
Another source of great content for beginners is YouTube. Having watched hundreds of SQL JOIN tutorials there, I've handpicked the best ones for you. These tutorials stand out for their clarity, comprehensive coverage, and practical examples; they are extremely useful for anyone looking to master SQL JOINs. The tutorials below are a personal favorite, combining quality content with an engaging teaching style.
While video tutorials are an excellent way to learn the concepts of SQL JOINs, they need to be complemented with real practice. Writing and executing your own SQL queries is crucial in solidifying your understanding and skills. This hands-on experience allows you to apply what you've learned in practical scenarios, helping you to grasp the nuances of JOIN operations and how they function in real-world databases.
Learn and Practice SQL JOINs
Mastering SQL JOINs is a journey that begins with understanding the basics and gradually evolves into handling more complex queries. The resources, tutorials, and practice exercises highlighted in this guide are tailored to make this learning process engaging and effective for beginners.
Remember, consistent practice is key to becoming proficient in SQL JOINs. By exploring different JOIN
types and applying them in various scenarios, you'll gain a comprehensive understanding and the ability to manipulate data effectively in your database projects.
Keep experimenting and learning, and soon SQL JOINs will become an integral part of your data management toolkit.