Monday, 24 July 2017

Querying Multiple Tables in MySQL

One of the drawbacks to normalization to the third form (3NF) is more cumbersome data extraction due to the greater number of tables. These require careful linking via JOIN clauses. Improper table joining can easily result in erroneous results or even in the dreaded Cartesian Product. In today’s article, we’ll explore how table joins are achieved in MySQL.

A Tale of Two Join Styles

SQL was adopted as a standard by the American National Standards Institute (ANSI) in 1986 as SQL-86 and the International Organization for Standardization (ISO) in 1987.

Using ANSI-89 JOIN syntax, tables were joined on common fields using the equals symbol (=):

SELECT o.OrderID, 
          od.ProductID
   FROM   Orders AS o, 
          dbo.OrderDetails AS od
   WHERE  o.OrderDate >= '20091001'
   AND    o.OrderID = od.ProductID;

That style was eventually eclipsed by a very different one that was introduced in the new SQL-92 standard. In general, developers found this style to be more readable than its predecessor by separating the joining criteria from the filter criteria. Here is the above query expressed using an SQL-92 join:

SELECT o.OrderID, 
          od.ProductID
   FROM  Orders AS o
   INNER JOIN dbo.OrderDetails AS od
           ON o.OrderID = od.ProductID
   WHERE o.OrderDate >= '20091001';

Although some of the ANSI 86 syntax, such as OUTER JOINs using "*=" and "=*" has been deprecated by some SQL vendors, some developers continue to use it in MySQL, where it is still supported. There is of course no harm in doing so, but for the purposes of this tutorial, I’ll be focussing on the SQL-92 standard, since it has been the recommended one since 1992.

With that being said, let’s go over the different types of table joins.

Inner Joins in Mysql

The type of table join depicted in the example above is referred to as an inner join. The MySQL INNER JOIN clause matches rows in one table with rows in other tables and selects rows that contain columns from both tables. Hence, the above query would only return rows where an Order has an associated record in the OrderDetails table.

SELECT o.OrderID, 
          od.ProductID
   FROM   Orders AS o
   INNER JOIN dbo.OrderDetails AS od
           ON o.OrderID = od.ProductID
   WHERE o.OrderDate >= '20091001';

Selecting data from Three Tables in Mysql

Inner Join statements may be utilized to link any number of tables to the query, so long as there are common fields between the tables. Here is the syntax for joining three tables:

SELECT * 
FROM  table1 
        INNER JOIN table2 
        ON table1.primaryKey=table2.table1Id   
        INNER JOIN table3 
        ON table1.primaryKey=table3.table1Id

Applying the above syntax to our original query, let’s say that we only wanted to see records where there was an associated customer to the order. As long as there was a column shared between the two tables, we could hook it up to our query as follows:

SELECT o.OrderID, 
        od.ProductID
 FROM   Orders AS o
 INNER JOIN dbo.OrderDetails AS od
         ON o.OrderID = od.ProductID

INNER JOIN dbo.Customers AS c
         ON o.CustID = c.Id
 WHERE o.OrderDate >= '20091001';

Outer Joins in Mysql

If an inner join only selects rows that contain non-null linked values from both tables, an outer join selects rows regardless of whether there is a linked record in the joined table. There are three types of outer joins:

  • Left: selects rows from the first (main) table regardless of whether or not there is a linked record in the joined table.
  • Right: selects rows from the joined table regardless of whether or not it links to a record in the first (main) table.
  • Full: selects rows from both the first (main) and joined table regardless of whether or not a record links the two tables.

This join type is not supported by MySQL.

Left Join Example

The following query uses a Left Join (sometimes referred to as a left outer join) to fetch all of the records from the Orders table where the OrderDate is equal or greater than 20091001, regardless of whether or not there is an associated OrderDetail:

SELECT o.OrderID, 
        od.ProductID
 FROM Orders AS o
 LEFT JOIN dbo.OrderDetails AS od
        ON o.OrderID = od.ProductID
 WHERE o.OrderDate >= '20091001';

In cases where there is no associated OrderDetail record, the ProductID will come up as NULL.

Right Join Example

Replacing the Left Join with a Right one will fetch all of the records from the OrderDetails table where the OrderDate is equal or greater than 2009-10-01. The result set will include unmatched records in the right table of the join. In this instance, that means that each OrderDetail is returned even if there is no associated Order. Note that right joins may also be referred to as a right outer join:

SELECT o.OrderID, 
        od.ProductID
 FROM Orders AS o
 RIGHT JOIN dbo.OrderDetails AS od
        ON o.OrderID = od.ProductID
 WHERE o.OrderDate >= '20091001';

In cases where there is no associated Order record, the OrderID will show a NULL value.

Two JOINs and a UNION

One method to simulate a full join is to take the UNION of two outer joins, for example:

SELECT o.OrderID, 
        od.ProductID
 FROM Orders AS o

LEFT JOIN dbo.OrderDetails AS od
        ON o.OrderID = od.ProductID
 WHERE o.OrderDate >= '20091001'

UNION

SELECT o.OrderID, 
        od.ProductID
 FROM Orders AS o
 RIGHT JOIN dbo.OrderDetails AS od
        ON o.OrderID = od.ProductID
 WHERE o.OrderDate >= '20091001';

The limitation of this solution is that it does not handle duplicate records in either of the joined tables. UNION ALL with an Exclusion Join

One way to eliminate duplicates is to use an exclusion join to exclude anything from the second result that is already included in the first:

SELECT o.OrderID, 
        od.ProductID
 FROM Orders AS o

LEFT JOIN dbo.OrderDetails AS od
        ON o.OrderID = od.ProductID
 WHERE o.OrderDate >= '20091001'

UNION ALL

SELECT o.OrderID, 
        od.ProductID
 FROM Orders AS o
 RIGHT JOIN dbo.OrderDetails AS od
        ON o.OrderID = od.ProductID
 WHERE o.OrderDate >= '20091001';

AND   o.OrderID IS NULL;

Notice the use of UNION ALL instead of plain UNION, which would eliminate both of the duplicated records.

Conclusion

To recap what we learned here today:

  • Stick with the SQL-92 syntax.
  • Inner Joins selects only rows that contain columns from both tables.
  • Outer Joins include Left, Right, and Full.
  • Outer Join result sets include unmatched records in the left, right, or both tables of a join, respectively.
  • Full Outer Joins may be simulated in MySQL using UNION or UNION ALL with an Exclusion Join.

Source: http://www.databasejournal.com/features/mysql/querying-multiple-mysql-tables.html

2 comments:

Advertisement

Advertisement