\Zend\Db\Sql – Creating Joins and Unions with Ease

Zend Framework 2

In the first part of this series on \Zend\Db\Sql\Select, we jumped the gun a bit and went straight in to looking at building Where clauses. We looked at a number of the predicates that are available to us, such as In, Between and EqualTo and saw just how easy \Zend\Db\Sql\Select makes both building and maintaining queries.

In this, the second part, we’re backtracking a bit and looking at Joins and a slightly more esoteric feature of SQLUNIONs. By the end of today’s tutorial, you’ll be building some pretty good queries that should satisfy most of your daily requirements.

Joins

Selecting information from one table’s fine. But you quickly need to select across a number of tables. Most queries I see in code that I’ve taken on in consulting projects and other roles normally involves at least one join (even if it’s a self-join at times).

If you’d like a good (visual) refresher on SQL joins, read this post from Jeff Atwood over at Coding Horror.

An interesting point I’ve heard, circumstantially, is that after a number of joins, the performance of the query begins to break down because most RDBMSs are not optimized for more than that. However I don’t have a link supporting that. But this quote from AskTom at Oracle sums it up well:

A 20 table join — no worries, well, except that I cannot imagine the data model that would make me do that!)

However the physical limits are:

With that said, let’s build on the previous tutorial in this series and go through building joins, including Inner, Left, Right and Outer (we’ll include a bonus – a self join).

Below is the schema of the additional table we’ll use to demonstrate the joins. It’s a simple department table. We’re going to construct queries that show things such as: who is the manager of department X and what department is which user in.

Department Table

+---------------------+--------------+------+-----+---------+----------------+
| Field               | Type         | Null | Key | Default | Extra          |
+---------------------+--------------+------+-----+---------+----------------+
| DepartmentID        | int(11)      | NO   | PRI | NULL    | auto_increment |
| UserID              | int(11)      | NO   | PRI | NULL    | auto_increment |
| Department          | varchar(100) | YES  |     | NULL    |                |
| Description         | text         | YES  |     | NULL    |                |
| ManagerId           | varchar(100) | YES  |     | NULL    |                |
| Phone               | varchar(50)  | YES  |     | NULL    |                |
| Fax                 | varchar(50)  | YES  |     | NULL    |                |
| Email               | varchar(100) | YES  |     | NULL    |                |
| Website             | varchar(200) | YES  |     | NULL    |                |
+---------------------+--------------+------+-----+---------+----------------+

Inner Join

Let’s start with a simple join and create a query to find out which users are in which department. We’ll create a query which starts with tbluser and joins on to tbldepartment based on the UserID column. The query will be as follows:

SELECT u.*, d.*
FROM tbluser AS u
INNER JOIN tbldepartment AS d ON (d.UserID = u.UserID)
ORDER BY d.Department

To construct this query, we’ll use the following code:

$select = new Select('tbluser');

$select->join(array("d" => "tbldepartment"), "d.UserID = tbluser.UserID")
        ->order("d.DepartmentID");

print $select->getSqlString();

Here, we’ve first created a new Select object whose base table is tbluser. Then, we’ve called the join function on it, passing in an array, because we want tbldepartment to be aliased to d. Then, we’ve specified the join condition as the second argument. Finally, we’ve used the order function to specify the sort order of the records.

Left Join

Now, what if not all users were yet assigned to a department? In that case, we could slightly change the query to perform a left join. If you’re not familiar, a left join returns all records from the table in the left side of the equation, even if they have no matching records in the one on the right.

Our query will change slightly, as follows:

SELECT u.*, d.*
FROM tbluser AS u
LEFT JOIN tbldepartment AS d ON (d.UserID = u.UserID)
ORDER BY d.Department

To construct this query, we’ll use the previous code, :

$select = new Select('tbluser');

$select->join(
    array("d" => "tbldepartment"), 
    "d.UserID = tbluser.UserID", 
    Select::SQL_STAR , 
    Select::JOIN_LEFT
)->order("d.DepartmentID");

print $select->getSqlString();

Here, we’ve passed in two extra arguments to the join function, columns and type. For columns, we’ve used the SQL_STAR constant, as we want all the columns. For type, we’ve used the JOIN_LEFT constant, which will result in a left join.

Right Join

Now, what if not all departments which were yet assigned to a user? In that case, we could slightly change the query, again, to perform a right join. Right joins are the inverse of left joins; returning all records in the right hand table of the equation, which may or may not have a matching record in the table on the left.

Our query will change slightly, as follows:

SELECT u.*, d.*
FROM tbluser AS u
LEFT JOIN tbldepartment AS d ON (d.UserID = u.UserID)
ORDER BY d.Department

To construct this query, we’ll use the previous code, :

$select = new Select('tbluser');

$select->join(
    array("d" => "tbldepartment"), 
    "d.UserID = tbluser.UserID", 
    Select::SQL_STAR , 
    Select::JOIN_RIGHT
)->order("d.DepartmentID");

print $select->getSqlString();

Here, we’ve only made one change, the type constant, to JOIN_RIGHT.

Outer Join

Now what if we want all records in both tables in the join, irrespective of whether they have a counterpart record? In that case, we’d construct an outer join. Yet again our query will change slightly, as follows:

SELECT u.*, d.*
FROM tbluser AS u
OUTER JOIN tbldepartment AS d ON (d.UserID = u.UserID)
ORDER BY d.Department

To construct this query, we’ll use the previous code, :

$select = new Select('tbluser');

$select->join(
    array("d" => "tbldepartment"), 
    "d.UserID = tbluser.UserID", 
    Select::SQL_STAR , 
    Select::JOIN_OUTER
)->order("d.DepartmentID");

print $select->getSqlString();

Here, we’ve again changed the type constant, to JOIN_OUTER.

Self Join

This is one that I find really interesting, joining a table on itself. In this case, the query doesn’t really get too complex, but just takes a bit more work – and possibly a bit more time to understand.

What we’re wanting to do is the original inner join, so we’ll have the users linked with departments. Then, we want to link the users on the departments again, but this time, we’ll be using the ManagerID in tbldepartment as the user identifying.

SELECT u.*, d.*
FROM tbluser AS u
INNER JOIN tbldepartment AS d ON (d.UserID = u.UserID)
INNER JOIN tbluser AS m ON (d.ManagerID = u.UserID)
ORDER BY d.Department

To construct this query, we’ll use the following code:

$select = new Select('tbluser');

$select->join(
    array("d" => "tbldepartment"), "d.UserID = tbluser.UserID"
)
->join(
    array("m" => "tbluser"), "m.ManagerID = tbluser.UserID"
)
->order("d.DepartmentID");

print $select->getSqlString();

To perform the self join, we’ve added another call to join, specifying the join table as tbluser, but this time with a different alias and joining on ManagerID as we’ve said previously.

Limiting Columns in the Dataset

In all of these examples, we could have limited the columns in the dataset returned. And, if this were a live application, I’d expect that this would happen. Returning .* is ok if the table has only a few columns.

But how often do tables remain small? How regularly are these types of queries forgotten, resulting in an application whose performance seems to mysteriously degrade over time.

So, to limit the columns returned in the joined table, replace Select::SQL_STAR in the examples above with an array. This can be a combination of either just the names of the columns to be returned, or an associative array where the key is the alias and the value is the column name.

Unions, Intersects and Excepts

In the comment’s from the previous tutorial on Select, X asked about Union queries and how they are made. If you’re not familiar with a Union query, the Wikipedia entry describes one as:

In SQL the UNION clause combines the results of two SQL queries into a single table of all matching rows. The two queries must result in the same number of columns and compatible data types in order to unite. Any duplicate records are automatically removed unless UNION ALL is used.

To the best of my knowledge, there is no specific UNION function in the Select classes. Nor does there need to be. When we consider that a Union is only two or more queries combined with the keyword UNION, it would be pretty pointless to write functions, when we can use string concatenation or sprintf already. Right?

NB: Though, it might be good to have a simple utility function to speed things along.

So let’s look at a union example where we want to find all users with the last names of “Baker” and “Flynn”. Below is what the query will be:

SELECT * FROM tblusers WHERE LastName = 'Baker'
UNION
SELECT * FROM tblusers WHERE LastName = 'Flynn  '

Now all we need to do is build two queries, filtering on Baker and Flynn respectively, then combine them together with UNION, execute them, and we’re done. So let’s do that.

// NB: assumes that we're in a controller action.
$serviceLocator = $sm = $this->getServiceLocator();
$dbAdapter      = $serviceLocator->get('Zend\Db\Adapter\Adapter');

$sql = new Sql($dbAdapter);

$selectLastNameIsAndrews = new Select('tbluser');
$selectLastNameIsAndrews->where(
    array("LastName" => "Andrews")
);

$selectLastNameIsCalvert = new Select('tbluser');
$selectLastNameIsCalvert->where(
    array("LastName" => "Calvert")
);

print $unionQuery = sprintf(
    '%s UNION %s',
    $selectLastNameIsAndrews->getSqlString(),
    $selectLastNameIsCalvert->getSqlString()
); exit;

$results = $dbAdapter->query(
    $unionQuery, $dbAdapter::QUERY_MODE_EXECUTE
);

We can take the same approach with INTERSECT and EXCEPT queries. We’d replace UNION with either INTERSECT or EXCEPT, giving us the query that we desired. If you’re not familiar with these, here’s two quotes from the Wikipedia link above:

The SQL INTERSECT operator takes the results of two queries and returns only rows that appear in both result sets

The SQL EXCEPT operator takes the distinct rows of one query and returns the rows that do not appear in a second result set

Well, we could, if MySQL supported them that is. It doesn’t. But I’m including this section because they’re supported by other RDBMSs, including Oracle, MSSQLServer and PostgreSQL.

NB: Most database examples I write are based around MySQL, because it’s rather synonymous with PHP. However, it’s not the only game in town and I don’t assume that it’s all you’re using.

Conclusion

By part two of this series, you have enough to build a robust range of SQL queries in your applications. In the third part, we’ll drop right back to the basics of using Select, including having, order and limit clauses. Don’t forget, check out part one on building Where clauses, if you’ve come straight here.

So tell me your opinion. Is Zend\Db\Sql\Select a really simple library to use to construct queries? Are they as hard as you thought they would be? Add a comment with your opinion.

See you next week!

Intermediate Tutorial
  • jchlu

    This UNION example is bonkers, you’d use UNION to create an amalgamation of data from different tables, never the same one.
    The result you’re after here would simply be written as:
    SELECT * FROM tblusers WHERE lastname IN (‘Baker’, ‘Flynn ‘);

    • http://www.matthewsetter.com/ maltblue

      @jchlu very very right you are. It’d be completely nuts to actually build this query this way. The simple where clause that you’ve shown here, is far more efficient. However, for the purposes of a simple example of building a UNION query, it doesn’t really hurt and gets the job done. But thanks for pointing it out though.

  • nomaanp

    Thanks for this tutorial !!!!

    • http://www.matthewsetter.com/ maltblue

      @nomaanp Anytime (always happy to be of help). Anything specifically that helped you out? Part three’s out too – http://www.maltblue.com/tutorial/zend-db-sql-the-basics

  • nomaanp

    Hi,
    I tried the above examples but having problems to get tbluser’s id on view script. It is giving the department’s id instead..

    • http://www.matthewsetter.com/ maltblue

      @nomaanp thanks for letting me know you’re having issues. Can you send me  a Gist link with the code? Keen to help you get this resolved.

      • nomaanp

        @maltblue here is the link for gist
        https://gist.github.com/nomaanp/750f328669f1b8f91957

        • http://www.matthewsetter.com/ maltblue

          @nomaanp Thanks for the Gist link. I’ll get feedback to you tomorrow.

        • http://www.matthewsetter.com/ maltblue

          @nomaanp How did the feedback on the Gist link go?

        • nomaanp

          added the defintion for addr_category table

        • nomaanp

          @maltblue added the defintion for addr_category in gist!!!