SQL
  Home arrow SQL arrow Page 4 - Complex SQL Queries
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
SQL

Complex SQL Queries
By: O'Reilly Media
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 3 stars3 stars3 stars3 stars3 stars / 29
    2008-01-10

    Table of Contents:
  • Complex SQL Queries
  • Number of Other Users
  • Evaluation of Filtering Conditions
  • Buyers of Batmobiles

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Complex SQL Queries - Buyers of Batmobiles


    (Page 4 of 4 )

    Assume that we have four tables, namely customers,orders,orderdetail, and a table ofarticles, as shown in Figure 4-5. Please note that in the figure the sizes of the boxes representing each table are more or less proportional to the volume of data in each table, not simply to the number of columns. Primary key columns are underlined.


    Figure 4-5.  A classical order schema

    Let’s now suppose that our SQL problem is to find the names of all the customers living in the city named “Gotham” who have ordered the article called “Batmobile” during the last six months. We have, of course, several ways to formulate this query; the following is probably what an ANSI SQL fan would write:

      select distinct c.custname
      from customers c
           join orders o
             on o.custid = c.custid
           join orderdetail od
             on od.ordid = o.ordid
           join articles a
             on a.artid = od.artid
     
    where c.city = 'GOTHAM'
       
    and a.artname = 'BATMOBILE'
       
    and o.ordered >= somefunc

    somefunc  is supposed to be a function that returns the date six months prior to the current date. Notice too, the presence ofdistinct, which may be required if one of our customers is an especially heavy consumer of Batmobiles and has recently ordered several of them.

    Let’s forget for a while that the optimizer may rewrite the query, and look at the execution plan such a statement suggests. First, we walk thecustomerstable, keeping only rows for which the city happens to be Gotham. Then we search theorderstable, which means that thecustidcolumn there had better be indexed, because otherwise the only hope the SQL engine has of executing the query reasonably fast is to perform some sorting and merging or to scan theorderstable to build a hash table and then operate against that. We are going to apply another filter at this level, against the order date. A clever optimizer will not mind finding the filtering condition in thewhereclause and will understand that in order to minimize the amount of data to join it must filter on the date before performing the join. A not so clever optimizer might be tempted to join first, and then filter, and may therefore be grateful to you for specifying the filtering condition with the join condition, as follows:

      join orders o
        on o.custid = c.custid
        and a.ordered >= somefunc

    Even if the filtering condition really has nothing to do with the join, it is sometimes difficult for the optimizer to understand when that is the case. If the primary key oforderdetail is defined as(ordid, artid)then, becauseordid is the first attribute of the index, we can make use of that index to find the rows associated with an order as in Chapter 3. But if the primary key happens to be(artid, ordid)(and note, either version is exactly the same as far as relational theory is concerned), then tough luck. Some products may be able to make some use of the index* in that case, but it will not provide the efficient access that(ordid, artid)would have allowed. Other products will be totally unable to use the index. The only circumstance that may save us is the existence of a separate index onordid.

    Once we have linkedorderdetails toorders, we can proceed toarticles--without any problem this time since we foundartid, the primary key, inorderdetail. Finally, we can check whether the value inarticles is or is not a Batmobile. Is this the end of the story? Not quite. As instructed bydistinct, we must now sort the resulting set of customer names that have passed across all the filtering layers so as to eliminate duplicates.

    It turns out that there are several alternative ways of expressing the query that I’ve just described. One example is to use the older join syntax, as follows:

      select distinct c.custname
      from customers c,
          
    orders o,
          
    orderdetail od,
          
    articles a
     
    where c.city = 'GOTHAM'
       
    and c.custid = o.custid
       
    and o.ordid = od.ordid
       
    and od.artid = a.artid
       
    and a.artname = 'BATMOBILE'
       
    and o.ordered >= somefunc

    It may just be old habits dying hard, but I prefer this older way, if only for one simple reason: it makes it slightly more obvious that from a logical point of view the order in which we process data is arbitrary, because the same data will be returned irrespective of the order in which we inspect tables. Certainly thecustomers tableis particularly important, since that is the source from which we obtain the data that is ultimately required, while in this very specific context, all the other tables are used purely to support the remaining selection processes. One really has to understand that there is no one recipe that works for all cases. The pattern of table joins will vary for each situation you encounter. The deciding factor is the nature of the data you are dealing with.

    A given approach in SQL may solve one problem, but make another situation worse. The way queries are written is a bit like a drug that may heal one patient but kill another.

    Please check back next week for the conclusion to this article.


    DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware.

       · This article is an excerpt from "The Art of SQL," published by O'Reilly. We hope you...
     

    Buy this book now. This article is excerpted from chapter four of The Art of SQL, written by Stephane Faroult and Peter Robson (O'Reilly; ISBN: 0596008945). Check it out today at your favorite bookstore. Buy this book now.

    SQL ARTICLES

    - Focusing SQL Queries
    - Complex SQL Queries
    - A Close Look at the SQL Query
    - Generating Reports with SQL Date Handling
    - Creating SQL Reports Based on Date Criteria
    - SQL Date Handling and Data Trends
    - Date Handling
    - Introduction to SQL
    - Lies, Damn Lies, Statistics, and SQL







    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 2 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek