Home arrow SQL arrow Page 5 - Lies, Damn Lies, Statistics, and SQL
SQL

Lies, Damn Lies, Statistics, and SQL


sqlHeard about the Developer Shed December Giveaway?  Peter has a method useful for all things statistical.  In this article, Peter shows us how you can make use of some quick methods to generate statistics for your sites or projects in as little time as possible.  Using MS Access and a little SQL code, Peter is able to devise an algorithm useful for determining this month's winner.

Author Info:
By: Peter Lavin
Rating: 3 stars3 stars3 stars3 stars3 stars / 23
December 23, 2003
TABLE OF CONTENTS:
  1. · Lies, Damn Lies, Statistics, and SQL
  2. · Finding the Data
  3. · Capturing the Data
  4. · Into the Database
  5. · The Contest SQL
  6. · Conclusion

print this article
SEARCH DEVARTICLES

Lies, Damn Lies, Statistics, and SQL - The Contest SQL
(Page 5 of 6 )

 The algorithm that will unlock the puzzle of December’s winner is shown below:

SELECT TOP 1 [Author], SUM([Hits])
AS SumOfHits, Count([author])
AS numarticles, Round((Sum([hits])/Count([author])))
AS avghits
FROM hits
GROUP BY [Author]
HAVING Count([author]) > 2
ORDER BY Round(Sum([hits])/Count([author])) DESC

Cut and paste this code into an SQL query window, but don’t run it just yet. Let’s first say a few brief words about it.

Most of you are probably familiar with the common SQL keyword “SELECT”, so let’s not dwell on it. The predicate “TOP 1” will ensure that only one record is returned – after all we are really only interested in the winner. Next comes the list of the fields to be retrieved. Obviously we want to include the Author field – we are going to need the winner’s name. “Sum([hits])” is an aggregate function that does exactly what you might think – namely it totals the “hits” field. Following this is the keyword “AS” which will allow us to give a meaningful alias to this aggregate function. We’ve chosen the name “SumofHits”; not too original perhaps, but descriptive of the value returned.

After this, the “Round” function will ensure that no fractional values are returned. The formula inside the “Round” function will give us the average number of hits by author. This will be one of our unprejudiced criteria for determining the winner.

Where you have a select statement there is always a “FROM” clause lurking nearby. Since we only have one table in our database the table we’re selecting from is “hits”.

Whenever there is an aggregate function in a select statement we must group by any simple fields that appear in that Select statement. Hence, we have “GROUP BY [Author]”. “HAVING” clauses are simply the replacement for “WHERE” clauses when a “GROUP BY” has been used. The expression “Count([author]) > 2” will ensure that our results are not skewed by statistical anomalies, further enhancing the objectivity of our results

Finally the “ORDER BY” clause will arrange the records in descending order starting with the author with the best average number of hits.

Query Results

I don’t know if you can imagine my surprise when I saw the results returned by our query. If I was simply to report them you might not believe me so I’m going to show you them using a screen capture.

In cases such as this, a more quarrelsome audience might question the credibility of our methods. Fortunately, this description does not apply to the readers of the Developer Shed Network especially at this time of year when they want to stay off the naughty list, and, frankly, given the thorough efforts at objectivity, I don’t think that there can be any reasonable grounds for dispute.

I don’t know if you can imagine my surprise when I saw the results returned by our query. If I was simply to report them you might not believe me so I’m going to show you them using a screen capture.

In cases such as this, a more quarrelsome audience might question the credibility of our methods. Fortunately, this description does not apply to the readers of the Developer Shed Network especially at this time of year when they want to stay off the naughty list, and, frankly, given the thorough efforts at objectivity, I don’t think that there can be any reasonable grounds for dispute.


blog comments powered by Disqus
SQL ARTICLES

- Focusing SQL Queries
- Complex SQL Queries
- A Close Look at the SQL Query
- Generating Reports with SQL Date Handling
- Creating SQL Reports Based on Date Criteria
- SQL Date Handling and Data Trends
- Date Handling
- Introduction to SQL
- Lies, Damn Lies, Statistics, and SQL

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials