Creating histogram data using SQL

Histograms (and bar graphs of discrete data) are really really handy when you’re trying to work out how people are using your system. Generating the data for them using SQL is a touch fiddly though. Here’sa simple query to generate some data I’ve just written. All the problems stem from the niggle that what you [...]

By oli

Oli Wood is the former CTO at Wishlist

Histograms (and bar graphs of discrete data) are really really handy when you’re trying to work out how people are using your system. Generating the data for them using SQL is a touch fiddly though. Here’sa simple query to generate some data I’ve just written.

All the problems stem from the niggle that what you are generally interested in is a count of counts eg. how many people have how many rows associated with them.

We have a simple table where a row is added everything somebody “makes friends”, so my graph wants to be a visulisation of the question “How many friends have people added”. We do it in two steps…

First – a simple query to get the raw data. This returns a big list of user id’s and the number of friends they’re added.

select id, count(id) as myCount from relationships group by id

Which runs and returns a row per user, and the number of friends that each user has added. Now to take counts of these counts, which means adding a table identifier to each table and a subquery. The table identifiers allow us to avoid the “Every derived table must have its own alias” error

select one.mycount, count(one.myCount) from (select two.id, count(two.id) as myCount from relationships two group by two.id) as one group by myCount

…and there we go – data you can drop into Google Docs and graph. The one data entry point that is missing is the Zero point eg. how many users don’t have any friends, but we get that data from another table (relationships only contains those who have added relationships).


Tags: , , , , ,
  • http://topsy.com/trackback?utm_source=pingback&utm_campaign=L2&url=http://wishli.st/blog/creating-histogram-data-using-sql/ Tweets that mention Wishlist » Creating Histogram data using sql — Topsy.com

    [...] This post was mentioned on Twitter by hkrnws, Hacker News. Hacker News said: SQL to create histograms: http://bit.ly/cGK2ga Comments: http://bit.ly/9KG8F0 [...]

  • dferrero

    Oli, you are a SQL rock star! I’ve spent the better part of the day looking for the right sql to produce this. Most examples assume one of the tables has a column with a number value that will be used. My count was derived as was yours. You are a genius!

blog comments powered by Disqus

About

Wishli.st is David Haywood Smith . I want to help you give great gifts. You can find out more about us and our products on wishli.st"

Here on our blog we try to write about the things that are important to us, giving great presents, and producing great code. If you think we can help you out in any way, get in touch.

Wishlist on Facebook

We build a Facebook application to help you give great gifts. You can find out more about it on out fan page

Subscribe

wishli.st Posts RSS feed