Technology: September 2006 Archives

Serving With Code

As I thought about practicing my faith I wondered what my place was. We make good money and give generously. But is that all I'm supposed to be doing, or should I quit it all, cash out the equity in our house, move our family to a "poor neighborhood" and serve like Shane and The Simple Way. Sometimes I would lean towards the later, but then that doesn't seem the least bit feasible and would I really be using my gifts and abilities?

I have a friend who is the pastor of Redeemer Presbyterian in Indianapolis. Our family has been helping financially with a small and wonderfully unique congregation that he and Redeemer have helped to get off the ground called The New Deal. About once a year we get to see Jason and his family in Indy. We did again in August. While visiting, he introduced us to Tim Streett who founded Jireh Sports. Jireh is an inspiring organization dedicated to "meet the spiritual, physical, mental and emotional needs of urban youth...through significant relationships with mentoring adults developed around unique sports, recreation, and educational opportunities."

The reason I mention Jireh is because it got me thinking. I'm not a guy that could start a gym, but could I start a program to mentor underprivileged children about computers. Is it possible to have an aftershool program that teaches kids how to use computers, how to take them apart and put them together again, to learn about logic and math, to learn how to program them? I suppose the curriculum would have to have healthy doses of video games also ;) Perhaps more importantly would kids be interested in that, and would it make a difference in their lives? Perhaps this will be another idea in the list of many unimplemented ideas? But let's try anyhow, or at least act like we're trying....

I always start with naming a project. In this case, because I have another idea for how to start developing a curriculum that's gonna need a website. So audience; all five of you (thanks Mom)... What should I name such an endeavor? Here are a few ideas that I'm considering:

  • codeducate (or codeducator)
  • bit professor
  • code kids (I couldn't bring myself to use a 'k' for kode ;)
  • bits for kids (as in "silly rabbit bits are for kids")

What do you think?

MySQL "group-wise limiting"

Time to really geek out. I've run into this class of problem a bunch of times over the years and I've never seen an answer that I like.

I've got a query that I'm using GROUP BY for. Is is possible to limit the rows per group.

Every SQL dialect that I've seen makes it possible to limit the rows in a results set. Usually something like:

mysql> select * from sometable limit 10;

and you'll get only the first ten rows. However, when you introduce grouping it all falls apart. Consider this schema representing (denormalized) grades for different students from two classes:

create temporary table groupwise_select(
    id int not null primary key auto_increment,
    class_id int not null,
    name varchar(255) not null,
    grade float not null);                                                                                
insert into groupwise_select(class_id, name, grade) values(1, 'Andy', 100.0);                             
insert into groupwise_select(class_id, name, grade) values(1, 'Andy', 98.0);                              
insert into groupwise_select(class_id, name, grade) values(1, 'Andy', 99.0);                              
insert into groupwise_select(class_id, name, grade) values(1, 'Andy', 97.0);                              
insert into groupwise_select(class_id, name, grade) values(1, 'Bob', 97.0);                               
insert into groupwise_select(class_id, name, grade) values(1, 'Bob', 98.0);                               
insert into groupwise_select(class_id, name, grade) values(1, 'Bob', 65.0);                               
insert into groupwise_select(class_id, name, grade) values(1, 'Henry', 32.0);                             
insert into groupwise_select(class_id, name, grade) values(1, 'Henry', 33.0);                             
insert into groupwise_select(class_id, name, grade) values(1, 'Henry', 34.0);                             
insert into groupwise_select(class_id, name, grade) values(1, 'Henry', 35.0);                             
insert into groupwise_select(class_id, name, grade) values(1, 'Henry', 100.0);                            
insert into groupwise_select(class_id, name, grade) values(1, 'Amelia', 73.0);                            
insert into groupwise_select(class_id, name, grade) values(2, 'Valerie', 100.0);                          
insert into groupwise_select(class_id, name, grade) values(2, 'Valerie', 100.0);                          
insert into groupwise_select(class_id, name, grade) values(2, 'Valerie', 100.0);                          
insert into groupwise_select(class_id, name, grade) values(2, 'Valerie', 100.0);                          
insert into groupwise_select(class_id, name, grade) values(2, 'Valerie', 100.0);                          
insert into groupwise_select(class_id, name, grade) values(2, 'Hans', 13.0);                              
insert into groupwise_select(class_id, name, grade) values(2, 'Hans', 14.0);                              
insert into groupwise_select(class_id, name, grade) values(2, 'Hans', 15.0);                              
insert into groupwise_select(class_id, name, grade) values(2, 'Hans', 16.0);                              
insert into groupwise_select(class_id, name, grade) values(2, 'Oscar', 99.0);                             
insert into groupwise_select(class_id, name, grade) values(2, 'Oscar', 99.0);                             
insert into groupwise_select(class_id, name, grade) values(2, 'Oscar', 99.0);                             
insert into groupwise_select(class_id, name, grade) values(2, 'Oscar', 104.0);                            

Your assignment is to return the two best students (highest average) from each class. OK, great. Lets start with a simple grouping query to return the average for each student



mysql> select class_id, name, avg(grade) as average from groupwise_select group by name, class_id order by class_id, average desc;
+----------+---------+-----------------+
| class_id | name | average |
+----------+---------+-----------------+
| 1 | Andy | 98.5 |
| 1 | Bob | 86.666666666667 |
| 1 | Amelia | 73 |
| 1 | Henry | 46.8 |
| 2 | Oscar | 100.25 |
| 2 | Valerie | 100 |
| 2 | Hans | 14.5 |
+----------+---------+-----------------+
7 rows in set (0.01 sec)

Good we now know that the four students we need to return as the answer are Andy (98.5), Bob(86.7), Oscar(100.25), and Valerie(100). This should be easy let's just use a LIMIT

mysql> select class_id, name, avg(grade) from groupwise_select group by name, class_id LIMIT 2;           
+----------+------+-----------------+
| class_id | name | average         |
+----------+------+-----------------+
|        1 | Andy |            98.5 |
|        1 | Bob  | 86.666666666667 |
+----------+------+-----------------+
2 rows in set (0.00 sec)

Hmmm... That only returned the first two rows from our previous query. Yep, that's because the LIMIT is applied as the last thing the query does. Now you'll probably bang your head against the wall trying all different variations of GROUP BY, ORDER BY, and LIMIT at this point and ultimately you'll end up doing the LIMITing in application code instead of SQL concluding that it's just not possible.

However, there is a solution! It uses MySQL's user defined variables and sub-selects. Let's jump to the answer:

mysql> set @idx=0; set @cur_class=0;                                                                      
mysql> SELECT class_id, name, average FROM (
    SELECT r.class_id, r.name, r.grade,
    IF(@cur_class != r.class_id, @idx:=1, @idx:=@idx+1) AS row_index,
    IF(@cur_class != r.class_id, @cur_class:=r.class_id, 0) AS discard
    FROM (SELECT class_id, name, avg(grade) AS average FROM
      groupwise_select GROUP BY name, class_id
      ORDER BY class_id, average DESC) AS r
    HAVING row_index <= 2) AS r2;                                            

+----------+---------+------------------+
| class_id | name | average |
+----------+---------+------------------+
| 1 | Andy | 98.5 |
| 1 | Bob | 86.6666666666667 |
| 2 | Oscar | 100.25 |
| 2 | Valerie | 100 |
+----------+---------+------------------+
4 rows in set (0.01 sec)


Awesome! So how does that work. There are probably a few key insights here:


  1. sub-selects must evaluate in their entirety before the outer-select (in this case our query returns the averages for ALL of the students sorted first by class_id and then by average)

  2. The true and false expression parts of the IF() function can be used to set user-defined variables

  3. IF() function expressions included in a SELECT statement are evaluated left to right

  4. You can't use a WHERE clause on a computed column (like row_index) but you can include it in the HAVING clause (I don't presume to really understand that)

So given those insights we first retrieve ALL of the GROUPed averages for the students order by class_id and then average, second we number the results for each class, and last we trim out the rows that have a row number less than or equal to our target (2 students per class). Voilla! Group-wise limiting.

Cringley Article

Robert X. Cringley wrote an article about Apple's pending announcement In which he makes some points similar to mine

"This is key, because what's been missing throughout this conversion to Internet television has been a way to incorporate our user device of choice -- the TV. People don't really want to watch movies on their computer screens. They'll do it, some of them, but most people won't, so for the Internet and downloadable video market to explode the way it is supposed to do, we need an easy way to get the movies out of our computers and onto our TV screens."

The question remains about Amazon UnBox... How will they get the content to the TV. If it's through a player then they chose the wrong set of devices.

Amazon Unbox

Over the last two or three years I've basically stopped buying CDs opting instead to buy music digitally (except for artists on really small labels). And within the last few months, since I got my video iPod, I've started to buy video content digitally too (Shiree and I are HOOKED on Prison Break).

Because I have a Mac and iPod I've been doing all of my purchasing through the iTunes Music Store, and while I have my beefs with them (DRM prevents playing on non Apple software, there's no "wishlist", navigation is poor, etc.) it's been a great experience and I'm hooked. That said, when Amazon announced Amazon Unbox I was intrigued and wanted to try it. I read a little bit and noted that buying a video gets you "DVD-quality video to watch on your PC or TV, and a video file optimized for compatible portable video players". I just assumed that would work on my Mac/iPod setup. Nope! Not only does Amazon Unbox not work for the iPod, it doesn't work on a Mac either. The iPod bit is understandable. I'm sure the media companies aren't licensing the content without DRM and Apple's DRM for the iPod is it's special sauce that it won't open up to competitors. I suppose it's even reasonable to not support the Mac since it's only 10% marketshare or somesuch.

However, I don't see a non-iPod strategy working out. As of now, people are not going to watch video content on their PC and you're right to point out that people are not watching video content on their iPod either. BUT, they will dump the video content onto a player and then plug it into their home entertainment system (for example I use a n RCA/video cable). So, as of now, Amazon is trying to compete in 13% of the player market and maybe not even that much because they support a pretty limited set of players.

Joel on Software

It's been a while since I've written anything and so it's strange that I would dip back into the water on a topic like this but within the span of a week I've decided that I'm over Joel Spolsky.

First I think he's starting to recycle content (like his latest Finding Great Developers). If it's NOT recycled it feels like it. So at a minimum he's no longer saying things that I don't already know. Then last week there was the whole Language Wars rant (and the answer from DHH). I don't really even know what to say about that except "did someone pay him to say that stuff?"

So thanks for the insights over the past 6 years, but Joel, I'm afraid that you've Jumped the Shark.

What I'm Consuming

About This Archive

This page is a archive of entries in the Technology category from September 2006.

Technology: January 2006 is the previous archive.

Technology: May 2007 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Technology: September 2006: Monthly Archives

Powered by Movable Type 4.0-beta7-20070717