Complex ActiveRecord Queries Pt. II - Greatest N Per Group
comments

Last updated 12 February 2015

It is time again to get ridiculous with ActiveRecord. Last time I talked about dealing with includes and sorting. Today I'll talk about an issue I had with sorting by a one-to-many relationship.

The situation

  • Users have many Subscriptions
  • A User's end date (when they are no longer active) is based on the expiration_date of the last Subscription that user has
  • You want to order the Users by end date

Okay! Great! Should be easy. You want to write this:

User.joins(:subscriptions).order('subscriptions.expiration_date ASC')

The problem

It doesn't work. Of course.

Why? Because if a User has more than one Subscription which one should be used for ordering? That code doesn't really make sense.

After some googling around I discovered that there's a name for this problem - greatest n per group. (By the way, isn't it annoying when it would be so easy to find the answer to something if you only knew the name? It's like the old joke about not being able to look up something in the dictionary to see how it's spelled unless you know how it's spelled.) In this case we want to find the latest (instead of greatest) Subscription in each group of Subscriptions owned by each User. I found this excellent blog post showing how to solve that problem in ActiveRecord.

After spending time trying to grok the post (it is confusing!) I produced this:

User.joins(:subscriptions).joins('LEFT OUTER JOIN subscriptions sp ON (subscriptions.expiration_date < sp.expiration_date and subscriptions.user_id = sp.user_id)').where('sp.id IS NULL').order('subscriptions.expiration_date ASC')

So simple! sp is an arbitrary name representing the Subscription that is later than the one represented by subscriptions. You keep searching until sp.id IS NULL, meaning that subscriptions represents the latest Subscription for that User.

As I'm sure you predicted, this doesn't work either. Postgres immediately complains that you can't order by something you aren't selecting (why didn't it complain in the first case? I don't know, ActiveRecord).

Okay! We'll just select that too:

User.joins(:subscriptions).select('users.*, subscriptions.expiration_date').joins('LEFT OUTER JOIN subscriptions sp ON (subscriptions.expiration_date < sp.expiration_date and subscriptions.user_id = sp.user_id)').where('sp.id IS NULL').order('subscriptions.expiration_date ASC')

Another beautiful ActiveRecord query. Just as a note, I really like ActiveRecord because it makes 99% of all the queries you do beautiful and semantic. Unfortunately, that other 1% is out there to get you.

I hope this helps someone trying to fight with ActiveRecord to sort by a single instance in a has_many relationship!

comments powered by Disqus