Arel and tags
I love working out ActiveRecord and Arel queries in Rails apps. I thought I’d share some of the more fun queries I’ve designed in Arel, spread out over a few blog posts. We’ll start with one for fetching tags separated by account.
Say you have a Rails app with accounts. Different accounts have different data, and you want to keep that data separate, including the tags on that data. Say our data is stored in models Song, Book, and Video. Let’s say account #1 is Super Religious Church Group and account #2 is Sexxxy Porn Site. You can probably anticipate why users in account #1 never want to mistakenly see content or tags from account #2. It’s easy enough to keep the songs, books, and videos in your site separated by account because each song, book, and video record stores its account_id
right on it. A simple Song.where(account_id: current_account)
is sufficient, for example.
If you’re using ActsAsTaggableOn for tagging, however, you have a tags table with tag names and a taggings table that associates tags with your records. Unless you do something custom, there’s no account_id per tag or tagging, there’s just your record itself, stored as a polymorphic association called taggable
on the ActsAsTaggableOn::Tagging
model. So we need to somehow say “give me all the tags on all the kinds of content associated with the given account.”
My usual technique when I start any Arel or ActiveRecord query is to first map out exactly what I want to get back, then start at that end result and trace backward to the parameters I have available, making a path from end goal back to known information. In this case, I want to get a list of ActsAsTaggableOn::Tag
records, and what I know is an account ID. Since I want the list of tags as my result, that’s the model that I need to start my query with:
1 |
ActsAsTaggableOn::Tag.where(something) |
In an easy scenario, I could just do ActsAsTaggableOn::Tag.where(account_id: account_id)
and be done. But instead, since I know account_id
isn’t a field in the tags
table, I’m going to have to jump from tags
to another table. The taggings
table in my ActsAsTaggableOn setup is what actually ties a given tag to a piece of content, so I should join to taggings
:
1 |
ActsAsTaggableOn::Tag.joins(:taggings).where(something) |
We’re one step closer to being able to filter by account: tags -> taggings -> content -> account. We need to flesh out this amorphous something
list of conditions. Since we’re calling where
on the tag model, these are conditions from the perspective of a tag. So we could describe them as “where the tag is applied to a piece of content in the given account.” How do we know if a tag is applied to a piece of content?
Just by doing joins(:taggings)
we are filtering out tags that aren’t in use, because joins
results in an INNER JOIN
. One of the best tools for designing a complicated query is the Rails console. I specifically like calling to_sql
on queries that I’m working on to check that my SQL looks sane:
> bin/rails c
Loading development environment (Rails 4.2.2)
irb(main):001:0> puts ActsAsTaggableOn::Tag.joins(:taggings).to_sql
SELECT "tags".* FROM "tags" INNER JOIN "taggings" ON "taggings"."tag_id" = "tags"."id"
=> nil
So we see that an INNER JOIN
is produced, which means we’re making a one-to-one mapping from tags over to taggings, and this will exclude any tags that aren’t present in taggings.
To check if a tag is applied to a piece of content, we need to consult the taggings table. We won’t be able to get all the data from taggings
alone, however, since account_id
lives in songs
, books
, and videos
. So we need to hop from taggings
over to each of those tables. Now comes the time to start setting some variables so things don’t get unwieldy:
1 2 3 4 |
taggings = ActsAsTaggableOn::Tagging.arel_table songs = Song.arel_table books = Book.arel_table videos = Video.arel_table |
The arel_table
method is a method Rails adds on to your models that returns an Arel::Table
. This is how you talk about particular fields in your tables when dipping from ActiveRecord down into Arel. For example, I can say songs[:id]
to get a reference to the id
field on my songs
table. I can then use a method like eq
to make comparisons between that field to other fields or values. songs[:id].eq(1)
will produce the SQL "songs"."id" = 1
.
The ActsAsTaggableOn::Tagging
model has a polymorphic relationship called tagging
, and it uses fields taggable_id
and taggable_type
to know whether a tag is applied to a Song
, Book
, or Video
record. We need to make use of both of these fields to know which table to join:
1 2 3 4 5 6 7 8 9 10 11 |
# "songs"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Song' taggings_to_songs = songs[:id].eq(taggings[:taggable_id]). and(taggings[:taggable_type].eq(Song.name)) # "books"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Book' taggings_to_books = books[:id].eq(taggings[:taggable_id]). and(taggings[:taggable_type].eq(Book.name)) # "videos"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Video' taggings_to_videos = videos[:id].eq(taggings[:taggable_id]). and(taggings[:taggable_type].eq(Video.name)) |
Arel is nice because I can save bits of my query in variables like those above, and assemble my query piecemeal. Let’s make use of the join conditions defined above to connect taggings
to our content tables:
1 2 3 |
taggings.join(songs).on(taggings_to_songs) taggings.join(books).on(taggings_to_books) taggings.join(videos).on(taggings_to_videos) |
Note we’re using join
here instead of joins
. join
is used with Arel while joins
is for joining an ActiveRecord model to another table by giving the relationship name. For example, in our Song
model we have belongs_to :account
, so we can say Song.joins(:account)
and ActiveRecord generates SQL to do an INNER JOIN
to the accounts
table.
Why were we joining to the individual content tables again? Because they’re where the account_id
field is stored, and that’s the field by which we want to filter tags. Having joined to the content tables, we can now write conditions using their fields:
1 2 3 4 5 6 |
taggings.join(songs).on(taggings_to_songs). where(songs[:account_id].eq(account_id)) taggings.join(books).on(taggings_to_books). where(books[:account_id].eq(account_id)) taggings.join(videos).on(taggings_to_videos). where(videos[:account_id].eq(account_id)) |
This is assuming we have an account_id
variable, maybe representing the account of the user who is currently logged into our app. Note that the where
here is not the same as the where
on an ActiveRecord model. Here, we pass in Arel conditions like songs[:account_id].eq(account_id)
instead of the usual hash or SQL string you’d pass to an ActiveRecord where
call, like where(account_id: 1)
.
So we’re querying taggings
where the associated content is in the desired account, but we haven’t connected those taggings back to their tags yet. Let’s get the tag_id
off of the tagging record and stick that bit of SQL-to-be in a variable:
1 2 3 4 5 6 7 8 9 |
song_tags = taggings.join(songs).on(taggings_to_songs). where(songs[:account_id].eq(account_id)). project(taggings[:tag_id]) book_tags = taggings.join(books).on(taggings_to_books). where(books[:account_id].eq(account_id)). project(taggings[:tag_id]) video_tags = taggings.join(videos).on(taggings_to_videos). where(videos[:account_id].eq(account_id)). project(taggings[:tag_id]) |
The project
method lets you list which fields should be in the SELECT
part of the query. It’s like a projection in relational algebra.
If you call to_sql
on the song_tags
variable, assuming our account_id
is 1, you get the following:
1 2 3 4 5 |
SELECT "taggings"."tag_id" FROM "taggings" INNER JOIN "songs" ON "songs"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Song' WHERE "songs"."account_id" = 1 |
This is getting the ID of every tag that’s applied to a song in the given account. That’s a third of what we need; we also want any tag applied to a book or video in the given account. So we need to combine our song_tags
, book_tags
, and video_tags
SQL snippets somehow. Since we want to include a tag that has been applied to a song, book, or video, but not necessarily to one of each, we should use an OR
operator:
1 2 3 4 |
tags = ActsAsTaggableOn::Tag.arel_table conditions = tags[:id].in(song_tags). or(tags[:id].in(book_tags)). or(tags[:id].in(video_tags)) |
These conditions will select tags that are applied to any song, book, or video in the specified account–that was our original goal! If we look at to_sql
of conditions
, we get:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
( ( "tags"."id" IN ( SELECT "taggings"."tag_id" FROM "taggings" INNER JOIN "songs" ON "songs"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Song' WHERE "songs"."account_id" = 1 ) OR "tags"."id" IN ( SELECT "taggings"."tag_id" FROM "taggings" INNER JOIN "books" ON "books"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Book' WHERE "books"."account_id" = 1 ) ) OR "tags"."id" IN ( SELECT "taggings"."tag_id" FROM "taggings" INNER JOIN "videos" ON "videos"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Video' WHERE "videos"."account_id" = 1 ) ) |
Almost there. Now we can use these conditions in our original query, back at the ActiveRecord level. The nice thing is that we can just pass our conditions
variable to ActiveRecord’s where
:
1 |
ActsAsTaggableOn::Tag.joins(:taggings).where(conditions) |
If we run this query, we definitely get back tags for the right account, but there may be duplicates. We can add a .uniq
at the end to add DISTINCT
to our SELECT
statement:
1 |
ActsAsTaggableOn::Tag.joins(:taggings).where(conditions).uniq |
While there is a uniq
method on enumerables in Ruby, ActiveRecord will use uniq
to modify our query so we leverage SQL to get unique results instead of filtering at the Ruby level. Here is the final SQL that is generated:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
SELECT DISTINCT "tags".* FROM "tags" INNER JOIN "taggings" ON "taggings"."tag_id" = "tags"."id" WHERE ( ( "tags"."id" IN ( SELECT "taggings"."tag_id" FROM "taggings" INNER JOIN "songs" ON "songs"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Song' WHERE "songs"."account_id" = 1 ) OR "tags"."id" IN ( SELECT "taggings"."tag_id" FROM "taggings" INNER JOIN "books" ON "books"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Book' WHERE "books"."account_id" = 1 ) ) OR "tags"."id" IN ( SELECT "taggings"."tag_id" FROM "taggings" INNER JOIN "videos" ON "videos"."id" = "taggings"."taggable_id" AND "taggings"."taggable_type" = 'Video' WHERE "videos"."account_id" = 1 ) ) |
We’re mostly hitting indexed fields here. tags.id
, songs.id
, books.id
, and videos.id
are all primary keys, while tag_id
, taggable_id
, and taggable_type
belong to other indices. Hopefully your account_id
fields in songs
, books
, and videos
are indexed as well, since they’re foreign keys by which you’ll be doing a lot of filtering.
So there you have it. That’s how I construct a query in Arel or ActiveRecord. Start with where you want to end up and work your way backward, connecting in other tables whenever you need access to their data. In my next Arel post, I’ll cover how I filter records that the given user has permission to view.