Ruby on Rails Bedtime Stories

It’s time for some bedtime stories. These are children’s stories so there is a happy ending. These stories are about making your Rails code cleaner, faster, and less brittle, and about keeping your fragment caching code simple. The third story shows how removing code from a view can make your app’s controller query the DB a lot more often.

Story #1: If You Fragment Cache a View

If you want your Rails website to run faster and scale better, you might decide fragment caching will help.

And chances are, you will put memcached on the server and write cache statements in your views.

But if you are still on Rails 2.x then you’re not using Arel’s lazy execution. So you might notice your controllers are still querying the database even for views that are cached and won’t use the data. So you will probably add read_fragment calls to the controllers to prevent that.

If you used a custom key in the view’s cache statement you will want to use the same custom key in the read_fragment statement.

But if it’s a complex key, that might not feel DRY. So you might refactor the key out to a helper function you can call from the controller and the view, passing the result in to read_fragment and cache. You might even put the helper in application_controller.rb if the view is a partial that’s called from several controllers.

Then you might feel bad that your fragment cache logic, which started out nice and clean and small in the view, is now spread out among three source files. You might get a headache. You might think there must be a better way. And you might ask for a glass of milk.

Story #2: If You Remove Data from a View

If your product manager wants you to output some data in a view, you might ask for the data in your controller. You might use ActiveRecord’s find method since you’re still on Rails 2.x and can’t use Arel yet.

And chances are, if you query data in your controller, you’re going to store the result in an instance variable.

If your product manager decides three months later that the data doesn’t need to be shown after all, she may ask your coworker to remove it. Your coworker may do just that–remove it from the view.

Your coworker may not realize the query in the controller is no longer needed. He may not even think of removing it.

Your server may worker harder than it has to, continuing to query the database on every page view to get data that’s never shown on the page.

Your server might get a headache. It might ask for a glass of milk.

Story #3: If You Combine Stories 1 and 2

If your product manager asks you to show data in a view, chances are you will put a database query in your controller to pass the data to the view.

When traffic gets high and your server gets overloaded, you may decide to use a fragment cache. You might use :expires_in => 5.minutes so your server only has to query the database once every five minutes.

You might get the same headache you got in bedtime story #1 because you have to add a read_fragment call in your controller, a cache call in your view, and a reusable key function in your application controller. But the glass of milk might help and your headache may go away.

When your product manager says you no longer need to show the data in your view, chances are you will remove the fragment cache and its contents from the view. But you might forget about the query in the controller. You might leave it there.

But no big deal, right? The app still works. Your tests still pass. You’ll only be querying the database once every five minutes, after all. Wrong. Remember: calling cache from the view is what writes to the cache. By removing the view’s cache statement you have removed all cache writes. So there’s never anything in the cache. So read_fragment will always return nil.

And if read_fragment returns nil, your query will run.

On. Every. Request.

By removing the data from the view you have gone from querying the database once every five minutes to querying it on each and every page view, even though you never need to show the data.

You’re going to need a really big glass of milk to get rid of that headache. Hopefully you’re running New Relic and will see the huge increase in DB queries immediately after deploy so you can look into it quickly.

The Promised Happy Ending

There are several possible solutions to the problems in these three bedtime stories. The third one is my favorite by a long shot:

  1. Not the solution: perform the query in the view instead of in the controller. Obviously this is wrong. I won’t discuss this further.
  2. Not the solution: use Model-View-Presenter. I won’t discuss it either because I think it needlessly complicates matters. We Rails programmers tend to believe in the KISS principle, and there is a simpler solution.
  3. The solution: move your queries out of the action methods and into memoized helper methods in your controller.
    1. Give the helper methods the same name that the instance variables used to have. (@company becomes the company method).
    2. In each helper method, assign the result to the instance variable. Make sure the assignment uses ||=, not plain old =, so the helper runs the query only once no matter how many times it’s called.
    3. In your view, never rely on instance variables having been previously set by the controller. Remove all the @ signs from the views and almost all @ signs from the controllers. (Exceptions are in new and create. See below.) What you’re left with are function calls to the helpers. Each helper will perform the query the first time it is called within a single request.

This solution is pure beauty. Here’s why:

  • It solves bedtime story #1. You can remove all those sloppy read_fragment calls from your controller. The only reason they were there was to limit database calls to only when the view needs it. But by the time your helper function is called, you know the view needs the data and wasn’t able to find a fragment in the cache. So just go get the data. This means you can also remove the reusable key method from application.rb and put the cache key inline as a parameter to the cache statement. Your code will still be DRY because you only need the key in that one place.
  • It solves bedtime story #2. If you ever remove your data output from your view, you are removing calls to the helper functions. If they’re not called, they won’t query the database. So even if you forget to remove them from the controller, they’re doing no harm.
  • It solves bedtime story #3. If you ever remove a fragment cache and its contents from the view and thus never write to the cache any more, it does not increase the number of times your controller queries the database. Since you’ve removed the view code that calls the helper functions, you actually reduce the number of database calls to zero.
  • It’s simple to understand.
  • It’s easy to implement.
  • The resulting code is clean. Your controller functions are shorter and therefore sweeter. You don’t have to add new classes. You don’t have to write any new code.
  • It does not require putting database calls in the wrong layer (the view).
  • It does not require explaining a scary new pattern name to your team and to every new person who joins your team.

Does It Work in Rails 3?

Sure, it works in Rails 3. You will get the same benefits from using it in any version of Rails. Rails 3 has Arel with its lazy evaluation of queries, but the default generated controllers don’t take advantage of this feature.

Update: previously I said it won’t really be needed any more after you update all your find calls to Arel’s implicit query that runs when you start iterating over results. It turns out I was wrong: even in Rails 3 the default index function calls ModelClass.all and the other functions call ModelClass.find. Both of these run an immediate query, not a lazy one.

Putting performance improvements aside, this technique also DRYs up your controller code a bit even when you’re using Arel. Instead of putting @company = Company.find(params[:id]) in every controller method, you just write it once.

It’s Not Following the Rails Conventions. Will Other Gems Work?

Gems will still work. Let’s say, for example, you use Ryan Bates’ CanCan gem. There’s a code sample in the CanCan README that goes like this:

def show
  @article = Article.find(params[:id])
  authorize! :read, @article
end

It looks like you have to initialize @article, right? But you could rewrite it this way:

def show
  authorize! :read, article
end

def article
  @article ||= Article.find(params[:id])
end

Notice I took the @ off the “article” parameter. It is now a function call to the your helper method that calls Article.find. In this case you’re not avoiding a database call, but you’re not doing any harm either. And your show method is shorter. And sweeter.

Update: In the comments below, in response to Ben’s note about CanCan’s load_and_authorize_resource function, I posted a better way to write your helpers when using CanCan so they don’t run the query or do any authorization until we know the data is needed.

What About Non-Read Operations? Create, Update, Delete?

The new and create methods should still set the instance variable like they did before. update can just call the helper (company.update_attributes instead of @company.update_attributes) and destroy can do the same (company.destroy instead of assigning @company and then calling @company.destroy).

But all views—whether for any of the seven RESTful methods or any non-RESTful method—should just reference the helper method, never the instance variable. Since the helper method is memoized (@company ||= Company.find), it will just work because the instance variable was already set.

I Like Testing

Your tests will need to use @controller.post instead of assigns(:post). Understand, though, that the test might do a DB query by calling post if the view never called it.

Code Sample: the Old Way

Here’s an example CompaniesController that follows the Rails convention of doing the database query in the controller, assigning the result to an instance variable, and referencing it in the view:

# GET /companies/1
# GET /companies/1.xml
def show
  @company = Company.find(params[:id])

  respond_to do |format|
    format.html
    format.xml  { render :xml => @company }
  end
end

And the corresponding show.html.erb:


  <strong>Name:</strong>
  <%=h @company.name %>

<%= link_to 'Edit', edit_company_path(@company) %> |
<%= link_to 'Back', companies_path %>

Code Sample: the Better Way

Here’s how you can rewrite it to avoid bedtime story problems:

helper_method :company

# GET /companies/1
# GET /companies/1.xml
def show
  respond_to do |format|
    format.html
    format.xml  { render :xml => company }
  end
end

def company
  @company ||= Company.find(params[:id])
end

And the corresponding show.html.erb:


  <strong>Name:</strong>
  <%=h company.name %>

<%= link_to 'Edit', edit_company_path(company) %> |
<%= link_to 'Back', companies_path %>

For completeness, here is how you would implement the index method:

helper_method :company, :companies

def index
end

def companies
  @companies ||= Company.all
  # or with will_paginate:
  # @companies ||= Company.paginate :page => params[:page] || 1
end

Thanks

Thanks and apologies to If You Give a Mouse a Cookie and other books in the series by Laura Joffe Numeroff and Felicia Bond. A wonderful series of books that I recommend reading to small children.

Thanks to Struts 2 for helping me discover this pattern. (Yeah. Whoa.)

17 thoughts on “Ruby on Rails Bedtime Stories”

    1. Thanks, Kang. Does story #3 sound like something we saw on a project we worked on? ;-)

      This code is already just as DRY as the conventional Rails pattern. More DRY if you count the removal of the duplicate key logic for read_fragment and cache. My preference is just to leave the code as is rather than try to make it a plugin. I don’t think a plugin would really help here.

    2. Come to think of it, Kang, this approach makes the app’s code more DRY than the conventional Rails approach even if you are not using caching. Because where you used to have this sprinkled throughout your controller:

      @company = Company.find(params[:id])

      you now have it in only one place.

  1. Does this technique work with partials and collections? I am thinking of how the partial local is named, in this case, company, conflicting with the controller method. It probably works just fine, but I haven’t tested it.

    1. Allan, great question. You’re right, the partial local is named the same as the helper method. I have tested it and fortunately it does work just fine, as you suspected. The locally scoped company variable in this example makes the helper function invisible.

      In the “render” statement you would use company/companies instead of @company/@companies:

      render company # used to be render @company
      render companies # used to be render @companies

    1. Thanks for the link Allan. Yeah, as I mentioned in the post, Arel greatly reduces the need to use this pattern because it does lazy execution. You’ll see the most benefit from this pattern in your Rails 2.x apps that don’t use Arel.

      I still think it’s kind of a nice pattern to use even with Arel because it’s more DRY than the typical pattern. I’ve added a brief paragraph about that to the “Does It Work in Rails 3?” section above. But with Arel, the practical benefits of using this pattern might no longer outweigh the benefit of using the patterns every Rails developer is familiar with.

    2. Another follow-up on the point about Arel and Rails 3: it turns out Rails 3′s default generated controllers don’t take advantage of Arel. They use .all and .find, neither of which does lazy execution of the query. If you’re using those queries, you get the same benefit from my approach in Rails 3 as in Rails 2.

      I’ve updated the Rails 3 section above to mention this.

  2. I wonder why helper_method is required? It would be nice if you could just control visibility with protected and private. If instance variables are made available, why not methods?

    1. Hmm, good question, James. You’re right, it’s an interesting choice that controller instance variables are accessible to the view by default but public methods are not.

      1. Good read, thanks for the article. It definitely looks like viable option.

        You can’t use public methods to be accessible to the view because this visibility is reserved to routes (:controller/:action). However, I think it would be nice to have protected methods available in the view.

        public => callable methods for routes
        protected => available methods for controller, helpers, views
        private => only available in controller, like the name suggests

  3. Since you mentioned CanCan, I’d like to also point out CanCan’s helper:
    `load_and_authorize_resource`.

    In which the instance variable (e.g.- @article) will already have been loaded (and authorized for the user) before the ‘show’ method gets called.

    In this case, one can simplify the show action even more in your example (i.e. – empty).

    I think I agree that even in this case, I’d still use your suggested helper method access in the view. Thanks for the tip.

    1. Thanks for pointing that out, Ben.

      The bad news is load_and_authorize just takes you back to the same old problem I was pointing out in the article. It proactively performs a query (in a before_filter) before knowing if the view will actually need the data. Then we’re back to square one.

      But your question brings up a great point: even without load_and_authorize_resource, calling authorize! in the controller (as I did in my article) causes a proactive query before knowing if the view will need the data.

      How do we work around this? By authorizing the resource in the helper method instead of in a before_filter. Try something like this:

        def company
          return @company if defined? @company
          @company = Company.find(params[:id])
          authorize! params[:action].to_sym, @company
          @company
        end
      
        def companies
          return @companies if defined? @companies
          @companies = Company.all
          authorize! :read, @companies
          @companies
        end
      

      Feel free to DRY up this code.

  4. First, i worked on a project for along time and in some cases used this approach (some other team members used it). so i have experience with this.

    Here is why i don’t like this approach:

    1. the action doesn’t reveal what will happen when the request arrive.
    if you look at the index action in the example you will see that it is empty. you need to read the view in order to understand what queries will be made or services will be called.

    this makes it harder to understand the flow and how to optimize or refactor it. you will need to find the view and look for names of methods mixed inside html, and you will also have to look in all the partials that are called. a messy job.

    2. the name of these helper methods are like any other variable name. in this example “company”. i would prefer “get_company()”.

    i understand company is sweeter, but since ruby doesn’t require you to use bracket, when a programmer who barely knows the code tries to solve a problem and see “company.name” his first thought is that company is an object variable, he doesn’t suspect this might be a call to a method that query the db.

    1. doro, it’s good to hear feedback from someone like you who has worked on a project for a while that used this technique. Your points are spot-on and they deserve a detailed response:

      1. You said the action doesn’t reveal what will happen when the request arrives. I agree that this is a problem. Having to trace through a bunch of html would be a pain. There are alternatives like using query_trace or New Relic but I get your point–nothing would be clearer than just seeing it right there in the controller and knowing what queries will be run for a given action.

      The problem is, that gives a false sense of security. Reading a controller action in Rails doesn’t reveal what will happen when the request arrives. Even when people try to follow the rules and not write queries in views, It’s common to show associations in a view without having joined or included the association in the controller. So the controller doesn’t make it clear that more queries will be run.

      It’s pretty easy for these queries to get out of hand. In the worst case I’ve seen, the controller did just one query but by the time the view was rendered, 300 queries were performed–and that was just to build a site’s nav bar (which was shown on every page). A programmer naively crawled the entire category tree in a helper function (in the helpers folder) without realizing each call to .parent or .children ran a db query.

      It feels like I’m saying “it’s already bad so it’s ok to make it worse.” Not exactly. If I knew a good fix that made all queries obvious in the controller, I might advocate that. But since I don’t know one, I’m trying to be pragmatic and say given the current landscape, let’s see we we *can* solve some of its problems.

      2. You said the name of these helper methods are like any other variable name. In this example “company”. You would prefer “get_company()”. I see what you’re saying but I’m not yet convinced on this one. I was tempted to name the helper methods something like “get_company()” instead of just “company,” but it seemed like it goes too much against the Ruby naming pattern that’s used for all accessors. Once someone has used Ruby for a bit they’re probably going to know that “company” is either a variable or an accessor method, and if they don’t see the variable declaration up above then it must be an accessor method.

      But you’re right that a programmer who barely knows the code won’t realize that the accessor method queries the db. As I mentioned above, I worry about that with ActiveRecord in general since it’s not obvious which method calls will run queries.

      My answer is: it’s still better than in the normal Rails pattern, where you still query the db to get the company even if you *never* reference “company.”

      But hey, if readers want to use this pattern but call the helper method get_company() to make it more obvious, or leave off the parens and call it get_company, go for it. You’ll still get the benefits I described in the post.

      Summing up:

      The Struts 2 project I work on for my day job uses the helper method approach that I describe in the post above. I waited six months before writing this post about doing it in Rails to make sure it was working out ok. So far we have not had problem #1 or #2 that you mentioned. (Yes, #2 is a potential problem even in Java because we’re using OGNL script to call the accessors. So in the view it looks like “company,” not “getCompany()”.) But admittedly it’s a team of three and we all know the code really well. We’ll have to see what happens when more people join the team.

      Thanks a lot for your insights.

Leave a Reply

Your email address will not be published. Required fields are marked *

Feel free to use <a>, <b>, <i>, <strong>, <em>, <strike>, <code>.

Code blocks:
[code language="ruby/javascript/html/css/sass/bash"]
[/code]