Category Archives: Rails

The Javafication of Ruby on Rails

As you know, one of the nice things about using Ruby on Rails is the emphasis on code simplicity and readability. A month ago in “This Week in Edge Rails,” Mike Gunderloy posted something that makes me a little concerned—it feels like one small step toward the Javafication (a.k.a. complexification) of Ruby on Rails.

The change is called “Pluggable JSON Backends” and it was implemented by Rick Olson, a.k.a. Technoweenie. (I have the utmost respect for Technoweenie. I’ve used his code before and it’s wonderful stuff—it’s clean, it’s simple, and it works.)

Instead of this:

my_model.to_json

we are now encouraged to do this:

ActiveSupport::json.encode(my_model)

I Like Stuff that’s Simple

It’s a little thing and I don’t want to blow it out of proportion, but this is a pattern I would like to see the community avoid. Once people see this style of coding they start to copy it.

I think I understand the motivation behind the change, but I believe the new style of coding is harder to read and write. It reminds me of all the years I wrote C++ code and wished I could just extend a class outside its original definition. Ruby can do that and I love the fact that Rails isn’t shy about taking advantage of this capability to make programmers’ lives easier and code simpler.

It may be too late to have this debate when it comes to JSON. Maybe this was the only way to meet the requirements Technoweenie had. I don’t know. And I understand sometimes it’s necessary to do something a little ugly to achieve a goal that’s more important than simplicity or readability. But I am writing today to encourage the Rails core committers, and the rest of us in the Rails community, to please avoid this pattern when possible. If you are considering writing a method that looks like this:

ClassOrModule::namespacey_function.method_name(obj)

please pause and consider whether the users of your API would find it easier if you instead made it look like this:

obj.method_name

Thanks!

Add Optional SEO-Friendliness to link_to_remote

link_to_remote_with_seo adds optional SEO-friendly goodness to the Rails link_to_remote function.  I wrote it for cases where I would have used link_to_remote in my Rails app but I wanted GoogleBot and other search engines to be able to follow the links.  In addition to setting onclick like the normal link_to_remote, it also sets html_options[:href] to the SAME URL that you pass in to options[:url]. (It only does this if you pass :seo => true and you do not explicitly set the href.)

See the big honking warning at the bottom for an explanation of why this plugin doesn’t just override the behavior of link_to_remote.

I Like Stuff that’s SEO-Friendly

The following example shows a “Next” link in paginated output.  Clicking the link in a browser results in an AJAX call (using the POST method) that retrieves just the “page” partial and inserts it into the “results” div on the page with a highlight visual effect.  When a search engine sees the link, however, it will send a GET request to the same URL, and the entire page (not just the partial) will be sent in the response.

Putting this in the view (home/index.html.erb):

<div id="results">
  <%= render :partial => "page" -%></div>
<%= link_to_seo_remote "Next",
  { :update => "#results",
    :url => { :action => "next_page" },
    :complete => visual_effect(:highlight, "#results") } %>

Produces (pay attention to the href attrbute):

<div id="results">
  <!-- first page of results shown here --></div>
<a href="/home/next_page"
  onclick="new Ajax.Updater('#results', '/home/next_page',
  {asynchronous:true, evalScripts:true,
  onComplete:function(request){new Effect.Highlight(&quot;#results&quot;,{});}}); return false;">
  Next
</a>

In  the controller (home.rb), render just the partial if called in an XHR (AJAX) request:

def next_page
  if request.xhr?
    render :partial => "page"
  else
    # Render the entire page, including the "results" section.
    render :action => "index"
  end
end

WARNING ABOUT INCORRECT USE OF THIS FUNCTION

Sorry but I have to yell for emphasis here.

When Google crawls your site it will follow all links on a page in advance, even before the user clicks on them.  Adding :confirm => “Are you sure?” WILL NOT HELP because it generates JavaScript that Google doesn’t execute.  So when you use link_to_seo_remote, DO NOT ALLOW destructive links to be placed in the href attribute.  Instead, override html_options[:href] to link to an intermediate page with “Are you sure?” and a BUTTON (not a link.  The crawler will not click the link, so the data will not be deleted.

See Using Rails AJAX Helpers to Create Safe State-Changing Links and search the page for “request.post?” for an explanation and some sample code.

Does it Have Tests?

Why, yes. I’d like to thank the Rails Community for not tolerating code with no tests. It was soooo tempting just to release this without writing automated tests but the peer pressure got to me.

And I’ll also like to thank Cake for awesome music.

To get the code

ruby script/plugin install http://github.com/BMorearty/link_to_remote_with_seo.git

My Favorite Quotes from the Yellowpages.com Ruby on Rails Talk

yellowpagesI just watched a video from the 2008 QCon conference of a talk by John Straw about how and why Yellowpages.com rewrote their Java site to use Ruby on Rails. It’s a pretty good talk. He starts by describing the situation they were in that led them to consider a rewrite, then goes into the architectural decisions and some of the technical details.

Here are some some choice quotes from the talk, along with my own commentary.

“All programmers want to rewrite the code they’re forced to maintain. They’re almost always wrong.”

Man, is that ever true. (Note that he said almost always. After all, his talk is about a successful rewrite.)

I’ve seen it again and again. Programmers tend to believe the code they’re maintaining (that someone else wrote) sucks and they could write it much better. Often that’s because they haven’t taken the time to understand the code base. As Joel on Software says, “It’s harder to read code than to write it.” I think usually (but not always) the cost of rewriting it far outweighs any benefits. What you’d typically end up with after a rewrite is:

  • A few years have passed
  • You’ve spent a ton of money on the rewrite
  • The app still has bugs–just a different set of bugs. (Another quote from the Joel article: “The idea that new code is better than old is patently absurd.”)
  • A new generation of programmers will join the team soon. They will complain that the code base sucks and needs to be rewritten.

Having said that, I know there are times when a rewrite is the right thing to do. But that’s a discussion for another day.

Something I think his team did correctly: they made a goal of finishing the rewrite in four months, not two years. A massive two-year rewrite has an extremely low chance of succeeding.

“EJB3 is a whole big boxcar full of crazy.”

Now that’s just funny. (He said that after saying EJB3 is much better than earlier generations of EJB, by the way.)

“At this point our performance architect will maintain that Apache is unsuitable for use in any production web serving environment, in general. (And only nginx with its polling model is the right way to go.)”

I don’t agree but it’s a great quote.

“I actually kind of like the thread-unsafety of Rails. I mean it simplifies the programming model quite a bit for simple web sites. You know: I’m handling one request; I understand how to scale that.”

I totally agree with that. As someone who loves writing software, I think threading is fun and awesome and there are situations where it’s a must–I once even thought about writing a book about threading on Win32. When I was first introduced to Ruby on Rails I had a kneejerk “are you kidding me?” reaction when I heard it wasn’t thread-safe. But I’ve since formed the opinion that single-threading is really nice when you can get away with it because of its simplicity. It helps developers focus on the task at hand rather than spending a lot of time debugging threading problems. In a multi-threading environment it’s too easy for developers who understand threading to introduce code that then gets broken by other developers–and it’s too hard to write tests that will catch the breakage the moment it occurs.

By the way, the speaker’s next sentence was “Obviously our fast service-side application is multi-threaded and we have good benefits from that.” So he’s not saying multi-threading should never be used.

“Testing was a big part of the decision. You know, that was actually one of the things which drew me so strongly to the platform once I started understanding it. I had spent years myself as a Java developer trying to figure out how in the heck to use JUnit to do anything useful on my web site. And maybe that was just a failure of imagination on my part, but when we started looking at Rails we didn’t have to figure it out. It was obvious how to test each level. Both the unit tests for the models, and the functional tests and the integration tests. It was all there in the framework. And not only was the framework built to make it easy, but the community expected it. You know, I’ve never seen a development community that was so involved and oriented towards writing test code–writing test automation–than this one. And so that was a big part of our decision.”

So true. I have found that when it’s obvious how to write effective tests and where to put them, I will write tons of tests. If the framework greases the wheels of test-writing and make it pain-free, I will write a lot more and better tests. Rails does a lot better at this than other frameworks I’ve used, although I still think it could use improvement. And I love the emphasis placed on automated testing in the Rails community.

Well, that’s it. To see the whole talk, go to http://www.infoq.com/presentations/straw-yellowpages. And enjoy the grouchy comments by Java developers below the video.

Put HTML tags and apostrophes in fixtures and tests or a meanie will hack you.

Here’s a good way to protect against cross-site scripting attacks and SQL injection attacks. This will help catch mistakes where you (well actually your teammate, since you’re perfect) forgot to call “h” in a <%= %> block, or accidentally passed a SQL statement to the database without escaping the values:

Sprinkle unclosed HTML tags and apostrophes all over your fixture data and test code.

Then use assert_select liberally, which will barf on the console if it sees unclosed HTML tags–even if you were selecting some other part of the document.

I Like Stuff that’s Safe

Here is what a posts.yml file might look like:

test_post:
id: 1
  subject: <script> attack!
  detail: "sql injection: '; drop table posts;"

(If you use an apostrophe in YAML you have to quote the whole string.)

So assert_select has this handy side-effect I mentioned where it tells you about your malformed HTML. Since Rails tests don’t actually run in a browser, you need some other way to know that you’ve forgotten to escape data. Unclosed HTML tags in your fixtures, yeah, that’s the ticket.

And remember, you don’t need to call assert_select on the element that contains the bad data. Just call assert_select on anything and it will parse the output to make sure it’s well-formed.

  def test_show
    post = posts(:test_post)
    get :show, post.id
    assert_select "body"
  end

The idea is that by sprinkling XSS attacks through your fixtures and using assert_select whenever you’re testing other stuff, the XSS attacks will become apparent.

If you do need to assert that the output is correct, you can call CGI::escapeHTML:

  def test_show
    post = posts(:test_post)
    get :show, post.id
    assert_select "span", :count => 1,
      :text => CGI::escapeHTML(post.detail)
  end

I can’t haz SQL injection attacks

I admit that putting SQL injection attacks in the fixtures is a bit contrived and may not help. A better way to catch SQL injection attacks is to pass apostrophes into the app from your test code, so go ahead and sprinkle your test code with beauties like this:

  def test_update
    post :update, posts(:test_post).id,
      :detail => "sql injection: '; drop table posts;"
  end

The secret to making this work is:

  1. apostrophe
  2. semicolon
  3. SQL statement
  4. another semicolon

You want to use a SQL statement that will cause a test to fail. It would be coolio if there were some way to make the current test succeed and subsequent tests fail, but I’m not sure I know a way to do that consistently. But at least if you use a “drop table” statement, you’re going to cause subsequent tests to fail (if there are any subsequent tests that use that table) because a schema change does not happen in a transaction. So even if you’re using transactional fixtures, the next test will fail anyway cuz the dang table is gone.

Generate guid ids 2100x faster for ActiveRecord models (but only if you use MySQL)

The Rails project I’m working on (the Small Business Help Forums at the Intuit Community) has some tables that use GUIDs for their primary keys instead of autoincrement integers. To implement GUIDs we used the handy usesguid plugin. All you have to do is change your “id” column to a 22-character varchar (make sure it’s a binary varchar and uses binary collation, so upper and lower case are treated differently) and put this in your model:

class MyModel < ActiveRecord::Base
  usesguid
end

Pretty nice.

Just one problem.

It’s HECKA slow.

On my Windows machine it was taking a whopping 0.4 seconds to create a GUID with this plugin. On my Linux VM it was a lot faster, but still slower than it should be (0.0322 seconds–just 31 GUIDs per second).

Download the Faster Plugin

If you use MySQL for your database and you’d like to download my modified usesguid plugin which is way faster, type this from the main directory of your Rails app:

 script/plugin install git://github.com/BMorearty/usesguid.git

Or download it here and copy it into vendor/plugins/usesguid.

Then add the “usesguid” statement (see above) to any models that you want to have guid ids, migrate the id columns to binary varchar(22), and add this to your environment.rb file:

ActiveRecord::Base.guid_generator = :mysql

Here is a sample migration for creating a new table with guids, as opposed to changing an existing one to use them:

create_table :products, :id => false, :options => 'ENGINE=InnoDB' do |t|
  # This table uses guid ids
  t.binary :id,   :limit => 22, :null => false
  t.string :name, :limit => 50, :null => false
end
# Since the t.column syntax can't specify a character set and collation...
execute "ALTER TABLE `products` MODIFY COLUMN `id` VARCHAR(22) BINARY CHARACTER SET latin1 COLLATE latin1_bin NOT NULL;"
execute "ALTER TABLE `products` ADD PRIMARY KEY (id)"

I Like Stuff that’s Fast

Read on to find out why the old code was so slow, and how the code got 2100 times faster.

I investigated to see why it takes so long, and found that every time it creates a GUID, it calls UUID.timestamp_create. This in turn calls UUID.get_mac_address, which spawns a new process (ipconfig on Windows; ifconfig on UNIX-based systems) and parses the output. The reason: to discover the network card’s MAC address. (Hey yeah, even Windows has a MAC address.)

But the MAC address never changes. It’s hard-wired into the network card. So why bother querying it every time you create a GUID? Launching a whole new process every time we need a GUID is overkill.

My first thought was to write a plugin on top of the plugin. My plugin would cache result of UUID.get_mac_address. I tried it, but found a problem: there’s a bug in UUID.timestamp_create. If it executes too quickly on a system whose clock resolution is not high enough, it returns the same GUID multiple times in a row. Whoops! Kind of defeats the purpose of GUIDs.

So I decided to take advantage of the fact that MySQL has a “SELECT UUID()” syntax, and I wrote a new GUID creator in the UUID class that calls MySQL to generate GUIDs. (Obviously this only works if you have MySQL.) I called this new creator “UUID.mysql_create.” The first time it is called, it calls MySQL like this:

SELECT UUID(), UUID(), UUID(), UUID(), UUID(), ... ;

It selects 50 UUIDs in a single round-trip to the database and stores the results in memory. Each time a new GUID is required, it plucks one off the list. When the list is empty and another one is required, it goes and gets another 50.

On my Windows machine, creating a GUID with UUID.mysql_create now takes 0.0001937 seconds, which is over 2100 times faster than the 0.4 seconds it used to take. On my Linux VM it’s 0.0001671 seconds, or 193 times faster than the 0.0322 seconds it used to take.

All these changes were made in a new file, uuid_mysql.rb. But I also made a number of changes to the usesguid.rb file:

  1. Added a configuration option so you can specify which creator to use. The default is still timestamp_create, but to use mysql_create you just put “ActiveRecord::Base.guid_generator = :mysql” in your environment.rb file.
  2. Fixed the code so it respects the :column option, which lets you override the column that stores the primary key.
  3. Delayed the assignment of a guid until just before creation (before_create) rather than just after “new” (after_initialize). This has two benefits:
    1. It more closely mimics the default behavior of autoincrement columns, which doesn’t assign an id until after creation
    2. It is faster. After_initialize gets called every time a model object is instantiated, including all objects return by a call to find. (But don’t worry, it wasn’t generating GUIDs for all those objects; it was just being called and bailing out when it saw there was already an id).  Before_create only gets called for newly created model objects.

I thought about making it even faster by calling CoCreateGuid() on Windows and calling a UNIX C function to create a GUID when on UNIX, but it’s so fast now that it hardly seemed worth the extra effort and the extra platform-specific code.

So that’s it. Enjoy it!

Find tests more easily in your Rails test.log

Here’s a nice little trick to make it easier to search test.log for the results of a specific test that’s failing. This trick works with normal Rails unit tests and with Shoulda tests.

When a Rails test fails, I look for it in test.log to see if there are any clues there. But it’s pretty hard to find the portion of the log associated with the test that failed. In this sample section of a log, where does the processing begin for the test called test_should_require_email_on_signup?

test.log without titles
Which test is which? Where does my test start?

It’s hard to find. Now imagine running rake on all your tests and sifting through the whole test.log looking for one test whose name you know, but the test name isn’t in the log.

So the other day I wrote a bit of code in my test_helper.rb file to make the log a lot easier to sift through. Here’s what the above log looks like with this code in place:

test.log with titles
Ooh, nice-n-clear

Ahh, that’s more like it. Now it’s easy to tell where test_should_require_email_on_signup begins. If you scroll up and look at the first log again, you’ll see that there isn’t even a blank line separating that test from the previous one. (See how the test starts on the SELECT count(*) statement?)

Here’s the code. Drop it into test_helper.rb for your Rails project. To me this seems like a nice little example of how Ruby’s open classes can benefit developers (while understandably considered harmful by some). In a language without monkey patching, I would have to resort to something more painful like changing all my tests to be derived from my own subclass of TestCase, and put this code in that class.

Enjoy!

class ActiveSupport::TestCase
  # This extension prints to the log before each test.  Makes it easier to find the test you're looking for
  # when looking through a long test log.
  setup :log_test

  private

  def log_test
    if Rails::logger
      # When I run tests in rake or autotest I see the same log message multiple times per test for some reason.
      # This guard prevents that.
      unless @already_logged_this_test
        Rails::logger.info "nnStarting #{@method_name}n#{'-' * (9 + @method_name.length)}n"
      end
      @already_logged_this_test = true
    end
  end
end

P.S. I didn’t spend the time to figure out why my callback was being called multiple times for each test. I just inserted the guard you see in the code above to prevent the same test title from being shown multiple times.

Drop me a line in the Reply section below and let me know what you think–especially if you’ve figured out why each one is called multiple times when running from rake or autotest.

Updated 6/18: I corrected the code above because WordPress automatically inserted a “mailto:” tag when it saw the @ sign.

Updated 2/26/10: I changed Test::Unit::TestCase to ActiveSupport::TestCase.

How to Show Response Time in a Rails Page with Mongrel

You’ve seen this on Google result pages, right?

You wanna do that in your Rails app that runs with Mongrel? I show you how. Sit down. And along the way I’ll show what I learned about writing custom Mongrel HttpHandlers and why you shouldn’t store instance variable in them.

I remember seeing it there long ago when Google was new. I liked it because:

  1. Faster is better
  2. It shows Google focuses on helping me go fast
  3. It reminds the Google developers to focus on helping me go fast

So I’m developing this awesome web site now and I’m using Ruby on Rails with Mongrel. I wanted to pull a Google and show the server response time as text content in my pages, as a reminder to myself and my co-developer that it’s super-important to keep things fast.

I can haz question

How do you put response time in a page using Rails?

Continue reading