Add Optional SEO-Friendliness to link_to_remote

link_to_remote_with_seo adds optional SEO-friendly goodness to the Rails link_to_remote function.  I wrote it for cases where I would have used link_to_remote in my Rails app but I wanted GoogleBot and other search engines to be able to follow the links.  In addition to setting onclick like the normal link_to_remote, it also sets html_options[:href] to the SAME URL that you pass in to options[:url]. (It only does this if you pass :seo => true and you do not explicitly set the href.)

See the big honking warning at the bottom for an explanation of why this plugin doesn’t just override the behavior of link_to_remote.

I Like Stuff that’s SEO-Friendly

The following example shows a “Next” link in paginated output.  Clicking the link in a browser results in an AJAX call (using the POST method) that retrieves just the “page” partial and inserts it into the “results” div on the page with a highlight visual effect.  When a search engine sees the link, however, it will send a GET request to the same URL, and the entire page (not just the partial) will be sent in the response.

Putting this in the view (home/index.html.erb):

<div id="results">
  <%= render :partial => "page" -%></div>
<%= link_to_seo_remote "Next",
  { :update => "#results",
    :url => { :action => "next_page" },
    :complete => visual_effect(:highlight, "#results") } %>

Produces (pay attention to the href attrbute):

<div id="results">
  <!-- first page of results shown here --></div>
<a href="/home/next_page"
  onclick="new Ajax.Updater('#results', '/home/next_page',
  {asynchronous:true, evalScripts:true,
  onComplete:function(request){new Effect.Highlight(&quot;#results&quot;,{});}}); return false;">
  Next
</a>

In  the controller (home.rb), render just the partial if called in an XHR (AJAX) request:

def next_page
  if request.xhr?
    render :partial => "page"
  else
    # Render the entire page, including the "results" section.
    render :action => "index"
  end
end

WARNING ABOUT INCORRECT USE OF THIS FUNCTION

Sorry but I have to yell for emphasis here.

When Google crawls your site it will follow all links on a page in advance, even before the user clicks on them.  Adding :confirm => “Are you sure?” WILL NOT HELP because it generates JavaScript that Google doesn’t execute.  So when you use link_to_seo_remote, DO NOT ALLOW destructive links to be placed in the href attribute.  Instead, override html_options[:href] to link to an intermediate page with “Are you sure?” and a BUTTON (not a link.  The crawler will not click the link, so the data will not be deleted.

See Using Rails AJAX Helpers to Create Safe State-Changing Links and search the page for “request.post?” for an explanation and some sample code.

Does it Have Tests?

Why, yes. I’d like to thank the Rails Community for not tolerating code with no tests. It was soooo tempting just to release this without writing automated tests but the peer pressure got to me.

And I’ll also like to thank Cake for awesome music.

To get the code

ruby script/plugin install http://github.com/BMorearty/link_to_remote_with_seo.git

My Favorite Quotes from the Yellowpages.com Ruby on Rails Talk

yellowpagesI just watched a video from the 2008 QCon conference of a talk by John Straw about how and why Yellowpages.com rewrote their Java site to use Ruby on Rails. It’s a pretty good talk. He starts by describing the situation they were in that led them to consider a rewrite, then goes into the architectural decisions and some of the technical details.

Here are some some choice quotes from the talk, along with my own commentary.

“All programmers want to rewrite the code they’re forced to maintain. They’re almost always wrong.”

Man, is that ever true. (Note that he said almost always. After all, his talk is about a successful rewrite.)

I’ve seen it again and again. Programmers tend to believe the code they’re maintaining (that someone else wrote) sucks and they could write it much better. Often that’s because they haven’t taken the time to understand the code base. As Joel on Software says, “It’s harder to read code than to write it.” I think usually (but not always) the cost of rewriting it far outweighs any benefits. What you’d typically end up with after a rewrite is:

  • A few years have passed
  • You’ve spent a ton of money on the rewrite
  • The app still has bugs–just a different set of bugs. (Another quote from the Joel article: “The idea that new code is better than old is patently absurd.”)
  • A new generation of programmers will join the team soon. They will complain that the code base sucks and needs to be rewritten.

Having said that, I know there are times when a rewrite is the right thing to do. But that’s a discussion for another day.

Something I think his team did correctly: they made a goal of finishing the rewrite in four months, not two years. A massive two-year rewrite has an extremely low chance of succeeding.

“EJB3 is a whole big boxcar full of crazy.”

Now that’s just funny. (He said that after saying EJB3 is much better than earlier generations of EJB, by the way.)

“At this point our performance architect will maintain that Apache is unsuitable for use in any production web serving environment, in general. (And only nginx with its polling model is the right way to go.)”

I don’t agree but it’s a great quote.

“I actually kind of like the thread-unsafety of Rails. I mean it simplifies the programming model quite a bit for simple web sites. You know: I’m handling one request; I understand how to scale that.”

I totally agree with that. As someone who loves writing software, I think threading is fun and awesome and there are situations where it’s a must–I once even thought about writing a book about threading on Win32. When I was first introduced to Ruby on Rails I had a kneejerk “are you kidding me?” reaction when I heard it wasn’t thread-safe. But I’ve since formed the opinion that single-threading is really nice when you can get away with it because of its simplicity. It helps developers focus on the task at hand rather than spending a lot of time debugging threading problems. In a multi-threading environment it’s too easy for developers who understand threading to introduce code that then gets broken by other developers–and it’s too hard to write tests that will catch the breakage the moment it occurs.

By the way, the speaker’s next sentence was “Obviously our fast service-side application is multi-threaded and we have good benefits from that.” So he’s not saying multi-threading should never be used.

“Testing was a big part of the decision. You know, that was actually one of the things which drew me so strongly to the platform once I started understanding it. I had spent years myself as a Java developer trying to figure out how in the heck to use JUnit to do anything useful on my web site. And maybe that was just a failure of imagination on my part, but when we started looking at Rails we didn’t have to figure it out. It was obvious how to test each level. Both the unit tests for the models, and the functional tests and the integration tests. It was all there in the framework. And not only was the framework built to make it easy, but the community expected it. You know, I’ve never seen a development community that was so involved and oriented towards writing test code–writing test automation–than this one. And so that was a big part of our decision.”

So true. I have found that when it’s obvious how to write effective tests and where to put them, I will write tons of tests. If the framework greases the wheels of test-writing and make it pain-free, I will write a lot more and better tests. Rails does a lot better at this than other frameworks I’ve used, although I still think it could use improvement. And I love the emphasis placed on automated testing in the Rails community.

Well, that’s it. To see the whole talk, go to http://www.infoq.com/presentations/straw-yellowpages. And enjoy the grouchy comments by Java developers below the video.

Put HTML tags and apostrophes in fixtures and tests or a meanie will hack you.

Here’s a good way to protect against cross-site scripting attacks and SQL injection attacks. This will help catch mistakes where you (well actually your teammate, since you’re perfect) forgot to call “h” in a <%= %> block, or accidentally passed a SQL statement to the database without escaping the values:

Sprinkle unclosed HTML tags and apostrophes all over your fixture data and test code.

Then use assert_select liberally, which will barf on the console if it sees unclosed HTML tags–even if you were selecting some other part of the document.

I Like Stuff that’s Safe

Here is what a posts.yml file might look like:

test_post:
id: 1
  subject: <script> attack!
  detail: "sql injection: '; drop table posts;"

(If you use an apostrophe in YAML you have to quote the whole string.)

So assert_select has this handy side-effect I mentioned where it tells you about your malformed HTML. Since Rails tests don’t actually run in a browser, you need some other way to know that you’ve forgotten to escape data. Unclosed HTML tags in your fixtures, yeah, that’s the ticket.

And remember, you don’t need to call assert_select on the element that contains the bad data. Just call assert_select on anything and it will parse the output to make sure it’s well-formed.

  def test_show
    post = posts(:test_post)
    get :show, post.id
    assert_select "body"
  end

The idea is that by sprinkling XSS attacks through your fixtures and using assert_select whenever you’re testing other stuff, the XSS attacks will become apparent.

If you do need to assert that the output is correct, you can call CGI::escapeHTML:

  def test_show
    post = posts(:test_post)
    get :show, post.id
    assert_select "span", :count => 1,
      :text => CGI::escapeHTML(post.detail)
  end

I can’t haz SQL injection attacks

I admit that putting SQL injection attacks in the fixtures is a bit contrived and may not help. A better way to catch SQL injection attacks is to pass apostrophes into the app from your test code, so go ahead and sprinkle your test code with beauties like this:

  def test_update
    post :update, posts(:test_post).id,
      :detail => "sql injection: '; drop table posts;"
  end

The secret to making this work is:

  1. apostrophe
  2. semicolon
  3. SQL statement
  4. another semicolon

You want to use a SQL statement that will cause a test to fail. It would be coolio if there were some way to make the current test succeed and subsequent tests fail, but I’m not sure I know a way to do that consistently. But at least if you use a “drop table” statement, you’re going to cause subsequent tests to fail (if there are any subsequent tests that use that table) because a schema change does not happen in a transaction. So even if you’re using transactional fixtures, the next test will fail anyway cuz the dang table is gone.

Fun with Ruby’s instance_eval and class_eval

In an attempt to better understand instance_eval and class_eval, I just read Khaled’s post on Ruby reflection. It helped, and I came up with a memory crutch I can use to remember when to use each of them:

Use ClassName.instance_eval to define class methods.

Use ClassName.class_eval to define instance methods.

That’s right. Not a typo. Here are some examples, shamelessly stolen from his post:

# Defining a class method with instance_eval
Fixnum.instance_eval { def ten; 10; end }
Fixnum.ten #=> 10

# Defining an instance method with class_eval
Fixnum.class_eval { def number; self; end }
7.number #=> 7

I Like Stuff that’s Backwards

Why is it the reverse of what you might expect? Because Fixnum.instance_eval treats Fixnum as an instance (an instance of the Class class), thus any new functions you define can be called on that instance. So it’s equivalent to this:

class Fixnum
  def self.ten
    10
  end
end
Fixnum.ten #=> 10

Fixnum.class_eval treats Fixnum as a class and executes the code in the context of that class, thus any “def” statements are treated exactly as if they were in normal code without any reflection. It’s equivalent to this:

class Fixnum
  def number
    self
  end
end
7.number #=> 7

There are still some things about Ruby reflection that mystify me but at least I think I’ve got this one nailed.

Generate guid ids 2100x faster for ActiveRecord models (but only if you use MySQL)

The Rails project I’m working on (the Small Business Help Forums at the Intuit Community) has some tables that use GUIDs for their primary keys instead of autoincrement integers. To implement GUIDs we used the handy usesguid plugin. All you have to do is change your “id” column to a 22-character varchar (make sure it’s a binary varchar and uses binary collation, so upper and lower case are treated differently) and put this in your model:

class MyModel < ActiveRecord::Base
  usesguid
end

Pretty nice.

Just one problem.

It’s HECKA slow.

On my Windows machine it was taking a whopping 0.4 seconds to create a GUID with this plugin. On my Linux VM it was a lot faster, but still slower than it should be (0.0322 seconds–just 31 GUIDs per second).

Download the Faster Plugin

If you use MySQL for your database and you’d like to download my modified usesguid plugin which is way faster, type this from the main directory of your Rails app:

 script/plugin install git://github.com/BMorearty/usesguid.git

Or download it here and copy it into vendor/plugins/usesguid.

Then add the “usesguid” statement (see above) to any models that you want to have guid ids, migrate the id columns to binary varchar(22), and add this to your environment.rb file:

ActiveRecord::Base.guid_generator = :mysql

Here is a sample migration for creating a new table with guids, as opposed to changing an existing one to use them:

create_table :products, :id => false, :options => 'ENGINE=InnoDB' do |t|
  # This table uses guid ids
  t.binary :id,   :limit => 22, :null => false
  t.string :name, :limit => 50, :null => false
end
# Since the t.column syntax can't specify a character set and collation...
execute "ALTER TABLE `products` MODIFY COLUMN `id` VARCHAR(22) BINARY CHARACTER SET latin1 COLLATE latin1_bin NOT NULL;"
execute "ALTER TABLE `products` ADD PRIMARY KEY (id)"

I Like Stuff that’s Fast

Read on to find out why the old code was so slow, and how the code got 2100 times faster.

I investigated to see why it takes so long, and found that every time it creates a GUID, it calls UUID.timestamp_create. This in turn calls UUID.get_mac_address, which spawns a new process (ipconfig on Windows; ifconfig on UNIX-based systems) and parses the output. The reason: to discover the network card’s MAC address. (Hey yeah, even Windows has a MAC address.)

But the MAC address never changes. It’s hard-wired into the network card. So why bother querying it every time you create a GUID? Launching a whole new process every time we need a GUID is overkill.

My first thought was to write a plugin on top of the plugin. My plugin would cache result of UUID.get_mac_address. I tried it, but found a problem: there’s a bug in UUID.timestamp_create. If it executes too quickly on a system whose clock resolution is not high enough, it returns the same GUID multiple times in a row. Whoops! Kind of defeats the purpose of GUIDs.

So I decided to take advantage of the fact that MySQL has a “SELECT UUID()” syntax, and I wrote a new GUID creator in the UUID class that calls MySQL to generate GUIDs. (Obviously this only works if you have MySQL.) I called this new creator “UUID.mysql_create.” The first time it is called, it calls MySQL like this:

SELECT UUID(), UUID(), UUID(), UUID(), UUID(), ... ;

It selects 50 UUIDs in a single round-trip to the database and stores the results in memory. Each time a new GUID is required, it plucks one off the list. When the list is empty and another one is required, it goes and gets another 50.

On my Windows machine, creating a GUID with UUID.mysql_create now takes 0.0001937 seconds, which is over 2100 times faster than the 0.4 seconds it used to take. On my Linux VM it’s 0.0001671 seconds, or 193 times faster than the 0.0322 seconds it used to take.

All these changes were made in a new file, uuid_mysql.rb. But I also made a number of changes to the usesguid.rb file:

  1. Added a configuration option so you can specify which creator to use. The default is still timestamp_create, but to use mysql_create you just put “ActiveRecord::Base.guid_generator = :mysql” in your environment.rb file.
  2. Fixed the code so it respects the :column option, which lets you override the column that stores the primary key.
  3. Delayed the assignment of a guid until just before creation (before_create) rather than just after “new” (after_initialize). This has two benefits:
    1. It more closely mimics the default behavior of autoincrement columns, which doesn’t assign an id until after creation
    2. It is faster. After_initialize gets called every time a model object is instantiated, including all objects return by a call to find. (But don’t worry, it wasn’t generating GUIDs for all those objects; it was just being called and bailing out when it saw there was already an id).  Before_create only gets called for newly created model objects.

I thought about making it even faster by calling CoCreateGuid() on Windows and calling a UNIX C function to create a GUID when on UNIX, but it’s so fast now that it hardly seemed worth the extra effort and the extra platform-specific code.

So that’s it. Enjoy it!

Find tests more easily in your Rails test.log

Here’s a nice little trick to make it easier to search test.log for the results of a specific test that’s failing. This trick works with normal Rails unit tests and with Shoulda tests.

When a Rails test fails, I look for it in test.log to see if there are any clues there. But it’s pretty hard to find the portion of the log associated with the test that failed. In this sample section of a log, where does the processing begin for the test called test_should_require_email_on_signup?

test.log without titles
Which test is which? Where does my test start?

It’s hard to find. Now imagine running rake on all your tests and sifting through the whole test.log looking for one test whose name you know, but the test name isn’t in the log.

So the other day I wrote a bit of code in my test_helper.rb file to make the log a lot easier to sift through. Here’s what the above log looks like with this code in place:

test.log with titles
Ooh, nice-n-clear

Ahh, that’s more like it. Now it’s easy to tell where test_should_require_email_on_signup begins. If you scroll up and look at the first log again, you’ll see that there isn’t even a blank line separating that test from the previous one. (See how the test starts on the SELECT count(*) statement?)

Here’s the code. Drop it into test_helper.rb for your Rails project. To me this seems like a nice little example of how Ruby’s open classes can benefit developers (while understandably considered harmful by some). In a language without monkey patching, I would have to resort to something more painful like changing all my tests to be derived from my own subclass of TestCase, and put this code in that class.

Enjoy!

class ActiveSupport::TestCase
  # This extension prints to the log before each test.  Makes it easier to find the test you're looking for
  # when looking through a long test log.
  setup :log_test

  private

  def log_test
    if Rails::logger
      # When I run tests in rake or autotest I see the same log message multiple times per test for some reason.
      # This guard prevents that.
      unless @already_logged_this_test
        Rails::logger.info "nnStarting #{@method_name}n#{'-' * (9 + @method_name.length)}n"
      end
      @already_logged_this_test = true
    end
  end
end

P.S. I didn’t spend the time to figure out why my callback was being called multiple times for each test. I just inserted the guard you see in the code above to prevent the same test title from being shown multiple times.

Drop me a line in the Reply section below and let me know what you think–especially if you’ve figured out why each one is called multiple times when running from rake or autotest.

Updated 6/18: I corrected the code above because WordPress automatically inserted a “mailto:” tag when it saw the @ sign.

Updated 2/26/10: I changed Test::Unit::TestCase to ActiveSupport::TestCase.

How to Show Response Time in a Rails Page with Mongrel

You’ve seen this on Google result pages, right?

You wanna do that in your Rails app that runs with Mongrel? I show you how. Sit down. And along the way I’ll show what I learned about writing custom Mongrel HttpHandlers and why you shouldn’t store instance variable in them.

I remember seeing it there long ago when Google was new. I liked it because:

  1. Faster is better
  2. It shows Google focuses on helping me go fast
  3. It reminds the Google developers to focus on helping me go fast

So I’m developing this awesome web site now and I’m using Ruby on Rails with Mongrel. I wanted to pull a Google and show the server response time as text content in my pages, as a reminder to myself and my co-developer that it’s super-important to keep things fast.

I can haz question

How do you put response time in a page using Rails?

Continue reading

What is your Zombie Escape Plan?

Ok, so I was talking to my cool niece last month and she told me something that just cracked me up.

Are you ready?

Here goes:

Every teenage boy has a Zombie Escape Plan.

That’s what my niece told me. She was serious. And she thought it was just as weird as I do. (She doesn’t have one.)

Here’s how she found out about it. One day she was listening as two male friends of hers were comparing zombie escape plans. This was new to her. “Does every guy have a zombie escape plan?” she asked them.

“Well duh,” they both said, dead serious.

So she ran an unscientific survey of her teenage male friends to find out if it was true. And guess what.

It was.

Every boy she asked said yes, naturally he has a zombie escape plan.

By now I was busting up. I told her well, at least I have a fire escape ladder in my 2-story house. I could use that as my zombie escape plan too. Her dad (my brother-in-law) said no: that’s lame. A fire escape plan does not serve as a zombie escape plan. As evidence he pointed to his own zombie escape plan: he will dance a jig. Because everyone knows a zombie cannot resist dancing a jig if he sees someone else doing it. But it doesn’t make a very good fire escape plan.

My niece said her dad was right. One boy in her survey said his zombie escape plan involved climbing up on the roof, which is usually not a good idea in a fire. At this point her younger brother, who’s also a teenager, piped up and said that after all, his zombie escape plan is to use a flame thrower. And that never makes a very good fire escape plan.

When I got home I googled it and found that there is even a web site dedicated just to this (side note: what did we ever do before the web?): http://www.zombieescapeplan.com. Except it’s made by a girl. So I guess at least that’s good because when the zombies attack some of the girls will be prepared.

I need a good zombie escape plan. What’s yours? Please comment below so I can get some good ideas.

How to write case (switch) statements in Ruby

If you’re like me, when you started coding in Ruby last year you found the “case” statement intriguing. After years of writing in C++ and C# it was hard for you to remember Ruby’s case syntax because it can do so much more than switch statements in those languages.

So you wrote these notes to yourself as you discovered its capabilities. Except you’re not that much like me so you didn’t. But I did. I hope you find them useful.

switch/case syntaxes
(remember: Ruby uses "case" and "when"
where others use "switch" and "case"):

# Basically if/elsif/else (notice there's nothing
# after the word "case"):
[variable = ] case
when bool_condition
  statements
when bool_condition
  statements
else # the else clause is optional
  statements
end
# If you assigned 'variable =' before the case,
# the variable now has the value of the
# last-executed statement--or nil if there was
# no match.  variable=if/elsif/else does this too.

# It's common for the "else" to be a 1-line
# statement even when the cases are multi-line:
[variable = ] case
when bool_condition
  statements
when bool_condition
  statements
else statement
end

# Case on an expression:
[variable = ] case expression
when nil
  statements execute if the expr was nil
when Type1 [ , Type2 ] # e.g. Symbol, String
  statements execute if the expr
  resulted in Type1 or Type2 etc.
when value1 [ , value2 ]
  statements execute if the expr
  equals value1 or value2 etc.
when /regexp1/ [ , /regexp2/ ]
  statements execute if the expr
  matches regexp1 or regexp 2 etc.
when min1..max1 [ , min2..max2 ]
  statements execute if the expr is in the range
  from min1 to max1 or min2 to max2 etc.
  (use 3 dots min...max to go up to max-1)
else
  statements
end

# When using case on an expression you can mix &
# match different types of expressions. E.g.,
[variable =] case expression
when nil, /regexp/, Type
  statements execute when the expression
  is nil or matches the regexp or results in Type
when min..max, /regexp2/
  statements execute when the expression is
  in the range from min to max or matches regexp2
end

# You can combine matches into an array and
# precede it with an asterisk. This is useful when
# the matches are defined at runtime, not when
# writing the code. The array can contain a
# combination of match expressions
# (strings, nil, regexp, ranges, etc.)
[variable =] case expression
when *array_1
  statements execute when the expression matches one
  of the elements of array_1
when *array_2
  statements execute when the expression matches one
  of the elements of array_2
end

# Compact syntax with 'then':
[variable =] case expression
when something then statement
when something then statement
else statement
end

# Compact syntax with semicolons:
[variable =] case expression
when something; statement
when something; statement
else statement # no semicolon required
end

# 1-line syntax:
[variable = ] case expr when {Type|value}
  statements
end

# Formatting: it's common to indent the "when"
# clauses and it's also common not to.
case
  when
  when
  else
end

# More idiomatic:
case
when
when
else
end