Generate guid ids 2100x faster for ActiveRecord models (but only if you use MySQL)

The Rails project I’m working on (the Small Business Help Forums at the Intuit Community) has some tables that use GUIDs for their primary keys instead of autoincrement integers. To implement GUIDs we used the handy usesguid plugin. All you have to do is change your “id” column to a 22-character varchar (make sure it’s a binary varchar and uses binary collation, so upper and lower case are treated differently) and put this in your model:

class MyModel < ActiveRecord::Base
  usesguid
end

Pretty nice.

Just one problem.

It’s HECKA slow.

On my Windows machine it was taking a whopping 0.4 seconds to create a GUID with this plugin. On my Linux VM it was a lot faster, but still slower than it should be (0.0322 seconds–just 31 GUIDs per second).

Download the Faster Plugin

If you use MySQL for your database and you’d like to download my modified usesguid plugin which is way faster, type this from the main directory of your Rails app:

 script/plugin install git://github.com/BMorearty/usesguid.git

Or download it here and copy it into vendor/plugins/usesguid.

Then add the “usesguid” statement (see above) to any models that you want to have guid ids, migrate the id columns to binary varchar(22), and add this to your environment.rb file:

ActiveRecord::Base.guid_generator = :mysql

Here is a sample migration for creating a new table with guids, as opposed to changing an existing one to use them:

create_table :products, :id => false, :options => 'ENGINE=InnoDB' do |t|
  # This table uses guid ids
  t.binary :id,   :limit => 22, :null => false
  t.string :name, :limit => 50, :null => false
end
# Since the t.column syntax can't specify a character set and collation...
execute "ALTER TABLE `products` MODIFY COLUMN `id` VARCHAR(22) BINARY CHARACTER SET latin1 COLLATE latin1_bin NOT NULL;"
execute "ALTER TABLE `products` ADD PRIMARY KEY (id)"

I Like Stuff that’s Fast

Read on to find out why the old code was so slow, and how the code got 2100 times faster.

I investigated to see why it takes so long, and found that every time it creates a GUID, it calls UUID.timestamp_create. This in turn calls UUID.get_mac_address, which spawns a new process (ipconfig on Windows; ifconfig on UNIX-based systems) and parses the output. The reason: to discover the network card’s MAC address. (Hey yeah, even Windows has a MAC address.)

But the MAC address never changes. It’s hard-wired into the network card. So why bother querying it every time you create a GUID? Launching a whole new process every time we need a GUID is overkill.

My first thought was to write a plugin on top of the plugin. My plugin would cache result of UUID.get_mac_address. I tried it, but found a problem: there’s a bug in UUID.timestamp_create. If it executes too quickly on a system whose clock resolution is not high enough, it returns the same GUID multiple times in a row. Whoops! Kind of defeats the purpose of GUIDs.

So I decided to take advantage of the fact that MySQL has a “SELECT UUID()” syntax, and I wrote a new GUID creator in the UUID class that calls MySQL to generate GUIDs. (Obviously this only works if you have MySQL.) I called this new creator “UUID.mysql_create.” The first time it is called, it calls MySQL like this:

SELECT UUID(), UUID(), UUID(), UUID(), UUID(), ... ;

It selects 50 UUIDs in a single round-trip to the database and stores the results in memory. Each time a new GUID is required, it plucks one off the list. When the list is empty and another one is required, it goes and gets another 50.

On my Windows machine, creating a GUID with UUID.mysql_create now takes 0.0001937 seconds, which is over 2100 times faster than the 0.4 seconds it used to take. On my Linux VM it’s 0.0001671 seconds, or 193 times faster than the 0.0322 seconds it used to take.

All these changes were made in a new file, uuid_mysql.rb. But I also made a number of changes to the usesguid.rb file:

  1. Added a configuration option so you can specify which creator to use. The default is still timestamp_create, but to use mysql_create you just put “ActiveRecord::Base.guid_generator = :mysql” in your environment.rb file.
  2. Fixed the code so it respects the :column option, which lets you override the column that stores the primary key.
  3. Delayed the assignment of a guid until just before creation (before_create) rather than just after “new” (after_initialize). This has two benefits:
    1. It more closely mimics the default behavior of autoincrement columns, which doesn’t assign an id until after creation
    2. It is faster. After_initialize gets called every time a model object is instantiated, including all objects return by a call to find. (But don’t worry, it wasn’t generating GUIDs for all those objects; it was just being called and bailing out when it saw there was already an id).  Before_create only gets called for newly created model objects.

I thought about making it even faster by calling CoCreateGuid() on Windows and calling a UNIX C function to create a GUID when on UNIX, but it’s so fast now that it hardly seemed worth the extra effort and the extra platform-specific code.

So that’s it. Enjoy it!

One thought on “Generate guid ids 2100x faster for ActiveRecord models (but only if you use MySQL)”

Leave a Reply

Your email address will not be published. Required fields are marked *

Feel free to use <a>, <b>, <i>, <strong>, <em>, <strike>, <code>.

Code blocks:
[code language="ruby/javascript/html/css/sass/bash"]
[/code]