Ruby and Ruby on Rails tricks and tips from the edge

The 7 main actions we took to improve the Rails stack performance at Justin.tv

Here are the slides of the talk I gave at the San Francisco Rails meetup group yesterday about the work we have done on improving Rails performance at Justin.tv

Enjoy!

  • Share/Bookmark

Rails3 ActiveSupport Notification subscription – Rails3 Tricks #03

Hello there! Almost 10 months without any post due to a very busy period: I moved from Paris to San Francisco.
Ok, so now facts has been said, this will hopefully change! I plan to write a new series of Rails3 posts, starting with a quick one on the new way Rails handles notifications.

The new version of ActiveSupport shipped with Rails brings along a new notification system which is heavily used by Rails3 internarlly. Rails doesn’t write directly to logs anymore, instead of that, it publishes a notification which can be caught by any observers.

In production, Rails will by default publish deprecation warnings through this notification system. Last week I was looking for a way to play with that and didn’t find a clear example on the web, so here is a small snippet of code if you are also looking for a nice way to log your deprecation warnings:


# In config/initializers/deprecations_logger.rb
DeprecationLogger = Logger.new(Rails.root.join('log/deprecations.log'))

ActiveSupport::Notifications.subscribe(/deprecation/) do |type, date,b,c, event|
  DeprecationLogger.info("#{date} - #{event[:message]}")
end
  • Share/Bookmark

Avoid memory leaks in your ruby/rails code and protect you against denial of service

We heard a lot about that Ruby is cool cause we do not have to care about memory, the garbage collector does it for us. Well, that’s kind of true, but this does not mean we can write code without keeping in mind on what’s is going on under our ruby code.

Ruby symbol memory leak

We all know that using symbols instead of strings is a good practice to have, it’s faster and it saves your memory. Yes but at what price ? Symbols are faster in part cause they are created just one time in memory, that’s great ! But then ? they will stay forever in memory

That means, do not convert everything in symbol ! Be sure to well know what you are converting in symbol.

Example: Somewhere in your app, you apply a to_sym on an user’s name like :

hash[current_user.name.to_sym] = something

When you have hundreds of users, that’s could be ok, but what is happening if you have one million of users ? Here are the numbers :

kwi:~$ irb
ruby-1.9.2-head >
# Current memory usage : 6608K
# Now, add one million randomly generated short symbols
ruby-1.9.2-head > 1000000.times { (Time.now.to_f.to_s).to_sym }

# Current memory usage : 153M, even after a Garbage collector run.
# Surprisingly, on Ruby 1.8.7-p249,
# the VM only grow up to 33M, but that's still a lot !

# Now, imagine if symbols are just 20x longer than that ?
ruby-1.9.2-head > 1000000.times { (Time.now.to_f.to_s * 20).to_sym }
# Current memory usage : 501M

Furthermore, NEVER convert non controlled arguments in symbol or check arguments before, this can easily lead to a denial of service.

Example: You have a website with a locale parameter in order to localize your content and you have something like this in your application controller:

before_filter :set_locale

def set_locale
  I18n.locale = params[:locale].to_sym
end

It’s really simple to call thousand of times your website with a long params[:locale] and make your application bloat !

By the way, it looks like the I18n gem converts automatically the locale in symbol, so be sure to check if the locale is valid before assigning it !
Here is the link: http://github.com/svenfuchs/i18n/blob/master/lib/i18n/config.rb#LID8

If you need to control your number of allocted symbols in your app, you can use Symbol.all_symbols.size. Add this to your log to see if you are leaking symbols over time ! (This can be a good measure to add in Newrelic; Newrelic guys, are you reading ? :)

Reference to objects leak

This leak is a fake one but can grow rapidly in your app.
It happens when you keep a variable in your code referering objects, and these objects are also referencing objects, and etc…

This often happen when using $variable or @@variable as they stay forever in memory.
Here is a little example :

# Memory usage at irb launch: 6320K

class HelloIamLeaking
  @@an_array = []

  def initialize()
    # Put something big in the array
    @@an_array << "hello world" * (4**10)
  end
end

x = HelloIamLeaking.new
x = nil # So no more HelloIamLeaking instance in our code
GC.start # Run the garbage collector to be sure this is real !
# Memory usage after : 17M

ruby-1.9.2-head > ObjectSpace.each_object(HelloIamLeaking) {|x| p x }
 => 0
# So we have no more instance of HelloIamLeaking
# but the class variable remains in memory.

Ok, this is a completely logical and dumb example but this show you the principle.

And this can grow exponentially if you have objects linking to huge array or datasets, they will never be garbage collected if just one object in your code is still referencing the source object.

This will consume your memory, but not only, this will also consume your cpu time as when the garbage collector runs, it looks on every single object, and the more objects you have, the more it spent time looking at them …

To resume reference leak : with time, it’s grow in memory and slow down dramatically the garbage collector running time.

If you want to read more about reference leaks, read the awsome post descent into darkness on the blog of Joe Damato.

Update: Find this leaks easily using the memprof gem and by using memprof.com (awsome stuff again by Joe Damato)

My app is still bloating !

After that, if you have still ruby/rails process bloating, be sure to use the latest version of gem that are using C code, they can be an easy source of memory leak.

And, this is obvious, but be sure to not load huge dataset in memory at one time ! (use find_in_batch instead for example)

Then, If you want more control over the memory allocation, here is a good link for tune up the heap easily and control your ruby process growth.



Thanks for taking the time to read and I hope this article will help you to reduce your memory consumption !

  • Share/Bookmark

I18n_routing: Translate your Rails2/3 routes with ease

Nowadays, more and more of our Rails applications have to be localized in order to support some languages and handle lots more of customers. Since Rails 2.2 we can do this easily through the awsome i18n gem.

That’s great, but with Rails, we are stuck with basics url in just one main language, and that’s really sucks in term of SEO performance.

So why not translate your Rails routes too ? You will gain more customers through SEO optimization and by the way you will make your clients happy !

Yesterday I have released a new version i18n_routing. It’s now fully compatible with both Rails2 and Rail3.

This gem give you the capability to translates easily all your routes through the shipped i18n gem :
There is nothing to change in your code: just add a few translations in your locale files !

Usage and examples

First of all, install the i18n_routing gem and add it to your Rails project. (Help if needed in the wiki)
Then, in your routes.rb file, just set wich routes you want to translate with the localized method:


Rails2 examples :

map.localized(I18n.available_locales, :verbose => true) do
  map.about, 'about', {:controller => 'contents', :action => :about}

  map.resources :users
  map.resource  :contact
end

Rails3 examples :

localized(I18n.available_locales, :verbose => true) do
  match 'about' => 'contents#about', :as  => :about

  resources :users
  resource  :contact
end

(I18n.available_locales is an array of all locales available given by the i18n gem (version > 0.3.5), you can pass a custom table like this if you prefer : ['en', 'fr', 'es'])

So, now our routes have been declared, just edit your locale files with translations :

fr:
  resource:
    contact: 'contactez-nous'
  named_routes_path:
    about: 'a-propos'
  routes:
    users:
      as: utilisateurs
      path_names:
        new: 'nouvel_utilisateur'
        edit: 'edition_utilisateur'

(This works with any i18n backend, just choose your favorite one and go ahead.)


Now it’s done, just watch the result in your irb console :

$ rails console
ruby-1.8.7-p249 > I18n.locale = :en
 => :en
ruby-1.8.7-p249 > app.users_path
 => "/users"
ruby-1.8.7-p249 > I18n.locale = :fr
 => :fr
ruby-1.8.7-p249 > app.users_path
 => "/utilisateurs"
ruby-1.8.7-p249 > app.contact_path
=> "/contactez-nous"


Here we are ! You can now translate all your routes without any modification in your actual code !

All your routes are going to be automatically generated and recognized in the correct language depending on the current I18n.locale.

Keypoints and Features overview

  • No translations are made during runtime, all is precompiled when building routes at startup.
  • Works with Rails 2.x series (> 2.2) and with Rails 3
  • Built on top of I18n api => translating your routes has never been as simple as now
  • Works with simple resource(s) to deep nested resource(s)
  • Can translates path names like new/edit and custome ones

If you want to know more about i18n_routing, go to the i18n_routing github and have a look at the wiki.

  • Share/Bookmark

Introducing BrB, extremely fast interface for doing distributed ruby

BrB is a simple, transparent and extremely fast interface for doing distributed ruby easily.
It’s inspired from the original Ruby Drb library (Distributed Ruby) but it is build on top of EventMachine for performance.

The concept

BrB use a simple concept : Create an object instance and expose it to the world.
Any other ruby process will be able to call method on that object after having created a communication tunnel.

  • It’s simple as a method call
  • It’s efficient, by default BrB do simple message passing (no return value)
  • You can pass over network every object that is dumpable through Marshal

Example 1 – Simple communication

Start communicating between your different Ruby processes in two easy steps :

Start accepting connections :

class ExposedCoreObject
  def simple_api_method(parameter)
    puts " > Receive #{parameter} in the main ruby process"
  end
end

EM::run do # Start event machine
  # Start BrB Service, expose an instance of core object to the outside world
  BrB::Service.instance.start_service(:object => ExposedCoreObject.new, :host => 'localhost', :port => 5555)
end

In any other ruby process, start communicating :

# Create a communication tunnel to the core process
# nil as first parameter as we do not expose any object in exchange
core = BrB::Tunnel.create(nil, "brb://localhost:5555")

core.simple_api_method('a message')
# Results :
# On core process :  "> Receive a message in the main ruby process"

At the current point, the client call the simple_api_method on our core process.
All the ruby magic is happening again, and number of processes communicating that way is unlimited !

Example 2 – Both side communication

Our previous example was great, but clients can receive method’s call too.

Core code :

EM::run do # Start event machine
  # Start BrB Service, expose an instance of core object to the outside world
  BrB::Service.instance.start_service(:object => ExposedCoreObject.new, :host => 'localhost', :port => 5555)  do |type, tunnel|
    # Get alerted that a new connection has been made :
    if type == :register
      tunnel.say_hi_in_return('I am the core saying Hi')
    end
  end
end

Client code :

class ExposedClientObject
  def say_hi_in_return(s)
    puts " > Core says : #{parameter}"
  end
end

# That time, we are exposing an object.
core = BrB::Tunnel.create(ExposedClientObject.new, "brb://localhost:5555")
core.simple_api_method('a message')
# Results :
# On client process :  "> Core says : I am the core saying Hi"
# On core process :  "> Receive a message in the main ruby process"

That’s it, our both processes are now communicating each others, it’s completely transparent as it’s just work like normal ruby method calls.

Example 3 – Waiting for a return value

By default, calling a method on a distant object is not blocking. That means that it do not wait for any return value. But sometimes, it’s useful to get a return, in order to do this, just add _block at the end of the method name.

core = BrB::Tunnel.create(nil, "brb://localhost:5555")
ret = core.simple_api_method_block('a message') # Wait for the return

What BrB is usable for ?

  • Doing Simple message passing between ruby processes.
  • Connecting hundred of ruby processes transparently.
  • Building a real-time scalable (game) server
  • Taking important load on a server easily just by distributing the load on multiple BrB instance.
  • Taking advantage of multi-core and multi-threaded systems.

If you want to know more about BrB, go to the BrB github.

  • Share/Bookmark

Rails3 and will_paginate, Doing easy remote links – Rails3 Tricks #02

As you know, Rails3 use only UJS (unobtrusive javascript), so for every remote link, Rails3 just add the data-remote attribute to links :

<a href="ajax_page.html" data-remote="true">A remote link !</a>

If you want to do ajaxed pagination, there is no easy way with will_paginate to do remote link (or I didn’t find any one), but with Rails3 UJS, there is a little tricks do to this easily !

Here is the haml and jquery code :

= will_paginate(@users)
:javascript
  $('.pagination a').attr('data-remote', 'true');

This snippet of code just add the attribute data-remote to pagination links. And that’s it ! Our pagination will now be ajaxed :)

  • Share/Bookmark

Using Rspec with multiple version of Rails – Rspec Tricks #01

With the upcoming release of Rails3, it’s important to maintain gem/plugins compatibility with both Rails 2.x series and the new Rails 3.x series.

That’s why I wanted to implement specs supporting both versions for one of my plugin.

With a simple rake task, I wanted to run spec for a specific Rails version.

By default, Rspec create this kind of Rakefile :

require 'rubygems'
require 'rake'
require 'spec/rake/spectask'
 
spec_files = Rake::FileList["spec/**/*_spec.rb"]
 
desc "Run specs"
Spec::Rake::SpecTask.new do |t|
  t.spec_files = spec_files
  t.spec_opts = ["-c"]
end
 
task :default => :spec

Here we have the default Rake task : “spec” wich will run specs for our code.

Custom our Rakefile to specify the Rails version

At this point, we need to pass an extra option to spec script in order to specify wich rails version we want to use in our tests.

The way I found to do this is to tricks spec_opts (which is command line options for spec) by assigning to it, a Proc, like this :

t.spec_opts = lambda do
  @rails_version ? ["-c -- rails_version=#{@rails_version}"] : ["-c"]
end

Then add two tasks to your Rakefile :

desc "Run Rails 2.x specs"
task :rails2_spec do
  @rails_version = 2
  Rake::Task['spec'].invoke
end

desc "Run Rails 3.x specs"
task :rails3_spec do
  @rails_version = 3
  Rake::Task['spec'].invoke
end

Ok, now we have our three tasks passing Rails version to our spec tests !

kwi@ ~/Projects/i18n_routing$ rake -T
(in /Users/kwi/Projects/i18n_routing)
rake rails2_spec  # Run Rails 2.x specs
rake rails3_spec  # Run Rails 3.x specs
rake spec         # Run specs for current Rails version

Retrieve the Rails version parameter

Then in your spec_helper.rb file, where you include your dependencies, you just need to retrieve the rails version with this (dirty) piece of code :

rails_version = ARGV.find { |e| e =~ /rails_version=.*/ }
rails_version = rails_version.split('=').last.to_i rescue 2

Now we have the correct rails version we want for running our tests, so we need to include our dependencies depending on the Rails’ version we want to load.

For this, there is a tricky things with gem method for specify wich version of gem you want to use :

gem 'actionpack', (rails_version < 3 ? '< 2.9' : '> 2.9')
require 'action_controller'

(Example for loading action_controller only, here I use 2.9 version number in order to support Rails3beta)

Here we are, now we can run our tests on both Rails version just with a simple rake task :

rake rails2_spec
# Or
rake rails3_spec
  • Share/Bookmark

Rails 3 edge routing – Rails3 Tricks #01

Here are just two little issues I encoutered today when I worked on Rails 3 edge routes. I think it can be useful to share them.

I assume you already know changes in routes for Rails3.
(if not, watch: Ryan Bates Screencast about routing in Rails3)

Declare your root path

Matching empty route string are now impossible in Rails 3 edge (not in beta yet), so you can not do anymore :

MkdBaseApp::Application.routes.draw do
  match '' => "home#show", :as => :home
end

Instead, you need to use root :

MkdBaseApp::Application.routes.draw do
  root :to => 'home#show', :as  => :home
end

Route Globbing

Route globbing is a way to catch any action matching a part of an url. It’s useful for catching errors for example.

In Rails 2, it was with connect

ActionController::Routing::Routes.draw do |map|
  map.connect "*path", :controller => 'error', :action => 'handle404'
end

In Rails 3, it’s pretty the same, but with match :

MkdBaseApp::Application.routes.draw do
  match "*path" => 'error#handle404'
end

Put this snippet at the end of your routes declaration and the ErrorController will be called if not routes match the requested uri.

Remove the map variable

Remember that you do not need anymore to get the map variable in your route block :

Rails 2:

ActionController::Routing::Routes.draw do |map|
  map.resources :users
end

Rails3, remove the |map| :

MkdBaseApp::Application.routes.draw do
  resources :users
end
  • Share/Bookmark

Adding an external repository with git using submodule

We often need to include externals git repositories in our own one in order to keep links to our favorite plugins.
With git, there is an easy way to do this, it’s called submodule.

The usage is trivial :

git submodule add git://github.com/kwi/i18n_routing.git vendor/plugins/i18n_routing
git submodule init
git submodule update

And there we go, we have installed that nice i18n rails routing plugin in order to translate our rails routes.

Update your modules

If you want to update a particular submodule, go to its folder and do a :

git pull

For pulling all your submodules do a :

git submodule foreach 'git pull'
  • Share/Bookmark

Nginx rewriting and redirection tips

Rewriting url with nginx

Last week, I needed to rename a folder in a rails’ public directory served by nginx. But, doing that, I wanted to keep some links already indexed by search engines and especially jpg images still accessible. And only jpg images, so symbolic links were not appropriate.

  # Ensure JPEG are still accessible after
  # renaming my folder videos to thumbs
  location ~* ^/videos/.+\.(jpg)$ {
    rewrite ^/videos/(.*)$ /thumbs/$1 last;
  }

This snippet of code solve my problem. For every .jpg asked in the videos folder, it rewrites videos by thumbs and then continue serving the file.

Redirection with nginx

Another things can be useful, here is a little piece of code that check if a file is present on the file system and if not, redirect the request to another server :

location ~* ^/thumbs/.+\.(jpg)$ {
  if (-f $request_filename) {
    # The file is present, so serve the file with caching header
    expires 1y;
    add_header Cache-Control public;
    break;
  }

  # the file is not present, redirect to another server
  rewrite ^(.*) http://another.thumbs-server.com$1 permanent;
}
  • Share/Bookmark

RVM, The ruby version manager

Thanks to our greatest friend, Ryan, I discovered last month, RVM, a Ruby version manager.

Starting from now, stop the pain of getting multiple versions of ruby running on the same system!
With RVM, you can manage like with a packet manager every version of ruby you want. (including, ree, jruby, ironruby, etc… !!)

Follow the simple guide for the trivial install on the official website of RVM.

Then install versions of ruby you want simply :

rvm install 1.8.7,1.9.2

rvm list
#rvm Rubies
#
#=> ruby-1.8.7-p249 [ x86_64 ]
#   ruby-1.9.2-preview1 [ x86_64 ]
#
#Default Ruby (for new shells)
#
#   ruby-1.8.7-p249 [ x86_64 ]
#
#System Ruby
#
#   system [ x86_64 i386 ppc ]

After that, just choose the ruby version you want to use by telling RVM :

rvm 1.9.2
ruby -v
# ruby 1.9.2dev (2009-07-18 trunk 24186) [i386-darwin10.2.0]

rvm 1.8.7
ruby -v
# ruby 1.8.7 (2010-01-10 patchlevel 249) [i686-darwin10.2.0]

More infos about RVM :

Each ruby version with RVM has its own ruby gems directory, that’s why you can install separate gems for every version of Ruby you run.
And please note, for installing a gem, you do not need to use sudo.

By default, RVM, will keep the ruby system version on each new term session, if you want to change this behaviour, just tell RVM.

rvm 1.9.2 --default

Start enjoying Ruby again !

So, now, no more excuses, time to go ahead and make the jump, start using Ruby 1.9 as it now pass RubySpec tests !

And by the way test your application and your plugins over multiple version of ruby with ease !

  • Share/Bookmark

How Rails 2.3 marks your string safe

Here is a little thing I find good to know ! Maybe you are curious too and you want to know how rails 2.3 know if your strings are html safe or not.

In order to prevent XSS injections, Rails 2.3 implement a way to determine if a string is safe : It’s just a little extension of the core string class given by active support (output_safety.rb).

This extension add a @_rails_html_safe instance variable to the majority of each string created by rails helper in your application !! (I find this way kind of brutal personally)

This variable is a boolean, and if it’s true, your string is HTML safe. So on each concatenation your string will be marked as html safe or not.

If you want to mark yourself a string as safe, use the method : html_safe!
And if you want to know if a string is safe, use : htm_safe?

ruby-1.8.7-p249 > helper.tag('br').html_safe?
 => true
ruby-1.8.7-p249 > "<script>Not safe</script>".html_safe?
 => nil
ruby-1.8.7-p249 > "<script>Not safe</script>".html_safe!.html_safe?
 => true
ruby-1.8.7-p249 > Marshal::dump(helper.tag('br'))
 => "\004\bI\"\v<br />\006:\026@_rails_html_safeT"
# Here you can see the @_rails_html_safe variable with the string.

Be careful, this code completely change in Rails3 and the little I seen, it’s a lot cleaner :)

  • Share/Bookmark

jQuery 1.4.2 is Out, it’s again blazing fast !

Yesterday, the jQuery team has released a new version of its famous javascript library, and again, it’s rock competitors :)

jQuery benchmarks vs competitors

A good start for succeed your projects, it’s using good tools. And jQuery is one of them.
So, do not wait, and start using the today best javascript library !

For Rails 3, use jQuery UJS :

Use the jQuery official UJS plugin : http://github.com/rails/jquery-ujs, explanations for usage on another blog : http://blog.datagraph.org/2010/02/jquery-with-rails-3

And do not forget to use the helper csrf_meta_tag in your header in order to output metas for the authenticity token.

For Rails 2.x, use Jrails :

Install the drop-in remplacement jrails plugin: http://github.com/aaronchi/jrails

  • Share/Bookmark

Another way to compare Class – Ruby Tricks #04

Today, just a really small tricks but I find it kind of cool. (But useless :) )

When we want to check an object class, we often use .is_a?(ClassName).

But you can do this with the === operator too :

Hash === {}
# => true

But be careful, put the Class before, as it’s not the same :

{} === Hash
# => false
  • Share/Bookmark

Hash creation on the fly – Ruby Tricks #03

So many times, we want to extract from an Array an Hash in order to access more easily some values.

For example, your are in your Rails environment and you want to extract from your Articles tables an Hash with hash[article_id] => Article.

Here is the tricks do to this with just one line of code :

Article.all.inject({}) { |h, article| h[article.id] = article; h }

A real world example I use for caching data in an active record model with a constant :

class Locale < ActiveRecord::Base
  ...
  LocaleCached = self.find(:all).inject({}) { |h, l| h[l.short.to_sym] = l; h }
  ...
end
  • Share/Bookmark

String concatenation performance – Ruby Tricks #02

When it’s come to make string concatenation that you use hundred time in your every day projects, you have the choice in Ruby !

Most common cases :

"Hi #{login}"

'Hi ' + login

s = 'Hi '
s += login

s = 'Hi '
s << login

But, all these methods for concatening strings does not really behave the same :

First case, += VS << :

s = 'Hi '
s += login

The + operator for strings create a new string object by concatening two strings, here ‘Hi ‘ and login. So we have instanciated two strings in order to just get one.

s = 'Hi '
s << login

On the other hand, the << append directly the content of the second string in the first string, so you do not re-instantiate a new string. But you modify your first object, so be careful especially when it comes from a variable.

Second case, + VS #{} :

'Hello ' + 'ruby ' + 'world'

Create the ‘Hello ruby ‘ string then re-create the last string : ‘Hello ruby world’
=> So create unecessary strings.

"Hello #{'ruby '}#{'world'}"

Directly create the full string ‘Hello ruby world’ without an intermediate state like seen before

Conclusion

  • Privilegiate << when you can !
  • Use the “#{}” concatenation manner when you concatenate more than 2 strings together.
  • Share/Bookmark

Symbol#to_proc – Ruby Tricks #01

First tricks today, here is an easy one :

If you are using Active Support (shipped with Rails), or a ruby version superior or equal to 1.8.7, you can use the symbol proc shortcut :

Here is the standard way declaring a block :

>> ['a', 'b', 'c'].collect {|letter| letter.capitalize}
=> ["A", "B", "C"]

Here is the handy method :

>> ['a', 'b', 'c'].collect(&:capitalize)
=> ["A", "B", "C"]

But, keep in mind that the shortcut method is a little bit slower in term of performance than the normal way cause it creates a new Proc on each call !

Benchmark :

t = Benchmark.realtime do
  (['a'] * 1000000).collect(&:to_s)
end
puts "Time using to_proc: #{t}"

t = Benchmark.realtime do
  (['a'] * 1000000).collect do |e|
    e.to_s
  end
end
puts "Time using normal block: #{t}"

# Time using to_proc: 0.631899118423462
# Time using normal block: 0.246822834014893
# Results are the same if you test the normal block first
  • Share/Bookmark