Avoid memory leaks in your ruby/rails code and protect you against denial of service

We heard a lot about that Ruby is cool cause we do not have to care about memory, the garbage collector does it for us. Well, that’s kind of true, but this does not mean we can write code without keeping in mind on what’s is going on under our ruby code.

Ruby symbol memory leak

We all know that using symbols instead of strings is a good practice to have, it’s faster and it saves your memory. Yes but at what price ? Symbols are faster in part cause they are created just one time in memory, that’s great ! But then ? they will stay forever in memory

That means, do not convert everything in symbol ! Be sure to well know what you are converting in symbol.

Example: Somewhere in your app, you apply a to_sym on an user’s name like :

hash[current_user.name.to_sym] = something

When you have hundreds of users, that’s could be ok, but what is happening if you have one million of users ? Here are the numbers :

kwi:~$ irb
ruby-1.9.2-head >
# Current memory usage : 6608K
# Now, add one million randomly generated short symbols
ruby-1.9.2-head > 1000000.times { (Time.now.to_f.to_s).to_sym }

# Current memory usage : 153M, even after a Garbage collector run.
# Surprisingly, on Ruby 1.8.7-p249,
# the VM only grow up to 33M, but that's still a lot !

# Now, imagine if symbols are just 20x longer than that ?
ruby-1.9.2-head > 1000000.times { (Time.now.to_f.to_s * 20).to_sym }
# Current memory usage : 501M

Furthermore, NEVER convert non controlled arguments in symbol or check arguments before, this can easily lead to a denial of service.

Example: You have a website with a locale parameter in order to localize your content and you have something like this in your application controller:

before_filter :set_locale

def set_locale
  I18n.locale = params[:locale].to_sym
end

It’s really simple to call thousand of times your website with a long params[:locale] and make your application bloat !

By the way, it looks like the I18n gem converts automatically the locale in symbol, so be sure to check if the locale is valid before assigning it !
Here is the link: http://github.com/svenfuchs/i18n/blob/master/lib/i18n/config.rb#LID8

If you need to control your number of allocted symbols in your app, you can use Symbol.all_symbols.size. Add this to your log to see if you are leaking symbols over time ! (This can be a good measure to add in Newrelic; Newrelic guys, are you reading ? :)

Reference to objects leak

This leak is a fake one but can grow rapidly in your app.
It happens when you keep a variable in your code referering objects, and these objects are also referencing objects, and etc…

This often happen when using $variable or @@variable as they stay forever in memory.
Here is a little example :

# Memory usage at irb launch: 6320K

class HelloIamLeaking
  @@an_array = []

  def initialize()
    # Put something big in the array
    @@an_array << "hello world" * (4**10)
  end
end

x = HelloIamLeaking.new
x = nil # So no more HelloIamLeaking instance in our code
GC.start # Run the garbage collector to be sure this is real !
# Memory usage after : 17M

ruby-1.9.2-head > ObjectSpace.each_object(HelloIamLeaking) {|x| p x }
 => 0
# So we have no more instance of HelloIamLeaking
# but the class variable remains in memory.

Ok, this is a completely logical and dumb example but this show you the principle.

And this can grow exponentially if you have objects linking to huge array or datasets, they will never be garbage collected if just one object in your code is still referencing the source object.

This will consume your memory, but not only, this will also consume your cpu time as when the garbage collector runs, it looks on every single object, and the more objects you have, the more it spent time looking at them …

To resume reference leak : with time, it’s grow in memory and slow down dramatically the garbage collector running time.

If you want to read more about reference leaks, read the awsome post descent into darkness on the blog of Joe Damato.

Update: Find this leaks easily using the memprof gem and by using memprof.com (awsome stuff again by Joe Damato)

My app is still bloating !

After that, if you have still ruby/rails process bloating, be sure to use the latest version of gem that are using C code, they can be an easy source of memory leak.

And, this is obvious, but be sure to not load huge dataset in memory at one time ! (use find_in_batch instead for example)

Then, If you want more control over the memory allocation, here is a good link for tune up the heap easily and control your ruby process growth.



Thanks for taking the time to read and I hope this article will help you to reduce your memory consumption !

  • Share/Bookmark