Archive | Ruby RSS for this section

Ruby 1.9.2-p318 faster

Due to an applied patch that optimized/fixed how ruby required libraries, the performance of my test suite was significant (~2 times.)

Below are the benchmarks from my test suite before and after the upgrade. The only issue was that RVM did not have support for the latest patch version so one pull request later and we are good to go.

    Ruby 1.9.2-p290

        real    6m14.266s
        user    2m17.930s
        sys 0m9.587s

    UPGRADE TO ruby p318

        real    2m22.649s
        user    1m15.867s
        sys 0m8.126s

My continuous integration server will be really happy now as well!

Advertisements

Rails Session ArgumentError: key too long [SOLVED]

Problem

I am migrating an app from one version of Ruby to another. In my rollout I decided to first launch one server using the new version of Ruby 1.9.2 and adjust the load balancer to only move over about 10% of the traffic to it. I noticed in newrelic that within 10 minutes there were tons of exceptions like:

ArgumentError: key too long "rt:BAh7Czozg3OWY4MWRkMjAyYjlkNzljNmUwNWYwZDRhMDQ6F
GFiaW5nb19pZGVudGl0eUkiDzEyMTc2ODUyNzIGOgZFRjoKcnRfYWJJIgZiBjsHRjodY2xpY2tfdHJh
Y2tpbmdfZmlyc3RfdXJsSSJAaHR0cDovL3JlYWx0cmF2ZWwuY29tL2ItMjM1MTM4LWxvbmRvbl9ib
G9nLWxvbmRvbjpfYV90b196ZWQGOwdGOihjbGlja190cmFja2luZ19vcmlnaW5hbF9yZWZlcmV
yX3VybCJgaHR0cDovL3d3dy5nb29nbGUuY29tL3NlYXJjaD9zb3VyY2U9aWcmaGw9ZW4mc
mx6PTFHMUdHTFFfRU5VUzQwMCZxPWxvbmRvbithK3RvK3plZCZhcT1mJm9xPToMcmVm
ZXJlckAK--65814e8bacff8432ac09e15520afb13117d0e5cd"
Stack trace (hide Rails)
/path/shared/bundle/ruby/1.8/gems/memcache-client-1.8.5/lib/memcache.rb:703:in `get_server_for_key'
/path/shared/bundle/ruby/1.8/gems/memcache-client-1.8.5/lib/memcache.rb:920:in `request_setup'
/path/shared/bundle/ruby/1.8/gems/memcache-client-1.8.5/lib/memcache.rb:885:in `with_server'
/path/shared/bundle/ruby/1.8/gems/memcache-client-1.8.5/lib/memcache.rb:246:in `get_MemCache_read'
/path/shared/bundle/ruby/1.8/gems/actionpack-2.3.8/lib/action_controller/session/mem_cache_store.rb:31:in `get_session'

The interesting thing was that none of these errors were coming from the new server, but something on the new ruby server was breaking the other, so here is what I found was the problem.

Solution

The new server was configured to use the cookie store and the others were using memcached store. It sounds stupid and it is, but it wasn’t so easy to figure out at first. When I googled for the error I noticed that none of the issues had solutions, so thats why I decided to write this post.

In short, the session[:session_id] with the default cookie store passes both the session data and the session key into the cookie, and the memcached/db store’s only put the session_id from the cookie as session[:session_id].

May your session keys be truly keys!

Ruby 1.9.2, Encoding UTF-8 and Rails: invalid byte sequence in US-ASCII

Ruby 1.9.2, Encoding UTF-8 and Rails: invalid byte sequence in US-ASCII

written by Paul on January 20th, 2011 @ 12:05 AM

While working on a migration from ruby 1.8.7 to 1.9.2, I ran into some issues with encoding. Fortunately, we are using PostgreSQL and the database drivers are pretty good for UTF-8 support and encoding tagging, but there were still some snags in a few areas of my code.

My company, has pretty custom urls structures. Because of the concern of having multiple urls go to the same content and appearing to Google to be doing something bad, we have come code that ensures that the url that was requested is the same url that we would have generated and if it wasn’t we would redirect.

In this code, we generate a url and compare it to the value ofrequest.request_uri to see if we should redirect or not. On issue that came up is that Nginx and Passenger encode the unicode characters and Rack turns it into Binary, which is ASCII-8bit, but its really just means that no encoding is assigned.

In the browser a url might look like this:

/h-336461-amboise_hotel-château_de_noizay

But when my code genrated the url it looked like this:

/h-336461-amboise_hotel-ch%C3%A2teau_de_noizay

The above could easily be fixed with this:

URI.unescape(canonical_url)

Then I had issued where I had a URL (request.request_uri) like:

/h-336461-amboise_hotel-ch\xC3\xA2teau_de_noizay

It was ASCII-8bit which is really a way of saying that its binary or in other words that no encoding is set. The solution was pretty easy, I just assigned it the encoding that I knew it should be:

/h-336461-amboise_hotel-ch\xC3\xA2teau_de_noizay".force_encoding('UTF-8')
  # => "/h-336461-amboise_hotel-château_de_noizay"

Then I had an issue where templates/views were breaking due to some data in a haml view thinking that the test was ASCII: The test was supposed to look like this “Details for Château de Noizay,” but haml raised an exception “ActionView::TemplateError (invalid byte sequence in US-ASCII).”

After digging around a bit I was able to configure Apache (on my mac) by adding the following to the /etc/profile.

export LANG=en_US.UTF-8

Then after restarting Apache on my mac, I refreshed, and when I did, the text that was supposed to look like “Details for Château de Noizay” ended up looking like “Details for Ch 도teau de Noizay”.

I was about to write my own hybrid asian/latin based languages but instead added the following to my environment.rb and everything seemed to come together like I had hoped it would.

Encoding.default_internal = 'UTF-8'

Now that my app was able to run without encoding errors, I said “yeah!”

Hope this scribble scrabble helps some poorly encoded soul.

Thanks to a few articles I was not only reminded of some of the basics of encoding but learned to embrace the new changes within Ruby 1.9.2. We’ll see how tomorrow goes. 😉

 

Changing File Encoding Using Ruby 1.9.2

Changing File Encoding Using Ruby 1.9.2

written by Paul on January 3rd, 2011 @ 06:22 AM

Currently, I am in the process of upgrading an application from Ruby 1.8.7 to Ruby 1.9.2. One of the big differences between 1.8 and 1.9 is the multi-byte character support.

The Problem

We have thousands of static html files that were generated in Ruby 1.8 and when Ruby 1.9 reads them it fails. As usual, before I start to dig in to solving the problem I do a quick search and see what other people have been doing to solve the problem. My search yielded a bunch of multi-lined scripts and techniques… most of which were from the Ruby 1.8 days.

The Solution

In short I wrote a simple 4 lined script in irb and it completed my task quickly. One thing that I am really happy about it how Ruby 1.9.2 strings have a method called escape that provides great utility when performing these kinds of tasks.

So here is the code:


`find . -name '*.html'`.split("\n").each do |filename|
  puts filename
  handle = File.open(filename,"w+")
  handle.write(handle.read.encode('UTF-8'))
  handle.close
end; nil

If you are interested in the options with the encode method, go check them out.

 

Crawlable AJAX for SPEED with Rails

Recently at work we have been focusing our efforts on increasing overall performance of our site. Many of our pages have a lot of content on them, some might say too much. Thanks to Newrelic, we identified a couple of partials that were slow (consumed ~60% of the time) but could not just remove them from our page and the long term fix was going to be put into place over the coming weeks. Long story short, we though that it would be better to speed up the initial page load time and then call the more expensive partials asynchronously using separate AJAX calls. That way the page time would be faster and the slow parts would be split up between requests.

The Problem: Google’s Crawler doesn’t like AJAX (normally)

Googlebot still does not load javascript and perform AJAX calls. Because of this we don’t get credit from google for the content that we loaded on AJAX—which is a bummer and a show stopper for us. Duplicate content is bad forSEO and Google will come to our pages and see that they are similar, even though the user sees the relevant content as the page loads, it will “think” that our pages are mostly the same (header and footer, etc.)

The Solution: Google allows for Crawlable AJAX

On their site, Google suggests that sites that have AJAX on them use a particular approach to making them crawlable. I won’t go into the details of how google supports this because its all stated in their paper, but I did want to focus on how I implemented the solution.

Before I continue I want to say that I was hesitant to do this because at first glance I didn’t think it was be easy or effective, I was wrong and I apologize to my friend Chris Sloan for doubting him in the beginning as he proposed the idea. (he made me include this in the post and threatened my life if I didn’t)

Google basically wants to be able to see the ajaxified page as a whole static page, so they pass an argument to the page and in turn we are supposed to render the whole page without the need to call AJAX to fill portions of the page with content.

I wanted to funnel the AJAX calls for different partials though a single action within our site so I didn’t have to build out custom routes and custom actions for each partial, which would be extremely messy to maintain.

The Code

So here is a simple example of the approach we took:

Created a single action that tooked for specific classes and then made requests to the server passing a couple of key parameters: /ajaxified?component=&url=<%= request.request_uri %>

    module AJAXified # include this in a controller (or the app controller)

      # HOWTO
      # To add a new AJAX section, do the following:
      # 1) Write a test in crawlable_spec for the new container and url
      # 2) Add the new method/container to the the ALLOWED_CALLS array      
      # 3) Add the new method below so it sets required instance variables

      ALLOWED_CALLS=[:bunch_o_things]

      def is_crawler?
        params.key?(:_escaped_fragment_)
      end

      # Actual Instance Setting Methods Are BELOW This Line

      # Note: each method needs to return the partial/template to render

      def bunch_o_things(options=nil)
        @thing ||= Thing.find(options[:params][:id])

        @things_for_view = @thing.expensive_call
        'thing/things_to_view'
      end

      # Actual Instance Setting Methods Are ABOVE This Line

      public # below is the actual main ajax action

      def ajaxified
        raise "method \"#{params[:container]}\"is not allowed for AJAX crawlable" unless ALLOWED_CALLS.include? params[:container].to_sym

        raw_route = ActionController::Routing::Routes.recognize_path(params[:url],:method=>:get)
        request.params[:crawlable_controller] = raw_route[:controller]
        request.params[:crawlable_action]     = raw_route[:action]

        render :template => self.send(
          params[:container].to_sym, :params => request.params
        ), :layout => false
      end

    end

I needed to ensure that the method :is_crawler? is available within views as controller.is_crawler?

  hide_action :is_crawler?

In the controller action where the code would have normally been executed, we need to add a check for crawler so we don’t execute code that is not needed.

def show
    @thing = Thing.find(params[:id])

    if is_crawler?
      # sets @things_for_view
      bunch_o_things
    end
    ...
end

In the view:

<article id="things" data-crawlable="<%= controller.is_crawler? ? 'false' : 'true' %>">
  <% if controller.is_crawler? or request.xhr? %>
    <% @things_for_view.each do |thing| %>
      ... potentially expensive stuff ...
    <% end %>
  <% end %>
</article>

Because I had to water the code down a bit to show how it works ingeneral, this code is not tested nor has it been executed as is. I actually had to add more stuff around the project I did for work in order for it it work as we needed it to.

The general idea here is to centralize the partial render, reduce duplication within the controller and ensure that the code that slowed the partial down to begin with is not executed when the page is not being crawled.

In the end, we were able to reduce the initial request by ou users by 60% and google is able to crawl our site as it always had.

 

Saving time with Threads the Ruby way

I have been working on some projects that require me to do multiple serial webservice calls using soap and http clients. As you might guess without concurrency its such a waste waiting for the network IO and its ends up being accumulative in times—the more service calls the slower it gets (.5s+1s+2s+1s+1s = 5.5seconds). Originally I wasn’t worried because I knew I would come back and tweak the performance by using threads and so today was the day for me to get it going. Before I got too crazy coding i wanted to run some basic benchmarks just to see if it would really end up making things faster. Here is what I did:

Benchmark.bm { |rep| 
  rep.report("non-threading") { 
    1.upto(100) { |count|
      amount_rest = rand(4)
      # puts "##{count}: sleeping for #{amount_rest}" 
      sleep(amount_rest)
      # puts "##{count}: woke up from a #{amount_rest} second sleep" 
    }
  }

  rep.report("threading") { 
    threads = []
    1.upto(100) { |c|
      threads << Thread.new(c) { |count| 
        amount_rest = rand(4)
        # puts "##{count}: sleeping for #{amount_rest}" 
        sleep(amount_rest)
        # puts "##{count}: woke up from a #{amount_rest} second sleep" 
      }
    }
    while !(threads.map(&:status).uniq.compact.size == 1 and threads.map(&:status).uniq.compact.first == false)
     # puts "will check back soon" 
     sleep(0.3)
    end
  }
}

benchmark        user     system      total        real
non-threading  0.100000   0.290000   0.390000 (142.005792)
threading          0.010000   0.020000   0.030000 (  3.182716)

As you can see, the threading in Ruby works really well as long as each thread is not doing anything CPU intensive. Even though ruby 1.8.7 does not support native threads, the threading, as you can see above, does work well. When all was said and done, I ended up making more than a 100% improvement and it will work a bit better if and when we have to do more requests concurrently.

I do however look forward to using ruby 1.9, but this will do the trick for me now.

Real Travel Acquired by Uptake and The Back Story

Today is the day that it has become public knowledge that my company Real Travel has been acquired by Uptake Networks. It’s been over 5 years since I wrote the first line of PHP code for Real Travel. I wanted to share a bit about my ride so far and I look forward to even more excitement within Real Travel as part of the Uptake Network.

Real Travel’s Conception

Over 5 years ago I was introduced by to Ken Leeder by a mutual friend of ours. After a few breakfast meetings with a few of us (Michael T., Christina B., and Ken L.) we put together some wireframes and I went to work in the moonlight. Six months later we had a pilot/prototype that was built by me using PHP/MySQL with some help from a contract designer who gave me some photoshop designs. (Christina also helped with some of the design.) The application was quite simple then; we had lists of hotels in destinations and a review form for hotels.

A few more months later I quit my job [yes, the link no longer works.:)] and the company was incorporated, and our initial seed round of fundingwas coming together. This was also the time that we hired a new CTO and Chief Architect (who are no longer with us.) The decision was made to rewrite the pilot using the then beta ASP.NET 2. I pushed back but other were more familiar with Microsoft technologies and won the battle. Knowing now what was coming next, in retrospect I wish I had fought harder for an open source LAMP stack that couldn’t scale.

The Pain of Learning What You Already Knew (Again)

There was a “culture” clash in respect to development and design approaches. Within three months we went from my pilot’s 5-10K lines of PHPcode and uber simple database schema, you know the kind that you can mange from the command-line without a plethora of GUI tools, to over 100K lines of code of C# and a object oriented db schema in PostgreSQL (not the built in kind of object oriented that PostgreSQL provides, but a new home grown schema that was designed for fourth normal form.)

I will never forget the day when our architect showed the database schema relation map, it looked like a ball of yarn with so many relationship lines that the shapes of tables themselves were indistinguishable as rectangles. This was my first run in with doing it the “right” way the first time fallacy, at least to this extreme. As an engineer with then 5 years of experience I expressed my concern with the complexity but was quickly extinguished by the group-think cliches like “this is how the big guys do it.”

We started to release our site weekly and launched our site at the Web 2.0 Conference Launch Pad. The development was slow due to the complexity of the database. Not kidding, it took about 10 SQL inserts in order to add a single photo to the database; tables for strings, dates, root objects, photo, photo renditions, and so on.

During the year or two following…. a lot of “fun” happened”….. and our site’s architecture was extremely brittle and we spent a good part of our time debugging strange bugs, slow queries (with at least 5 if not 8 joins in them,) and strange IIS bugs. (A very educational experience for me building the ivory tower inefficiently.)

Born Again

I was playing with a new toy on the side called Ruby on Rails and became really enchanted in the framework and the Ruby language its self. I finally felt free from the 90 second compile time and memory cache load that ourASP.NET app required between code changes. It reminded me of thePHP/MySQL days and I realizes just how horribly over engineered our system was. I started to propose a rewrite but it went over like a lead balloon, but this did not deter me. I did freelance projects in Rails and PHPjust to keep my sanity. I believed in Real Travel and didn’t want to leave; I was drawn in by the opportunity to be the catalyst for positive change the second I was given a chance. The time would come.

As time went on and our releases became more distant apart, and our site became slower and slower, my challenge to rewrite the site became that much more appealing, but it was becoming a bigger task each day. Each time I had to change a line of code I had to wait (And wait) for hours to compile the app, start it and test it, then more hours and even days to release it, I would say “if this were Ruby on Rails it would have been done a while ago.” We knew as a company that something had to be done and as a team we were unable to develop new features and move our company forward.

As a company, we were forced by the market to change our midset, and accepted that something had to be done but a port to Rails was still out of the question. We attempted to de-normalize our tables and rewrite the code base in ASP.NET and C#, but that only proved to take even more time and we were still on ASP.NET.

A New Chapter With Ruby on Rails

It wasn’t easy, but I continued to champion the move to Ruby on Rails and we started to build all of our new development on it. We ended up having two applications, the main one was APS.NET the new one Ruby on Rails. With a use of a load balancer we were able to make much of this transparent and we found ourselves spending must of out time where it could count—on the Rails application. In fact there was a time where only one Windows development machine existed on a desk in case we had to make a change to the old system we could make the change, test it, and then push the code to production. After many many long discussions and debates we finally made the decision to make the port to Rails. It was also about this time that we decided that unless we started to use SCRUM we would either all quit or jump off of a bridge.

SCRUM, Agile, TDD/BDD, Quality, and Accountability

We had all learned our lesson. As a company we went to SCRUM training and this was one of the most pivotal points in my tener at Real Travel (or in my career.) We began to form good process and better working relationships. I begin to jump into rspec and getting the quality built in to the product it was a start, each day getting better and better.

About 11 two-week sprints later (579 story points) we had a new version ofReal Travel up and running on Ruby on Rails. I had the pleasure of powering down the last windows server—it was nice after all of the pain.

Nothing was stopping us: continuous improvement, open communication, retrospectives, developer productivity, and backlog directed self organized team. Within a year we ported the system and made major improvements to our site and even started to make some decent revenue. The summer of 2009 was great for us, major traffic gains, increased revenue, and then the Google problems started. Although I did the Rails development I could not have done it without Chris Sloan and Francisco Marin as team members, peers, and heros.

Google: The Authoritarian Mime

I won’t bore you with the details in this post, but due to some spammed pages in our site Google had decided to kick us out of their index and not tell us why. Fortunately after some time we found the problem (we found and deleted some pages that were link spammed with V1agra links and had 5K links pointing into them) and once it was fixed our traffic started to come back a bit.

Then months later we had another problem with a load balancer configuration which caused our old site’s links to become 404s. This was not obvious to us right away, but like the other Google problems, it was a problems for us and affected our traffic. We fixed the problem but we were cut deeply but the two complications. Our traffic started coming back along with our revenues, but it was slower and was a long rough ride. Traffic and revenues were going in the right direction however. We knew we could get our traffic back and get things going so we marched on.

Becoming “That” Team and Company

Through the couple of years we started to tune our SCRUM process and went from two-week sprints to one week sprints, one day planning meetings to 1 hour planning meetings, and one a week releases to releasing multiple times we day with continuous release. It started to become really exhilarating when we could run an on the street low-fi paper test with the kind people of Palo Alto, come up with some up hypothesis on a change, push out a split test (aka. AB test) and then release the winner in a day. If there was a bug and it affected our site, we could have the fix out in minutes.

What Now?

With traffic bouncing back we were engaged by Uptake and I will let Tech Crunch, and the many other blogs tell the rest of the story. I consider myself privileged to be able to learn with and from my fellow team members and ride the ebbs and flos from the first line of code that I wrote on that first pilot and I look forward to the future as Uptake and Real Travel aim to provide the best travel experiences on the web.

So, Sloan, pull the next test from the top of the test queue and lets test it!

LibXML-Ruby and XPath with namespaces

So, have you ever wasted a half hour coding while also driving yourself absolutely insane? Was it when you were playing with libxml-ruby and xpath?

Minutes ago I was coding up a xml-rpc webservice when I realized that I was unable to get the nodes that I was looking for with xpath.

As usual I searched google looking for other people having the same issue and nothing helpful came up. I knew I had to write this post when I sawthis.

So my response xml looked somthing like this:

response = <<-REMOTE_XML
<?xml ...?>
<rootNode xmlns="http://happythanksgiving.com/htgn">
  <list>
    <item>hey</item>
    <item>there</item>
  </list>
</rootNode>
REMOTE_XML

My ruby was something like this:

document = XML::Parser.string(response).parse
namespace = 'http://happythanksgiving.com/htgn'
turkeys = document.find('/htgn:rootNode//item', namespace)

But turkeys.sizewas always 0.

I the found out that I needed to add the namespace prefix to each element in the xpath find…. duhh!

document = XML::Parser.string(response).parse
namespace = 'http://happythanksgiving.com/htgn'
hotels = document.find('/htgn:rootNode//htgn:item', namespace)

Note the xpath ”/htgn:rootNode//item” changed to ”/htgn:rootNode//htgn:item” (added the namespace prefix)

Hope this helps some poor hacker or me next July when I forget and start searching google. 😉

Compiling Ruby on OSX : Readline [Resolved]

An hour before our Daily Scrum this morning I decided to recompile ruby on my Mac (OS X 10.5.8). I did this because I was trying to install passenger for development. More on that later (maybe).

I ran into the following issue with readline while I was building ruby with the--enable-shared option.


readline.c: In function ‘filename_completion_proc_call’:
readline.c:703: error: ‘filename_completion_function’ undeclared (first use in this function)
readline.c:703: error: (Each undeclared identifier is reported only once
readline.c:703: error: for each function it appears in.)
readline.c:703: warning: assignment makes pointer from integer without a cast
readline.c: In function ‘username_completion_proc_call’:
readline.c:730: error: ‘username_completion_function’ undeclared (first use in this function)
readline.c:730: warning: assignment makes pointer from integer without a cast
{standard input}:358:non-relocatable subtraction expression, "_mReadline" minus "L00000000007$pb" 
{standard input}:358:symbol: "_mReadline" can't be undefined in a subtraction expression
{standard input}:356:non-relocatable subtraction expression, "_completion_case_fold" minus "L00000000007$pb" 
{standard input}:356:symbol: "_completion_case_fold" can't be undefined in a subtraction expression
{standard input}:342:non-relocatable subtraction expression, "_mReadline" minus "L00000000007$pb" 

After google’in and coming across some similar but different solutions

Then I found this and realized that I was most likely not using the correct readline lib.

So, the issue was related to having two readline libraries installed, one in /usr/local/lib, which was installed by port as a dependency to postgres, and the other, located in /usr/lib came with OSX.

For whatever reason ruby 1.8.6 does not like to use the port library so all I had to do to get going was specify which realine library I wanted to use.


./configure --enable-shared --enable-pthread CFLAGS=-D_XOPEN_SOURCE=1 --with-readline-dir=/usr/local

Happy compiling!

FAIL: sudo gem install mysql (Fixed)

The other day I had an issue with ruby and so I went to google to fine a fix…. I laughed when the second result was my own blog. 🙂

I figured it wouldn’t hurt to save me some time next time I run into the OS Xnightmare with the mysql gem so here is what happened and what I did to fix it.

After running “sudo gem install mysql” I got the following errors:


/usr/local/bin/ruby extconf.rb
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lm... yes
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lz... yes
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lsocket... no
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lnsl... no
checking for mysql_query() in -lmysqlclient... no

As usual I looked into the mkmf.log found in the gem directory and saw a bunch of these:


"gcc -o conftest -I. -I/usr/local/lib/ruby/1.8/i686-darwin9.6.2 -I. -I/usr/local/include   -D_XOPEN_SOURCE=1  -fno-common -pipe -fno-common conftest.c  -L. -L/usr/local/lib -L/usr
/local/lib -L.      -lruby-static -lmysqlclient  -lpthread -ldl -lobjc  " 
ld: library not found for -lmysqlclient
collect2: ld returned 1 exit status
checked program was:
/* begin */
1: /*top*/
2: int main() { return 0; }
3: int t() { mysql_query(); return 0; }
/* end */

So here is what I did to fix it:


sudo ln -s /usr/local/mysql/include /usr/local/include/mysql
sudo ln -s /usr/local/mysql/lib /usr/local/lib/mysql



[heppy /usr/local/lib/ruby/gems/1.8/gems/mysql-2.7 64]$ sudo gem install mysql
Building native extensions.  This could take a while...
Successfully installed mysql-2.7
1 gem installed
Installing ri documentation for mysql-2.7...

Yeah!