Saving time with Threads the Ruby way
I have been working on some projects that require me to do multiple serial webservice calls using soap and http clients. As you might guess without concurrency its such a waste waiting for the network IO and its ends up being accumulative in times—the more service calls the slower it gets (.5s+1s+2s+1s+1s = 5.5seconds). Originally I wasn’t worried because I knew I would come back and tweak the performance by using threads and so today was the day for me to get it going. Before I got too crazy coding i wanted to run some basic benchmarks just to see if it would really end up making things faster. Here is what I did:
Benchmark.bm { |rep|
rep.report("non-threading") {
1.upto(100) { |count|
amount_rest = rand(4)
# puts "##{count}: sleeping for #{amount_rest}"
sleep(amount_rest)
# puts "##{count}: woke up from a #{amount_rest} second sleep"
}
}
rep.report("threading") {
threads = []
1.upto(100) { |c|
threads << Thread.new(c) { |count|
amount_rest = rand(4)
# puts "##{count}: sleeping for #{amount_rest}"
sleep(amount_rest)
# puts "##{count}: woke up from a #{amount_rest} second sleep"
}
}
while !(threads.map(&:status).uniq.compact.size == 1 and threads.map(&:status).uniq.compact.first == false)
# puts "will check back soon"
sleep(0.3)
end
}
}
benchmark user system total real non-threading 0.100000 0.290000 0.390000 (142.005792) threading 0.010000 0.020000 0.030000 ( 3.182716)
As you can see, the threading in Ruby works really well as long as each thread is not doing anything CPU intensive. Even though ruby 1.8.7 does not support native threads, the threading, as you can see above, does work well. When all was said and done, I ended up making more than a 100% improvement and it will work a bit better if and when we have to do more requests concurrently.
I do however look forward to using ruby 1.9, but this will do the trick for me now.
LibXML-Ruby and XPath with namespaces
So, have you ever wasted a half hour coding while also driving yourself absolutely insane? Was it when you were playing with libxml-ruby and xpath?
Minutes ago I was coding up a xml-rpc webservice when I realized that I was unable to get the nodes that I was looking for with xpath.
As usual I searched google looking for other people having the same issue and nothing helpful came up. I knew I had to write this post when I saw this.
So my response xml looked somthing like this:
response = <<-REMOTE_XML
<?xml ...?>
<rootNode xmlns="http://happythanksgiving.com/htgn">
<list>
<item>hey</item>
<item>there</item>
</list>
</rootNode>
REMOTE_XML
My ruby was something like this:
document = XML::Parser.string(response).parse
namespace = 'http://happythanksgiving.com/htgn'
turkeys = document.find('/tvlw:rootNode//item', namespace)
But turkeys.sizewas always 0.
I the found out that I needed to add the namespace prefix to each element in the xpath find…. duhh!
document = XML::Parser.string(response).parse
namespace = 'http://happythanksgiving.com/htgn'
hotels = document.find('/tvlw:rootNode//htgn:item', namespace)
Note the xpath ”/tvlw:rootNode//item” changed to ”/tvlw:rootNode//htgn:item” (added the namespace prefix)
Hope this helps some poor hacker or me next July when I forget and start searching google. ;)
Sharing shell with ytalk on Ubuntu
A good friend of mine years ago used to use a command-line app called ytalk to show me around the bash shell (thanks Sione!). After a short while I stopped needing his help and so I stopped using ytalk. At work we really wanted to shell-share with remote team members who were unable to use the iChat screenshare because of OS and bandwidth limitations.
I remembered that ytalk was such a good tool for being able to see what someone else was doing in the shell and to show off your bash skills. I thought it was going to be easy to setup on Ubuntu, but as it turns out, although its still an available package, it is dead on install.
So…. here is what I ended up doing and I hope that if you do the same you will be ytalk’in in no time..
On ubuntu install ytalk:
sudo apt-get install ytalk
Change the default/broken inetd.comf configuration:
talk dgram udp wait nobody.tty /usr/sbin/in.talkd in.talkd
ntalk dgram udp wait nobody.tty /usr/sbin/in.ntalkd in.ntalkd
to:
talk dgram udp4 wait root /usr/sbin/in.talkd in.talkd
ntalk dgram udp4 wait root /usr/sbin/in.ntalkd in.ntalkd
Note the “4” after the udp and the “nobody.tty” change to “root”
In the /etc/services file, make sure the following lines are in there:
paul@box:~$ sudo grep talk /etc/services
talk 517/udp
ntalk 518/udp
I didn’t have to change anything, but its a good idea to confirm things.
Using YTalk
Initiating the chat:
You can do this in a couple of ways, the first and most obvious way is to coordinate with another person/user and ensure that the two of you are only logged in once to the same box ad then type.
paul@box:~$ ytalk fred
Or if your logged on more than once you can specify the tty in the request after finding out which one it is:
paul@box:~$ who
fred pts/0 2009-11-06 10:50 (208.X.X.X)
fred pts/2 2009-11-06 10:48 (208.X.X.X)
paul pts/3 2009-11-06 14:02 (208.X.X.X)
ytalk fred#2
More on that can be found here: http://manpages.ubuntu.com/manpages/intrepid/man1/ytalk.1.html
Thanks to euphemus for the breakthroughs!
Hope you find ytalk as useful and coolific as I do.
Enjoy!
Compiling Ruby on OSX : Readline [Resolved]
An hour before our Daily Scrum this morning I decided to recompile ruby on my Mac (OS X 10.5.8). I did this because I was trying to install passenger for development. More on that later (maybe).
I ran into the following issue with readline while I was building ruby with the --enable-shared option.
readline.c: In function ‘filename_completion_proc_call’:
readline.c:703: error: ‘filename_completion_function’ undeclared (first use in this function)
readline.c:703: error: (Each undeclared identifier is reported only once
readline.c:703: error: for each function it appears in.)
readline.c:703: warning: assignment makes pointer from integer without a cast
readline.c: In function ‘username_completion_proc_call’:
readline.c:730: error: ‘username_completion_function’ undeclared (first use in this function)
readline.c:730: warning: assignment makes pointer from integer without a cast
{standard input}:358:non-relocatable subtraction expression, "_mReadline" minus "L00000000007$pb"
{standard input}:358:symbol: "_mReadline" can't be undefined in a subtraction expression
{standard input}:356:non-relocatable subtraction expression, "_completion_case_fold" minus "L00000000007$pb"
{standard input}:356:symbol: "_completion_case_fold" can't be undefined in a subtraction expression
{standard input}:342:non-relocatable subtraction expression, "_mReadline" minus "L00000000007$pb"
After google’in and coming across some similar but different solutions
Then I found this and realized that I was most likely not using the correct readline lib.
So, the issue was related to having two readline libraries installed, one in /usr/local/lib, which was installed by port as a dependency to postgres, and the other, located in /usr/lib came with OSX.
For whatever reason ruby 1.8.6 does not like to use the port library so all I had to do to get going was specify which realine library I wanted to use.
./configure --enable-shared --enable-pthread CFLAGS=-D_XOPEN_SOURCE=1 --with-readline-dir=/usr/local
Happy compiling!
FAIL: sudo gem install mysql (Fixed)
The other day I had an issue with ruby and so I went to google to fine a fix…. I laughed when the second result was my own blog. :)
I figured it wouldn’t hurt to save me some time next time I run into the OS X nightmare with the mysql gem so here is what happened and what I did to fix it.
After running “sudo gem install mysql” I got the following errors:
/usr/local/bin/ruby extconf.rb
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lm... yes
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lz... yes
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lsocket... no
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lnsl... no
checking for mysql_query() in -lmysqlclient... no
As usual I looked into the mkmf.log found in the gem directory and saw a bunch of these:
"gcc -o conftest -I. -I/usr/local/lib/ruby/1.8/i686-darwin9.6.2 -I. -I/usr/local/include -D_XOPEN_SOURCE=1 -fno-common -pipe -fno-common conftest.c -L. -L/usr/local/lib -L/usr
/local/lib -L. -lruby-static -lmysqlclient -lpthread -ldl -lobjc "
ld: library not found for -lmysqlclient
collect2: ld returned 1 exit status
checked program was:
/* begin */
1: /*top*/
2: int main() { return 0; }
3: int t() { mysql_query(); return 0; }
/* end */
So here is what I did to fix it:
sudo ln -s /usr/local/mysql/include /usr/local/include/mysql
sudo ln -s /usr/local/mysql/lib /usr/local/lib/mysql
[heppy /usr/local/lib/ruby/gems/1.8/gems/mysql-2.7 64]$ sudo gem install mysql
Building native extensions. This could take a while...
Successfully installed mysql-2.7
1 gem installed
Installing ri documentation for mysql-2.7...
Yeah!
Mongrel to Passenger with CPanel
I host this blog on slicehost and used to have a couple of slices, one for rails, and one for client sites, php, email etc. Just a few hours ago I moved my blog from my Rails slice to what I call my CPanel slice using passenger and the process was smooth sailing. In the process I decided to leverage what I learned about Cpanel and Passenger and I created a gem called cpanel-passenger which can be found on github.
The gem just installs a command called cpanel-passenger that takes a bunch of parameters to modify the Apache config in a way that will not make Cpanel upset.
There is a lot of work to do to make this script do all that one would want, but at least it makes setting up a rails app on passenger a simpler task with Cpanel. Feel free to fork the gem and add to it. Its just a matter of time and the Cpanel folks will bundle passenger as a supported module, but until then try this out on your VPS that is running Cpanel.
Enjoy!
A default route gone 404 when it should
UPDATE: This worked for Rails < 2.0, but now you should follow something like this
Rails routes are a critical piece of a rails application. One issue about the routes is that there isn’t a default route for the home page of an application. Typically, one would create a controller and create a route for a default controller and default action. Here is what one of mine looks like:
map.root :controller => 'main', :action => 'home'
There is one problem with this. The url http://domain.com/blah%20blah will go to the main controller and will throw an “no action/ no id given” exception which will result in a 500 error. This is not what you want for SEO or otherwise.
The solution is quite simple, all you have to do is add a method missing to the main controller and add a method missing that logs and renders a real 404 page and http status.
def method_missing(method, *args)
logger.warn "action #{method} dos not exist, 404"
render :file => File.join(RAILS_ROOT, 'public', '404.html'), :status => 404
end
There may be better ways to do this, but this is one way around the false 500 errors, especially if your likely to get old inbound links to your site.
Making the Rails Request Profiler and KCacheGrind Play
I have been working on optimizing my companies site after porting over many features. I have been finding the newer rails performance tools including the request profiler to be very helpful in this effort. Ryan Bates put out a great screencast on request profiling that will get you started, but if your app has any complexity, you will find out quickly like I did that the html file gets too large and is not very helpful when it crashes your browser. ;)
Assuming that you have already installed KCacheGrind on your Mac using fink, you can do the following:
# Open up therequest_profiler.rb in the actionpack gem (the code that is used by ./script/performance/request)
mate /Library/Ruby/Gems/1.8/gems/actionpack-2.2.2/lib/action_controller/request_profiler.rb# Add the following lines of ruby to the
show_profile_results method at the bottom.
File.open "#{RAILS_ROOT}/tmp/profile-call-tree.kcg", 'w' do |file|
RubyProf::CallTreePrinter.new(results).print(file)
`kcachegrind #{file.path}` if options[:open]
end
Now next time you run the request profiler you will see the KCacheGrind open up with the call tree output in it, yeah!
Changing Session Store in Rails
TIP: If you change the sessions store in rails, I would recommend also changing the session_id so your app doesn’t blow up with 500 errors on every request.
I changed the store from cookie based sessions (the default) to memcached based sessions.
Automatic hidden form fields and lightview
Ever needed to automatically add a hidden field in a form? Here is what I did to make it happen.
Not sure if its the best solution, but it worked for me… at least until the next rails release. ;)
In the original form_for code it creates a form tag which prints out the templates in the blog that is passed to it. There is a method that creates the opening form tag and it already creates extra_tags. All I do it add an additional concatenated string to the fields with the result of a custom method that I created called my_custom_extra_tags. Anything the method returns will be added to each form.
module ActionView::Helpers::FormTagHelper
# form_tag_html overridden on line 454 in actionpack-2.2.2/lib/action_view/helpers/form_tag_helper.rb
# original
# def form_tag_html(html_options)
# extra_tags = extra_tags_for_form(html_options)
# tag(:form, html_options, true) + extra_tags
# end
# modified
def form_tag_html(html_options)
extra_tags = extra_tags_for_form(html_options)
tag(:form, html_options, true) + extra_tags + my_custom_extra_tags
end
def my_custom_extra_tags
(params[:lightview].blank? ? '' : hidden_field_tag(:lightview, params[:lightview]))
end
end
I used this to show the same controller action with different templates and in my application controller I determine which template to show from a passed in parameter that cannot be lost or the template will revert back to the default template. Now all I have to do is pass a parameter lightview to the iframe source and the correct template will show before and after the form inside the iframe is submitted.
Hope this was helpful.
FAIL: COMPROMISED SSH Public Key on Ubuntu
Last night I was setting up a new application on my server and while I was configuring capistrano I came across this strange problem and didn’t immediately find much help on google, so I thought I would post this to help someone else along.
I had already setup my ssh keys months ago but when I tried to ssh into my subversion repository it would ask me for a password/passphrase and it just about drove me crazy.
I came across this article in google and checked off each potential problem and nothing. Then I saw that my key was conprimized when I ran the “ssh-vulnkey -a” command.
capistrano@allison:~/.ssh$ ssh-vulnkey -a
Unknown (no blacklist information): 2048 5a:b4:d6:94:10:14:e1:a0:35:35:ff:c6:08:e6:9f:10
Not blacklisted: 2048 5f:43:c2:f0:fb:e6:52:c4:90:59:fb:d2:e0:fe:66:d0
Unknown (no blacklist information): 2048 ab:5e:39:5c:33:f0:02:e3:cf:cd:99:84:ca:9e:f8:e1 Paul@paul-hepworths-computer.local
COMPROMISED: 2048 81:85:1d:a7:b1:c6:ff:b2:d5:3f:60:3e:2e:c0:25:5c capistrano@mislice
COMPROMISED: 1024 fa:87:13:5f:0c:01:3e:53:b9:a1:ff:4a:8a:29:b2:a1 capistrano@mislice
So I searched google some more to find out how to fix the problem. I regenerated keys multiple times on my client server and no-dice.
Then after searching and searching I found this tutorial and followed it to update openssl and openssh and regenerate my private keys .
What a relief! (and a waste of time, but now I am secure I guess)
Scaling Anything
Tonight I attended a presentation at Google that given by Jason Hoffman (the CTO at Joyent) about Ruby, and scaling web architecture. Although none of the actual information was new to me it did remind me of the basic points to scaling a web application. Here they are.
- Scalability is language, performance, and throughput independent.
- Test each piece of your architecture and find out what the maximums of each service are.
- Find the real bottlenecks and remove them efficiently. (Use DTrace) An example was how Apache’s proxy module will limit the number of requests per second to 140 per Apache instance, and by using virtualization you could have eight instances running Apache on the same server with the capacity of 1120 rps.
Overall it was worth while especially when you factor in the snacks and dinner that Google provided.
Scale away!
Url Canonicalization in Rails
In one of my last posts I showed how I was able to create completely custom urls for SEO, but there is an issue that sometimes comes up when creating custom urls or when migrating urls, etc.
Here is a simple way to ensure that urls that are being requested are valid. Google and Yahoo! (and others) crawl your sites links and can on occasion come across an incorrect ink from someone else’s site that may be old or mistyped. There are some stiff penalties associated with having two different urls pointing to the same page. There may also be a need to retire certain urls or to change the way they are formated.
Here is an example, the URL:http://domain.com/d-123456-mountain_viewering
Should be redirected to:
http://domain.com/d-123456-mountain_view
Here is the simple solution:
I created a module that looked like the following in the lib directory and included it into the ActionController class.
include ActiveRecord
module MY
module URL
def page_code_object_map
{
'd' => Destination, 'p' => Photo
}
end
def execute_url_post_process
canonicalize if params[:canonicalize]
end
def canonicalize
whole_url = request.request_uri().split('?')[0].split('#')[0]
url_pieces = current_url.split('-')
page_type = url_pieces[0].gsub(/\//, '')
type_id = url_pieces[1]
begin
object = page_code_object_map[page_type].find(type_id)
canonical_url = send "custom_#{page_type}_path", object, params
rescue RecordNotFound => e
render :file => File.join(RAILS_ROOT, 'public', '404.html'), :status => 404
return
end
if canonical_url and canonical_url != whole_url
headers['Status'] = '301 Moved Permanently'
redirect_to("#{http_base}#{canonical_url}", :status => 301)
end
end
end
end
ActionController::Base.send :include, MY::URL
ActionView::Base.send :include, MY::URL
In the route below, notice that I am passing a parameter named :canonicalize with the value of true. This parameter is passed through to the controller as a request parameter and can be accessed in the params hash.
map.d '/d-:destination_global_id-:name*other_params', :controller => 'destinations', :action => 'show', :canonicalize => true, :destination_global_id => /\d{1,20}/, :name => /[^-]+/
How does this all work you say? Simple. In your application controller (controllers/application.rb) you need to include something like this:
before_filter :execute_url_post_process
This will start the checking process by calling the execute_url_post_process() method defined above in my module. If the route that matches passes the :cononicalize parameter, the conanicalize() method will get the current url and certain important pieces. Then depending on the object that is mapped to the page code (d) it will reconstruct the url of the destination object that should match the existing url. If it matches then were golden, if it doesn’t then we redirect to the new/correct url ensuring that we do not loose page rank or be counted as spam (duplicate content).
There are many things that you can do within this code. Some of them include managing authorization, hiding pages, etc.
I hope you enjoyed this tip. If you have any suggestions, please post them, I am sure some genius will have something to add. :)
Really Customized Urls for SEO in Rails
I needed to build urls that were packed with keywords for SEO. I needed to make sure that the url more fully described the contents of the page.
This default rails url does not cut it.
/destinations/12345
This does cut it.
/d-12345-mountain_view
So here is the hack that I did to get the desired affect. (Suggestions or insults on my approach are welcomed!)
First, I added this code into a plugin that I was using for our custom routes stuff. You can probably add this to the environment.rb file or better yet to a a file within lib and just make sure that you require the file from within environment.rb. I really needed to add the ’-’ as a delimiter.
This is step is important because by default rails uses slashes (/) as a dilimeter for parts of the url, but by adding a dash (-) to the array things work the way they should.
module ActionController
module Routing
SEPARATORS = %w( / ; . , ? -)
end
end
Then I added a named route (config/routes.rb) that looked something like this:
map.d '/d-:destination_global_id-:name*other_params', :controller => 'destinations', :action => 'show', :canonicalize => true, :destination_global_id => /\d{1,20}/, :name => /[^-]+/
Now we can create helper methods that take all of these wonderful parameters.
def custom_d_path(destination, params={})
d_path(
destination.global_id,
string_for_url(destination.name)
) + (params.size > 0 ? create_other_parameters(params) : "")
end
The method string_for_url() just replaced spaces with underscored and removed illegal characters.
The create_other_parameters() appended parameters in a subtle way that ensured that Google and Yahoo! wouldn’t get prejudice about dynamic pages with parameters. (This is another topic for another time.)
In short, now we can simply call custom_d_path(destination) from any view (or controller if we included the helper in both ActionView and ActionController classes).
I realize that there may be a better way to do this to make it simpler to code, but this is a simple example of a way to solve this problem.
Now for a couple of caveats:
- For those who have OCRD (obsesive compulsive REST disorder) the urls may not suite your style. I use them for the read only pages of a site.
- You may not need to go to this extreme to keyword pack your urls… there are many other approaches that may be more robust and easier to implement.
Hopefully this example helps someone. :)
Merging Branches with Subversion using CLI and FileMerge
On small projects I usually work right out of trunk to avoid the need to merge, but when working in teams to implement features that will be released separately creating a branch or two is the way to go. The only problem with working with branches is that you have to merge your code periodically in order to avoid nightmares. Here are the steps that I use to to a simple merge between a development branch and trunk. If there are better ways or if I missed something please let me know, but these is what worked for me.
1) First svn update local working copy (both trunk and branches)
2) Change directory to the branch (branches/development)
cd /Users/Paul/Documents/test_svn/repo/
3) Run a merge command similar to the one below as a dry-run to see if everything looks OK:
svn merge --dry-run -r 4:HEAD file:///Users/Paul/Documents/test_svn/repo/trunk
4) Then if you are satisfied with what you see, you can run the real command which will actually update your working copy with the merged files from trunk.
svn merge -r 4:HEAD file:///Users/Paul/Documents/test_svn/repo/trunk
5) If you have conflicts (lines that start with “C”,) then its time to merge the changes. I use FileMerge and merging the right version with the working version and then I save the merged file and then
6) Checkin all of the merges files by doing a svn commit.
7) No change directories to the Trunk working copy and run the following as a dry run.
svn merge --dry-run -r 4:HEAD file:///Users/Paul/Documents/test_svn/repo/branches/development
8) Then if everything checks out, you do the real merge:
svn merge -r 4:HEAD file:///Users/Paul/Documents/test_svn/repo/branches/development
NOTE: when merging the branch back into trunk, you must use the same revision number as the you did when you merged trunk into the branch, or the revision number of the commit made after the last merge from trunk to your branch.
If you not done any regular merges, which you should do BTW, to avoid really hairy merges, then your revision numbers for both merges will be the same.
9) Resolve any conflicts.
10) Checkin all of the merges files by doing a svn commit.
Now your two branches are synced up! Yeah! Happy merging!
The key is making sure you keep track of revision numbers and merging, one way to do that is to create a tag with a date or sequence number. Also, you can look into the history by using the svn log .

