Changing File Encoding Using Ruby 1.9.2

Changing File Encoding Using Ruby 1.9.2

written by Paul on January 3rd, 2011 @ 06:22 AM

Currently, I am in the process of upgrading an application from Ruby 1.8.7 to Ruby 1.9.2. One of the big differences between 1.8 and 1.9 is the multi-byte character support.

The Problem

We have thousands of static html files that were generated in Ruby 1.8 and when Ruby 1.9 reads them it fails. As usual, before I start to dig in to solving the problem I do a quick search and see what other people have been doing to solve the problem. My search yielded a bunch of multi-lined scripts and techniques… most of which were from the Ruby 1.8 days.

The Solution

In short I wrote a simple 4 lined script in irb and it completed my task quickly. One thing that I am really happy about it how Ruby 1.9.2 strings have a method called escape that provides great utility when performing these kinds of tasks.

So here is the code:


`find . -name '*.html'`.split("\n").each do |filename|
  puts filename
  handle = File.open(filename,"w+")
  handle.write(handle.read.encode('UTF-8'))
  handle.close
end; nil

If you are interested in the options with the encode method, go check them out.

 

Advertisements

3 responses to “Changing File Encoding Using Ruby 1.9.2”

  1. Jan Friedrich says :

    Dir[‘**/*.html’].each do |filename|
    puts filename
    File.open(filename, ‘w+’) do |handle|
    handle.write(handle.read.encode(‘UTF-8’))
    end
    end; nil

  2. Jan Friedrich says :

    Formatting lost in the last comment. Textile doesn’t like me. 😦

    Have a look at the posted code in “this gist”:https://gist.github.com/764563

  3. Paul says :

    You are right, Jan, the Dir[‘*/.html’] is a better approach, thanks for sharing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: