2

I have a Ruby/Rails app.

I have an artists table in my postgresql database which I want to query by name. I have some artists with portuguese characters etc. and am having some issues querying them.

For example one band is called Legião Urbana. If I query with the string "legiã" from my app I get the following params:

{"action"=>"search_artist", "q"=>"legi\343", "controller"=>"home"}

However I get an error from the query

Artist.all(:conditions => "name LIKE '%#{params[:q]}%'")

PGError: ERROR:  invalid byte sequence for encoding "UTF8": 0xe32527

What should I be doing to convert into UTF8 or fix this from happening somehow?

2 Answers 2

5

You need to know what the encoding is of that parameter in the query-string.

Ruby 1.9 includes support for strings tagged with their encodings. In Ruby 1.9, you could:

params[:q].encoding # Rails 3 on 1.9 generally presents strings in UTF-8
params[:q].encode('utf-8') # ask Ruby to re-encode it to UTF-8

Then you need to convert the parameter from that encoding into UTF-8 before doing string-interpolation (#{...} syntax).

Or you need to pass the parameter as a SQL parameter, not using string-interpolation.

Of course, this brings up the security consideration that, unless you know how to properly encode text for usage in SQL, you should never do string-interpolation to build SQL string fragments. Because SQL-fragments with parameters are quick and easy to do in Rails, you should use them.

# Rails 2
Artist.all(:conditions => ['name like ?', "%#{params[:q]}%"])
Artist.all(:conditions => ['name like :q', { :q=> "%#{params[:q]}%" }])

# Rails 3
Artist.where('name like ?', "%#{params[:q]}")
Artist.where('name like :q', :q => "%#{params[:q]}")

SQL injection is the security problem that occurs when you do string-interpolation and encode strings in a way that builds correct SQL fragments for some input strings, but not for others. In languages/frameworks where parameters are more difficult to work with, it would be acceptable to do string-interpolation or string-building (if it remains easy to do string-interpolation or string-building), so long as you research exhaustively how you are required to encode the interpolated strings to build correct SQL fragments, regardless of the input string. Because SQL injection is so easy to avoid with Rails via ordered or named parameters (see the four samples above), you should not have any problems ensuring that your SQL fragments are all safe.

Sign up to request clarification or add additional context in comments.

3 Comments

Ok, it's more the encoding/converting that I was interested. How do you find the encoding of the parameter? I'm aware of the SQL injection issue but I just wanted to give an example of the query.
+1 for "you need to pass the parameter as a SQL parameter, not using string-interpolation".
Unfortunately I'm not too familiar with the 1.8 string-encoding situation. I do know that it can be rather difficult to deal with encoding issues in 1.8, and that if you are facing encoding problems, the best bet is to move to 1.9.
4

I think this might do it

require 'iconv'
Iconv.conv("UTF8", "LATIN1", params[:q])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.