Skip to content
mokolabs edited this page May 17, 2011 · 5 revisions

How to convert legacy text from Latin1 to UTF8 in MYSQL

Here's a rake task which will connect to your legacy database and convert all your data into UTF8.

desc 'Convert the legacy db to UTF8 before the import'
  task :convert_to_utf8 => :environment do
    LegacyBase.connection.tables.each do |table|
      LegacyBase.connection.execute("ALTER TABLE #{table} CONVERT TO CHARACTER SET utf8;")
    end
  end
end

To check the encoding of your table, you can use this MYSQL query:

SELECT TABLE_SCHEMA, TABLE_NAME, CCSA.CHARACTER_SET_NAME AS DEFAULT_CHAR_SET,
COLUMN_NAME, COLUMN_TYPE, C.CHARACTER_SET_NAME FROM information_schema.TABLES
AS T JOIN information_schema.COLUMNS AS C USING (TABLE_SCHEMA, TABLE_NAME) JOIN
information_schema.COLLATION_CHARACTER_SET_APPLICABILITY AS CCSA ON
(T.TABLE_COLLATION = CCSA.COLLATION_NAME) WHERE TABLE_SCHEMA=SCHEMA() AND
C.DATA_TYPE IN ('enum', 'varchar', 'char', 'text', 'mediumtext', 'longtext')
ORDER BY TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME;