Fixing encoding issues on WordPress (comments display as question marks)

One of my dear friends in Russia subscribes and reads my blog :) how cool is that?

Well, we also discovered that her comments that she posts in Russian (using Cyrillic characters) don’t get saved properly into the database, and are displayed as a bunch of question marks ‘??????’

As I suspected, the character encoding was not set right. Older version of WordPress had latin1-swedish collation as default (weirdly enough). Latest version had this corrected, and the default is utf8, and you can also specify it in settings in wp-config.php file (as described in this Codex article):

define('DB_COLLATE', 'utf8_general_ci');

To fix this issue on existing installations, open your MySQL console (command line, MySQL Admin or PhpMyAdmin), backup your database (just in case), and follow these steps:

  1. Starting from the top, the database level, let’s make sure that the database encoding is correct:

    ALTER DATABASE my_wp_database CHARACTER SET utf8;
    
  2. While this applies to all the new tables within this database, it does not change encoding of the existing tables. So that’s why, we’ll need to go a bit deeper to the table level.

  3. Table level: make sure your table collation is set correctly:

    ALTER TABLE wp_comments CHARACTER SET utf8;

  4. Again – this change will be effective for all new columns within this table. For existing columns, we’ll need to go down one more level.

  5. Column level: set collaction on individual columns within your table.

    alter table wp_comments change comment_content comment_content LONGTEXT CHARACTER SET utf8;

    alter table wp_comments change comment_author comment_author LONGTEXT CHARACTER SET utf8;

  6. This will not change your existing data, but going forward, your comments will be saved and displayed with correct encoding.

Source: Codex article on converting database character sets

Leave a Reply