Character Set and Collation Housekeeping: Pre-vB5 upgrade prep.

    No one wants to wade into this one?

    I've been putting this off, but it's time to address the character set and collation issues on our vBulletin v4.2.0 PL2 database. I'd appreciate some folks helping me square this away...

    Forum is in the US... The whole Latin1 vs. UTF8 thing... driving me crazy. I'd really like to take my database all the way Latin OR all the way UTF8.

    Here's what I have right now:

    vBulletin Language-->General: HTML Character Set = ISO-8859-1

    SHOW VARIABLES LIKE '%character%'
    character_set_client utf8
    character_set_connection utf8
    character_set_database latin1
    character_set_filesystem binary
    character_set_results utf8
    character_set_server latin1
    character_set_system utf8
    character_sets_dir /usr/local/mysql-5.0.96/share/mysql/charsets/

    SHOW VARIABLES LIKE '%collation%'
    collation_connection utf8_general_ci
    collation_database latin1_general_ci
    collation_server latin1_german2_ci
    And to clarify for the database settings (these are being read from phpmyadmin)
    Database Character Set: utf8 <--- does not appear to agree with SHOW...'%character%' command results
    Database Collation: latin1_general_ci

    Tables Character Sets: "Character Set of the file "utf8"
    Tables Collations: latin1_swedish_ci for the VAST majority of them. Have two TapaTalk tables that are latin1_general_ci
    Field Collations: MANY unassigned, but of those that have explicit assignment, they are all latin1_swedish_ci, except for the assigned TapaTalk table fields which are latin1_general_ci

    MOSTLY, I want to rid myself of the collation errors, BUT I'd also like to settle on a single character set and collation.

    It would appear the EASIEST fixes would involve moving toward the latin1_swedish_ci collation on the latin1 character set... no need to change the HTML Character setting in admincp.

    Would there be any benefit of going the hard way and setting everything to UTF-8 (to includea LOT of ALTERing of the database)?

    I'm studying, but as you can see, I have a ways to go before I understand how all this fits together and the implications.

    PS: Won't do ANYTHING without multiple backups, and will test any setup change in my test system.

