Fix capitalized words in a MySQL table

Jacob Allred
#my-sites#web-dev

I got a support request today for the Fake Name Generator. A visitor ordered US addresses, and some of the cities came back in all caps and some normal (e.g., FOLSOM and Folsom).

After digging into my code, I found that I was checking if a city name was in uppercase, and if so I was forcing it to lowercase then running PHP’s ucwords() on it (so FOLSOM -> folsom -> Folsom). This was happening on the web but not in bulk orders, so I never noticed it was happening until someone told me.

Having recently started a crusade to make the Fake Name Generator blazing fast, I decided to fix the database rather than fix the PHP. But databases tend to be case insensitive, so how to accomplish this without having to pull everything into PHP, then save it back? Well, I found this handy MySQL query that lets you do case sensitive queries:

select distinct city from ats__ng_areazipus where upper(city) = city collate latin1_bin;

That collate latin1_bin somehow magically makes things case sensitive. So my query is showing me all the distinct city names where the uppercase version of the city matches what is actually in the table. Great! But how to go from FOLSOM to Folsom?

Searching on Google, I found this nifty MySQL user function that lets you do something similar to PHP’s ucwords():

DROP FUNCTION IF EXISTS proper;
SET GLOBAL  log_bin_trust_function_creators=TRUE;
DELIMITER |
CREATE FUNCTION proper( str VARCHAR(128) )
RETURNS VARCHAR(128)
BEGIN
  DECLARE c CHAR(1);
  DECLARE s VARCHAR(128);
  DECLARE i INT DEFAULT 1;
  DECLARE bool INT DEFAULT 1;
  DECLARE punct CHAR(17) DEFAULT ' ()[]{},.-_!@;:?/';
  SET s = LCASE( str );
  WHILE i < LENGTH( str ) DO
    BEGIN
      SET c = SUBSTRING( s, i, 1 );
      IF LOCATE( c, punct ) > 0 THEN
        SET bool = 1;
      ELSEIF bool=1 THEN
        BEGIN
          IF c >= 'a' AND c <= 'z' THEN
            BEGIN
              SET s = CONCAT(LEFT(s,i-1),UCASE(c),SUBSTRING(s,i+1));
              SET bool = 0;
            END;
          ELSEIF c >= '0' AND c <= '9' THEN
            SET bool = 0;
          END IF;
        END;
      END IF;
      SET i = i+1;
    END;
  END WHILE;
  RETURN s;
END;
|
DELIMITER ;

So after adding that user function to my MySQL database, I came up with the following query to fix my data:

update ats__ng_areazipus set city = proper(city) where upper(city) = city collate latin1_bin;

Is it super efficient? Probably not, but I’m only running it once, so efficiency isn’t an issue. And now I’ve been able to strip some conditionals and slow string manipulation code out of my name generator class! Yay!