Fix capitalized words in a MySQL table

I got a sup­port request today for the Fake Name Gen­er­a­tor. A vis­i­tor ordered US addresses, and some of the cities came back in all caps and some nor­mal (e.g., FOLSOM and Folsom).

After dig­ging into my code, I found that I was check­ing if a city name was in upper­case, and if so I was forc­ing it to low­er­case then run­ning PHP’s ucwords() on it (so FOLSOM -> fol­som -> Fol­som). This was hap­pen­ing on the web but not in bulk orders, so I never noticed it was hap­pen­ing until some­one told me.

Hav­ing recently started a cru­sade to make the Fake Name Gen­er­a­tor blaz­ing fast, I decided to fix the data­base rather than fix the PHP. But data­bases tend to be case insen­si­tive, so how to accom­plish this with­out hav­ing to pull every­thing into PHP, then save it back? Well, I found this handy MySQL query that lets you do case sen­si­tive queries:

select distinct city from ats__ng_areazipus where upper(city) = city collate latin1_bin;

That col­late latin1_bin some­how mag­i­cally makes things case sen­si­tive. So my query is show­ing me all the dis­tinct city names where the upper­case ver­sion of the city matches what is actu­ally in the table. Great! But how to go from FOLSOM to Folsom?

Search­ing on Google, I found this nifty MySQL user func­tion that lets you do some­thing sim­i­lar to PHP’s ucwords():

DROP FUNCTION IF EXISTS proper;
SET GLOBAL  log_bin_trust_function_creators=TRUE;
DELIMITER |
CREATE FUNCTION proper( str VARCHAR(128) )
RETURNS VARCHAR(128)
BEGIN
  DECLARE c CHAR(1);
  DECLARE s VARCHAR(128);
  DECLARE i INT DEFAULT 1;
  DECLARE bool INT DEFAULT 1;
  DECLARE punct CHAR(17) DEFAULT ' ()[]{},.-_!@;:?/';
  SET s = LCASE( str );
  WHILE i < LENGTH( str ) DO
    BEGIN
      SET c = SUBSTRING( s, i, 1 );
      IF LOCATE( c, punct ) > 0 THEN
        SET bool = 1;
      ELSEIF bool=1 THEN
        BEGIN
          IF c >= 'a' AND c <= 'z' THEN
            BEGIN
              SET s = CONCAT(LEFT(s,i-1),UCASE(c),SUBSTRING(s,i+1));
              SET bool = 0;
            END;
          ELSEIF c >= '0' AND c <= '9' THEN
            SET bool = 0;
          END IF;
        END;
      END IF;
      SET i = i+1;
    END;
  END WHILE;
  RETURN s;
END;
|
DELIMITER ;

So after adding that user func­tion to my MySQL data­base, I came up with the fol­low­ing query to fix my data:

update ats__ng_areazipus set city = proper(city) where upper(city) = city collate latin1_bin;

Is it super effi­cient? Prob­a­bly not, but I’m only run­ning it once, so effi­ciency isn’t an issue. And now I’ve been able to strip some con­di­tion­als and slow string manip­u­la­tion code out of my name gen­er­a­tor class! Yay!

This entry was posted in My Sites, Web Dev. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>