[SOLVED] Mangled accents in the 'link_title' field of the 'links' table

Register an Account
Pligg Chat Room
Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 07-18-2007, 07:19 PM
liotier's Avatar
Casual Pligger
Pligg Version: Slightly modified 9.8.2
Pligg Template: Slightly modified Yget from 9.8.2
 
Join Date: Jul 2007
Location: Paris La Défense, France
Posts: 70
Hello, I am using Beta 9.7 (07.09.2007) since yesterday with default template. I am quite amazed how fast one can get a smooth Pligg site up and running - it seems like much work has already gone into this code. I still have to understand how to get "UrlMethod 2" to work, but I'm sure I'm the thousandth newby who will soon figure it out after digging the forums a bit...

What I currently can't figure out is why my accented characters are mangled when a link title is stored in the 'link_title' field of the 'links' table. For example, "é" and "ç" are stored wrong. I set up a first Pligg test site at #b^2 links / Published News to play with it. You can see examples of my problem at #b^2 links - Détournement de mobilier urbain pour un sondage (notice the mangled "é") and at #b^2 links - Chez moi ça marche (notice the mangled "ç").

In both cases the page is graphically fine but the URL is wrong, respectively http://infotain.ruwenzori.net/out.ph...our_un_sondage and http://infotain.ruwenzori.net/out.ph...%BF%BDa_marche in the above mentioned examples.

I started my database with utf8_unicode_ci and then tried with utf8_general_ci but that does not change anything.

Since the database record itself is mangled, I guessed after reading a few threads that the answer may have something to do with libs/utils.php. I found that "é" and "ç" were both missing from the list of "$output = str_replace" character replacements around line 260 of libs/utils.php. Can that be the cause of my problem ? I am the only French user who encountered that specific problem ? Has anybody figured what "$output = str_replace" lines should be added to support titles in the French language ?
Reply With Quote
  #2 (permalink)  
Old 07-24-2007, 10:51 AM
liotier's Avatar
Casual Pligger
Pligg Version: Slightly modified 9.8.2
Pligg Template: Slightly modified Yget from 9.8.2
 
Join Date: Jul 2007
Location: Paris La Défense, France
Posts: 70
I would have thought that with the history of issues with I8N that I found in the forums someone may have been able to hint me in the right direction...
Reply With Quote
  #3 (permalink)  
Old 07-25-2007, 02:01 PM
liotier's Avatar
Casual Pligger
Pligg Version: Slightly modified 9.8.2
Pligg Template: Slightly modified Yget from 9.8.2
 
Join Date: Jul 2007
Location: Paris La Défense, France
Posts: 70
Quote:
Originally Posted by liotier View Post
Since the database record itself is mangled, I guessed after reading a few threads that the answer may have something to do with libs/utils.php. I found that "é" and "ç" were both missing from the list of "$output = str_replace" character replacements around line 260 of libs/utils.php. Can that be the cause of my problem ?
Reflecting on that solution I thought that it would not work : the function I am talking about is named makeUrlFriendly() whereas my problem is sanitizing the title, not the URL.

But then more grepping led me to discover that this function is used in templates/yget/submit_step_3.tpl at line 58 to sanitize the title so that it can be used in the story's URL :

PHP Code:
$linkres->title_url makeUrlFriendly($linkres->title); 
So I modified libs/utils.php so that makeUrlFriendly() supports my problematic accents :

PHP Code:
276a277,281
>    $output str_replace("é""e"$output);
>    
$output str_replace("è""e"$output);
>    
$output str_replace("ç""c"$output);
>    
$output str_replace("ù""u"$output);
>    
$output str_replace("à""a"$output); 
A few tests later I confirmed that I have solved my problem.

Now, can anyone add that in the dev tree ?

Last edited by liotier; 07-25-2007 at 02:02 PM. Reason: Typo
Reply With Quote
  #4 (permalink)  
Old 08-01-2007, 07:36 AM
liotier's Avatar
Casual Pligger
Pligg Version: Slightly modified 9.8.2
Pligg Template: Slightly modified Yget from 9.8.2
 
Join Date: Jul 2007
Location: Paris La Défense, France
Posts: 70
I forgot one... You may add the following to the list :

PHP Code:
$output str_replace("ê""e"$output); 
Reply With Quote
  #5 (permalink)  
Old 08-02-2007, 05:41 AM
beatniak's Avatar
Pligg Donor
 
Join Date: Apr 2006
Location: NL - 52.100863;5.108356
Posts: 197
Great job loitier!

Here are ALL the characters necessary for German / Turkish / Norwegian / Dutch / etc:

Code:
$output = str_replace("ì", "i", $output);
$output = str_replace("í", "i", $output);
$output = str_replace("î", "i", $output);
$output = str_replace("ï", "i", $output);
$output = str_replace("Ì", "I", $output);
$output = str_replace("Í", "I", $output);
$output = str_replace("Î", "I", $output);
$output = str_replace("Ï", "I", $output);
$output = str_replace("ò", "o", $output);
$output = str_replace("ó", "o", $output);
$output = str_replace("ô", "o", $output);
$output = str_replace("õ", "o", $output);
$output = str_replace("ö", "o", $output);
$output = str_replace("ø", "o", $output);
$output = str_replace("Ò", "O", $output);
$output = str_replace("Ó", "O", $output);
$output = str_replace("Ô", "O", $output);
$output = str_replace("Õ", "O", $output);
$output = str_replace("Ö", "O", $output);
$output = str_replace("Ø", "O", $output);
$output = str_replace("ù", "u", $output);
$output = str_replace("ú", "u", $output);
$output = str_replace("û", "u", $output);
$output = str_replace("ü", "u", $output);
$output = str_replace("Ù", "U", $output);
$output = str_replace("Ú", "U", $output);
$output = str_replace("Û", "U", $output);
$output = str_replace("Ü", "U", $output);
$output = str_replace("é", "e", $output);
$output = str_replace("è", "e", $output);
$output = str_replace("ê", "e", $output);
$output = str_replace("ë", "e", $output);
$output = str_replace("È", "E", $output);
$output = str_replace("É", "E", $output);
$output = str_replace("Ê", "E", $output);
$output = str_replace("Ë", "E", $output);
$output = str_replace("à", "a", $output);
$output = str_replace("á", "a", $output);
$output = str_replace("â", "a", $output);
$output = str_replace("ã", "a", $output);
$output = str_replace("ä", "a", $output);
$output = str_replace("å", "a", $output);
$output = str_replace("À", "A", $output);
$output = str_replace("Á", "A", $output);
$output = str_replace("Â", "A", $output);
$output = str_replace("Ã", "A", $output);
$output = str_replace("Ä", "A", $output);
$output = str_replace("Å", "A", $output);
$output = str_replace("ñ", "n", $output);
$output = str_replace("Ñ", "N", $output);
$output = str_replace("æ", "ae", $output);
$output = str_replace("Æ", "AE", $output);
$output = str_replace("ß", "ss", $output);
$output = str_replace("ç", "e", $output);
$output = str_replace("Ç", "C", $output);
$output = str_replace("ý", "y", $output);
$output = str_replace("ÿ", "y", $output);
$output = str_replace("Ý", "Y", $output);

Last edited by beatniak; 08-02-2007 at 05:50 AM. Reason: added the capitals as well
Reply With Quote
  #6 (permalink)  
Old 08-20-2007, 10:54 PM
Casual Pligger
 
Join Date: Aug 2007
Posts: 65
Okay, but how would one go about it with Chinese characters?
Reply With Quote
  #7 (permalink)  
Old 10-15-2007, 11:41 AM
liotier's Avatar
Casual Pligger
Pligg Version: Slightly modified 9.8.2
Pligg Template: Slightly modified Yget from 9.8.2
 
Join Date: Jul 2007
Location: Paris La Défense, France
Posts: 70
I spotted an error in Beatniak's list :

$output = str_replace("ç", "e", $output);

is wrong and should be

$output = str_replace("ç", "c", $output);
Reply With Quote
  #8 (permalink)  
Old 10-15-2007, 02:07 PM
dollars5's Avatar
Pligg Donor
 
Join Date: Dec 2006
Location: India
Posts: 1,961
lol - I never saw it that deep - I even copied and used it on one of my sites - now fixing it.
Reply With Quote
  #9 (permalink)  
Old 10-25-2007, 06:48 PM
New Pligger
Pligg Version: 9.8
Pligg Template: French
 
Join Date: Oct 2007
Posts: 6
From what I saw, before registering a title into your db, you're replacing your accents with non-accents letters ... that's not what you really want to do ... no?

actually, what you do with str_replace is to search for é and replace it with e, so in a title, you get

ORGINAL : Un avion est écrasé

AFTER STR_REPLACE : Un avion est ecrase

what I'm searching, is to register the title exactly as it is, no changes, with all the accents.
Reply With Quote
  #10 (permalink)  
Old 10-26-2007, 02:58 AM
liotier's Avatar
Casual Pligger
Pligg Version: Slightly modified 9.8.2
Pligg Template: Slightly modified Yget from 9.8.2
 
Join Date: Jul 2007
Location: Paris La Défense, France
Posts: 70
Quote:
Originally Posted by WebAbitibi View Post
From what I saw, before registering a title into your db, you're replacing your accents with non-accents letters ... that's not what you really want to do ... no?

[..]

what I'm searching, is to register the title exactly as it is, no changes, with all the accents.
I certainly agree with you, but just sanitizing the strings is a cheap way to get around the problem whereas properly i8nizing Pligg is a considerable endeavour that may not be a priority at the moment.
Reply With Quote
Reply

Thread Tools
Display Modes


Similar Threads
Thread Thread Starter Forum Replies Last Post
Warning: Table 'tucknet_newqian_digg.pligg_groups' doesn't exist in seanli Questions and Comments 0 04-23-2009 02:35 AM
installed successfully but one table is not exist oodbqpoo Questions and Comments 2 03-16-2007 02:22 PM
Upgrading problem GurillaNET Questions and Comments 4 01-19-2007 10:56 AM
Creating the links table with InnoDB gilshwartz Questions and Comments 0 05-23-2006 03:08 PM
beta 5 and 6 install but can not create table links mtmdali Questions and Comments 0 01-20-2006 02:21 AM


Pligg Modules and Pligg Templates from Pligg Pro Find support on the Pligg CMS Forum - 24 hours a day! Make a donation to support Pligg CMS development