Proper link title converting if source URL has no specified codepage

Register an Account
Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 03-15-2009, 04:18 PM
kossmoss's Avatar
New Pligger
Pligg Version: 1.0Rus
Pligg Template: wistie mod
 
Join Date: May 2008
Location: Russia
Posts: 19
Send a message via ICQ to kossmoss
Link class can convert link title got from URL only if source HTML code has '<meta http-equiv="Content-Type" ....' tag. Many pages has no encoding specified in their HTML code, but specified in their http headers. When I add link to such a page, I see something like this:


To fix this, add some code to libs/link.php:
class PliggHTTPRequest:
Code:
//new variable for content-type header
var $content_type;	//content-type
and these lines in function DownloadToString() to save this header (I inserted them before redirection section)
Code:
			if(isset($headers['content-type']))
				$this->content_type = $headers['content-type'];
after that add following code to class Link, function get():
Code:
		if(preg_match('/charset=([a-zA-Z0-9-_]+)/i', $this->html, $matches)) {
			//no need to change code here - kossmoss
			.....
			.....
		}
		else if(preg_match('/charset=([a-zA-Z0-9-_]+)/i', $r->content_type, $matches)) {
			$this->encoding=trim($matches[1]);
			//you need iconv to encode to utf-8
			if(function_exists("iconv"))
			{
				if(strcasecmp($this->encoding, 'utf-8') != 0) {
					//convert the html code into utf-8 whatever encoding it is using
					$this->html=iconv($this->encoding, 'UTF-8//IGNORE', $this->html);
				}
			}
		}
Thus, we can get properly converted title in any case. Probably, another thing still need to add - support of mb_convert_string() method, if iconv() doesn't supported by PHP instance.
Reply With Quote
  #2 (permalink)  
Old 07-24-2009, 08:14 PM
New Pligger
Pligg Version: 1.0.0
 
Join Date: Jul 2009
Posts: 21
Quote:
Originally Posted by kossmoss View Post
Link class can convert link title got from URL only if source HTML code has '<meta http-equiv="Content-Type" ....' tag. Many pages has no encoding specified in their HTML code, but specified in their http headers. When I add link to such a page, I see something like this:


To fix this, add some code to libs/link.php:
class PliggHTTPRequest:
Code:
//new variable for content-type header
var $content_type;	//content-type
and these lines in function DownloadToString() to save this header (I inserted them before redirection section)
Code:
			if(isset($headers['content-type']))
				$this->content_type = $headers['content-type'];
after that add following code to class Link, function get():
Code:
		if(preg_match('/charset=([a-zA-Z0-9-_]+)/i', $this->html, $matches)) {
			//no need to change code here - kossmoss
			.....
			.....
		}
		else if(preg_match('/charset=([a-zA-Z0-9-_]+)/i', $r->content_type, $matches)) {
			$this->encoding=trim($matches[1]);
			//you need iconv to encode to utf-8
			if(function_exists("iconv"))
			{
				if(strcasecmp($this->encoding, 'utf-8') != 0) {
					//convert the html code into utf-8 whatever encoding it is using
					$this->html=iconv($this->encoding, 'UTF-8//IGNORE', $this->html);
				}
			}
		}
Thus, we can get properly converted title in any case. Probably, another thing still need to add - support of mb_convert_string() method, if iconv() doesn't supported by PHP instance.
Kossmoss,
Thank you for your mod, but still cant work this out.
I have got the same issue problem but when i add to "class Link, function get()" i receive
-----------Parse error: syntax error, unexpected T_IF, expecting '&' or T_VARIABLE in /home/bubahuba/public_html/izberi-kupi.com/libs/link.php on line 62-----------
I will hope for your repply!
Reply With Quote
  #3 (permalink)  
Old 07-24-2009, 08:45 PM
New Pligger
Pligg Version: 1.0.0
 
Join Date: Jul 2009
Posts: 21
Quote:
else if(preg_match('/charset=([a-zA-Z0-9-_]+)/i', $r->content_type, $matches)) {
$this->encoding=trim($matches[1]);
//you need iconv to encode to utf-8
if(function_exists("iconv"))
{
if(strcasecmp($this->encoding, 'utf-8') != 0) {
//convert the html code into utf-8 whatever encoding it is using
$this->html=iconv($this->encoding, 'UTF-8//IGNORE', $this->html);
}
IT is not accepting me "else" this is the problem I think.
Reply With Quote
  #4 (permalink)  
Old 07-25-2009, 06:24 AM
kossmoss's Avatar
New Pligger
Pligg Version: 1.0Rus
Pligg Template: wistie mod
 
Join Date: May 2008
Location: Russia
Posts: 19
Send a message via ICQ to kossmoss
Your problem seems to be caused by incorrect PHP syntax (I wrote my code for an older Pligg version, so it probably needs some modification for Pligg v1).
What part of this code you have on line 62?

Last edited by kossmoss; 07-25-2009 at 06:31 AM.
Reply With Quote
  #5 (permalink)  
Old 07-29-2009, 12:26 PM
New Pligger
Pligg Version: 1.0.0
 
Join Date: Jul 2009
Posts: 21
I manage it this way: I erased on the very end
Quote:
//print_r($headers);
if (extension_loaded('iconv') && preg_match('/charset=(.+)$/',$headers['content-type'],$m))
$body = iconv($m[1],"UTF-8",$body);
and it is now okey. Thank you.

I am using pligg 1.0.0 and that's why, may be. On line 62 I have got
Quote:
function get($url) {
$url=trim($url);
Thanks for your attention
Reply With Quote
Reply

Tags
encoding, fix, link, title

Thread Tools
Display Modes


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] how to make story's title link to original resource? hbtq Questions and Comments 15 02-24-2010 01:40 PM
how to make story's title link to original resource? kajani Questions and Comments 1 01-06-2009 05:20 PM
Can the URL link be made more visible / title link? jon Questions and Comments 2 03-21-2007 04:30 PM


Pligg Modules and Pligg Templates from Pligg Pro Find support on the Pligg CMS Forum - 24 hours a day! Make a donation to support Pligg CMS development