I've discovered that using news.google.com and an advance search I can get extremely good results for the type of news I'm trying to put on my site. The problem I am having is that the google rss feed has some extra information that I'd like to get rid of. Take a look at an example:
I am using title=link_title, link=link_url, and summary=link_content . Unfortunately, each one of these is somewhat problematic.Code:<item> <title>Indianapolis Colts lead ESPY nominations - Times Picayune</title> <link>http://news.google.com/news/url?sa=T&ct=us/2-0&fd=R&url=http://www.nola.com/newsflash/sports/index.ssf%3F/base/sports-13/1182776353191090.xml%26storylist%3D&cid=1117574525&ei=CRyARvatAYiy0AH_vtE0</link> <guid isPermaLink="false">tag:news.google.com,2005:cluster=429cd57d</guid> <pubDate>Mon, 25 Jun 2007 13:14:06 GMT</pubDate> <description><br><table border=0 width= valign=top cellpadding=2 cellspacing=7><tr><td valign=top class=j><a href="http://news.google.com/news/url?sa=T&ct=us/2-0&fd=R&url=http://www.nola.com/newsflash/sports/index.ssf%3F/base/sports-13/1182776353191090.xml%26storylist%3D&cid=1117574525&ei=CRyARvatAYiy0AH_vtE0">Indianapolis Colts lead ESPY nominations</a><br><font size=-1><font color=#6f6f6f>Times Picayune, LA -</font> <nobr>6 hours ago</nobr></font><br><font size=-1>The Arthur Ashe Courage Award is presented to individuals whose contributions transcend <b>sports</b>. The new Jimmy V Award for Perseverance will be presented to <b>...</b></font><br></table> </description> </item>
For link_title, I would like to remove the '- Times Picayune' portion. Is there a way to get rid of this using php before it is submitted? The '-' is always there followed by the source.
For the link_url I'd really like to use the portion between 'url=' and '&cid'
Lastly, link_content I would like to keep what's between '<font size=-1>' and '</font>'
Is there any hope in hacking this together?



Reply With Quote





