Ticket #66 (closed defect: fixed)

Opened 2 years ago

Last modified 2 years ago

language, guid in feeds are wrong

Reported by: datapharmer Owned by: ofer
Priority: major Milestone: 0.5 - Beta wordpress plugin
Component: WordPress Plugin Version: 0.3.6
Keywords: langauge, feed, guid Cc:

Description

The language feeds are not generating a unique guid for each page (they are using the guid of the main language version) this makes it impossible for search engines to distinguish between the main language and translated pages so they are not indexed as separate content.

A solution to this would be to replace the_guid with the_permalink_rss and set is_permalink="true". The only side effect being that if the link structure is changed the guid will be modified. Using a redirection plugin such as  redirection would prevent this from causing 404s however, so I don't see this being a huge issue.

Also, the language declaration is not modified on line 18 of the feed output. For example, on the Spanish feed the following is declared on line 18:

<language>en</language>

Expected output is:

<language>es</language>

Change History

comment:1 Changed 2 years ago by ofer

  • Status changed from new to accepted

I believe the language issue was solved in [304]

comment:2 Changed 2 years ago by datapharmer

I see feed language has been fixed in the parser already with revision 304. The guid issue I tracked down to being caused by pages created before permalink structures were setup. This can be corrected by adding a filter in transposh.php (or possibly elsewhere). I'm not sure if it has been fixed anywhere yet, but the code I came up with to correct this is as follows:

add_filter('get_the_guid','guid_changer');

function guid_changer($guid) {

$new_guid = the_permalink_rss();
return $new_guid;

comment:3 Changed 2 years ago by datapharmer

Let's repost that so it doesn't break anything!
Here we go:

add_filter('get_the_guid','guid_changer');

function guid_changer($guid) {

$new_guid = the_permalink_rss();
return $new_guid;

};

comment:4 Changed 2 years ago by ofer

Two questions that I'd appreciate your point of view on:

  1. does the guid have to be a real link on the site?
  2. your code works nicely, but the ispermalink is still false (and might have been true...) how important is it to change it?

Here is what I found in regards:
isPermaLink Optional. If set to true, the reader may assume that it is a permalink to the item (a url that points to the full item described by the <item> element). The default value is true. If set to false, the guid may not be assumed to be a url

So from my understanding, even the current code is ok, as it create unique ids (although non working links) for example - <guid ispermalink="false"> http://transposh.org/he/?p=143</guid>
which is a non-page, but quite unique.

I'd rather avoid this code if its not needed.

comment:5 Changed 2 years ago by datapharmer

according to standards, no the GUID does not need to be a link. It can be any unique string. To fix the ispermalink issue, the feed section of the parser can be modified as follows:

        // fix urls on feed
        if ($this->feed_fix) {
            
            foreach (array('link','wfw:commentrss','comments','guid') as $tag) {
                foreach ($this->html->find($tag) as $e) {
                    $e->innertext = call_user_func_array($this->url_rewrite_func,array($e->innertext));
                }
            }

                foreach ($this->html->find('language') as $e) {
                    $e->innertext = $this->lang;
                }
                foreach ($this->html->find('guid') as $e) {
            	$e->ispermalink="true";
            }

        }

The relevant portion is this:

                foreach ($this->html->find('guid') as $e) {
            	$e->ispermalink="true";
            }

As far as your example of it generating a bad url but a unique ID, the problem is that I am not getting unique ids on my end. I don't know what the difference is between your install and mine, by my urls on the french page look like this:
<guid ispermalink="false"> http://www2.domain.com/?page_id=42</guid>
instead of
<guid ispermalink="false"> http://www2.domain.com/fr/?page_id=42</guid>

If it can be corrected some other way, that is fine as well.

comment:6 Changed 2 years ago by ofer

  • Status changed from accepted to closed
  • Resolution set to fixed

Thanks for the code and ideas, I have decided to fix this a little differently by adding the language to the end of the guid, in [307]

Note: See TracTickets for help on using tickets.