Ticket #69 (closed defect: fixed)

Opened 2 years ago

Last modified 16 months ago

fix link tag and meta tag modification

Reported by: datapharmer Owned by: ofer
Priority: minor Milestone: 0.5 - Beta wordpress plugin
Component: Parser Version: 0.3.7
Keywords: meta, link, rss, canonical, url, description, keywords, google Cc:

Description

I have several tags that are not being modified properly by the parser.

The most noticeable issue for end users is the alternate link that adds an rss feed icon to the browser location bar.

The template code for the rss is as follows:

<link rel="alternate" type="application/rss+xml" title="RSS 2.0" href="<?php bloginfo('rss2_url'); ?>" />

Three possible approaches for this are to

  1. loop through similar to for other links and hope we don't modify another link tag that shouldn't be touched.
  2. modify rss2_url and hope that doesn't break anything else
  3. look for /feed/ and check to see if the lang parameter is before it, if not, then add it.

For meta data the code not being updated correctly is as follows:

most important is the canonical link. It should be unique for each page translation or google is unlikely to index translated pages.

<link rel="canonical" href=" http://domain.com" />
expected behavior: add language param to url

The next biggest issue is meta information not being translated or having the wrong lang parameter specified.

<meta name="description" content="..." />
expected behavior: content should be translated, this may be broken because of all-in-one-seo pack.

<meta name="DC.subject" lang="en" content="..." />
expected behavior: content should be translated, lang is set to language param, this may be broken because of all-in-one-seo pack.

<meta name="keywords"
content="..." />
expected behavior: content should be translated, this may be broken because of all-in-one-seo pack.

<meta name="DC.title" lang="en" content="..." />
expected behavior: content is translated properly but lang should be set to lang param.

<meta name="DC.description" lang="en" content="..." />
expected behavior: content should be translated and lang should be set to language param, this may be broken because of all-in-one-seo pack.

Change History

comment:1 Changed 2 years ago by ofer

  • Status changed from new to accepted

Ok,

First part, the RSS thingy is fixed in [314] although the solution is quite different (and probably simpler)

Second part, rel="canonical", where? didn't find it on my sites or yours, so I'm probably missing something

Third part, meta keyword and description, the content part of the meta has been translated for long, although since it's in the hidden parts the translation is less straight forward, I could have put a list of hidden phrases somewhere at the top of the page, but this will make little sense to most users but the really advanced translators. need to think of a better solution

forth part, the whole dublin core stuff with the lang and language, I can put the code in, I don't think its much good. I am not even sure google and friends pay much attention to it, see the following:
 http://googlewebmastercentral.blogspot.com/2009/09/google-does-not-use-keywords-meta-tag.html
 http://www.google.com/support/webmasters/bin/answer.py?answer=79812&hl=en

comment:2 Changed 2 years ago by datapharmer

Hi,

Thanks for the quick fix for the rss links in 314, it makes sense to look for all alternates since that should cover other types of feeds as well.

canonical: Canonical urls is an option in all-in-one seo pack (checkbox, enabled by default). I disabled it when I realized google was ignoring the translated pages, but if you want I can reenable it temporarily for testing purposes, just let me know. Some people also include this in their templates using something like the_permalink to generate the url if they have only one domain and a www include/exclude preference set.

meta, keyword, description: If I understand you correctly, the meta keywords and description should be translated, but they aren't being reliably translated. Description is a big one, because google directly uses that.

As for the dublin core stuff, I understand fewer people and bots use this, and while google doesn't officially (and may ignore it totally) other bots do look for it, so I include it to give other crawlers a chance if they do look for it. Since it is much less common, I do understand if you don't want to bog down the code for such as small number of users, but did think it should at least be a known issue.

comment:3 Changed 2 years ago by ofer

I will check (and probably fix) the canonical issue soon, I'd appreciate testing after its checked in.
I agree regarding the description, and would love to hear your ideas (I can put all those missing near the first translated span that is visible, hmm, code will be ugly and interface lacks elegance...)

I guess the rest will remain as known issues, at least until something changes

comment:4 Changed 2 years ago by ofer

Commit [320] should handle the rel=canonical issue, I haven't tested it it, but its simple enough and should work, please test and let me know

comment:5 Changed 2 years ago by datapharmer

Canonical fix appears to work, but something else is causing problems (perhaps language handling?)

The first time I load the website I just get a white page. On reload it comes up, but if I clear cookies and set to an alternate language the website's default language still comes up instead of the detected language.

comment:6 Changed 16 months ago by ofer

  • Status changed from accepted to closed
  • Resolution set to fixed

I believe this is fixed, reopen if not

Note: See TracTickets for help on using tickets.