In this particular case we used noindex, nofollow meta tags in order to completely get rid of expired but indexed urls from Google cache. In this case the Joomla website has been upgraded and unfortunately there were no SEO focus at all by the former Joomla developer. After the migration there were traffic loss and we were called as Joomla SEO experts to regain the site original rankings in Google.
Problems with Google indexing
In Google Search Console (SC) you see that your Joomla website got 6000+ indexed URLs however your site has only cca. 2000 pages. You may see the following error reports in Search Console:
Discovered - currently not indexed
Crawled - currently not indexed
Duplicate title tags
Or you just realize under Pages in Search Console that you have more urls indexed and served by Google than you imagined.
In this very case the sitemap has been (re)configured correctly so now it contains all of the relevant pages under that domain. (If you need your sitemap be to checked and optimized for the neccessary links contact us, we offer Joomla SEO services.)
As those Search Console reports inform you, many URLs has been crawled by Googlebot and it got some problem with them because too many of them. In most cases these pages have almost the same content but the page itself has more URLs which leads to these error messages in SC (former Google Webmaster Tools).
Usually these URLs come from the Google index (prevoius versions from the site which was indexed earlier) or they are existing (mostly internal) links in the website.
These duplicated URLs hurt your Joomla SEO and in many cases they couse indexing and coverage issues for the Googlebot. They can lead crawl budget issues in the long run.
Let'say you have an example URL (for better understanding):
/computer-parts/keyboards.html
And you see many discovered URLs for this same page like these:
/computer-parts/keyboards.html?phacaslideshow=0&tmpl=component
/computer-parts/keyboards.html?do_pdf=1&id=111
/computer-parts/keyboards.feed.html
/computer-parts/keyboards.html
/computer-parts/keyboards/17-computer-parts.html
/68-uncategorized/computer-parts/keyboards/17-computer-parts.html
Long story short, the problem is that one single page had many more URLs in Google index.
Solution
From a technical SEO perspective it is recomended to flag those (duplicated page) URLs properly to Googlebot in Joomla CMS. We know certain urls could be marked with canonicals (link rel="canonical") but it could happen that the whole site has benne canonicalized poorly earlier so in this case using noindex was an easier method for cleaning up the site for Googlebot. (However we also added the proper canonical tags as well.)
With our Meta data Joomla SEO extension (free) you can add SEO signals for these URLs and with Rel link plugin you can inform Google (right on the link) with link attribute(s). (For example you can mark these links with a nofollow attribute.)
So sticking to the example above, you want to have all the link juice for the following URL:
/computer-parts/keyboards.html
As the other URLs are unnecessary and sometimes they have a different frontend view, you can get rid of them by marking them with nofollow and (or) noindex tag. As these URLs contain URL query parameters and parts as well you should make a rule for the parameters and for the parts too. For this do the followings:
In case of URL parameters:
Click on the Add rule button. Enable the rule. At URL inspection select URL query parameters, then copy and paste them into the URL query parameter(s) field and separate them by this character: | .
type|phocaslideshow|do_pdf
At Meta name field choose "robots" meta name, and in the next, pop up Meta content field select noindex,nofollow content.
In case of URL parts:
You need to collect URL parts as well, so make another rule.
Click on Add rule button. Enable the rule. At URL inspection select the Part(s) of the URL, then copy and paste these URL parts into the URL part(s) field and separate them by this character: | .
feed|68-uncategorised|17-computer-parts
At Meta name field choose "robots" meta name, and in the next, pop up Meta content field select noindex,nofollow content.
Final step: test your configuraton
After saving your configuration, you can check the URL query parameters and parts on your frontend at inspector view.
Your
Joomla SEO expert
is here!
- Having crawling problems in Search Console?
- Do you have unimportant Joomla pages indexed while new pages are not indexed by Google?
- Have you experienced sudden traffic loss?
- Do you facing with crawl budget issues?
Get in touch with us!
We can fix your SEO problems!
or email us: This email address is being protected from spambots. You need JavaScript enabled to view it.