understanding canonical url and search engines
- By Subodh Gupta
- Published 02/12/2009
First let’s understand what canonical or canonical url means.
Canonical : it means simple or basic or accepted as true without the need for proof.
Canonical url: The primary url that we want to use is known as the Canonical URL. In the world of search engines duplicate content is prohibited. Google applied penalty if several domain or url within your site pointing at the same content in your site.
The most common problem of duplicate content can be in the ecommerce website. An ecommerce site might allow various parameters for a webpage for example by lowest price, highest rated etc. You may have 50 pages, but may be 10 URLs for each page and now search engines have to sort through 500 URLs.
This could also lead to another problem as less of your site may get crawled because search engine crawlers use a limited amount of bandwidth on each website (based on many factors). So if the crawler is able to crawl only 50 pages of your site in a single visit, you want it to be 50 unique pages, not 5 pages 10 times each.
You can find here more information on Canonicalization by Matt Cutts.
Now the latest news: Google, Yahoo and Microsoft agreed in a major drive to clean up the duplicate contents. Now the top search engines are implementing support for the new HTML tag - “<link>”.
The <link> tag will tell search engine crawlers which canonical URL form it should use when retrieving search results.
The <link> tag also put the Canonical URL at the forefront of the website content which will be used for accessing the page, regardless of the session id, link parameter, sort parameter, parameter order of the URL form.
To implement Canonical URL form on the website is quite easy, as the site owners just need to add the following link tag at head section of the site’s HTML.
<link rel=”canonical” href=”http://www.xyz.com/products” />
Although this <link>tag been around for a long time and used for link stylesheets but not used so for the canonical url. So the new aspect now is the “canonical” part in the “rel” attribute. I think if you have done the CSS course perhaps you will easy to understand the <link> tag.
Canonical : it means simple or basic or accepted as true without the need for proof.
Canonical url: The primary url that we want to use is known as the Canonical URL. In the world of search engines duplicate content is prohibited. Google applied penalty if several domain or url within your site pointing at the same content in your site.
The most common problem of duplicate content can be in the ecommerce website. An ecommerce site might allow various parameters for a webpage for example by lowest price, highest rated etc. You may have 50 pages, but may be 10 URLs for each page and now search engines have to sort through 500 URLs.
This could also lead to another problem as less of your site may get crawled because search engine crawlers use a limited amount of bandwidth on each website (based on many factors). So if the crawler is able to crawl only 50 pages of your site in a single visit, you want it to be 50 unique pages, not 5 pages 10 times each.
You can find here more information on Canonicalization by Matt Cutts.
Now the latest news: Google, Yahoo and Microsoft agreed in a major drive to clean up the duplicate contents. Now the top search engines are implementing support for the new HTML tag - “<link>”.
The <link> tag will tell search engine crawlers which canonical URL form it should use when retrieving search results.
The <link> tag also put the Canonical URL at the forefront of the website content which will be used for accessing the page, regardless of the session id, link parameter, sort parameter, parameter order of the URL form.
To implement Canonical URL form on the website is quite easy, as the site owners just need to add the following link tag at head section of the site’s HTML.
<link rel=”canonical” href=”http://www.xyz.com/products” />
Although this <link>tag been around for a long time and used for link stylesheets but not used so for the canonical url. So the new aspect now is the “canonical” part in the “rel” attribute. I think if you have done the CSS course perhaps you will easy to understand the <link> tag.