Google Ranking Factor – Sitemap
Some indexing problems can occur through submission of sitemaps, or lack of submission of any sitemap.
All of the URLs listed in the sitemap should lead to valid URLs.
A common ‘problem’ with sitemaps I have encountered is when they refer to URLs that do not need to be indexed, or should not be indexed, but the sitemap has provided a crucial link (the only link) to unwanted pages that are not linked from anywhere else.
A practical example of this is when a web developer has built a site for a client but omitted to remove test pages or template example pages from the CMS. Such pages may not appear on the menu, or be linked-to from anywhere in the front end of the website, but the URLs for those pages may end up in the sitemap and may therefore be crawled by Google.
The result is a dilution of the site’s semantic profile and may negatively affect the site’s overall capacity to rank.
Another issue is when websites are large and crawl budgets insufficient to discover all pages in the site. The sitemap helps resolve this by providing Google with a link to the deep content pages. Those pages might contribute to rank, or may need rank themselves, but unless crawled will never achieve these goals.