Skip to content

Understand SEO crawling errors

Last updated: September 30, 2021

Applies to:

All products and plans

If an SEO crawler can't index a page, it will return a crawling error. This can happen with the crawlers in HubSpot's SEO tools, as well as external crawlers like Semrush. The steps for resolving a crawling error depend on the error and where the page is hosted. 

HubSpot's SEO tools crawling a HubSpot page

You can view SEO recommendations on the Optimization tab of a page or post's performance details. If there are issues crawling the page, you may see one of the following error messages: 

  • Status 301: Moved Permanently - a 301 redirect is preventing the crawler from accessing the content. 
  • Status 302: Object moved - a 302 (temporary) redirect is preventing the crawler from accessing the content.
  • Status 403: Forbidden - the server can be reached, but access to content is denied.
  • Status 404: Not Found - the crawler is unable to find a live version of the content because it was deleted or moved.
  • Crawl of [site] blocked by robots.txt - a robots.txt file is blocking the content from being indexed. 

HubSpot's SEO tools crawling an external page

If you have attempted to crawl external pages using HubSpot's SEO tools, you may encounter one of these errors: 

  • Scan blocked by robots.txt file: if your external page is excluded from indexing by your robots.txt file, add our crawler’s user agent “HubSpot Crawler” as an exemption. Learn more about working with a robots.txt file here
  • Robots.txt file couldn't be retrieved: if HubSpot's crawlers can't access your site's robots.txt file, verify that the robots.txt file is accessible and in the top-level directory of your site. Learn more about working with a robots.txt file here

If HubSpot's SEO tools return encounter a general crawling error, follow these steps to resolve it: 

  • Verify that the URL has been entered correctly. 
  • Verify that the page being crawled is currently live.
  • Verify that DNS can resolve the URL. Learn more about resolving DNS errors in Google's documentation
  • Reach out to your site administrator and request that they add our crawler's user agent, "HubSpot Crawler," to the allow list as an exemption. 

An external SEO tool crawling a HubSpot page

If you have attempted to crawl your HubSpot pages using an external SEO tool such as Moz or Semrush, you may find that you are unable to crawl your pages successfully.

Common causes for this issue include: 

  • The inclusion of your pages in the robots.txt file is preventing them from being indexed or crawled. 
  • A noindex meta tag in the head HTML of your pages is preventing them from being indexed or crawled. 
  • Auditing a root domain, rather than the subdomain connected to HubSpot, is causing a timeout error.
  • Links for RSS feeds and blog listing pages expire when new blog posts are published, which can generate blocked resources errors.
  • Non-essential resources, such as the scripts that load the HubSpot sprocket menu, may prompt blocked resources errors. This does not prevent the rest of the page from being crawled.