Understand SEO crawling errors
Last updated: October 31, 2023
Available with any of the following subscriptions, except where noted:
All products and plans |
If an SEO crawler can't index a page, it will return a crawling error. This can happen with the crawlers in HubSpot's SEO and import tools, as well as external crawlers like Semrush. The steps for resolving a crawling error depend on the error and where the page is hosted.
HubSpot's SEO tools crawling a HubSpot page
You can view SEO recommendations on the Optimization tab of a page or post's performance details. If there are issues crawling the page, you may see one of the following error messages:
- Status 301: Moved Permanently: a 301 redirect is preventing the crawler from accessing the content.
- Status 302: Object moved: a 302 (temporary) redirect is preventing the crawler from accessing the content.
- Status 403: Forbidden: the server can be reached, but access to content is denied.
- Status 404: Not Found: the crawler is unable to find a live version of the content because it was deleted or moved.
- Crawl of [site] blocked by robots.txt: a robots.txt file is blocking the content from being indexed.
HubSpot's SEO tools crawling an external page
If you have attempted to crawl external pages using HubSpot's SEO tools or are importing external content to HubSpot, you may encounter one of these errors:
- Scan blocked by robots.txt file: if your external page is excluded from indexing by your robots.txt file, add the HubSpot crawler’s user agent “HubSpot Crawler” as an exemption. Learn more about working with a robots.txt file.
- Robots.txt file couldn't be retrieved: if HubSpot's crawlers can't access your site's robots.txt file, verify that the robots.txt file is accessible and in the top-level directory of your site. Learn more about working with a robots.txt file.
- The crawler isn't able to scan this URL: if HubSpot's crawlers can't crawl a specific URL, try the following troubleshooting steps:
- Verify that the URL has been entered correctly.
- Verify that the page being crawled is currently live.
- Verify that DNS can resolve the URL. Learn more about resolving DNS errors in Google's documentation.
- Reach out to your site administrator and request that they add our crawler's user agent, "HubSpot Crawler," to the allow list as an exemption.
An external SEO tool crawling a HubSpot page
If you have attempted to crawl your HubSpot pages using an external SEO tool such as Moz or Semrush, you may find that you are unable to crawl your pages successfully.
Common causes for this issue include:
- The inclusion of your pages in the robots.txt file is preventing them from being indexed or crawled.
- A "noindex" meta tag in the head HTML of your pages is preventing them from being indexed or crawled.
- Auditing a root domain, rather than the subdomain connected to HubSpot, is causing a timeout error.
- Links for RSS feeds and blog listing pages expire when new blog posts are published, which can generate blocked resources errors.
- Non-essential resources, such as the scripts that load the HubSpot sprocket menu, may prompt blocked resources errors. This does not prevent the rest of the page from being crawled.