Robots.txt files are referenced by search engines to index your website content. These can be useful to keep certain content, such as a content offer hidden behind a form, from being returned in search engine results.
Please note: Google and other search engines can't retroactively remove pages from search results after you implement the robots.txt file method. While this tells bots not to crawl a page, search engines can still index your content if, for example, there are inbound links to your page from other websites. If your page has already been indexed and you'd like it to be removed from search engines retroactively, you'll likely want to use the "No Index" meta tag method.
How robots.txt files work
Your robots.txt file tells search engines how to crawl pages hosted on your website. The two main components of your robots.txt file are:
User-agent: Defines the search engine or web bot that a rule applies to. An asterisk (*) can be used as a wildcard withUser-agentto include all search engines.
Disallow: Advises a search engine not to crawl and index a file, page, or directory.
In your HubSpot account, click the settings iconsettings in the main navigation bar.
In the left sidebar menu, navigate to Website > Pages.
Use the Modifyingdropdown menu to select a domain to update.
Click the SEO & Crawlers tab.
Scroll down to theRobots.txtsection and make your changes to your robots.txt file in the text field.
Please note: if you're using HubSpot's site search module on your website, an asterisk in the user-agent field will block the search feature from crawling your site. You'll need to include HubSpotContentSearchBot as a user-agent in your robots.txt file to allow the search feature to crawl your pages.