Robots.txt files are referenced by search engines to index your website content. These can be useful to keep certain content, such as a content offer hidden behind a form, from being returned in search engine results.
Please note: Google and other search engines can't retroactively remove pages from search results after you implement the robots.txt file method. While this tells bots not to crawl a page, search engines can still index your content if, for example, there are inbound links to your page from other websites. If your page has already been indexed and you'd like it to be removed from search engines retroactively, you'll likely want to use the "No Index" meta tag method.
How robots.txt files work
Your robots.txt file tells search engines how to crawl pages hosted on your website. The two main components of your robots.txt file are:
User-agent: Defines the search engine or web bot that a rule applies to. An asterisk (*) can be used as a wildcard withUser-agentto include all search engines.
Disallow: Advises a search engine not to crawl and index a file, page, or directory.