What is robots.txt?
The robots.txt file is a plain text file placed in your website's root directory that tells search engine crawlers which pages or sections they are allowed or not allowed to access. It follows the Robots Exclusion Protocol and is the first file crawlers look for when visiting your site. A well-configured robots.txt helps you manage crawl budget, keep private areas out of search results, and point crawlers to your sitemap.
Common directives
User-agent specifies which crawler the rules apply to (use * for all). Disallow blocks a path, while Allow overrides a disallow for a more specific path. Sitemap tells crawlers where to find your XML sitemap, and Crawl-delay (honoured by some bots) sets a pause between requests to reduce server load.
Important notes
Robots.txt is advisory β well-behaved crawlers respect it, but malicious bots may ignore it. Never rely on robots.txt for security; use authentication or server-level access controls instead. Upload the generated file to your domain root (e.g. https://example.com/robots.txt). This tool runs entirely in your browser.