liminfo

Robots.txt Generator

Free web tool: Robots.txt Generator

User-agent: *
Disallow: /admin/
Disallow: /api/
Allow: /

About Robots.txt Generator

The Robots.txt Generator is a visual, form-based tool that lets you build a valid robots.txt file for your website without having to memorize the syntax. You define one or more User-agent rules, specify which URL paths to Allow or Disallow for each bot, and optionally add a Crawl-delay, Sitemap URL, and Host directive. The tool assembles the correct robots.txt format in real time and lets you copy it to the clipboard with a single click.

Web developers, SEO specialists, and site administrators use robots.txt to communicate crawling instructions to search engine bots like Googlebot, Bingbot, and others. Common use cases include blocking crawlers from admin panels (/admin/), API endpoints (/api/), staging directories, or duplicate-content pages, while ensuring that important pages remain indexable. The tool includes three ready-made presets — Allow All, Block All, and Block Admin Only — to cover the most common scenarios instantly.

Technically, each rule block in a robots.txt file starts with a User-agent line followed by Disallow and Allow directives. Order matters: Disallow directives are evaluated first by most crawlers, but the Allow directive can override a Disallow for a more specific path. The Crawl-delay directive asks bots to pause a given number of seconds between requests, which can reduce server load. The Sitemap directive points crawlers to your XML sitemap for efficient discovery of all pages.

Key Features

  • Visual rule builder — add multiple User-agent blocks, each with independent Allow and Disallow path lists
  • Three quick presets: Allow All, Block All, and Block Admin Only (blocks /admin/, /api/, /private/)
  • Crawl-delay field to throttle bot crawling and reduce server load
  • Sitemap URL field that appends a Sitemap directive pointing crawlers to your sitemap.xml
  • Host directive field for sites with multiple mirror domains to specify the canonical host
  • Real-time live preview of the generated robots.txt output in a monospaced code block
  • One-click copy to clipboard for pasting directly into your site root or deployment pipeline
  • 100% client-side generation — the output is built in your browser, nothing is sent to a server

Frequently Asked Questions

What is robots.txt and where does it go?

robots.txt is a plain-text file placed at the root of your website (e.g., https://example.com/robots.txt) that tells web crawlers which pages or directories they are allowed or not allowed to access. It follows the Robots Exclusion Protocol. Search engines like Google check this file before crawling your site.

Does blocking a page in robots.txt remove it from Google search results?

No. Disallowing a page in robots.txt prevents Googlebot from crawling it, but if other sites link to that page, Google may still index it and show it in search results without seeing its content. To prevent indexing entirely, use a noindex meta tag on the page itself — robots.txt only controls crawling, not indexing.

What is the difference between Disallow and Allow directives?

Disallow tells a bot not to access a given path. Allow overrides a Disallow for a more specific sub-path. For example, Disallow: /private/ combined with Allow: /private/public-page.html lets the bot access that one specific page while blocking the rest of the directory.

How do I block only Googlebot and allow all other crawlers?

Create two rule blocks. In the first block, set User-agent to Googlebot with the desired Disallow paths. In the second block, set User-agent to * (all bots) with Allow: / to permit full access. Use the "+ Add Rule" button to create the second block.

What is the Crawl-delay directive and when should I use it?

Crawl-delay tells a bot to wait a specified number of seconds between successive requests. It is useful for low-traffic sites or servers with limited resources that cannot handle aggressive crawling. Note that Googlebot ignores Crawl-delay and uses its own crawl rate settings in Google Search Console instead.

How do I include my sitemap in robots.txt?

Enter the full URL of your sitemap in the Sitemap URL field (e.g., https://example.com/sitemap.xml). The generator will append a Sitemap: directive to the output. This helps search engines discover your sitemap without needing to know its exact URL in advance.

Can I have different rules for different bots?

Yes. Add multiple rule blocks using the "+ Add Rule" button and set a different User-agent value for each — for example, Googlebot, Bingbot, or GPTBot. Each block can have its own Allow and Disallow paths, letting you give different access levels to different crawlers.

Is robots.txt a security mechanism to protect sensitive files?

No. robots.txt is a polite request, not a security control. Any human or malicious bot can ignore it and access the listed URLs directly. Never rely on robots.txt to protect sensitive data. Use proper authentication, server-side access control, or firewall rules to protect confidential pages.