User Rating 0.0
Total Usage 1 times
Category Seo Tools

1. Global Settings (All Robots)

Note: Google ignores Crawl-Delay, but Bing and Yandex respect it.

2. CMS Presets (Quick Setup)

3. Custom Rules

4. Sitemap Location

Is this tool helpful?

Your feedback helps us improve.

About

The robots.txt file acts as the primary gatekeeper for your website's interaction with search engine crawlers. It is a simple text file placed in the root directory of your site that instructs bots (like Googlebot or Bingbot) on which pages they should access and which they must ignore. While it does not strictly enforce security, it is critical for Search Engine Optimization (SEO) and server load management.

Using a generator is highly recommended because syntax errors in this file can have catastrophic consequences, such as accidentally preventing search engines from indexing your entire website. This tool provides a structured interface to define User-agent directives, Disallow paths, and Allow exceptions without needing to manually type complex syntax. It includes safeguards and presets for popular Content Management Systems (CMS) to ensure standard administrative directories are protected from public crawling.

robots.txt seo configuration crawler control googlebot sitemap

Formulas

The robots.txt protocol follows a specific hierarchical logic. The file is processed top-to-bottom, grouping rules by User-agent.

  • Step 1. User-agent Definition. Identifies the specific robot (e.g., User-agent: Googlebot) or applies to all robots using a wildcard (User-agent: *).
  • Step 2. Blocking Access (Disallow). Specifies directories or files the bot must avoid.
    Example: Disallow: /admin/ prevents access to the admin folder.
  • Step 3. Granting Access (Allow). Overrides a parent Disallow rule for a specific sub-path.
    Example: Disallow: /public/ followed by Allow: /public/images/.
  • Step 4. Sitemap Declaration. An optional but recommended directive pointing crawlers to the XML sitemap.
    Format: Sitemap: https://example.com/sitemap.xml.

Reference Data

User-Agent (Bot)OwnerPrimary Function
GooglebotGoogleMain crawler for Google Search index.
BingbotMicrosoftCrawler for Bing search engine.
SlurpYahooCrawler for Yahoo Search.
DuckDuckBotDuckDuckGoPrivacy-focused search engine crawler.
BaiduspiderBaiduLeading Chinese search engine crawler.
YandexBotYandexLeading Russian search engine crawler.
FacebookExternalHitMetaCrawls pages to generate previews for shared links.
ApplebotAppleUsed for Siri and Spotlight suggestions.
AhrefsBotAhrefsSEO analysis and backlink checking.
MJ12botMajesticLink intelligence and SEO mapping.

Frequently Asked Questions

A syntax error or an incorrect 'Disallow: /' rule can completely de-index your website from Google and other search engines, making it invisible to organic traffic. This is why testing and using a generator is crucial.
No. Robots.txt is a polite request to scanners, not a firewall. Malicious bots will ignore it, and the file is public, effectively listing your private directories. Use server-side password protection (.htaccess) for real security.
Generally, no. Modern search engines like Google render pages like a browser. Blocking CSS/JS prevents them from seeing the page correctly, which can negatively impact your mobile-friendliness and SEO rankings.
Disallow tells the bot 'do not crawl this link'. Noindex (a meta tag) tells the bot 'crawl this, but do not show it in search results'. If you Disallow a page, the bot cannot see the Noindex tag on it, so the page might still appear in search results with just a URL.