How to Customize the Robots.txt File for HubSpot Websites

·January 28, 2026·Updated June 11, 2026·7 min read

When your HubSpot site’s rankings aren’t improving, despite great content and solid backlinks, your crawl settings might be the hidden culprit. Search engines can only optimize what they’re allowed to access. If your robots.txt file isn’t tuned correctly, crawlers might waste time on test pages, ignore high-value URLs, or get lost in parameter-heavy links. That’s not just inefficient, it’s an easy way to sabotage your SEO from behind the scenes.

Navigating this file inside HubSpot often puzzles even experienced marketers. It lives deep in the settings menu, and most teams forget it exists until something breaks. Layer in multiple domains, auto-generated landing pages, or content tools that constantly churn out new variations, and managing crawl behavior gets messy fast. Without a precise robots.txt configuration, your technical SEO foundation starts to crack, no matter how strong your content.

This guide walks you through how the HubSpot robots.txt file works, where to find it, and how to tweak it safely. You’ll see real-world examples, resolved mistakes, step-by-step changes, and how to measure whether it’s actually improving your SEO visibility.

Control Search Visibility via HubSpot’s Robots.txt Settings

The robots.txt file is a gatekeeper. It lives at the root of your HubSpot domain and tells search engine crawlers which parts of your site they’re allowed to access. Inside HubSpot, you can edit this file directly, no server access or dev tickets needed. It applies to HubSpot-hosted content, such as landing pages, blog posts, and full-site pages.

You use it to set ground rules. Want to block bots from preview pages or test folders? Allow access to your core product pages? List your sitemap? The robots.txt file handles those directives.

If your team uses multiple HubSpot-hosted domains, each one has its own robots.txt file. The customization tool is available in CMS Hub and in Marketing Hub (Professional and Enterprise tiers).

To find it:

Go to Settings > Website > Pages > SEO & Crawlers > Robots.txt.

This area pulls together all related crawl and indexing tools. While you’re here, check performance tips, indexing flags, and recommendation reports to support your changes.

How It Works Under the Hood

Every time you publish updates to your robots.txt file in HubSpot, the system generates a clean version and places it at the domain root (like www.yoursite.com/robots.txt). When a bot visits that URL, it sees the latest rules.

Here’s the flow broken down:

Inputs:

User-agent directives target specific crawlers (such as Googlebot or Bingbot).
Allow or Disallow paths that define folders or files crawlers can or cannot access.
You can optionally add an XML sitemap URL to help bots discover a clean map of your pages.

Output:

A plain text instruction file that tells crawlers what’s off-limits and what’s fair game.

HubSpot handles the delivery and hosting automatically. Once you save, the changes go live within minutes, no delay, no manual deployment.

Tip: Every rule goes on its own line. Keep formatting clean, lowercase, and consistent. Avoid using session strings or long parameters like ?utm_source in directives, they can break more than they help.

You can further tailor control by:

Creating rules for individual crawlers via multiple user-agent sections.
Adding crawl delays (though major search engines often ignore this).
Including multiple sitemap references per domain if necessary.

Check everything by visiting yourdomain.com/robots.txt from a browser. Any mismatch between what you saved and what you see should be fixed before bots return to crawl.

Main Uses Inside HubSpot

Controlling crawler access to staging or test content

HubSpot frequently generates test URLs for previews or experimental pages. You don’t want these floating around search results. A simple rule like:

User-agent: *

Disallow: /test/

Disallow: /preview/

blocks search engines from crawling that content.

Let’s say you’re running a locked Q2 campaign under /test-q2promo/. This rule keeps it out of Google’s index, ensures clean reporting, and prevents unfinished work from leaking into search results.

Preserving crawl budget by excluding dynamic or filtered pages

Filtered pages, like search results or user-generated views, can multiply rapidly. If bots spend all their time crawling versions of /search-page?query=abc, they may never get to priority URLs.

Use targeted disallow lines like:

User-agent: *

Disallow: /search?

Disallow: /app/

For example, if you’re running a SaaS platform with dynamic knowledge base views, this keeps crawlers out of noise and focused on pages with SEO value.

Indicating XML sitemap to support faster indexing

Adding a sitemap reference ensures search engines know where to find your structured page listings.

Sitemap: https://www.example.com/sitemap.xml

Even if you’ve submitted this separately in tools like Google Search Console, adding it to your robots.txt gives consistent reinforcement across your crawl ecosystem. Especially helpful if you run multiple blog instances or campaign microsites within HubSpot.

Managing different domains or subdomains consistently

Many orgs host different HubSpot domains for blogs, documentation, events, or regional content. Consistency is everything.

Let’s say www.brand.com should allow all bots, but docs.brand.com needs to restrict internal PDFs. HubSpot lets you manage robots.txt files per domain. To avoid logic gaps, document shared rules in one place, such as a team wiki or an SEO playbook.

Aligning these rules prevents search fragmentation and protects brand integrity across all digital properties.

Common Setup Errors and Wrong Assumptions

Blocking all site access unintentionally

Adding Disallow: / for User-agent: * shuts everything down. Crawlers will skip the entire domain.

Fix: Allow critical routes and only disallow folders you truly want hidden.

Assuming the setting applies outside HubSpot

Your HubSpot robots.txt rules only affect HubSpot-hosted pages. If you run e-commerce on Shopify or a blog on WordPress, those need their own files.

Fix: Audit each hosting environment separately.

Not verifying your changes after saving

HubSpot doesn’t push live changes until you click Save. And even then, it’s smart to cross-check that they’ve published.

Fix: Visit domain.com/robots.txt within minutes of each update to confirm.

Formatting errors

Uppercase paths or trailing spaces can confuse some bots. A directive like “Disallow: /Private/ ” might fail silently.

Fix: Always stick to lowercase and clean spacing. One rule per line. No extra punctuation.

Step-by-Step Setup or Use Guide

Before editing, check that you have the right permissions inside your HubSpot portal. Only admins can update domain-level settings.

Here’s how to update your robots.txt file safely:

Go to Settings

After logging into HubSpot, click the gear icon in the main nav.
Navigate to Website > Pages

Open the SEO & Crawlers tab under the Pages section.
Locate the Robots.txt section

Each connected domain appears here. Choose the one you want to update.
Review the current content

Do a quick audit, what’s already blocked, and what may need to change? Don’t delete anything without understanding its purpose.
Add new directives

Use standard syntax like:

User-agent: *

Disallow: /private-resources/
Add your sitemap

Include a sitemap entry to boost discovery timelines:

Sitemap: https://yourdomain.com/sitemap.xml
Save the file

Click Save to publish instantly to the domain root.
Test live output

Go to yourdomain.com/robots.txt in a browser. The file should match exactly what you entered.
Monitor post-setup behavior

Wait a few days. Then use Google Search Console’s URL Inspector or Coverage report to confirm the right pages are being crawled, or not.

Track all changes in a team-shared doc with timestamps and editor initials. It prevents accidental overrides and builds a reference history for future audits.

Measuring Results in HubSpot

Impact here is indirect, but critical. The robots.txt file influences how fast (or slow) your site ranks. Powerful crawl rules lead to better indexing, cleaner traffic, and more reliable SEO performance.

Inside HubSpot, monitor progress with:

Traffic Analytics: Check if organic sessions increase, or at least stop dropping, after cleanup.
Page Performance: Ensure core URLs remain indexed and appear in search results.
SEO Recommendations Tool: Detects new accessibility issues caused by over-blocking.
Campaign Dashboards: Validate that critical landing or product pages still convert.

Here’s a quick post-change checklist:

Core content is still visible in Google.
Thin or duplicate URLs stop appearing in index reports.
Crawl stats in Search Console improve (fewer errors, better coverage).
Organic leads or conversions stay steady, or increase.

For bonus points, build a custom report in HubSpot to track traffic to the folders you just disallowed. Zero means it’s working.

Short Example That Ties It Together

Let’s say your HubSpot setup uses two domains: your main brand at www.brand.com and a gated knowledge base at resources.brand.com.

You want Google to index your blogs and product pages, but ignore test pages and internal how-to content. Here’s how you handle it:

Inside HubSpot, open Settings > Website > Pages > SEO & Crawlers > Robots.txt

For www.brand.com:

User-agent: *

Disallow: /test-pages/

Disallow: /beta/

Sitemap: https://www.brand.com/sitemap.xml

For resources.brand.com:

User-agent: *

Disallow: /internal-guides/

Sitemap: https://resources.brand.com/sitemap.xml

Two weeks later, Google Search Console confirms no test or internal URLs are indexed. Your blog posts load faster, and HubSpot reports show consistent organic traffic without the noise from test folders.

How INSIDEA Helps

Managing technical SEO inside HubSpot requires precision. INSIDEA helps you get it right, starting with robots.txt configuration and scaling through full SEO lifecycle support.

Here’s how we support your SEO operations:

HubSpot onboarding: Set up permissions, content domains, and crawl settings the right way from day one.
HubSpot management: Ongoing cleanup and maintenance of SEO-critical structures.
Automation design: Streamline page creation without introducing index bloat.
Reporting and CRM sync: Tie SEO KPIs to business outcomes across teams.
Technical setup: Handle robots.txt, sitemaps, canonical tags, and structured data with clarity and confidence.

If your team’s uncertain about SEO settings in HubSpot, or needs a second set of hands during a migration or campaignINSIDEA is ready to help. Connect with a certified HubSpot expert or check out INSIDEA’s HubSpot consulting services.

Jigar Thakker

Jigar Thakker is a HubSpot Certified Expert and CBO at INSIDEA. With over 7 years of expertise in digital marketing and automation, Jigar specializes in optimizing RevOps strategies, helping businesses unlock their full potential. A HubSpot Community Champion, he is proficient in all HubSpot solutions, including Sales, Marketing, Service, CMS, and Operations Hubs. Jigar is dedicated to transforming your RevOps into a revenue-generating powerhouse, leveraging HubSpot's unique capabilities to boost sales and marketing conversions.