robots.txt Guide + Checker

Bytespider & robots.txt

Block or Allow ByteDance/TikTok's Crawler

Bytespider is one of the most aggressive AI crawlers on the web. Here's how to control it in your robots.txt.

https://

Free scan — no account required. Takes 30 seconds.

What is Bytespider?

Bytespider is the web crawler operated by ByteDance, the Chinese technology company behind TikTok. It collects web content for AI model training and other ByteDance products.

Among AI crawlers, Bytespider stands out for its crawl volume. Website operators frequently report it making far more requests per day than GPTBot or ClaudeBot. If you're seeing unexplained server load, Bytespider is worth checking.

Server Impact Warning

Bytespider's high crawl rate can noticeably increase server load on smaller sites. If you don't need ByteDance's AI products to access your content, blocking Bytespider can reduce unnecessary server resource usage.

robots.txt Syntax for Bytespider

Copy-paste these examples into your robots.txt file.

Block Bytespider (Most Common)

# Block ByteDance/TikTok crawler

User-agent: Bytespider

Disallow: /

Prevents Bytespider from crawling your entire site.

Allow Bytespider

# Allow ByteDance crawler

User-agent: Bytespider

Allow: /

Only needed if you specifically want ByteDance AI access.

Rate Limiting Alternative

# Slow down Bytespider instead of blocking

User-agent: Bytespider

Crawl-delay: 10

Allow: /

Crawl-delay asks Bytespider to wait 10 seconds between requests. Not all crawlers respect Crawl-delay, but it's worth trying if you want to allow access without the server impact.

Should You Block Bytespider?

Block If...

  • You don't target Chinese markets
  • Server load is a concern
  • No benefit from ByteDance AI products
  • You want to limit AI training broadly

Allow If...

  • You have a Chinese/Asian market audience
  • You want maximum AI platform coverage
  • Server resources aren't a constraint

Frequently Asked Questions

What is Bytespider?

Bytespider is ByteDance's web crawler. ByteDance is the parent company of TikTok. Bytespider crawls websites to collect data for ByteDance's AI models and products. It's one of the most aggressive AI crawlers in terms of crawl volume.

Why is Bytespider so aggressive?

Bytespider is known for high crawl rates that can impact server performance. Multiple website operators have reported Bytespider making significantly more requests than other AI crawlers. If you're seeing server load from Bytespider, blocking it in robots.txt is a common response.

Does Bytespider respect robots.txt?

ByteDance states that Bytespider respects robots.txt. However, some website operators have reported that Bytespider continues crawling after being blocked, though this may be due to caching delays. If you block it and still see traffic, consider also blocking at the firewall level.

What AI products does Bytespider feed?

Bytespider collects training data for ByteDance's AI products, which have included chatbots and language models. The specific products vary by region. In some markets, ByteDance operates AI assistants that compete with ChatGPT.

Should I block Bytespider?

Most Western website operators block Bytespider. Unlike GPTBot or ClaudeBot, there's no clear benefit to allowing Bytespider for most English-language sites. It doesn't power a widely-used AI assistant in Western markets, and it's known for aggressive crawling.

Related robots.txt Guides

Check Your Bytespider Configuration

See which AI crawlers can access your site. Instant report on all 14 bots including Bytespider.

https://

Free scan — no account required. Takes 30 seconds.

Free scan, no signup needed