Crawl Budget

Crawl budget refers to the number of pages a search engine like Googlebot will crawl on your website within a given timeframe. For PrestaShop stores with thousands of product listings, understanding and optimising your crawl budget is essential to ensure all your important pages are discovered and indexed.

What is crawl budget?

Google defines crawl budget through two components: crawl rate limit and crawl demand. The crawl rate limit is the maximum frequency at which Googlebot can crawl your site without overloading it — it depends on your server response speed and site health signals. Crawl demand reflects Google's interest in your pages, based on their popularity, freshness, and authority.

Why crawl budget matters for e-commerce

A mid-sized e-commerce site can easily generate tens of thousands of URLs: product pages, variants, category pages, filter results, pagination pages, and more. Googlebot has limited resources. If your crawl budget is wasted on low-value URLs, your important product pages risk being crawled less frequently — or even missed during index updates.

PrestaShop and faceted navigation

Filter navigation (size, colour, price) in PrestaShop generates thousands of combination URLs such as `/shoes?colour=red&size=42`. These URLs are often duplicates or thin content that consume a significant portion of your crawl budget without any SEO benefit.

How Google prioritises pages to crawl

🔗

Internal PageRank

Pages with the most internal links receive more Googlebot attention. A strong internal linking structure directs crawl budget towards your priority pages.

🕐

Content freshness

Frequently updated pages (new products, price changes) are revisited more often. Static and older pages receive fewer crawl visits.

📊

Popularity & authority

URLs that attract external backlinks or high organic traffic are considered more important and are crawled with higher priority.

How to optimise your crawl budget

Block low-value URLs via robots.txt (sort parameters, deep pagination, facet filter URLs)
Fix all 404 errors and redirect chains that waste crawl budget
Submit an up-to-date XML sitemap in Google Search Console to guide Googlebot
Use canonical tags to point to the primary version of duplicate pages
Improve server response speed to increase the allowed crawl rate limit
Remove or noindex low-value pages (filter pages, pagination duplicates)

Tip: monitor the Coverage report

In Google Search Console, the Coverage (Index) report shows which pages have been crawled, which have been excluded, and why. This is your go-to tool for diagnosing crawl budget issues.

Try Lexiik on your PrestaShop store