Back to Blog
    SEO Basics

    The Complete Guide to Google's site: Operator (And Every Trick SEOs Need to Know)

    Daniel Foley Carter Daniel Foley Carter
    ·
    March 18, 2026
    ·

    Daniel has 25+ years SEO experience & loves everything SEO including testing, ranking & getting clients results

    The Complete Guide to Google's site: Operator (And Every Trick SEOs Need to Know)

    Most SEOs know the site: operator exists. Far fewer use it to anything close to its full potential.

    At its simplest, site:example.com tells you what Google has indexed from a domain. But layered with other operators, filtered by subfolder, scoped to a subdomain, or combined with boolean logic - it becomes one of the most powerful diagnostic tools in technical SEO. No paid tools required.

    This guide covers everything: the basics, the advanced operators you can layer on top, real-world use cases, and the limitations you need to understand before you trust the numbers.


    The Basics: What site: Actually Does

    site:seo-stack.io


    This query returns Google's index of all pages it recognises as belonging to seo-stack.io. The number shown in the results bar ("About X results") gives you a rough indexed page count.

    A few things to understand immediately:

    • The number is an estimate. Google does not return precise index counts. The figure fluctuates between searches and is notoriously unreliable at scale. Don't build reporting on it.

    • It shows what Google has indexed, not what you've published. Pages blocked by robots.txt, noindexed, or simply not yet crawled will not appear.

    • It includes subdomains by default - unless you specify otherwise (more on this below).

    • You don't need the https:// prefix. site:seo-stack.io and site:https://www.seo-stack.io both work, but the former is cleaner.


    Scoping site: to Subfolders and Subdomains

    This is where site: starts to earn its keep in a real audit.

    Subfolder-level indexation

    site:example.com/blog/

    site:example.com/products/

    site:example.com/en-gb/


    This scopes the query to a specific section of the site. Useful for:

    • Checking how many blog posts are indexed vs. published

    • Diagnosing indexation gaps in an e-commerce category

    • Auditing international subdirectories individually

    Practical example: If site:example.com/blog/ returns 47 results but you've published 200 posts, you have an indexation problem - canonicalisation, noindex tags, crawl budget, or thin content are your likely culprits.

    Subdomain-level indexation

    site:blog.example.com

    site:app.example.com

    site:support.example.com


    This restricts results to a specific subdomain only. Useful when a site uses subdomains for different environments (blog, app, support docs, staging) and you want to audit each independently.

    Excluding subdomains

    site:example.com -site:blog.example.com


    Shows all indexed content on example.com except the blog subdomain. Good for isolating the core site when a blog or docs subdomain inflates the count.


    Layering Operators: Where It Gets Powerful

    The site: operator becomes genuinely diagnostic when combined with others. Here are the most useful combinations.


    site: + inurl:

    site:example.com inurl:category

    site:example.com inurl:?

    site:example.com inurl:page


    What it does: Filters indexed pages where the URL contains a specific string.

    Use cases:

    • site:example.com inurl:? - surfaces faceted or parameterised URLs that have been indexed. If you're seeing thousands of results here, you likely have a crawl budget or duplicate content issue.

    • site:example.com inurl:/tag/ - finds indexed tag pages on WordPress sites, which are often thin and should be noindexed.

    • site:example.com inurl:utm_ - reveals whether UTM-tagged URLs have been indexed (they shouldn't be).

    • site:example.com inurl:print - uncovers print-version pages being indexed.


    site: + intitle:

    site:example.com intitle:"buy"

    site:example.com intitle:"login"

    site:example.com intitle:"404"


    What it does: Filters results where the page title contains the specified word or phrase.

    Use cases:

    • site:example.com intitle:"404" - finds 404 error pages that Google has indexed with "404" in the title. This is a common technical SEO failure.

    • site:example.com intitle:"index of" - detects open directory listings that have been indexed. A security and SEO issue.

    • site:example.com intitle:"duplicate title" - use with your known duplicate title text to spot mass duplication.

    • site:example.com intitle:"coming soon" - catches placeholder pages that have leaked into the index.

    • site:competitor.com intitle:"best [keyword]" - competitive research into a rival's optimised content.


    site: + intext:

    site:example.com intext:"noindex"

    site:example.com intext:"lorem ipsum"

    site:example.com intext:"test page"


    What it does: Filters results where the page body contains the specified text.

    Use cases:

    • site:example.com intext:"lorem ipsum" - finds placeholder content that's been published live. Embarrassing if a client finds this before you do.

    • site:example.com intext:"add to basket" - quickly confirm e-commerce product pages are indexed.

    • site:example.com intext:"affiliate disclosure" - audit where disclosure text appears across a site.


    site: + filetype:

    site:example.com filetype:pdf

    site:example.com filetype:xml

    site:example.com filetype:doc


    What it does: Filters by file type.

    Use cases:

    • site:example.com filetype:pdf - reveals PDFs that Google has indexed. Often overlooked in audits; PDFs can rank independently and dilute link equity.

    • site:example.com filetype:xml - check if sitemap files are indexed as pages (they shouldn't be).

    • site:competitor.com filetype:pdf - competitive research. Their indexed PDFs often contain pricing, case studies, or guides you'd otherwise miss.


    site: + related:

    related:seo-stack.io


    Technically a separate operator, but pairs well conceptually. Shows sites Google considers topically related or similar. Use it for competitor discovery or to understand how Google categorises your domain.


    site: + cache:

    cache:example.com/specific-page/


    Shows Google's cached version of a specific page, including the date it was last crawled. Useful for checking:

    • Whether recent content changes have been picked up

    • What Google is actually rendering (JavaScript rendering issues become obvious here)

    • Historical snapshots of a page during a penalty investigation


    Advanced Layering: Multi-Operator Combinations

    This is where site: becomes a genuine audit instrument.

    Finding duplicate content patterns

    site:example.com intitle:"widget reviews" -inurl:"/widget-reviews/"


    Surfaces pages with a specific title that don't live at the canonical URL. Useful for catching syndicated or near-duplicate content.

    Indexation audit by content type

    site:example.com/blog/ -inurl:"/author/" -inurl:"/tag/" -inurl:"/category/" -inurl:"/page/"


    Returns blog posts while excluding archive, tag, author, and pagination pages - giving you a cleaner count of actual article content.

    Competitor content gap analysis

    site:competitor.com intitle:"guide" OR intitle:"tutorial" OR intitle:"how to"


    Quickly reveals the volume and nature of a competitor's long-form content strategy.

    Detecting staging site leakage

    site:staging.example.com OR site:dev.example.com OR site:test.example.com


    Checks whether development subdomains have leaked into Google's index. A finding here means a robots.txt misconfiguration or missing noindex headers.

    Sitelinks investigation

    site:example.com inurl:"/careers/"

    site:example.com inurl:"/legal/"


    When investigating which secondary pages are indexed and could appear as sitelinks, scoping by section is faster than crawling.


    Boolean Logic with site:

    Google supports AND, OR, and - (NOT) within search queries. Combined with site:, this opens up multi-domain and exclusion queries.

    Multi-domain comparison

    site:example.com OR site:competitor.com intitle:"product reviews"


    Returns matching indexed content from both domains simultaneously. Good for rapid competitive benchmarking.

    Excluding a section from results

    site:example.com -site:example.com/blog/


    Returns the whole site minus the blog directory. Useful when the blog vastly outnumbers commercial pages and you want to assess the core site independently.

    Checking for cross-domain content syndication issues

    site:syndicatedomain.com "original article title text"


    Checks whether your content has been scraped and indexed on another domain, which can create duplicate content risk.


    Practical SEO Use Cases, End-to-End

    1. Pre-audit indexation health check

    Before starting a full technical audit, run:

    site:example.com


    Note the count. Then run:

    site:example.com inurl:?

    site:example.com inurl:/page/

    site:example.com inurl:/tag/


    If parameterised, paginated, or taxonomic URLs are inflating the total, you know immediately where to focus.

    2. Spotting content cannibalisation

    site:example.com intitle:"keyword phrase"


    If multiple pages return with the same keyword in the title, you have cannibalisation risk. Google is being asked to choose between competing pages for the same intent.

    3. Post-migration indexation check

    After a site migration, run subfolder-level site: queries against both old and new domains at intervals. Compare the indexed page counts to confirm:

    4. Auditing indexation against your sitemap

    Export your sitemap URLs. Run a site: query against each subfolder. The gap between your sitemap count and Google's indexed count is your indexation coverage gap - and it needs explaining.

    5. Detecting thin or near-duplicate content at scale

    site:example.com intitle:"[boilerplate title string]"


    Many CMS platforms generate pages with templated titles (e.g., "Products – Brand Name"). If hundreds of these appear, you have a thin content indexation problem.


    site: for Competitive Intelligence

    Don't limit site: to your own properties. Competitive use cases include:

    • Total indexed pages: site:competitor.com - how large is their indexable footprint?

    • Content velocity: Monitor their site: count monthly. Rapid growth suggests an active content strategy.

    • Resource pages: site:competitor.com filetype:pdf - what assets are they publishing?

    • Blog depth: site:competitor.com/blog/ - how much editorial content do they have indexed?

    • Structured targeting: site:competitor.com intitle:"[your target keyword]" - what content are they optimising for terms you care about?


    Limitations You Must Understand

    The site: operator is a diagnostic tool, not a reporting instrument. Its limitations are significant:

    1. The count is unreliable. The number shown ("About X,XXX results") is a rough estimate and fluctuates. Don't use it as a KPI. Use Google Search Console's Index Coverage report for accurate indexation data.

    2. It doesn't show everything Google has indexed. Some indexed pages won't appear in site: results - particularly thin pages, heavily duplicated content, or pages Google has crawled but assigned low value to.

    3. Results are personalised. Signed-in searches may return results influenced by your search history. Use incognito mode for more neutral results.

    4. It doesn't show crawled-but-not-indexed pages. GSC distinguishes between crawled (but not indexed), discovered (but not crawled), and indexed. site: only shows the indexed tier.

    5. Not suitable for large-scale URL-level audits. Google caps results at around 1,000 pages in the SERP. For full indexation analysis at scale, use GSC, Screaming Frog, or a log file analysis tool.


    Quick Reference: site: Operator Cheatsheet

    Query

    Use Case

    site:example.com

    Total indexed pages

    site:example.com/subfolder/

    Subfolder indexation

    site:subdomain.example.com

    Subdomain indexation

    site:example.com -site:blog.example.com

    Site minus subdomain

    site:example.com inurl:?

    Parameterised URLs in index

    site:example.com inurl:/tag/

    Tag pages in index

    site:example.com intitle:"404"

    Indexed error pages

    site:example.com intitle:"lorem ipsum"

    Placeholder content live

    site:example.com filetype:pdf

    Indexed PDFs

    site:example.com intext:"lorem ipsum"

    Placeholder text in body

    site:example.com OR site:competitor.com

    Multi-domain comparison

    site:competitor.com intitle:"[keyword]"

    Competitor content audit

    site:staging.example.com

    Staging site leakage

    site:example.com -inurl:/blog/ -inurl:/tag/

    Core pages minus archive types


    Final Word

    The site: operator has been part of Google's search syntax for over two decades, yet most SEOs treat it as a one-liner. The real value comes from pairing it with inurl:, intitle:, filetype:, and boolean logic to ask precise diagnostic questions.

    Run it before an audit starts. Run it after a migration. Run it against competitors. And always interpret the results with an understanding of what the operator can - and can't - tell you.



    Daniel Foley Carter

    Daniel Foley Carter

    SEO Consultant & Founder of SEO Stack

    Daniel has 25+ years SEO experience & loves everything SEO including testing, ranking & getting clients results

    Ready to transform your SEO?

    Join thousands of SEO professionals using SEO Stack to get better results.

    Start Free 30 Day Trial

    Take SEO Stack for a FREE 30 Day Trial — Unlock the Full Power of Google Search Console Now!

    Store & Access 10+ Years of GSC Data Talk to Search Console Data Bypass All Limits AI Smart Annotations Get Better SEO Results for Clients
    Start Your Free Trial