Website Analytics Lookup
Enter any domain to get traffic estimates, revenue ranges, global rank, domain authority, and backlink counts — all powered by the methodology described above.
SiteWorthIt aggregates data from multiple APIs and public datasets to estimate website traffic, revenue, authority, and technical performance. This page documents every data source, formula, and confidence level behind each number you see on the platform.
Traffic estimation is the hardest problem in third-party web analytics. No external service can observe another site's actual server logs — every number is an estimate derived from indirect signals. SiteWorthIt is deliberately transparent about this: each traffic figure carries a source and a confidence level, and the only way to see truly measured traffic is to connect your own Google Analytics.
Tier 1 — Google Analytics 4 (confidence: HIGH, measured). If you own the site and connect your GA4 property (read-only), we replace every estimate with your real, first-party numbers, clearly labelled as measured. This is the only HIGH-confidence, measured source on the platform.
Tier 2 — DataForSEO Bulk Traffic Estimation (confidence: MEDIUM, estimated). For any domain you don't own, our primary estimate comes from DataForSEO's Labs bulk_traffic_estimation endpoint, which models a domain's search traffic (organic + paid) from its ranked-keyword footprint and expected click-through rates. Important: this is search traffic only — it does not include direct, referral, social, or app traffic, so for very large brands it will read below total-visit figures from panel tools like SimilarWeb. We label this metric "Estimated Search Traffic" rather than "monthly visitors" precisely so the number is not mistaken for total traffic. When the endpoint returns no search traffic for a domain, we fall back to rank-based estimation.
Tier 3 — Rank-based fallback (confidence: LOW, inferred). When no search-traffic estimate is available, we infer a rough figure from the domain's global popularity rank — sourced from Cloudflare Radar (DNS-based, primary) with the research-grade Tranco top-1M list as a secondary source — using a power-law regression. This is the least reliable tier and is shown at LOW confidence with a visible warning, because rank is only a loose proxy for traffic:
This formula is derived from the well-established empirical observation that web traffic follows a Zipf-like power-law distribution across ranked websites. The exponent 0.85 (rather than 1.0) reflects the fact that the traffic drop-off between ranks is sub-linear — the gap between rank 100 and rank 200 is much smaller in absolute terms than between rank 1 and rank 2. We calibrated this coefficient against publicly disclosed traffic figures from sites that appear on both Tranco and DataForSEO's panel.
About the rank sources. Cloudflare operates one of the world's largest DNS resolvers (1.1.1.1) and passively observes DNS query volume across billions of daily lookups; its Radar ranking reflects how frequently a domain is queried relative to others — a reasonable popularity proxy, even though one DNS query may map to many page views or none (email clients, API calls). Tranco is a research-grade composite list (Cisco Umbrella, Majestic, Cloudflare and others) that is more stable and harder to manipulate than any single ranking. Because rank is only loosely related to actual visits, any traffic figure derived this way is shown at LOW confidence — an order-of-magnitude indicator, not a precise number.
| Confidence Level | Data Source | Typical Accuracy | Site Size Applicability |
|---|---|---|---|
| HIGH · measured | Google Analytics 4 (owner-connected) | Exact (first-party) | Any site you own |
| MEDIUM · estimated | DataForSEO Bulk Traffic Estimation (search traffic) | ±20–30% of search traffic | Domains with ranked keywords |
| LOW · inferred | Cloudflare Radar / Tranco rank → power-law | ±50–70% (order-of-magnitude) | Ranked domains without search data |
| — | No data available | N/A | Very small or new sites — shown as "Insufficient data" |
Website revenue estimation is an inherently imprecise exercise. Actual earnings depend on factors no third party can observe: whether the site runs ads at all, which ad network it uses, its negotiated CPM rates, subscription revenue, affiliate commissions, product sales, and dozens of other monetization methods. Our revenue estimate specifically models display advertising revenue — the most common monetisation for general-interest websites — and should be treated as a ballpark indicator, not a valuation.
The RPM (Revenue Per Mille) Framework. RPM — revenue earned per 1,000 page views — is the key variable. It varies enormously by content niche. A cooking blog might earn $2–5 RPM because the advertisers competing for that audience (kitchen appliance brands, recipe app subscriptions) bid modestly. A personal finance site writing about mortgages or insurance might earn $15–40 RPM because financial service advertisers pay some of the highest CPCs in the Google Ads ecosystem. We infer the niche RPM tier from the average cost-per-click (CPC) data returned by DataForSEO's keyword analysis for the domain's top-ranking keywords.
The multiplier range of 0.30 to 0.80 deserves explanation. A "visit" in traffic analytics typically means a session — one user arriving at the site. That session might involve one page view or ten. The ratio of page views to visits (pages-per-session) varies widely: news sites often see 1.2–1.5 pages/session; Wikipedia-style reference sites can see 2–4. We use 0.30 as a conservative lower bound (low pages/session, low ad fill rate) and 0.80 as an optimistic upper bound (higher engagement sites with good ad partnerships). The result is presented as a range — e.g., "$3,200 – $11,000/month" — rather than a single number, which would imply false precision.
The "Domain Authority" label is one of the most misunderstood concepts in SEO. Moz created the term in 2010 and trademarked the metric as a predictor of Google ranking potential based on their proprietary backlink index. Ahrefs built an equivalent called "Domain Rating." Neither company shares their methodology in full, and neither metric is used by Google in its ranking algorithms — a fact Google has confirmed repeatedly. Despite this, DA/DR numbers are widely cited as if they were official signals.
SiteWorthIt uses a different, open-data approach. Our Domain Authority display is powered by the OpenPageRank API, which is built on link graph data extracted from the Common Crawl — a free, open repository of petabyte-scale web crawl data collected by a non-profit organisation. Common Crawl scans billions of pages and records which URLs link to which other URLs. OpenPageRank runs a PageRank-style computation on this link graph to assign each domain a score on a 0–10 decimal scale.
We multiply that raw score by 10 to present it on a 0–100 scale for intuitive reading. So a domain with an OpenPageRank score of 6.4 would display as "64" on SiteWorthIt. This is mathematically equivalent but easier to read alongside other percentage-based metrics on the same page.
The practical implication: a high score on SiteWorthIt's Domain Authority reflects that a domain receives links from many other well-linked domains, based on what Common Crawl has observed in its most recent crawl. A low score may mean the domain is genuinely low-authority, or that Common Crawl's crawler hasn't yet indexed its backlink profile fully — which is more likely for newer or smaller sites.
Raw backlink counts are a notoriously noisy metric. A single spammy website might have 50,000 pages that all link to the same target — this inflates the backlink count dramatically without adding meaningful authority. The metric that actually matters is referring domains: the count of unique root domains that link to a website at least once. This is the metric SiteWorthIt prioritises in its display.
We retrieve backlink and referring domain data from the DataForSEO Backlinks API, which maintains its own crawl-based link index updated on a rolling basis. DataForSEO reports both the total number of backlinks (all individual links pointing to the domain) and the number of unique referring domains. The distinction matters: 10,000 links from 10 domains is far less valuable, both to search engines and as a signal of genuine authority, than 500 links from 500 distinct domains.
DataForSEO's backlink index is not as large as Ahrefs' or Majestic's — no freely accessible API has comparable scale to the major paid link databases. This means the referring domain count we display may undercount the true figure for very large sites with extensive backlink profiles. For most sites in the mid-range (hundreds to tens of thousands of referring domains), DataForSEO's coverage is sufficient to give an accurate order-of-magnitude reading.
Global rank attempts to answer the question: "Where does this website sit in the overall pecking order of internet traffic?" Like traffic estimates, rank is derived from third-party data rather than the site's own analytics. SiteWorthIt uses a three-source priority chain for this metric.
Primary: DataForSEO Global Rank. DataForSEO's panel assigns a global popularity rank derived from the same aggregated clickstream data used for traffic estimates. This rank is available for a subset of domains that have enough panel representation to be ranked reliably. When available, this is the most reliable rank signal because it's based on actual user behaviour.
Secondary: Tranco List. The Tranco list ranks the top one million domains globally, updated weekly. It is compiled from Alexa historical data, Cisco Umbrella DNS queries, Majestic referring subnets, and Quantcast audience data. For domains in the top million that lack DataForSEO panel coverage, we use their Tranco rank position. The list is published openly and the methodology is documented in peer-reviewed research, making it one of the most transparent ranking datasets available.
Tertiary: Cloudflare Radar. For domains outside the top million but with measurable DNS query volume through Cloudflare's 1.1.1.1 resolver infrastructure, Cloudflare Radar provides a rank. This is the broadest-reaching source but also the least correlated with actual user traffic — DNS queries reflect many types of internet activity beyond web browsing — so it serves as a last resort when the other sources have no data.
The indexed page count answers a fundamental SEO question: how many of a website's pages has Google chosen to include in its search index? A page not in Google's index cannot rank for any query, so the indexed count is a rough indicator of a site's total search-visible footprint.
SiteWorthIt retrieves this number by running a site:domain.com query through the Serper.dev Google SERP API, which is a real-time interface to Google's search results. Google displays an estimated total result count when a site: query is submitted — for example, "About 14,200 results" — and we parse and display that number.
site: result counts are estimates and can vary by ±30% or more depending on how the query is processed. For large sites with millions of pages, the count can fluctuate significantly between queries. It is also possible for Google to index more pages than the site: operator reveals, as not all indexed content surfaces through this query type. Treat the displayed count as an indicator of scale rather than a precise inventory.
Additionally, a high indexed page count is not always desirable. Sites with large amounts of thin content, duplicate pages, or low-quality automatically generated pages may have a high index count but poor search performance. Google's systems will often surface only a fraction of technically-indexed pages for any given query, prioritising the highest-quality content. This is why SEO practitioners often focus on index quality over index quantity.
Unlike the traffic and revenue metrics — which are estimates based on third-party panel data — PageSpeed scores are direct measurements obtained by actually loading the website. SiteWorthIt submits each domain to the Google PageSpeed Insights API v5, which triggers a real Lighthouse analysis. Google's Lighthouse engine loads the page in a controlled, simulated environment and measures a standardised set of performance and quality signals.
The PageSpeed Insights API returns four composite scores, each on a 0–100 scale:
Within the Performance score, Lighthouse measures the Core Web Vitals — Google's official set of user experience metrics that directly influence search ranking:
| Metric | What It Measures | Good | Needs Work | Poor |
|---|---|---|---|---|
| FCP First Contentful Paint | Time until first content appears on screen | < 1.8s | 1.8–3s | > 3s |
| LCP Largest Contentful Paint | Time until the main content element loads | < 2.5s | 2.5–4s | > 4s |
| TBT Total Blocking Time | Total time main thread was blocked by JavaScript | < 200ms | 200–600ms | > 600ms |
| CLS Cumulative Layout Shift | How much the page layout jumps during load | < 0.1 | 0.1–0.25 | > 0.25 |
Score thresholds follow Google's official classification: 0–49 is Poor (red), 50–89 is Needs Improvement (orange), and 90–100 is Good (green). These thresholds apply to all four composite category scores.
The API is free to use up to 25,000 requests per day per API key. Because the analysis involves actually loading the target website, it takes 10–30 seconds to complete per domain. Scores reflect the mobile experience by default (the mode Google uses for indexing), as this is what Google's mobile-first indexing algorithm evaluates.
Domain age is sourced from the official registrar record for the domain. The registration date is a matter of public record maintained by domain registrars and made accessible through ICANN-mandated protocols.
SiteWorthIt uses RDAP (Registration Data Access Protocol) — the modern, structured replacement for the legacy WHOIS system. RDAP was standardised by the IETF (RFC 7482, 7483, 7484) and became mandatory for all ICANN-accredited registrars in 2019. Unlike WHOIS, which returns freeform text that requires fragile string parsing, RDAP returns structured JSON responses with standardised field names. This makes it dramatically more reliable: the registration date is always in the same machine-readable field regardless of which registrar manages the domain.
The specific field we read is the registrationDate (or creationDate in some registrar implementations) from the RDAP JSON response. This date represents when the domain was first registered, which we use to calculate the domain's age in years and months. It is possible for a domain to have changed ownership since its original registration — a domain acquired from a previous owner retains its original registration date, so "domain age" in this context means "time since first registration," not "time under current ownership."
RDAP has advantages over WHOIS beyond structured data. WHOIS servers are rate-limited aggressively, have inconsistent availability, and many registrars have restricted WHOIS data in response to GDPR requirements. RDAP is GDPR-compliant by design and has more consistent uptime. For the minority of domains where RDAP does not return a creation date (some country-code TLDs maintain their own protocols), we fall back to a WHOIS query as a last resort.
Website valuation is the most speculative metric on the platform. The actual sale price of any website depends on a complex negotiation involving traffic trends, revenue consistency, traffic source diversification, owner dependency, niche competition, intellectual property, technical debt, team requirements, and buyer-specific strategic value. None of these factors are available to a third-party analytics tool.
What we can do is apply the industry-standard "income multiple" methodology used by digital asset brokers like Empire Flippers, Flippa, and Motion Invest. In this framework, a content website's value is typically expressed as a multiple of its annual net profit (or gross revenue, for advertising-dependent sites). Based on historical brokered sales data from the digital assets market, content and information websites typically trade at 2.5× to 4× annual revenue at the low and high ends respectively.
The 2.5× lower bound reflects sites with lower-quality traffic, heavy dependence on a single traffic source (e.g., 90%+ organic SEO with algorithm risk), or thin monetisation. The 4× upper bound reflects more stable sites with diversified traffic, consistent revenue history, and lower operational complexity. SaaS businesses and membership sites can trade at 5–10× or higher, but we use content-site multiples as our baseline since that's the most common site type in our dataset.
Even under ideal conditions, treat our valuation range as a rough frame of reference — useful for understanding scale (is this a $10K site or a $1M site?) but not as a figure you would take to a transaction negotiation without independent due diligence.
Every time you look up a domain on SiteWorthIt, the platform checks multiple caching layers before making any external API calls. This is essential for performance (API calls take seconds), cost control (many of our data sources bill per request), and reliability (if an upstream API is temporarily unavailable, cached data keeps the service functional).
Layer 1 — Redis in-memory cache (TTL: 24 hours). The first lookup result for any domain is stored in Redis with a 24-hour time-to-live. Redis operates entirely in memory, so cache hits return in milliseconds. If you look up the same domain twice within 24 hours, the second request will return the cached result without touching any external APIs. This means the data you see reflects the state of the web at the time of the first lookup in that 24-hour window, not the current moment.
Layer 2 — PostgreSQL persistent database (staleness threshold: 30 days). When Redis has no cached entry for a domain, the platform checks the PostgreSQL database for a previously stored result. If a database record exists and is less than 30 days old, we serve that data and simultaneously refresh the Redis cache with it. If the database record is older than 30 days, or doesn't exist, the platform makes fresh API calls to all configured upstream sources, processes the results, stores them in both PostgreSQL and Redis, and returns the fresh data.
The practical implication: if a website underwent a major redesign yesterday and you look it up today, you may see data reflecting the pre-redesign state if another user looked it up recently. PageSpeed scores, in particular, can change significantly after a redesign. For fresh data on a specific domain, you can bypass the cache by appending ?refresh=true to the analysis URL — this forces a new API fetch regardless of cache state, subject to rate limits.
Different metrics have different inherent refresh rates from their upstream sources. Traffic data in DataForSEO's panel is updated monthly. The Tranco list is rebuilt weekly. Cloudflare Radar ranks are updated daily. OpenPageRank refreshes several times per year. RDAP registration data is real-time. Our 30-day database threshold is calibrated to balance freshness against API cost — monthly refreshes align with the natural update cadence of most of our upstream data providers.
We built SiteWorthIt because we wanted a free, accessible way to get a ballpark sense of a website's scale — not a replacement for first-party analytics. If you have Google Analytics or another analytics platform installed on your own site, that data will always be more accurate than anything we can provide. We are estimating from the outside looking in.
Traffic estimates carry significant uncertainty. Even DataForSEO's panel-based data — our most accurate source — has a margin of error of roughly ±20–30% for sites in the 50K–1M monthly visit range. For sites below 50K visits/month, panel representation becomes thin and errors can be ±40–60% or more. For rank-based estimates (Tranco/Cloudflare), the error can exceed ±100% — the actual traffic might be double or half what we show. We display confidence badges (HIGH/MEDIUM/LOW) precisely to communicate this uncertainty.
Small sites are harder to estimate than large ones. A site with 1 million monthly visitors is in almost every panel dataset and ranking list. A site with 8,000 monthly visitors might appear in none of them. This inverse relationship — where larger sites are easier to estimate, and small sites that might benefit most from traffic context are hardest to measure — is a fundamental limitation of panel-based analytics. We make no apologies for showing "insufficient data" for sites below the threshold; an honest "we don't know" is more useful than a fabricated number.
Revenue and valuation figures should not be used for financial decisions. The revenue estimate models display advertising and applies industry-average RPMs based on keyword CPC data. A site might use no advertising at all, or might monetise through premium subscriptions at ten times the implied RPM, or might have advertising rates negotiated directly with brands at prices very different from programmatic averages. Valuation multiples are drawn from publicly discussed transaction data but vary enormously in practice. These numbers answer "what ballpark are we in?" not "what should I pay for this site?"
Domain Authority comparisons across tools will differ. If you check a site on SiteWorthIt and then check it on Moz or Ahrefs, you will get different scores. All three tools are measuring link-based authority, but with different crawl data, different graph algorithms, and different scoring curves. Our score is not wrong and theirs are not wrong — they are different estimates from different datasets. The most useful thing you can do with any authority score is track it over time on a single platform, not compare the raw number across platforms.
PageSpeed scores change frequently and vary between requests. Google's infrastructure routes PageSpeed Insights requests to different geographic locations and Lighthouse versions may update between our cached fetch and your current visit. It's common to see a 3–5 point variance on repeated tests. Our 24-hour cache means the score you see is from the most recent test within that window. If a site is actively being worked on (JavaScript optimisation, image compression, server upgrades), the score could change substantially in a short period.
Every tool on SiteWorthIt is free, no sign-up required, and built on the data sources documented above.
Enter any domain to get traffic estimates, revenue ranges, global rank, domain authority, and backlink counts — all powered by the methodology described above.
Get a direct Lighthouse analysis via Google PageSpeed Insights API. Performance, SEO, Accessibility, and Best Practices scores for any URL.
See a domain's OpenPageRank-based authority score alongside trust signals, content quality indicators, and technical health checks.
Find where any website sits in global and country-level rankings, sourced from DataForSEO, Tranco, and Cloudflare Radar.
Subscriber counts, monthly view estimates, and revenue calculations for any YouTube channel — using the same RPM methodology described here.
Common questions about data accuracy, how to interpret results, and what the platform can and cannot tell you about a website.