How To Measure Entity Authority When The Old Metrics Can’t

Your domain rating tells you about your links. It tells you almost nothing about whether the web knows your name. A field guide to measuring the authority that actually decides who AI recommends — noise, caveats and all.

THE SHORT VERSION Domain rating measures your link graph. Entity authority — whether search engines and AI models recognise you, place you in the right category, and name you when asked — is a different object, and your existing metrics do not touch it.There is no single number for it, and anyone selling you one is selling a vanity metric. Entity authority is diffuse by nature; you measure it with a small panel of proxies, not a score out of 100.The four things worth measuring: Recognition (does the web know you exist as an entity), Association (are you tied to your category), Corroboration (how widely and credibly are you mentioned), and Recall (do AI models name you when asked).The biggest trap is noise. AI answers are non-deterministic — the same prompt rarely returns the same brand list twice — so you measure frequency across many runs, never position in a single answer, and you track the trend, not the snapshot.Use the Entity Authority Scorecard (below) to baseline all four dimensions, benchmark against competitors, and re-run it quarterly. It is the instrument that tells you whether the rest of this cluster’s work is actually landing.

This cluster has spent several articles arguing that authority is fragmenting — that the backlink is sharing its job with the citation, the mention and the entity. Co-citation builds your place in a category; unlinked mentions feed the systems that decide who gets named. All of which raises an awkward, unavoidable question that the earlier pieces deferred and this one has to answer: how do you actually measure any of it?

It is an uncomfortable question because the honest first answer is “not with the dashboard you currently open.” Domain rating, referring domains, total backlinks — the metrics that have governed link building for a decade — measure your link graph. Entity authority is a different object entirely: it is whether search engines and language models recognise you as a real thing, associate you with the right topics, and reach for your name when someone asks the category’s questions. None of the classic numbers capture that, which is why a brand can have a healthy domain rating and be completely invisible in the surfaces that increasingly decide who gets chosen.

So this article is a field guide to measuring the thing your existing tools miss. It will not give you a single tidy number, because no honest framework can. What it will give you is a structured, four-part way to baseline entity authority, a clear-eyed account of the noise that wrecks naive measurement, and a quarterly instrument you can actually run. Think of it as the meter for everything the rest of this cluster builds — the way you find out whether the co-citation and mention work is landing, rather than merely hoping it is. It assumes you already understand what link building is; this is about measuring what link building, in 2026, is increasingly for.

Why Your Existing Metrics Don’t Measure This

Start by being precise about what the familiar metrics actually do, because the problem is not that they are wrong — it is that they answer a different question than the one you now need answered.

Domain rating and its cousins are models of your backlink profile. They estimate, on a relative scale, how strong your link graph looks compared with other sites. That is a genuine and useful thing to know, and for classic organic ranking it remains relevant. But notice what it does not include: whether Google holds a clean entity for your brand in its Knowledge Graph; whether models describe you as belonging to your category; whether anyone is searching for you by name; whether ChatGPT reaches for you when a buyer asks for options. A site can score well on all the link metrics and fail every one of those tests.

The disconnect is not theoretical, and the evidence is mounting. Analyses of AI visibility repeatedly find that the signals predicting whether a brand gets recommended are not the on-page or pure link metrics SEOs are used to. One study of visibility in a major answer engine found the strongest correlations were with referring-domain counts and community presence on places like Reddit — broad corroboration signals — while content optimisation done purely for AI showed essentially no correlation with discovery. The blunt conclusion that keeps recurring is that AI visibility is not a content problem; it is an authority problem, and specifically an entity-authority problem that the standard metrics were never built to see.

There is a deeper reason the old numbers fall short. They measure something you possess — links pointing at your domain. Entity authority measures something the systems believe about you — a representation that lives in their models, assembled from everything they have read. You cannot read it off your own backlink profile because it does not live there. It lives in the Knowledge Graph, in training data, in the retrieved corpus, in the collective pattern of how the web discusses you. Measuring it means probing those systems from the outside, which is a fundamentally different and messier exercise than counting your own links.

This is why so many teams feel a vague unease that their reporting has stopped describing reality, without being able to name the cause. The dashboards still fill with green — rankings holding, referring domains climbing — while something they cannot point to feels off, usually a competitor showing up in places they themselves never appear. The unease is accurate. It is the gap between what the old metrics measure and what now decides visibility, registering as a feeling before it is captured as a number. The entire purpose of an entity-authority scorecard is to convert that uneasy hunch into something legible, so it can be acted on rather than merely felt. A measured gap is a problem you can work; an unmeasured one is just a slow, unexplained decline.

The Four Things Entity Authority Actually Is

You cannot measure a thing you have not defined, and “entity authority” is loose enough to mean almost anything. So before any measurement, here is the object broken into four distinct components — each of which can be probed, and each of which the rest of this cluster’s tactics feed.

Recognition is the foundation: does the web know you exist as a distinct entity at all? This is about whether Google holds a clean, unambiguous node for your brand — reflected in a Knowledge Panel, an entity the systems can resolve without confusing you for something else — and whether your name is used consistently enough across the web for those signals to consolidate rather than fragment. Without recognition, every other signal scatters, because the systems do not know which thing to attach them to.

Association is placement: once you are recognised, are you tied to the right topics and the right category? A brand can be perfectly recognised and still sit in the wrong neighbourhood of the entity graph — known, but not known as a player in the space it competes in. This is the dimension co-citation builds, and it is what determines whether you are even a candidate when the category’s questions are asked.

Corroboration is breadth and credibility: how widely, how authoritatively, and how positively is your entity discussed across independent sources? This is the dimension unlinked mentions and earned media build. It is the difference between a brand the systems have seen mentioned once and one they have seen described as credible by many voices — which is, repeatedly, the strongest predictor of whether you get recommended.

Recall is the payoff: when someone actually asks an AI model the questions your buyers ask, does it name you? This is the most direct read on entity authority because it is the systems revealing, in their own output, whether your entity has accumulated enough recognition, association and corroboration to be reached for. The first three dimensions are inputs; recall is the result. Measuring all four tells you not just whether you are visible, but — when you are not — which input is missing.

The Deliverable: The Entity Authority Scorecard

Here is the instrument the rest of the article supports. The Entity Authority Scorecard measures the four dimensions above with concrete proxies, scored against your closest competitors and re-run on a fixed cadence. It deliberately resists collapsing into a single number — a composite score would hide exactly the diagnostic detail that makes it useful — and instead gives you a four-part read on where your entity is strong and where it is starving.

Dimension	Proxies you can actually measure	The question it answers
RECOGNITION	Knowledge Panel present? Entity resolvable? Name used consistently (schema, sameAs)? Branded search volume and trend in Search Console.	Does the web know I exist as a distinct, unambiguous entity?
ASSOCIATION	Do models describe you in the right category? Co-citation density beside category peers. Topical mention share.	Am I tied to the topics and the competitive set I actually play in?
CORROBORATION	Mention volume and velocity. Diversity and authority of sources. Sentiment. Earned-media share.	How widely and how credibly do independent voices describe me?
RECALL	Entity-based AI share of voice (named as a recommendation) across engines, measured as frequency over many runs. Citation-based share as an influence proxy.	When buyers ask AI the category’s questions, does it name me — and how often versus rivals?

Three rules make the scorecard trustworthy rather than theatrical. Measure each dimension the same way every quarter, so the trend is real and not an artefact of changed method. Always score your competitors alongside yourself, because every one of these proxies is meaningful only in relation to the others in your category. And read the four dimensions separately: a brand failing on Recall but strong on Corroboration has a different problem — and a different fix — than one failing on Recognition. The scorecard’s job is not to produce a grade; it is to point at the starving dimension.

A word on cadence and ownership, because a scorecard nobody runs is worse than none at all. Quarterly is the right rhythm for most brands: frequent enough to catch real movement, infrequent enough that the noise averages out and the work between measurements has time to land. Assign one owner who runs the same library the same way each quarter, rather than rotating the task, because consistency of method is most of what makes the trend believable. And keep the raw runs, not just the summary — when a number moves, you will want to look back at the actual answers the models gave to understand why, and a summary score throws away exactly the evidence that explains the change.

The Measurement Trap: Noise, Non-Determinism And Vanity

The Recall dimension is where most entity-authority measurement goes wrong, because AI outputs are probabilistic in a way that breaks naive tracking. If you do not understand the noise, you will mistake randomness for signal and make confident decisions on data that means nothing.

The central fact is non-determinism. Ask the same model the same category question repeatedly and you get different brand sets and different orderings each time. One 2026 study found that the probability of two responses producing the same ordered brand list was under one in a thousand across nearly three thousand runs. The practical consequence is severe and widely misunderstood: any metric based on where your brand appears in a single answer is measuring noise. Position within one response is essentially random; what is stable, and therefore meaningful, is frequency across many runs. Measure how often your brand appears in a large pool of responses, not where it landed in any one of them.

The second trap is single-model myopia. The major models disagree with each other constantly — one analysis found that eight leading systems agreed on their top recommendation under 44% of the time, and reached full consensus barely 4% of the time. They also draw on different sources: one leans on encyclopaedic and elite news sources, another on community platforms and forums, another on long-form editorial, another on its own search ecosystem. Measuring one model and generalising is therefore meaningless; you have to probe several and treat them as separate instruments reading different aspects of your entity.

The third trap is the self-defined competitor pool, and it is the one tool vendors quietly build in. If a tool asks you to nominate your competitors before it computes your share of voice, it is calculating your share inside a set you invented, not the set the AI actually produces. Remove a strong competitor from your list and your share ‘improves’ instantly — a flattering illusion with no basis in reality. Honest measurement lets the models reveal the competitive set, then measures your share of what they actually say, not your share of a pool you curated to look good.

There is a final, subtler distinction worth carrying: being named is not the same as being absorbed. A model can list your brand without that mention actually shaping its reasoning, and it can lean heavily on your evidence while crediting it lightly. Researchers separate citation selection — making the reference list — from citation absorption — actually influencing the generated answer. You will rarely measure absorption precisely, but keeping the distinction in mind stops you over-valuing a shallow name-drop and under-valuing the cases where your content is genuinely doing the work behind an answer.

Hold all of this together and a clear measurement posture emerges: humble, repeated and comparative. Humble, because every individual number is noisy and partial, and false precision here is worse than honest approximation. Repeated, because frequency over many runs is the only thing that converts noise into signal. Comparative, because an entity-authority figure means nothing in isolation and everything against the field. A team that internalises those three principles will extract real intelligence from messy data; a team that wants a clean, single, absolute score will either be misled by one or give up in frustration when the numbers refuse to behave. The measurement is genuinely harder than counting links — but it is tractable, and the teams treating it as tractable are quietly building a picture their competitors do not have.

Building A Prompt Library That Means Something

Recall measurement is only as good as the questions you ask, and this is where most teams quietly sabotage themselves. The fix is a deliberately constructed prompt library rather than a handful of queries you thought of on the day.

Aim for something like twenty to fifty prompts that genuinely represent how buyers in your category interrogate an AI. Spread them across four types: brand prompts (“what does [brand] do”), which test Recognition; category prompts (“best [category] tools in 2026”), which test Recall against the field; comparison prompts (“[brand] versus [competitor]”), which test Association and positioning; and problem prompts (“how do I solve [the problem you address]”), which test whether you surface at the moment of need rather than only when named. A library skewed entirely to one type gives you a distorted picture — strong brand-prompt performance can coexist with total absence from the category and problem prompts that actually drive decisions.

Categorise the library by buyer-journey stage, too — awareness, consideration, decision — so you can see whether your entity surfaces early, late or only when the buyer already knows your name. Many brands discover they are named freely in decision-stage, brand-aware prompts and entirely absent from the awareness-stage, problem-framed prompts where new buyers actually begin. That gap is one of the most actionable findings the whole exercise produces, and it is invisible unless the library is built to expose it. Once set, freeze the library: changing the prompts between quarters destroys comparability, so resist the urge to keep tinkering and let the fixed set reveal the trend.

One caution on prompt design that separates useful libraries from misleading ones: write the prompts in the buyer’s language, not your keyword list. A library built from the terms you wish people searched — your product category as you brand it, your feature names — will flatter you, because it primes the question towards your framing. The prompts that matter are the ones a buyer with no knowledge of you would type: the problem in their words, the category in its common name, the comparison they would actually run. If your entity only surfaces when the prompt already contains your vocabulary, you have measured your own echo, not your visibility. Build the library from how outsiders ask, and it will tell you the truth rather than the answer you hoped for.

The Proxies Worth Your Time

Across the four dimensions, a handful of proxies do most of the work. None is perfect; together they triangulate something real.

Branded search is the most underrated and one of the most honest. A rising trend of people searching your name directly — visible in Search Console — reflects genuine human interest that no SEO can fabricate, and it correlates with the entity strength AI systems reward. It is a lagging, organic vote on your Recognition, and it is sitting in a tool you already own. Watch the trend, not the absolute number, and watch it against your category’s seasonality.

AI share of voice is the headline Recall proxy, and it splits usefully in two. Entity-based share of voice counts how often you are named as a recommendation — a direct read on recall and the more actionable number for most brands. Citation-based share counts how often your content is cited as a source — a leading indicator of your influence on the answer even when the user never sees the reference. Both matter; report them separately, because a brand strong on one and weak on the other has a specific, diagnosable gap. For context on what ‘good’ looks like, the bar is sobering: one 2026 industry report put the average brand mention rate across AI answers at around 17%, with leaders far higher — most brands, in other words, are simply absent most of the time.

Mention metrics carry the Corroboration dimension — volume, velocity, source authority and sentiment — and the practical detail of tracking them belongs to the mention-measurement discipline covered elsewhere in this cluster. For the scorecard, the key is to roll them into a single Corroboration read and benchmark it against rivals rather than chasing the raw count. And for Association, the most direct proxy is qualitative but revealing: read how the models actually describe you in their answers. Do they place you in the right category, with the right peers, doing the right thing? A model that describes you accurately has a strong entity for you; one that hedges, mislabels or confuses you for another brand is telling you your Association and Recognition need work. Pair these reads with the current link building statistics and the AI-visibility features now appearing in the major link building tools to assemble the picture without building everything by hand.

The Tooling Landscape — And Where To Distrust It

You can run a credible baseline by hand, and for a first pass you should: a spreadsheet, a frozen prompt library, and an afternoon of running those prompts across two or three models will tell you more than most teams currently know. Manual measurement also builds an instinct for the noise that no dashboard conveys. But it does not scale, and once you want consistent quarterly tracking across dozens of prompts and several engines, tooling earns its place.

The market splits into two families. The first is dedicated AI-visibility trackers, which run prompt libraries across the major models and compute share of voice for you — some offer free entry-level grading, others are built for enterprise competitor tracking at scale. The second is the entity-and-mention family: tools focused on Knowledge Panel and entity health, alongside the brand-monitoring and listening tools that carry the Corroboration dimension. A complete measurement stack usually draws from both, because no single tool covers all four scorecard dimensions well.

Whichever you choose, hold the outputs at arm’s length, because the same brand can score very differently across tools that calculate share of voice differently. Three disciplines keep you honest. Pick one tool per dimension and stay with it, so your trend is internally consistent even if its absolute numbers are arguable. Be alert to the self-defined-competitor-pool trap baked into some platforms, and prefer tools that let the models reveal the competitive set. And treat every number as directional and relative — useful for tracking your own movement and your gap to rivals, not as an objective truth about your standing. A tool that gives you a confident single ‘AI authority score’ is usually hiding more than it reveals; the four-dimension read is harder to sell but far more honest.

When Recognition Is The Weak Link

Of the four dimensions, Recognition is the one most worth fixing first when it scores poorly, because it is the gate. If the systems do not hold a clean, single entity for your brand, every mention and link you earn scatters across fragments instead of compounding into one strong node — and no amount of Recall work will pay off on a foundation that keeps splitting the signal.

The most common and most fixable failure is naming inconsistency. When your brand appears across the web in different forms — with and without a suffix, with varying capitalisation, under slightly different product or company names — the entity graph can fragment, and mentions that should reinforce one node get spread across several weak ones. The remedy is unglamorous plumbing: use one canonical name everywhere, attribute your experts consistently, and implement the structured signals that let systems resolve you to a single entity — organisation schema, canonical identifiers, and the explicit ‘same as’ links that tie your various profiles to one identity. This is the foundational technical step behind AI visibility, and it is invisible until you measure Recognition and find it leaking.

A note on patience, because Recognition rewards it and punishes the impatient. Shifting how a model perceives a brand within a category is not a quick win; one body of research suggests it can take on the order of hundreds of substantial documents — genuine, expert-led, structured content and coverage, not thin filler — to move the needle meaningfully. That is a high bar, and it is precisely why entity authority compounds for those who start early and frustrates those who expect it to respond like a ranking. The brands that will own their categories in the answer layer are the ones building recognised, corroborated entities now, while the measurement is still crude and the field is still open. Measuring early is itself an advantage, because it tells you where you stand while there is still time to act on it.

A Composite Case: The DR-Rich, Recall-Poor Brand

Consider an anonymised composite that captures the gap this whole article is about. A established software brand had spent years building a strong link profile and was, by every classic metric, doing well: a healthy domain rating, thousands of referring domains, solid rankings for its core terms. Leadership considered its authority a settled matter. Then someone ran an entity-authority scorecard for the first time, and the picture inverted.

Recognition scored well — a clean Knowledge Panel, consistent naming, healthy branded search. But Recall was near zero: across a fifty-prompt library run repeatedly over the major models, the brand was named in a small single-digit percentage of category and problem prompts, while two younger competitors with weaker link profiles dominated. The diagnosis sat in the middle two dimensions. Association was thin — the models rarely placed the brand in its own category — and Corroboration was thinner still: the brand had links, but few recent, credible, independent mentions describing what it did. Its authority was real in the link graph and almost absent in the entity graph. The scorecard did not just reveal the gap; by separating the four dimensions, it pointed straight at the cause — not a missing tool or a ranking problem, but a starving Corroboration layer that no amount of additional links would feed.

What makes the case instructive is not the gap itself but how invisible it had been. The brand had a full analytics stack and a competent team, and not one of its existing dashboards could have surfaced the problem, because every one of them measured the link graph the brand was already winning. The scorecard’s value was simply that it asked a different question — not ‘how strong are our links’ but ‘does the web know us as a player in our category’ — and the answer reorganised the team’s entire roadmap away from more link acquisition and towards the corroboration and association work the entity layer was starving for. The measurement did not just describe the problem; it redirected the budget.

Your Monday-Morning Move

Build a minimum-viable scorecard in one sitting. Write fifteen category and problem prompts a real buyer would ask. Run each one three times across two or three different AI models, and tally how often your brand is named versus your two closest competitors — frequency across the runs, never position within an answer. In parallel, note three quick Recognition checks: do you have a Knowledge Panel, is your name used consistently across your key profiles, and is branded search trending up in Search Console? You will end the hour with a crude but honest baseline across Recognition and Recall, and almost certainly one uncomfortable finding — usually that your link authority and your entity authority are nowhere near as aligned as you assumed. Diarise the same fifteen prompts for ninety days’ time. The single most valuable thing about this measurement is the trend, and the trend only exists once you have run it twice.

Measuring The Thing That Now Decides

For a decade, the comforting thing about link metrics was that they measured something you owned and could count. Entity authority offers no such comfort: it lives inside systems you cannot see into directly, it is noisy, probabilistic and partial, and it refuses to resolve into a single satisfying number. That is genuinely harder to work with. It is also where the visibility that increasingly matters is now decided, which means the difficulty is not optional — it is the job.

The discipline this article asks for is modest in effort and large in payoff: define the four dimensions, probe them with honest proxies, respect the noise by measuring frequency over many runs against a real competitive set, and track the trend rather than the snapshot. Done quarterly, that turns entity authority from an article of faith into something you can actually see move — and, crucially, lets you tell which of your inputs is working. A brand that can see its Corroboration rising and its Recall following has converted the vaguest concept in modern SEO into a managed metric.

There is also a competitive timing argument for doing this now rather than later. AI visibility is, in most categories, still up for grabs — the average brand is absent from the answer far more often than it is present, which means the gap between the brands that measure and act and the brands that do not is currently wide and still wideable. Measurement is the first-mover’s instrument: it tells you where the open space is in your category while the space is still open. The brands that wait until entity authority is easy to measure will be measuring a race that has already been run. The ones that tolerate the current messiness, and act on what it shows, are the ones writing the result.

It also closes the loop on everything this cluster has argued. The co-citation work, the mention-building, the shift from connectivity to corroboration — all of it is a bet that authority is moving from the link graph to the entity graph. The scorecard is how you check whether the bet is paying off in your specific category, on your specific timeline, rather than trusting the trend in the abstract. The next article turns from measurement to construction in earnest, asking what a link campaign looks like when its real product is no longer the link at all, but the entity it leaves behind.

How to Get Products Recommended by AI Shopping Agents (2026)

Link Building Predictions 2027: 12 Bets for the Next Cycle

Privacy-First Web (Cookieless & P-CR): The Link Building Impact