Most B2B teams build lead scoring models that grade the lead's score, not the lead. We have handled over 95,000 positive replies across 50 plus B2B campaigns this year, and the single biggest gap between teams that book meetings and teams that argue about scoring is whether the score actually changes what someone does next. Below, the plain English definition of lead scoring, the 100 point model that actually works in practice, the 5 mistakes that quietly kill the model, and how to wire scoring into a real sales motion that books meetings.
What Is Lead Scoring: The Short Definition
- Lead Scoring
- A system that assigns each lead a numerical value (commonly on a 0 to 100 scale) reflecting how likely they are to become a paying customer. The score is built from a weighted combination of who the lead is (firmographic and demographic fit) and what the lead has done (behavioral engagement). The goal is to give sales a triage signal so reps work the leads most likely to convert first, instead of going down the list in the order they arrived.
- Fit Score (the demographic and firmographic half)
- The portion of the lead score that measures who the lead is on paper. Inputs are static attributes that do not change with behavior: industry, employee count, revenue band, job title, seniority, geography, and tech stack. A perfect fit score means the lead matches your ideal customer profile exactly. A weak fit score signals the lead is in the wrong segment regardless of how engaged they look.
- Behavior Score (the engagement half)
- The portion of the lead score that measures what the lead has done. Inputs are dynamic actions: opening an email, visiting a pricing page, downloading a guide, attending a webinar, replying to outbound, requesting a demo, or using a product trial. Behavior signals decay over time so a recent action carries more weight than the same action from 6 weeks ago. A strong behavior score signals real intent. A weak one signals the lead is browsing, not buying.
The mental model that helps: fit score answers "should I sell to this person at all." Behavior score answers "is this person ready to buy now." Both halves matter. A perfect fit lead with zero behavior is a future opportunity for marketing nurture. A high behavior lead with terrible fit is usually a competitor, a job seeker, or someone wasting your sales team's time. The composite score routes the lead to the right next action.
The Two Halves of a Lead Score and Why You Need Both
Models built on fit alone miss the difference between a CMO who just landed on your pricing page after an industry report cited you and a CMO who downloaded a guide 18 months ago and has not opened an email since. Models built on behavior alone treat the unemployed marketing manager binge reading your blog as more valuable than the VP of Sales at your exact ICP who has not clicked anything yet. Neither half tells the full story.
The standard weighting across mature B2B teams is somewhere between 40 percent fit and 60 percent behavior, or the inverse, depending on the sales motion. Product-led SaaS leans heavier on behavior because product usage is the strongest buying signal there is. Enterprise sales with long cycles and tight ICPs lean heavier on fit because behavior in a 9 month sales cycle is noisier. Pick the weighting that matches how your deals actually close, not the one the CRM template ships with.
Per Salesforce's lead management guidance, the most accurate models also factor in negative signals that subtract points: competitor email domains, generic free email addresses on enterprise targeting, unsubscribes, repeated bounces, or roles that do not have buying authority. Without negative scoring, the top of your scored list fills up with people who will never buy, and sales loses faith in the system inside the first quarter.
How to Build a Lead Scoring Model: The 100 Point Framework
The cleanest starting model is a 100 point scale split into 4 buckets. The exact percentages flex per company, but the structure is the same across every mature B2B team we have seen.
| Bucket | Point Range | What It Measures |
|---|---|---|
| Firmographic Fit | 25 points | Company size, industry, revenue band, geography, tech stack alignment |
| Demographic Fit | 25 points | Job title, seniority, decision-making authority, department |
| Behavior and Intent | 40 points | Email engagement, page visits, demo requests, content downloads, product usage |
| Negative Signals | Up to minus 30 points | Competitor domains, free email addresses, unsubscribes, role mismatches, bounces |
Inside each bucket, individual signals get their own point values. For firmographic fit, an exact industry match might be worth 15 points, a revenue band match 5 points, and a tech stack match 5 points. For behavior, opening 3 emails in a week is worth 2 points, visiting the pricing page is worth 10 points, requesting a demo is worth 25 points. The specific numbers come from looking backward at the last 90 days of closed deals and asking which signals correlated with wins.
Recency weighting is the part most operators skip. A pricing page visit 48 hours ago should be worth meaningfully more than a pricing page visit 6 weeks ago. The standard approach is a decay function that halves the value of a behavior signal every 30 days. Without decay, leads accumulate stale points forever and the model starts handing sales 180 day old browsers as if they were today's hot prospects.
MQL, SQL, and Where the Thresholds Should Land
The scored model only matters if the score triggers a different action. The standard B2B convention defines two thresholds: MQL (Marketing Qualified Lead) and SQL (Sales Qualified Lead). MQL is the score at which marketing hands the lead to sales. SQL is the score at which sales treats the lead as a real opportunity worth booking a meeting.
On a 100 point scale, the typical landing zones look like this:
- 0 to 49 points: Cold or unqualified. Stays in marketing nurture. No sales touch yet.
- 50 to 74 points (MQL): Worth a light sales touch. SDR sends a personal email or LinkedIn message. No demo push yet.
- 75 to 89 points (SQL): Worth a real conversation. SDR books a discovery call. AE prepares for the meeting.
- 90 to 100 points (Sales Ready): Immediate outreach within hours. AE is alerted directly. Often skips the SDR stage entirely.
The right thresholds for your team depend on volume and capacity. If your sales team can work 200 leads a week and your scoring model produces 800 leads above 50 points each week, the MQL threshold is too low. Raise it until the volume matches what sales can actually touch. The point of scoring is not to flag every lead as qualified. The point is to give sales the 200 leads most likely to close and let the rest keep maturing.
Backtest both thresholds against the last 90 days of closed deals before locking them in. Pull every closed-won deal and look at the lead score on the day the opportunity was created. If 80 percent of your wins came from leads scored 70 or higher, the MQL line at 50 is wasting sales time on leads that historically have not converted. Move the threshold to match the data.
The 5 Mistakes That Make Lead Scoring Useless
Most B2B lead scoring models do not fail because the math is wrong. They fail because of operational gaps that look minor in isolation and quietly compound. The same 5 patterns show up over and over across the teams we work with.
| Mistake | How It Shows Up | How to Fix It |
|---|---|---|
| No negative scoring | Top of the scored list fills with competitors, job seekers, and internal employees | Subtract points when the lead uses a generic email domain, works at a competitor, or holds a non-buying title |
| No recency decay | Stale behavior from 6 months ago counts the same as a pricing page visit today | Halve the weight of every behavior signal every 30 days; cap behavior contribution at 12 months |
| Marketing built it alone | Sales ignores the score, calls leads in their own order, the model becomes a dashboard ornament | Co-build with sales from day one; the AE who works the leads sets the SQL threshold |
| No backtest against closed deals | Thresholds are guesses, the model surfaces leads that historically do not convert | Pull the last 90 days of closed-won deals, set thresholds at the score the wins actually had |
| Never refreshed | ICP shifts, product changes, channel mix changes, but the scoring weights stay frozen for years | Quarterly audit; reweight every bucket against the most recent 90 days of pipeline data |
The pattern that breaks scoring the fastest is the marketing-built-alone version. Marketing ops spends 4 weeks building a beautiful model in HubSpot or Marketo. Sales gets a 30 minute walkthrough. The model goes live. Sales takes one look at the top 20 scored leads, recognizes 3 of them as bad fits the model thinks are perfect, loses trust on day one, and starts working the list in their own order. Six months later the score is a number in the CRM nobody references. The fix is to co-build the model with the AE who will actually work the leads. The sales-built half is what makes the marketing-built half load bearing.
How to Wire Lead Scoring Into Your Sales Motion
A scored lead in the CRM does nothing on its own. The score has to trigger a routing decision and a different next action for the system to produce real lift. The wiring matters as much as the model.
- Set the routing rule. When a lead crosses the MQL threshold, the CRM creates a task for the SDR within 5 minutes. When the SQL threshold trips, the AE gets a direct alert (Slack, email, or in-app) and the lead is added to the AE's priority queue.
- Match the cadence to the score. MQL leads get a 5 touch outbound cadence over 14 days. SQL leads get a 3 touch cadence over 5 days with one of those touches being a personal video or LinkedIn message. Sales-ready leads get a same-day phone call.
- Track score-to-meeting conversion by bucket. Report weekly on meetings booked per 100 leads in each score bucket. If 90+ leads book at 35 percent and 75 to 89 leads book at 12 percent, the SQL threshold is calibrated. If both buckets book at the same rate, the model is not actually segmenting the leads.
- Feed booking data back into the model. Leads that booked meetings and closed get a positive feedback loop into the scoring weights. Leads that booked and did not close get analyzed for which signals were misleading. The model gets sharper every quarter only if the closed-deal data flows back in.
- Set a kill rule. Leads that score below 30 for 90 consecutive days go to a sunset list. They get one last reactivation email then leave the active database. Without a kill rule, dead leads accumulate, the database bloats, and the average score drifts down over time as stale records dilute the active ones.
Travis replaced his in-house SDR with a scored outbound system and hit $106K in his first full month. The triage piece is what made it work. Read the full case study →
When Lead Scoring Is Not Worth Building
Lead scoring is a triage tool. Triage only matters when you have more leads than capacity. For small teams under 100 inbound leads per month, a scored model adds setup and maintenance overhead with no real triage benefit because sales can touch every lead the same day regardless of score. A simple 3 column spreadsheet ranked by deal size beats a 100 point model at that volume.
The other case where scoring does not earn its keep is high-velocity outbound. If every positive reply gets a same-day SDR touch regardless of which prospect it came from, the scoring overhead adds nothing. We run AI outbound at 8 million emails a year across 50 plus campaigns and the highest leverage lift is not scoring the reply, it is responding within 60 seconds with a personalized lead magnet. Speed beats segmentation when the volume is on the outbound side, not the inbound side.
Scoring also breaks down when the lead source is too narrow to support statistical confidence. A team running 12 deals a quarter does not have enough closed-won data to backtest threshold calibration. Wait until you have 60 to 90 deals worth of historical pipeline before locking thresholds. Until then, the score is theater. Per Gartner's account based marketing research, scoring is one input into a broader account-level prioritization system, not the system itself.
The Practitioner Take on Lead Scoring in 2026
Lead scoring is a tool, not a strategy. The tool only matters if it changes what the team does next, and changing what the team does next requires sales to trust the score. Trust comes from co-building the model with the AEs who will work the leads, backtesting thresholds against real closed deals, and updating the weights every quarter as the business evolves. Skip any of those steps and the model becomes a CRM field nobody references.
The other shift worth naming: scoring is moving from rules-only to a hybrid where machine learning sets the behavior weights and humans set the fit weights. The hybrid model accepts that humans know their ICP better than an algorithm does, and that an algorithm reads engagement patterns better than a human can. Most mid-market B2B teams in 2026 do not need a full ML model. A well-built 100 point rules-based model with quarterly weight refreshes outperforms a misconfigured ML model every time. Build the rules version first, validate it for 2 quarters, then layer ML in if the volume justifies the complexity.
If you are starting from zero, the order is: define your ICP precisely, score 50 historical closed-won deals manually to identify the patterns, build a simple 100 point model in your CRM, set conservative MQL and SQL thresholds, route to sales with a 5 minute SLA, and review the score-to-close conversion weekly for the first 90 days. The model gets sharper the moment it is contact with real pipeline data. The teams that wait for the perfect model never ship one. The teams that ship the rough version and refine it weekly end up with the model that books meetings.
See How an AI SDR System Works
15 minute demo. No fluff. We will walk you through the exact system, show real prospect examples, and scope what it looks like for your market.
Schedule a Demo →