Blueprint · Hiring & Mobility · Autonomy Tier: 1

Screening A[i]gent Blueprint: Hybrid Rubric & Candidate Evaluation

Give every resume a real deep dive — in seconds, not minutes. A rubric-first screening engine that surfaces hidden talent, keeps decisions explainable, and emits clean signals into interviews and downstream analytics.

← Back to Home

Tools System Cases Articles

View Tool Overview What this tool solves, outcomes, and signals. View Tool Demo Try the interactive tool.

What this Blueprint is

This blueprint documents the rubric-first operating model for AI-native screening: categories, bands, scoring logic, governance, and the ATS outputs that make recommendations explainable and auditable.

Who this Blueprint is for

Recruiting Ops, AI-native recruiters, and hiring managers who want fair, explainable, data-rich screening — without losing human judgement or candidate empathy.

What this tool solves

1. Problem & Mission

1.1 Structural Constraint

High-volume recruiting environments operate under structural constraints. A single open role can receive 800–1,000 applications before closing. Even if only 10% meet baseline qualification criteria, that still produces 80–100 viable candidates requiring thoughtful evaluation.

Recruiter and hiring manager interview capacity rarely scales at the same rate. Recruiters may be able to screen 25–50 candidates. Hiring managers may have bandwidth for 10–15 interviews.

When evaluation capacity is constrained, screening decisions compress into seconds per resume. Under these conditions, the bottleneck is not applicant supply — it is structured evaluation capacity.

In a typical funnel:

~1,000 applications received
~100 viable candidates
~25–50 recruiter screens
~10–15 hiring manager interviews

This means a significant portion of qualified candidates may never reach structured, consistent review. The system quietly optimizes for speed over judgment.

1.2 The Risk

When review time is compressed, evaluation becomes discretionary and inconsistent. Reasons for rejection are rarely structured or auditable. Bias risk increases under time pressure. Funnel data becomes too noisy to diagnose missed talent or fairness drift later.

The problem is not recruiter intent. It is infrastructure.

1.3 Mission

Screening A[i]gent exists to introduce structured, rubric-driven evaluation at the earliest screening stage — before interview bandwidth becomes the bottleneck.

The objective is not to replace recruiter judgment. It is to standardize and document it.

Ensure every inbound application receives structured evaluation.
Preserve human decision authority.
Create auditable, explainable screening outputs.
Improve calibration, fairness, and downstream analytics.

1.4 Outcomes We’re Aiming For

Every inbound application receives structured, rubric-driven evaluation.
Clear, explainable recommendations for recruiters and hiring managers.
Reduced time spent on “obvious no” reviews.
Structured data that feeds downstream workflow and analytics.
Improved calibration and fairness monitoring over time.

2. Scope & Design Principles

2.1 In-Scope

Inbound application / resume evaluation for defined roles.
Transformation of JDs and HM requirements into role-specific rubrics.
Hybrid scoring across experience, tenure, employer fit, skills, education, and soft indicators.
Risk flagging (e.g., job hopping, gaps, inconsistent titles).
Recommendations: Advance / HM Review / Do Not Advance.
Ask-next prompts for recruiter screens.
Structured outputs into ATS fields and Metrics A[i]gent.

2.2 Out-of-Scope (v1)

Live interview workflows (owned by Workflow A[i]gent).
Offer, onboarding, or HRIS provisioning.
Sourcing, campaign management, or nurture flows.
Full ML-based prediction of performance (future extension, grounded + validated).

2.3 Design Principles

Rubric-first, model-second. The rubric is the contract; AI is the assistant.
Hybrid scoring. Every category has a human-readable band and a numeric score under the hood.
Explainable. Every recommendation can be traced to specific signals and weights.
Human-in-the-loop. Recruiters must approve or override the agent’s recommendation.
Bias-aware. Flags, not vetoes; calibration over time, not hard-coded stereotypes.

2.4 Global Rules

No fully automated rejection: a human must confirm “Do Not Advance.”
Every override requires a short note, creating an audit trail.
Rubrics and weights are documented and discoverable by TA & HMs.
The agent does not infer or store protected characteristic data.

3. Roles & Responsibilities

Role	Responsibilities
Recruiter	Reviews output, approves/overrides recommendations, adds notes, and owns candidate communication.
Hiring Manager	Aligns on rubric and thresholds, reviews structured summaries, and provides feedback on edge cases.
TA Ops / ATS Admin	Configures fields + mappings; maintains rubric definitions and scoring weights.
People Analytics / Metrics A[i]gent Owner	Builds dashboards; monitors fairness, overrides, and funnel quality.
Owner	Diane Wilkinson – design, implementation, and continuous improvement of Screening A[i]gent.

Key idea: The agent standardizes evaluation; humans own the decision.

4. System Overview

At a high level, Screening A[i]gent runs the following loop:

1. Inputs: JD + HM priorities + resume + metadata (location, source, etc.).
2. Rubric assembly: Convert requirements into structured rubric components.
3. Evidence extraction: Parse for roles, companies, tenure, skills, education, signals.
4. Scoring: Apply hybrid scoring across categories, with risk penalties.
5. Recommendation: Generate disposition and ask-next prompts.
6. Output: Write scores, bands, tags, and recommendation back to ATS fields.

4.1 Technical Architecture

Under the hood, Screening A[i]gent runs on a modular stack: a reasoning layer for rubric scoring, a light orchestration layer to sequence steps, and ATS-native integrations (e.g., Greenhouse custom fields) to write structured outputs back into the system of record.

5. Rubric & Signal Library

The rubric is broken out into categories with both band-level interpretation and numeric scores. Risk flags act as negative adjustments to the overall score.

5.1 Rubric Categories

Category	Max Points	Summary
Experience Depth	30	Relevancy of prior roles to this job’s scope and level.
Tenure Stability	15	Consistency and average time-in-role, tuned for tech.
Competitor / Industry Fit	15	Employer pedigree across competitors, adjacencies, and relevant tech.
Skills & Tools Match	25	Coverage of must-have and nice-to-have skills; transferability.
Education & Credentials	5	Baseline requirements and relevant advanced credentials.
Soft Indicators	10	Signals like clarity, detail, follow-through, and polish.
Risk Flags (penalty)	-20	Moderate negative weighting for meaningful risk patterns.

5.2 Experience Depth – Signals

High band: 3+ years in highly relevant roles, progressive scope.
Medium band: 1–3 years relevant, plus adjacent/transferable work.
Low band: mostly unrelated roles, unclear match to level/domain.

5.3 Tenure Stability – Tech-Tuned

High band: average 2–3 years per role, reasonable moves.
Medium band: average ~1.5–2 years, coherent trajectory.
Low band: repeated sub-12-month roles, unexplained gaps, frequent laterals.

5.4 Competitor / Industry Fit – Generous Model

High band: direct competitors, adjacencies, strong tech brands.
Medium band: general tech or relevant adjacent industries.
Low band: low-signal employers for this role/domain.

5.5 Skills & Tools Match

High band: all must-haves demonstrated; several nice-to-haves.
Medium band: most must-haves; transferable skills; some gaps.
Low band: missing critical skills or only adjacent experience.

5.6 Education & Credentials

High band: baseline + directly relevant advanced credentials.
Medium band: baseline/equivalent; relevant coursework.
Low band: missing true business-critical requirements.

5.7 Soft Indicators

Clarity and structure of resume content.
Evidence of results (metrics, impact, ownership).
Consistency between roles, responsibilities, and claimed achievements.

5.8 Risk Flags (Moderate Penalty)

Risk flags do not automatically disqualify candidates; they trigger ask-next questions and moderate penalties.

Repeated short tenures (< 12 months) without context.
Unexplained multi-year gaps.
Title inflation vs scope (e.g., “VP” for an IC-level role).
Buzzword-heavy content with little evidence of outcomes.

6. Hybrid Scoring & Recommendations

The hybrid model combines banded scores (explainability) with numeric ranges (analytics + tuning).

6.1 Category Bands & Points

Category	Band	Points (example)
Experience Depth	High / Med / Low	24–30 / 16–23 / 0–15
Tenure Stability	High / Med / Low	12–15 / 8–11 / 0–7
Competitor Fit	High / Med / Low	12–15 / 7–11 / 0–6
Skills Match	High / Med / Low	20–25 / 12–19 / 0–11
Education	High / Med / Low	4–5 / 2–3 / 0–1
Soft Indicators	High / Med / Low	8–10 / 4–7 / 0–3
Risk Flags	None / Mild / Significant	0 / -5 / -10 to -20

6.2 Overall Score Bands

90–100: Strong Match
75–89: Solid Match
60–74: Partial Match
< 60: Weak Match

6.3 Recommendation Logic

Advance: score ≥ 80, must-haves met, no significant risk flags.
HM Review: score ~65–79, or mixed signals, or unusual-but-promising paths.
Do Not Advance: missing non-negotiable + below threshold, or very low score with significant risk.

The agent always emits both a band (e.g., “Solid Match”) and a numeric score (e.g., 86/100), plus a plain-language explanation.

7. Ask-Next Prompts & HM Summaries

7.1 Ask-Next for Recruiter Screens

Experience depth: “Walk me through your work on X; what were you directly responsible for?”
Tenure: “I noticed a few shorter roles in [years]. Can you share the context?”
Skills: “Tell me about a recent project where you used [tool/skill] end-to-end.”
Risk flags: “I see a gap between [year] and [year]. What were you focused on then?”

7.2 HM Preview Summary

1–2 sentence overview of level + scope.
Top 3 strengths relative to the rubric.
Top 1–2 concerns / open questions.
Overall band and score.

8. Calibration & Governance

8.1 Shadow Mode

Run silently alongside manual decisions.
Compare agent recommendations vs actual outcomes.
Collect examples where humans disagree.

8.2 Override Tracking

Every override requires a short note.
Track override rate by role, recruiter, and segment.
Clusters of overrides signal rubric misalignment or training gaps.

8.3 Rubric Reviews

Quarterly reviews with TA Ops + HMs to tune thresholds and weights.
Use real candidate samples where the rubric under/over-scored.
Update documentation and communicate changes.

8.4 Fairness & Bias Monitoring

Monitor pass-through and override patterns across segments.
Use findings to refine rubrics and ask-next prompts — not hard-code stereotypes.

9. ATS Integration & Outputs

Screening A[i]gent is ATS-agnostic; the core requirement is a handful of stable fields.

9.1 Example ATS Fields

Field Name	Type	Description
screening_score_overall	Number	0–100 hybrid score.
screening_band_overall	Picklist	Strong / Solid / Partial / Weak.
screening_band_experience	Picklist	High / Medium / Low.
screening_band_tenure	Picklist	High / Medium / Low.
screening_band_competitor_fit	Picklist	High / Medium / Low.
screening_band_skills	Picklist	High / Medium / Low.
screening_risk_flags	Multi-select	Short tenures, gaps, title mismatch, etc.
screening_recommendation	Picklist	Advance / HM Review / Do Not Advance.
screening_override_flag	Boolean	Yes if recruiter overrode recommendation.
screening_decision_at	Datetime	Timestamp of final decision.

9.2 Implementation Notes

Start minimal; add fields after adoption.
Hide internal-only fields from HMs if they add noise.
Write fields only after recruiter confirmation.

10. Metrics & Metrics A[i]gent Integration

10.1 Key Screening Metrics

Application → Screen pass-through (by source, role, recruiter).
Distribution of screening bands.
Override rate and direction (Advance vs Do Not Advance overrides).
Time from application to screening decision.
Downstream performance: do Strong/Solid candidates reach offers?

10.2 Event Mapping

Screening Event	Metrics Dictionary Field	Used For
screening_started_at	screening_start_time	Turnaround time.
screening_decision_at	screening_decision_time	Lead time to decision.
screening_recommendation	screening_recommendation	Quality/fairness analysis.
screening_override_flag	screening_override_flag	Override patterns.
screening_score_overall	screening_score_overall	Score distribution + correlation.

Appendix A – Bands & Point Ranges

Band	Description	Typical Points (per category)
High	Clear, strong alignment with rubric expectations.	≈ 80–100% of category max.
Medium	Good but not perfect; some gaps or trade-offs.	≈ 50–79% of category max.
Low	Limited alignment or missing ingredients.	≈ 0–49% of category max.
No Risk	No meaningful risk flags detected.	0 penalty.
Mild Risk	One or two flags with plausible explanations.	≈ -5 penalty.
Significant Risk	Multiple/severe flags warrant caution.	≈ -10 to -20 penalty.

Exact ranges can be tuned per company and role. The key is stable, documented bands and transparent weights.

Appendix B – Example Role Profiles

B.1 AE / Account Executive

Experience: quota-carrying SaaS sales, similar ACV and cycle.
Tenure: 2–3 year stints; some startup volatility acceptable.
Skills: MEDDIC/BANT, CRM expertise, full-funnel sales motion.

B.2 SDR / BDR

Experience: outbound prospecting, high-volume outreach.
Signals: activity metrics + conversion to meetings/pipeline.
Fit: similar ICP and sales environment preferred.

B.3 Recruiter / Talent Partner

Experience: end-to-end recruiting for similar roles and stakeholders.
Skills: sourcing, stakeholder management, ATS hygiene, calibration.

B.4 Software Engineer

Experience: languages/frameworks/systems aligned to role.
Signals: shipped features, ownership, depth vs breadth.
Evidence: projects, OSS, measurable impact.

Let's Connect

Open to roles in People Analytics, Talent Intelligence, People Ops, and Recruiting Operations — especially teams building internal AI capabilities.

📩 Email Diane 🔗 Connect on LinkedIn 📅 Schedule 15 min Intro