Pre-Employment Assessment: The Only Guide That Doesn't Suck
Which hiring tests actually predict performance, which are legal nightmares, and which are pure theater — backed by 100 years of research.
Most Pre-Employment Assessment Guides Are Vendor Brochures. This One Isn't.
Search "pre-employment assessments" and you'll find 50 articles that all say the same thing: assessments are great, here are 12 types, now buy our platform. They cite vague statistics. They treat every assessment type as equally valid. They never tell you that most personality tests are about as useful as a horoscope for predicting job performance.
This guide is different. We're going to rank every major assessment type by what the research actually says — specifically the Schmidt & Hunter (1998) meta-analysis of 85 years of selection research, and the Sackett et al. (2022) update that shook up the entire field. We'll tell you which assessments work for which roles, which ones will get you sued, and which ones are a complete waste of everyone's time.
Fair warning: if you sell a pre-employment testing platform built entirely around personality quizzes, you're not going to like this article.
What 100 Years of Research Actually Says About Predicting Job Performance
In 1998, Frank Schmidt and John Hunter published the most cited paper in personnel selection history. They analyzed 85 years of research across 19 selection methods and ranked them by validity coefficient — basically, how well each method actually predicts on-the-job performance on a scale from 0 (coin flip) to 1 (perfect prediction).
Their findings upended conventional hiring wisdom. Resumes? Nearly useless. Years of experience? Barely better. Unstructured interviews — the kind where a hiring manager "goes with their gut"? About as predictive as checking someone's handwriting.
Then in 2022, Sackett, Zhang, Berry, and Lievens dropped a bomb. They revisited Schmidt & Hunter's methodology and found that previous studies had systematically overcorrected for range restriction, inflating some validity estimates. The updated numbers reshuffled the rankings. Cognitive ability tests dropped from their throne. Structured interviews moved to #1.
Here's the updated hierarchy of what actually works, based on both landmark studies.
| Assessment Method | Validity (Schmidt & Hunter 1998) | Validity (Sackett et al. 2022) | Verdict |
|---|---|---|---|
| Structured Interviews | 0.51 | 0.42 (highest) | Gold standard. Use them. |
| Work Sample Tests | 0.54 | ~0.33 | Excellent when feasible. Hard to scale. |
| Cognitive Ability Tests (GMA) | 0.51 | 0.31 | Strong but adverse impact concerns. |
| Job Knowledge Tests | 0.48 | ~0.31 | Good for technical roles. |
| Integrity Tests | 0.41 | ~0.31 | Surprisingly decent. Low adverse impact. |
| Conscientiousness (Big 5) | 0.31 | ~0.22 | Modest. Better as a supplement. |
| Situational Judgment Tests | Not studied | ~0.26 | Decent mid-tier option. |
| Unstructured Interviews | 0.38 | ~0.19 | Stop doing these. |
| Reference Checks | 0.26 | ~0.18 | Marginally useful at best. |
| Years of Experience | 0.18 | ~0.11 | Almost meaningless after 5 years. |
| Years of Education | 0.10 | ~0.07 | Even more meaningless. |
| Graphology | 0.02 | N/A | Yes, people actually used this. No, really. |
Every Pre-Employment Assessment Type, Ranked and Roasted
Let's break down each assessment type — what it measures, when to use it, and when it's a waste of time. No diplomatic hedging.
Tier 1: Actually Predicts Performance
These are the methods with real research backing them. If you're building a hiring process from scratch, start here.
- Structured Interviews — Standardized questions, standardized scoring rubric, multiple interviewers. Sackett et al. (2022) placed these at the top with validity of .42. The key word is "structured." The moment you let the interviewer freestyle, validity drops by half. Every role, every level, every department should use these.
- Work Sample Tests — Give candidates a task that mirrors actual job work. A marketer writes copy. A salesperson does a mock discovery call. A designer redesigns a screen. Validity was .54 in Schmidt & Hunter. The catch: they're expensive to design and hard to standardize. But for roles where you can pull it off, nothing beats watching someone do the actual work.
- Cognitive Ability Tests — General mental ability (GMA) tests measure reasoning, pattern recognition, and problem-solving speed. Schmidt & Hunter had this at .51, Sackett dropped it to .31 — still solid, but no longer the undisputed champion. The elephant in the room: cognitive ability tests produce larger group differences across racial and ethnic lines than any other assessment type. The EEOC knows this. Your legal team should too.
- Job Knowledge Tests — Domain-specific knowledge assessments. Does the accountant know GAAP? Does the developer understand data structures? Straightforward, job-relevant, legally defensible. Best for roles where specific knowledge is genuinely required on day one.
Tier 2: Useful but Limited
These add value when combined with Tier 1 methods. On their own, they're not enough.
- Integrity Tests — These measure honesty, dependability, and rule-following. Surprisingly valid (.31-.41 depending on the study) and — here's the kicker — they show minimal adverse impact across demographic groups. If you need a pre-screen that won't land you in legal trouble, these are underrated.
- Situational Judgment Tests (SJTs) — Present candidates with realistic work scenarios and ask them to choose or rank responses. Not included in Schmidt & Hunter's original study, but Sackett et al. found validity around .26. Good middle-ground option. Easy to administer at scale, decent signal, low adverse impact.
- Conscientiousness Measures — The only Big Five personality trait that consistently predicts performance across all job types, with validity around .22-.31. It's real, but it's modest. Useful as one data point in a battery, not as your primary filter.
Tier 3: Marginal to Useless
Here's where most companies waste their assessment budget.
- Unstructured Interviews — The most popular hiring method in the world, and one of the worst predictors. Validity of .19 in updated estimates. Essentially, the hiring manager is evaluating how much they personally like the candidate. This is how you build a team of people who went to the same schools and laugh at the same jokes. It's not hiring. It's friend-dating.
- Resume Screening — Education and experience are among the weakest predictors in the entire research literature. A degree from a good school tells you someone got into a good school 4-8 years ago. Ten years of experience tells you someone showed up for 10 years. Neither tells you much about how they'll perform.
- Reference Checks — Validity around .18. Everyone gives the same three references who say the same positive things. The only useful reference check is a backdoor reference — and even those are hit-or-miss.
- Myers-Briggs (MBTI) — Zero validity for predicting job performance. The American Psychological Association doesn't recognize it as a valid assessment tool. Yet companies still pay to sort candidates into 16 personality boxes as if that means anything. If you're using MBTI in hiring, please stop.
- DISC, Enneagram, StrengthsFinder — Same problem as MBTI. Fun for team-building workshops. Useless for hiring decisions. These tools were never designed or validated for personnel selection.
Which Assessments Work for Which Roles
One of the biggest mistakes companies make with pre-employment assessments is treating every role the same. A work sample test for an engineer looks nothing like one for a sales rep. Here's what actually works for each function.
| Role | Best Assessment Combo | Skip This |
|---|---|---|
| Software Engineers | Work sample (take-home or live coding) + structured technical interview + system design discussion | Whiteboard algorithm puzzles with no IDE, timed HackerRank with obscure puzzles |
| Sales / BDR | Role-play (mock cold call or discovery) + structured behavioral interview + SJT | Personality tests, cognitive ability tests |
| Marketing | Portfolio review + work sample (write a brief, build a campaign outline) + structured interview | Generic aptitude tests |
| Design | Portfolio deep-dive + design challenge (realistic scope, 2-4 hours) + structured crit session | Cognitive ability tests, personality profiles |
| Operations / Ops | SJT + structured interview + job knowledge test (process, tools, metrics) | Abstract reasoning tests |
| Customer Success / Support | SJT with real ticket scenarios + role-play (angry customer) + structured interview | Personality quizzes |
| Product Management | Case study (real problem, real constraints) + structured interview + work sample (write a PRD) | Coding tests, unless it's a technical PM role |
| Executive / Leadership | Structured interview + multi-rater assessment + work sample (strategy presentation) | Personality profiles, cognitive tests |
| Finance / Accounting | Job knowledge test + work sample (financial model) + structured interview | DISC, MBTI |
The Legal Minefield: What Can Get You Sued
Pre-employment assessments are perfectly legal. Using them badly is where companies get destroyed. The EEOC's Uniform Guidelines on Employee Selection Procedures (UGESP) are clear: you can test candidates, but if your tests disproportionately screen out a protected group, you need to prove the test is job-related and consistent with business necessity.
This is called adverse impact, and it's measured by the four-fifths rule: if the pass rate for a protected group is less than 80% of the pass rate for the highest-scoring group, you have a potential problem. It doesn't mean you'll automatically lose a lawsuit, but it means you'd better have solid validation data.
In practice, here's what this means for each assessment type:
- Cognitive ability tests — Highest adverse impact risk. Black and Hispanic candidates score lower on average than white candidates — this is one of the most robust findings in the entire assessment literature. If you use GMA tests, you need strong evidence of job-relatedness and should explore whether less impactful alternatives exist. Best Buy settled with the EEOC over this exact issue.
- Personality tests — Lower adverse impact, but watch out for ADA issues. If your personality test screens for traits that could be proxies for mental health conditions (depression, anxiety, social phobia), you might violate the Americans with Disabilities Act. The classic case: employers using the MMPI — a clinical diagnostic tool — as a pre-employment screen. Courts have consistently struck this down.
- Work samples and structured interviews — Generally the safest option. They're face-valid (candidates can see why the test is relevant), job-related by design, and tend to show smaller group differences than cognitive tests. If you want strong predictive power with low legal risk, this is your sweet spot.
- AI-scored assessments — This is the new frontier. The EEOC issued guidance in 2023 specifically about AI-driven hiring tools and adverse impact. If your vendor uses machine learning to score candidates, you're still on the hook for adverse impact — even if you didn't build the algorithm. Illinois, New York City, and Maryland already have specific laws governing AI in hiring. This area is moving fast.
How to Build an Assessment Process That Actually Works
Knowing which assessments are valid is only half the battle. Implementation matters just as much. Here's how to put it together without making the mistakes that plague most companies.
- Start with the job, not the tool — Run a proper job analysis. What does someone in this role actually do day-to-day? What separates your top performers from average ones? Build your assessment around those specific behaviors and skills — not around whatever your vendor happens to sell.
- Use at least two methods — The research is clear: combining two valid assessment methods produces better predictions than any single method alone. Schmidt & Hunter showed that structured interviews + cognitive tests hit composite validity above .60. Even if you skip cognitive tests, pairing structured interviews with work samples or SJTs beats any standalone method.
- Standardize everything — Same questions, same order, same scoring rubric, same evaluators (or at least calibrated evaluators). The moment you let each interviewer wing it, you've traded a structured interview (validity .42) for an unstructured one (validity .19). You just cut your predictive power in half.
- Keep it short and respect candidates' time — Assessment processes longer than 60-90 minutes see massive drop-off rates — especially from senior candidates who have other options. The best candidates aren't going to spend 8 hours on your hiring gauntlet. Design something tight: a 30-minute skills test plus a 45-minute structured interview gets you 90% of the signal.
- Measure and iterate — Track which assessment scores actually correlate with on-the-job performance after 6-12 months. Most companies never do this. They build an assessment process once and never validate it. If your interview scores don't correlate with performance reviews, you're scoring the wrong things.
- Document your rationale — If someone challenges your hiring process, you want a paper trail showing: job analysis, why you chose each assessment, validation data, adverse impact analysis. This isn't paranoia — it's what the EEOC will ask for.
AI-Powered Assessments: The Good, the Bad, and the Overhyped
Every assessment vendor now claims to use "AI." Most of them are doing something trivially simple — keyword matching on open-ended responses, or running a basic NLP classifier — and calling it artificial intelligence. That said, there are genuine advances worth paying attention to.
The good: AI can evaluate work samples at scale. Instead of having a human grade 500 coding submissions or 200 written responses, well-designed AI scoring can handle volume while maintaining consistency. AI can also reduce some forms of human bias — it won't penalize a candidate for having an accent or being nervous, which human interviewers absolutely do.
The bad: AI can also encode and amplify existing biases. Amazon famously scrapped an AI recruiting tool in 2018 because it systematically downgraded women's resumes — the model had been trained on 10 years of hiring data from a male-dominated company. The tool learned that "women's" was a negative signal. Garbage in, garbage out.
The overhyped: Emotion detection, video analysis of facial expressions, voice tone analysis. These are pure pseudoscience dressed up in machine learning. There is no peer-reviewed evidence that analyzing a candidate's micro-expressions predicts job performance. HireVue quietly dropped facial analysis from their platform in 2021 after years of criticism. If a vendor pitches you emotion AI, run.
7 Pre-Employment Assessment Mistakes That Cost You Good Hires
After looking at thousands of hiring processes, these are the patterns that consistently destroy candidate pipelines.
- Testing for the wrong things — Your assessment measures what's easy to test, not what matters for the role. Testing a product manager on SQL syntax when they'll never write a query. Testing a salesperson on personality type when you should be evaluating their discovery call skills.
- Too many rounds, too much friction — Six interview rounds, a case study, two reference checks, a personality assessment, and a cognitive test. By round four, your best candidates have accepted offers elsewhere. The companies winning the talent war have three-step processes that take under a week.
- Using unstructured interviews and calling them 'culture fit' — 'Culture fit' interviews without standardized criteria are just bias laundering. What they actually measure: does this person remind me of myself? This is how you build homogeneous teams and call it culture.
- Buying a platform without understanding validity — Your vendor says their assessment is 'scientifically validated.' Ask: validated against what criterion? What's the sample size? Was it an independent study or one they funded? What's the adverse impact ratio? If they can't answer these questions clearly, they're selling you a product, not a validated assessment.
- Ignoring the candidate experience — Your assessment is also a sales pitch. Every candidate who completes your hiring process — whether they get hired or not — forms an opinion about your company. Clunky interfaces, irrelevant questions, and no feedback all damage your employer brand.
- Not calibrating evaluators — If five interviewers use five different mental models for what 'good' looks like, your structured interview isn't structured anymore. Calibration sessions — where evaluators score the same candidate responses and align on criteria — are non-negotiable.
- Treating all roles the same — Using the same generic cognitive ability test for every role from intern to VP, from engineer to account executive. Different roles require different competencies. Your assessment process should reflect that.
The Bottom Line
Pre-employment assessments work — when you use the right ones. The research spanning a century is remarkably consistent on a few key points:
Structured interviews and work samples are the best tools we have. Cognitive ability tests are strong predictors but carry legal risk. Personality tests (the Big Five, not MBTI) add modest value as a supplement. Unstructured interviews, resume screening, and years of experience are mostly noise.
The companies that get hiring right aren't the ones using the most assessments. They're the ones using the right assessments — matched to the role, standardized in administration, and validated against actual performance data.
Stop wasting time on hiring theater. Test what matters. Measure what predicts. Hire people who can actually do the job.