How to hire the right way: An engineer’s perspective on tech recruiting

I’m an engineer, but I also have an MBA and for this post I’ll be wearing that hat. I was seriously considering buying an actual hat with “MBA” printed on it, but some good people talked me out of it, so unfortunately it’s a metaphorical hat instead of a real one.

I often see engineers complain about recruitment processes: They are long, they seem totally irrelevant to the job, everything is stupid. We just don’t get it. Unfortunately, we’re probably right — some employers don’t have any idea what they’re doing, but does it really have to be that way?

One of the very best courses I took during my MBA was a course about recruitment processes and the research behind them. It left a lasting impression on me and affected my perspective on recruitment processes when I was hiring and as a candidate. When I hear these rants, I often try to explain the rational behind a good hiring process, and I think this is insight which can be useful for engineers and recruiters, which is why I’m writing this post.

Hiring is hard. Mistakes are expensive. We have no idea what we’re doing.

Of course we are excellent at it. It’s always someone else who has no idea what they’re doing.

So, What Can We Do?

First, let’s look at some common tools you have at your disposal, describe them and check how good they actually are, based on the latest academic research in the field:

The validity numbers given here are “predictive validity”. This means that the validity of the result on a specific test is the correlation with success on the job. A 0 validity score means there is no correlation and the test is worthless (you might as well toss a coin), 1 would mean the test perfectly predicts success on the job.

Intelligence Tests: IQ tests, SATs etc.
Validity: 0.51
Even if you don’t ask for the scores on these tests directly, you usually try to get them by proxy by checking the school they went to and their GPA. I know this seems unfair to many candidates who are excellent engineers even though they didn’t do very well at higher education (or didn’t go at all, i.e. bootcamps etc.) but it is a quick and cheap way of getting through a lot of CVs.

Work Sample Tests: Requires you complete a task related directly to the job.
Validity: 0.54
When done well and at home, this saves time for everyone . It also allows the candidate to display their best work with a minimal amount of pressure. However, the time it takes is often underestimated, and if it’s done at home candidates may “miss” something and get stuck without answers, while potential employers are missing valuable signals.

I’ve seen many engineers who seem to think that work samples are THE BEST AND ONLY way to really show their worth. For the life of me, I don’t get it. Many work sample tests are just a way for the employer to get you to put in time with close to no cost for them. As a candidate I really don’t see how they’re better than a coding interview — the task is either just as synthetic and unrepresentative of the actual work, or too complicated and time consuming.

Employment Interview: I’m sure I don’t have to explain what an interview is.
Validity (Structured): 0.51
Validity (Unstructured): 0.36
Ah, we’re finally getting to something here. What’s the difference between a “structured” interview and an “unstructured” interview?
Well, a structured interview has repeatable questions and clear and objective criteria for answers. An unstructured interview, ummm, doesn’t. Ultimately, an unstructured interview comes down to a “gut feeling” about the candidate. I’ve often felt thankful for having clear and objective criteria specified in advance, because it’s so easy to skew the interview result according to your first (entirely biased) impression of someone.

Job Knowledge Tests: Direct questions on the subject matter.
Validity: 0.48
Usually this is carried out by giving the candidate a questionnaire asking them domain knowledge specific questions. Things like “How does web routing work in technology stack A”, “What would happen if you tried to assign this value to a string in technology stack B” etc.

Not bad so far. Now I’m getting to the “good stuff”:

Assessment Center: When you pay a whole lot of money to send candidates to spend their day doing group dynamics with a bunch of strangers.
Validity: 0.37
Can you tell I think these are stupid? And not only me, their validity is relatively low and their repeatability (i.e. getting the same score on repeat tests) is also extremely low. If you’re recruiting — don’t do this, please. If you’re a candidate and you have the privilege of refusing — just say no.https://rinaarts.com/media/d2043fe71f63366cbfc693af7f9c3249

Reference Checks: Asking past employers about the candidate.
Validity: 0.26
I know many employers insist on reference checks. But their validity is not so great and they should be used with caution.

Graphology: Sending a writing sample to a hand writing ״specialist״ who can tell if you’re a good match for the job. Usually used for integrity testing.
Validity: 0.02
Graphology is NOT a good recruitment tool. DO NOT use it. Have I stressed that enough?

The coding interview

Even though I already know these statistics, it still hit me hard while writing this post: even the best tests have only 0.51–0.54 validity, and that is far from amazing. Is this best we can do? What can we do with the tools we have to get better results?

Enter the dreaded “coding interview”. Whether done on a whiteboard or an online editor, it allows employers to pack a work sample, structured interview, and job knowledge test into a single session. By giving many such coding interviews, conducted by several different people, you can achieve better results, improve reliability and reduce bias (“gut feeling”). That’s why recruitment processes are so long and exhausting (for everyone involved).

I know I don’t have you convinced yet, but I’ll talk a bit more about how to assess the quality of this process later on.

Ok then, let’s see how it’s done.

Is this a good question?

You are given three containers.

One box contains all white balls, one all black balls, and one a mix of black and white balls.
Each box is labeled, but all the labels are wrong.

How many balls would you need to pull out to determine which box is which?https://rinaarts.com/media/fa17d6149bbf70cc4557b1afa353ffceDon’t skip ahead, answer: Is this a good question?

What about this one?

The count-and-say sequence is the sequence of integers beginning as follows:

1, 11, 21, 1211, 111221, ...

1 is read off as one 1 or 11.

11 is read off as two 1s or 21.

21 is read off as one 2, then one 1 or 1211.

Given an integer n, generate the nth sequence.https://rinaarts.com/media/83b545f56156e2bbd1a8e34c02f01de8Think about it, is this a good question?

Or this question?

Are you pregnant or are you planning to become pregnant in the next year?

I can’t recommend asking this one, it might even be illegal — but think about it anyway

You’re wrong.

It doesn’t matter if you answered “yes” or “no”.

Questions are neither good nor bad.

Now, I’m not recommending asking someone if she is pregnant, because that’s illegal and usually quite irrelevant. But if the job is in a factory with dangerous chemicals and hard physical labor, that question might actually be an important question to ask!

Questions should be the last thing you do when you build your process.

First, ask the right questions

What is the actual job definition?
Which technical skills does one need in order to perform the job well?
Which soft skills and personality traits do you value as an organization?
How much time, effort and money are you willing to spend in order to find the right person?
How much time and effort will the candidate be willing to spend in order to pass your process?

Only after you have the answers to these questions can you begin to create a concrete process. You can’t copy the answers to these questions from another organization. You have to figure them out for yourselves.

Create the right process

Decide what your screening parameters are and screen ruthlessly. Do not waste your or your candidates’ time.
Use structured interviews for technical and for soft skills interviews.
If you feel the candidate will be willing and it saves you a lot of time an effort, you can also use a work sample.
Call references and perform background checks only as needed.

Measure, refine, repeat

Every time I mention the validity of some process or other I’m asked how the validity was measured. I’ve actually read the articles and I can tell you the methods they used, but to be honest — it doesn’t really matter. What matters is what works for your organization. You may find that the “coding interview” doesn’t match your values or doesn’t give you the signals you need to decide who to hire. You may find your candidates love doing long work samples in the office and it allows you to get to know them better.

The point is that once you have your process in place, you must make sure it is working!

How many candidates who passed your initial screening made it through the interview process?
How many candidates who passed the entire process accepted the offer?
How long do employees stay with you?
How are their performance reviews?

Gather your data and refine your process accordingly. This is a continuous challenge!https://rinaarts.com/media/3fc08a13716aa9791d952ac5e4ab55da

Avoiding bias

Now you’ve built a process and have statistics on the candidates hired. How can you make sure you’re not testing for “rich white male” instead of “good programmer”? How do you verify you’re not hiring “people like us” instead of checking actual “culture fit”?

My personal pet peeve is open source contribution or other “after hour” projects as a requirement. If you’re an excellent programmer but have other hobbies, or if you god forbid have a family and actually want to spend time with them — that’s it, you’re out. That, to me, is an example of screening on irrelevant traits.

Unfortunately, some common quick fixes don’t seem to work. Implicit bias training doesn’t (usually) affect outcomes, it often makes things worse. Diverse interviewers do not necessarily make less biased decisions either, as they often show the same biases as anyone else.

You should remember that this is an entirely subconscious effect — you don’t have to be overtly racist or a bad person to be biased. It’s a natural part of how our brain works, so it’s up to the process to help us fight our biases actively.

The best way to tackle this issue is to add diversity measures and use the “measure, refine, repeat” cycle to check how you’re doing.

If you measure diversity and refine your process accordingly — you’re going to find where the problems are and take appropriate actions. If it’s a real pipeline issue — read Moran Weber’s excellent article. If it’s not really a pipeline issue, look at how you treat diverse CVs, how well they pass your process, how often they accept your offers, etc.

Aside: Integrity

This is a bit off topic, but it’s just too good to skip. You could skip it anyway, I won’t hold it against you.

I mentioned graphology as a bit of a joke, it’s validity is so low that it’s an absolute waste of money. So why do some employers still use it? You’d be surprised: to some of them it’s a way for someone else to decide what they feel they can’t! But some of them actually believe it’s accurate, how can that be?

Let me tell you a story: When I was taking this course, during one of the classes we were asked to give a writing sample. When the next class started, we each got a graphological assessment of our personality and were asked to rate how accurate it was. Most of the class graded it 4 or 5 for accuracy. Turns out we all got the same analysis… How could we be fooled like that?

This is called the Barnum effect, and basically what it means is that when we read a generic text like astrology or graphology, we will believe the parts that are true and dismiss/forget the parts that are wrong! The final impression we’re left with is that the text was very accurate.

Turns out the best way to measure integrity is by asking directly! There are pen & paper tests with a series of questions about integrity. The irony is that dishonest people believe they are normal people in a dishonest world, so they have no problem telling the truth! They will say “sure, I take boxes of pens home with me” or “why shouldn’t I lie about being sick?”, and expose themselves as less than worthy of trust.

If you are recruiting, I hope this gave you some insight into how to build a good process. If you’re a candidate, I hope this helped you understand what’s broken in recruiting and what actually works. For me — I’ll finally have a ready made answer for all the frustrated candidates out there. Good luck!

Thank you Yael Brender-Ilan for providing sources for this post and for teaching Personnel Selection at Hebrew University which left such a lasting impression in the first place.

Rina Artstain

See something, say something