Data Science: The Complete Interviewer Guide

The purpose of a data science interview is to assess a candidate’s ability to translate a business problem into a mathematical one that data science can solve and to solve it using math, statistics, database, and/or programming skills.

--

Thus, the ideal candidate has skills in three primary buckets: (1) math and statistics, (2) databases and programming, and (3) business and product. A typical data science hiring process will probe a candidate’s abilities in each area culminating in the data science case study which touches on insights from all three buckets.

Although modeling and programming are major aspects of data science, companies also want data scientists who understand business and product fundamentals, allowing them to create meaningful insights. For that reason, as an interviewer, you’ll want to validate that the candidate has a clear understanding of each major function and the ability to explain data science methods and processes, their tradeoffs, and use cases to a non-technical audience.

In broad strokes, you’ll be testing your peer’s domain knowledge in addition to their personability, communication, and problem-solving skills. Here’s a brief summary on why each skill is important:

Communication Skills

  • As you know, data science is about more than just statistics and coding. Solving business problems requires getting others to buy into what you’re doing to influence their actions. In particular, it requires communicating with both technical and non-technical audiences about complex data science topics.
  • Data scientists also need to be able to effectively communicate what they’ve done. If something goes wrong like you ship buggy code, the hiring manager will want to know that the candidate has the necessary communication skills to walk them through your process, so that they can diagnose where the problem occurred.

Personability

  • As mentioned in the communication section, data scientists need to establish trust with their teammates in order for them to act on the insights you create through preparing and analyzing the data.
  • Data scientists need to work with and carefully listen to product, engineering, business, and marketing teams to identify the underlying nature of the problem at hand, and to effectively your results to them.
  • You’ll also want to look for candidates who are open-minded, enjoy working cross-functionally, and possess user empathy. It’s important that data science candidates are not only technically skilled, but also be a person that a diversity of team members will enjoy interacting with.

Problem-Solving Skills

  • There are several types of data and models you could apply, so it’s important that data science candidates understand the tradeoffs of each in order to justify which to use in a given situation.
  • Try to get a sense of what types of problems the candidate has encountered and what they believe their impact was on the business.

With these skills in mind, your primary objective as the interviewer is to assume the role of a hiring manager and simulate a real interview for your peer.

Consider the following key points to allow for a successful interview:

  • Consider the attributes you’re looking for in a potential data science teammate using the examples above.
  • Make sure you have a good enough understanding of the questions you’re about to ask.
  • Encourage your peer to verbally communicate their thought process with a structured approach.
  • Encourage your peer to take a minute to think before answering a challenging question. A candidate that has had time to think through an answer will be able to articulate it better, making it easier for you to understand their thought process.
  • Prompt your peer for details. Feel free to ask follow-up questions at each step to get at their domain expertise.
  • Remember that the worst outcome from an interview is to decide that the person needs an additional interview before a decision can be made.

Interview Structure

Begin by introducing yourself and asking your peer whether they’d like you to briefly explain the interview process to them. You’ll be given content to interview your peer and will spend ~30 minutes interviewing them over video and a collaborative text editor. Emphasize that you’re there to help them practice and that you’ll provide constructive feedback following the interview.

We suggest covering the following to provide a good experience:

Establish Clear Goals

It’s important to determine what qualities you want your new team member to possess ahead of the interview. For example, you want to be aware of the following during the interview:

  • Technical Expertise — Is your peer knowledgeable about a variety of analytical tools?
  • Statistics — Does your peer suggest algorithms/models blindly, or do they understand what they do and their tradeoffs?
  • Business Logic — Does your peer understand the underlying metrics of a business problem?
  • Problem-Solving — Can your peer provide solutions to the business problem at hand?
  • Communication — Can your peer communicate their thoughts clearly and concisely?
  • Personability — Do you believe your peer will be able to influence their teammates?
  • Structured Approach — Was your peer able to take an ambiguous business problem and approach it with a concrete framework to solve the problem?

Your goal is to elicit examples of these skills or competencies by analyzing their thought process as they work through a data science business/product case study.

Be Open-Minded

It’s important to remain open-minded and challenge your personal biases about candidates during any type of interview. Your goal is to build a diverse team with each member bringing a different perspective and skill set to tackle your company’s challenges.

Each candidate will have a different background and unique professional experiences. Just because a candidate answers a question differently than you expected, it doesn’t mean they’re wrong. In fact, there aren’t necessarily right or wrong answers in data science case study interviews.

Set Expectations

Candidates are often nervous for interviews, so it’s important to try to put them at ease by setting clear expectations for the interview. If a candidate is nervous, or you made them nervous, you may miss your chance to see their true self, leading you to wrongly evaluate them. Your goal is to create an environment that allows your peer to showcase their best self.

Briefly introduce yourself and allow your peer the chance to do the same. Inevitably, there will a power dynamic between you and your peer, but starting the interview with a brief back-and-forth conversation will help put your peer at ease.

Define the Goal

After introducing the question (e.g. “How would you improve engagement on Facebook?”) to your peer, give them the opportunity to think and ask questions. They should be expected to understand the problem and lead the conversation, but it’s also important that you two agree on basic assumptions (the company’s overall goal, this particular team’s goal, the major use cases for the product/feature, etc.).

Challenge the candidate to briefly describe their basic assumptions and the basic features of the product/feature they’re tasked with improving. This should be driven by the candidate.

As the interviewer, it’s important to go into the session with a strong sense about the company you’re asking the candidate about, so do a bit of online research before the interview. A good way to do this is to experiment with the product as a user and read about the product/features on the company’s blog. You should expect your peer to ask you some questions about the product/features you’re asking about, so it’s important to be prepared.

In case study interviews, it’s particularly important to know when to keep quiet and when to chime in. You want to allow the candidate plenty of time to think, but you also need to be conscious if the candidate is quiet because they are mentally spinning in circles. If in doubt, opt for keeping quite a bit longer. We’re used to helping each other in real life, but the goal here is assessment, not a successful outcome to the interview.

If something important wasn’t covered, consider it for the feedback.

Choose the Metric

In case study interviews, you’ll tell candidates the business goal to improve such as retention, growth, or engagement, but not the metric they need to optimize for. Being able to understand a business goal and identify the appropriate metric(s) is a central responsibility of being a data scientist.

When evaluating whether they’ve identified the metric that makes sense, you should make sure that the metric:

  • Highlights an aspect of the product. It’s important to remember that while many metrics exist, no single metric can represent a complete picture of your whole product. Therefore, metrics need to drill down into a particular part of the product.
  • Is a value we can collect. We’re not able to improve something that we haven’t collected data on through the product.
  • Tracks performance. Metrics must measure parts of the product that are relevant to its success.
  • Captures change over time. The goal of tracking metrics is to see trends and create projections. To do so, metrics need to capture a change in the product over a period of time.

When evaluating your peer’s metric, make sure that it aligns with the underlying business objective, which in most cases will be related to growth. If you have trouble seeing how your peer’s metric links back to the company goal they identified, that’s a sign that their metric will not be useful in helping to achieve the business goal.

If that’s the case, push your peer for additional details about why they’ve chosen that metric; and if you’re still not convinced, suggest that they identify a second metric as a backup.

Select Variables

After you and your peer have settled on a concrete metric, ask them to consider which variables they think matter most to that metric. Typically, this will be a combination of user characteristics (age, gender, location, etc.) and behavioral ones (online behavior, device used, session time, etc.).

This is your chance to see how comfortable the candidate feels about finding the data they need to solve the question at hand. Often times, there’s also missing data, so it’s important during your research to allow you to speculate what data the company collects during the sign up process and through using the product.

Encourage your peer to briefly walk you through their data science workflow to identify appropriate variables and ask them if they have any questions about the collection of data.

Pick a Model

Equipped with the understanding of what data exists, ask your peer to pick what model/algorithm they’d use to solve the business problem. From their answer, you should be able to understand their logic and justification for choosing a particular model.

As a follow-up, you could ask them about the pros and cons of the model they chose, or ask them to justify why they’d use that model over another model you may have in mind.

Consider Possible Outputs

Once you’re confident with your peer’s choice of model, they should move on to considering possible output scenarios.

In other words, ask them to give an example of what information said model might tell them if they actually implemented it. For example:

Candidate: “After running my model, I might find that users from Japan aren’t very engaged on Facebook. On the other hand, I might’ve found that Russians aged between 20 and 30 are very engaged, but proportionally, we don’t have many of those users.”

You should be seeking to evaluate whether the scenario is reasonable, given what you know about the company.

Determine Next Steps

The final step of the interview is for the candidate to consider what actions the team should take based on the possible output scenarios they suggested may result from their model. For example:

Candidate: “Because my model showed that users from Japan aren’t very engaged on Facebook, I would have an area expert check the Japanese translation that locals see. Perhaps, it’s inaccurate or disregards Japanese cultural expectations of social networks. On the other hand, given that Facebook is performing particularly well with Russian millennials, I would suggest that the marketing team work on targeting more Russian millennials via advertisements or specific marketing campaigns.”

This is when the candidate actually answers the original question. In a real-life scenario, the candidate would likely be conveying their results and next steps to a non-technical team member, or at least to a non-data scientist. For that reason, it’s important that the candidate is able to communicate this information without using any overly complex data science terms or concepts.

Your evaluation of their answer in this section should take into account whether (1) they’re able to clearly and concisely demonstrate their findings, (2) whether their insights and next steps make sense within the context of the problem and the company’s goals, and (3) whether you feel that their logic and reasoning would be compelling enough to be acted upon by a fellow team member.

Interviewer Self-Checklist

  • Were you aware of any biases you had that might impact your assessment?
  • Were there any behaviors from your peer that made you uncomfortable?
  • Was the candidate overly nervous despite your best efforts to keep them calm? What parts of your assessment were affected most by the candidate’s nervousness?
  • Did you do anything that you feel may have adversely affected the dynamic of the interview or the candidate’s performance?
  • Do you have a good sense of the candidate’s communication and problem-solving skills based on their answers to your questions during the interview?
  • Did they provide various solutions to the subproblems encountered throughout the problem? Did they need a lot of help or hand-holding during the interview?
  • Were they able to justify their decisions, or did you feel that they were just guessing the whole time?
  • What would you have wanted to dig deeper into if you had more time with the candidate?

Additional Resources

--

--