Data Science Interview Q&A

Explain Logistic Regression.

Explain Linear Regression.

How do you split your data between training and validation?

Describe Binary Classification.

Describe decision trees.

What are some classification metrics?

What is a cost function?

What’s the difference between convex and non-convex cost functions?

Why is it important to understand the bias-variance trade off?

What is regularization? What are the differences between L1 and L2 regularization?

What are exploding gradients?

Is it necessary to use activation functions?

How is a box plot different from a histogram?

What is cross validation?

What are false positives and false negatives?

Explain how SVM works.

What techniques can be used to evaluate a Machine Learning model?

Why is overfitting a problem? What steps can you take to avoid it?

Describe how to detect anomalies.

What are the Naive Bayes fundamentals?

__What is a ROC curve, sensitivity, specificity, and confusion matrix?

What is an AUC-ROC Curve?

What is K-means?

How does Gradient Boosting work?

What is the difference between bagging and boosting?

__Why do we need feature selection?

What is unbalanced binary classification?

Why is dimensionality reduction important?

Describe hyperparameters?

How will you decide if a customer will buy a product today given: income, location, profession, and gender?

Design a heatmap for Uber drivers to recommend where to wait for passengers.

What are time series forecasting techniques?

How does a logistic regression model know what the coefficients are?

Explain Principle Component Analysis (PCA).

Explain Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA).

Why is gradient checking important?

Is random weight assignment better than assigning same weights to the units in a hidden layer?

Describe an F1 score.

Describe some common topic modeling techniques.

How does a neural network with one layer, input and output compare to a logistic regression?

__Why is Rectified Linear Unit/ReLU is a good activation function?

How do you use Gaussian mixture models (GMMs)?

How to decide whether to double the number of ads in Facebook’s Newsfeed?

What is Long short-term memory (LSTM)?

Explain the difference between generative and discriminative algorithms.

What is MapReduce?

If the model isn’t perfect, how do you select the threshold for a binary model?

Are boosting algorithms better than decision trees? If yes, why?

How does speech synthesis work?

How do you detect if a new observation is an outlier?

What are anomaly detection methods?

How do you solve for multicollinearity?

How does caching work?

How would you define a representative sample of search queries from 5 million queries?

Discuss how to randomly select a sample from a product user population.

What is the importance of Markov Chains (MCs)?

What is the difference between Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP)?

What does P-Value mean?

Define Central Limit Theorem (CLT)

  1. There are 6 marbles in a bag, 1 is white. You reach in the bag 100 times. After drawing a marble, it is placed back in the bag. What is the probability of drawing the white marble at least once? The probability of drawing out at least one marble is the complement of probability of drawing not a single white marble at all. Therefore, we’ll calculate the Probability of drawing all non-white marbles over a hundred times and subtract by 1:

P(White at least once) = 1 – [P(Non-white marbles) ^ 100] = 1 - [(5/6) ^ 100]

Explain Euclidean distance.

Define variance.

What is the law of large numbers?

How do you weigh 9 marbles three times on a balance scale to select the heaviest one?

You call 3 random friends and ask each if it’s raining. Each friend has a 2/3 chance of telling you the truth and a 1/3 chance of lying. All three say “yes”. What’s the probability it’s actually raining?

What is a Poisson distribution?

__What is the difference between a Stack and Queue?

What is the difference between Linked lists and Arrays?

How should you handle NULLs when querying a data set?

What is the JOIN function in SQL?

Select all customers who purchased at least two items on two separate days from Amazon. SELECT Customer_ID, COUNT(DISTINCT Item_ID) as ‘item’, COUNT(DISTINCT Purchase_Date) as ‘date’ FROM Purchase_List GROUP BY Customer_ID HAVING ‘date’ >= 2 AND ‘item’ >= 2

What is the difference between DDL, DML, and DCL?

Why is __Database Normalization Important?

What is the difference between a clustered and non-clustered index?

How do you avoid selection bias?