Dunnhumby Interview Questions for a full-time job by Shubham Bhatt (has a Master’s Degree in Data Science and Business Analytics, Narsee Monjee Institute of Management Studies (NMIMS), Mumbai)
Profile – Senior Applied Data Scientist
Process – Round 1 – Hacker Earth Technical Test, Round 2 – Technical, Round 3 – Case studies and Technical)
There were 30 MCQs (python+data science) and 2 programming questions(Language used – Python). The duration was of 40 minutes. Questions were based on Syntax, Loops and statements, classes and objects, finding output, Data manipulation.
- K means clustering. How do you decide K? What is there on both axis in the elbow method.
- Outlier detection methods: DB Scan, Z score.
- Which is more important: K means or DBScan (Depends on context – DBScan is used for outlier detection whereas K means clustering is used to perform unsupervised learning etc )
- RFM analysis. Which of the metrics (recency, frequency, or monetary value) is the most important and why. How do you cluster people on the basis of rfm score?
- Survival analysis. Hazard score, Kaplan Meier.
- Which model you used for classification? What is random forest (as I answered Random Forest)? What is a random forest? How does it work? Hyperparameter tuning.
- What is CLV. How do you find it? Is it possible to cluster customers on the basis of CLV scores? If yes, how? If not, then what other details do you need along with cltv score to cluster people.
- How would you introduce a product with a higher price. Will customers still prefer to buy or not.
- What is Associative rule mining. Terminologies related to it and how it’s used in real-time based on different case scenarios.
- Statistical models – beta geo fitter and gamma model and their assumptions. Which distribution is for which model.
- How would you label a customer as churn/not churn for training data in terms of the grocery store. Which visualization will help you out with it and how?
- What all variables will you take while building a customer churn model for the grocery store. Step by step process – how will you implement it (eda, outlier, fs, fe, model building)
- Disadvantages of logistic regression. How can you overcome overfitting in Logistic regression – Ridge or lasso.
- Metrics used to evaluate a classification model.
- Types of word embeddings in NLP – count vector, tf-idf, word2vec. Which model I used for sentiment analysis – vader. How can we use word2vec in retail data. If yes, then how? If no, then why?
Be very good with the context and its real-time applications and scenarios. Be thorough with the CV completely.
For any suggestions, please reach out to us on LinkedIn. You can also schedule a meeting by vising the Contact page.
Find some of the resources that helped us here.
Like these, Dunnhumby Interview Questions, you can create an impact by talking about your interview experience. Please fill this form and help students get a perspective about the interview structure and questions.
You can read other articles here.
Cheers and Best!