We know that each domain requires a different type of preparation, so we have divided our books in the same way:
Our best seller:
✅Become a Full Stack Analytics Professional with The Data Monk’s master e-book with 2200+ interview questions covering 23 topics – 2200 Most Asked Interview Questions
Machine Learning e-book
✅Data Scientist and Machine Learning Engineer ->23 e-books covering all the ML Algorithms Interview Questions
Domain wise interview e-books
✅Data Analyst and Product Analyst Interview Preparation ->1100+ Most Asked Interview Questions
✅Business Analyst Interview Preparation ->1250+ Most Asked Interview Questions
The Data Monk – 30 Days Mentorship program
We are a group of 30+ people with ~8 years of Analytics experience in product-based companies. We take interviews on a daily basis for our organization and we very well know what is asked in the interviews.
Other skill enhancer websites charge 2lakh+ GST for courses ranging from 10 to 15 months.
We only focus on making you a clear interview with ease. We have released our Become a Full Stack Analytics Professional for anyone in 2nd year of graduation to 8-10 YOE. This book contains 23 topics and each topic is divided into 50/100/200/250 questions and answers. Pick the book and read it thrice, learn it, and appear in the interview.
We also have a complete Analytics interview package
– 2200 questions ebook (Rs.1999) + 23 ebook bundle for Data Science and Analyst role (Rs.1999)
– 4 one-hour mock interviews, every Saturday (top mate – Rs.1000 per interview)
– 4 career guidance sessions, 30 mins each on every Sunday (top mate – Rs.500 per session)
– Resume review and improvement (Top mate – Rs.500 per review)
YouTube channel covering all the interview-related important topics in SQL, Python, MS Excel, Machine Learning Algorithm, Statistics, and Direct Interview Questions
Link – The Data Monk Youtube Channel
American Express Data Analyst Questions
Company: American Express (Amex)
Designation: Data Analyst
Year of Experience Required: 0 to 4 years
Technical Expertise: SQL, Python/R, Statistics, Machine Learning, Case Studies
Salary Range: Competitive, based on experience
American Express, founded in 1850, is a global leader in financial services, known for its credit cards, charge cards, and traveler’s cheques. Headquartered in New York City, Amex is one of the 30 components of the Dow Jones Industrial Average. If you’re preparing for a Data Analyst role at American Express, here’s a detailed breakdown of their interview process and the types of questions you can expect.
American Express Data Analyst Questions

We have some of the finest American Express Data Analyst Interview Questions.
Interview Process
The American Express Data Analyst interview process typically consists of 5 rounds, each designed to evaluate different aspects of your technical and analytical skills:
Round 1 – Telephonic Screening
Focus: Basic understanding of Data Science concepts, SQL, and Python/R.
Format: You’ll be asked to explain your projects and solve a few coding or SQL problems.
Round 2 – Walk-in/Face-to-Face Technical Round
Focus: Advanced SQL, coding, and problem-solving.
Format: You’ll solve problems on a whiteboard or shared document.
Round 3 – Project Analysis
Focus: Deep dive into your past projects.
Format: You’ll be asked to explain your approach, tools used, and the impact of your work.
Round 4 – Case Studies
Focus: Business problem-solving and data-driven decision-making.
Format: You’ll be given a real-world scenario and asked to propose solutions.
Round 5 – Hiring Manager Round
Focus: Cultural fit, communication skills, and long-term career goals.
Format: Behavioral questions and high-level discussions about your experience.
Difficulty of Questions
SQL – 8/10
1) How can you find the top 3 products that have been sold the most?
SELECT product_id, SUM(quantity) AS total_sold
FROM orders
GROUP BY product_id
ORDER BY total_sold DESC
LIMIT 3;
2) How do you find employees who have the same salary as another employee?
SELECT e1.name AS employee1, e2.name AS employee2, e1.salary
FROM employees e1
JOIN employees e2 ON e1.salary = e2.salary AND e1.id <> e2.id;
3) How can you find all orders placed on a Saturday or Sunday?
SELECT *
FROM orders
WHERE DAYOFWEEK(order_date) IN (1, 7);
4) How can you find the top customers who have spent the most?
SELECT customer_id, SUM(price * quantity) AS total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC
LIMIT 5;
5) How do you find products that have not been sold in the past six months?
SELECT p.product_id, p.name
FROM products p
LEFT JOIN orders o ON p.product_id = o.product_id AND o.order_date >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
WHERE o.product_id IS NULL;
R/Python – 7/10
1) Create a custom iterator class ReverseRange
that iterates over a given range in reverse order.

2) Write a Python function that implements the binary search algorithm to find a target element in a sorted list.

3) Use NumPy to create a 2D array, calculate the mean of each row, and find the maximum value in the entire array.

4) Create a metaclass that automatically adds an attribute created_at
with the current time to any class created with it.

5) Implement a simple singly linked list with insert
and display
methods.

Statistics/ML
1) How will you make sure that your model is not undergoing any type of overfitting? State some ways to avoid it.
Overfitting happens when a model learns the training data too well, including noise and irrelevant details, making it perform poorly on new data.
Ways to prevent overfitting:
- Train-Test Split & Cross-Validation: Use a train-validation-test approach and K-Fold Cross-Validation to check model performance on unseen data.
- Regularization (L1 & L2): Helps in reducing the impact of unnecessary features.
- Pruning (for Decision Trees): Limits tree depth to avoid complex structures.
- Dropout (for Neural Networks): Randomly removes neurons during training to prevent dependency on certain features.
- Feature Selection: Remove unnecessary or highly correlated features.
- Early Stopping: Stop training when validation error starts increasing.
These techniques ensure that the model generalizes well to new data.
2) Treating a categorical variable as a continuous variable would result in a better predictive model? How?
No, treating a categorical variable as continuous can lead to incorrect relationships in a predictive model.
For example, if we assign numbers to categories like (Red = 1, Blue = 2, Green = 3), the model might assume a linear relationship, which does not exist.
Correct approach:
- One-Hot Encoding (for non-ordinal categories)
- Label Encoding (for ordinal categories)
- Embedding Layers (for deep learning models)
Using the right encoding improves model accuracy while keeping data meaningful.
3) Explain why Data Cleansing is essential and which method you use to maintain clean data?
Data cleansing removes errors, inconsistencies, and missing values, ensuring the dataset is accurate, complete, and reliable.
Methods to maintain clean data:
- Handling Missing Values: Fill with mean, median, mode, or use predictive imputation.
- Removing Duplicates: Ensures no redundant records.
- Fixing Inconsistencies: Standardizing formats, spellings, and date formats.
- Removing Outliers: Helps avoid incorrect trends.
- Data Type Conversion: Ensures numerical and categorical values are correctly formatted.
Clean data improves model accuracy and reliability.
4) Is it necessary to perform resampling in your dataset? How would you initiate this process?
Resampling is necessary when the dataset is imbalanced (e.g., fraud detection, medical diagnoses).
Methods to initiate resampling:
- Oversampling (SMOTE): Increases samples from the minority class.
- Undersampling: Reduces the majority class size to balance the dataset.
- Stratified Sampling: Maintains proportional representation in train-test split.
Proper resampling helps in better generalization and balanced model predictions.
5) You have to choose between creating a large decision tree or 100 small decision trees for the same problem. Which one will you prefer and why?
I would prefer 100 small decision trees (Random Forest) instead of one large decision tree because:
- Better Accuracy: Combining multiple trees reduces errors.
- Less Overfitting: Random Forest generalizes better.
- Handles Missing Data: Works well with incomplete data.
- More Stable Predictions: Reduces variance in results.
A large decision tree may fit the training data perfectly but will overfit and perform poorly on new data.
Case Study
Problem Statement:
American Express wants to improve its credit risk assessment model to minimize loan defaults while maximizing customer retention. Your task as a Data Analyst is to analyze credit card transaction data, customer demographics, and repayment patterns to identify high-risk customers and suggest strategies for better risk management.
Dataset Overview:
You have access to a dataset containing credit card transactions and repayment history for American Express customers. The dataset includes:
- Customer_ID – Unique identifier for each customer
- Credit_Score – Customer’s credit score (e.g., FICO score)
- Annual_Income – Customer’s reported yearly income
- Credit_Limit – Maximum credit available to the customer
- Current_Balance – Outstanding balance on the credit card
- Monthly_Repayment – Average monthly repayment amount
- Late_Payment_Count – Number of times the customer has missed payments
- Transaction_History – Summary of past purchases (e.g., categories, amounts)
- Spending_Pattern – Analysis of customer spending behavior (e.g., high spenders vs. low spenders)
- Default_Flag – 1 if the customer defaulted on payments, 0 if not
Key Questions to Answer:
1. What factors contribute to credit risk for American Express?
- How do credit scores impact the likelihood of defaults?
- Do customers with high credit limits and low incomes pose a higher risk?
- Does the frequency of late payments predict future defaults?
2. How can American Express improve its credit risk model?
- Should Amex adjust credit limits dynamically based on spending behavior?
- Can transaction history be used to predict early warning signs of default?
- How can external data (e.g., economic trends) improve risk assessment?
3. What strategies can Amex implement to balance risk and customer retention?
- Should Amex offer lower interest rates to financially stable customers?
- How can personalized repayment plans reduce customer churn?
- Can targeted financial education improve customer credit behavior?
Key Insights & Business Recommendations
1. Identifying High-Risk Customers
- Low Credit Score & High Credit Utilization: Customers with low credit scores and high outstanding balances are at a higher risk of default. Amex should monitor such accounts closely.
- Frequent Late Payments: A high late payment count is a strong predictor of financial distress.
- Income vs. Credit Limit Mismatch: Customers with high credit limits but low reported incomes may struggle to repay, increasing default risk.
2. Enhancing Credit Risk Assessment
- Dynamic Credit Limit Adjustments: Amex can automatically adjust credit limits based on spending and repayment patterns to reduce risk.
- Predictive Default Models: Using historical transaction data and machine learning, Amex can develop early warning systems to flag risky customers.
- Alternative Credit Scoring: Amex can incorporate non-traditional data (e.g., spending behavior, employment history) to assess creditworthiness more accurately.
3. Customer Retention & Risk Mitigation Strategies
- Personalized Repayment Plans: Offering flexible installment-based repayment options can help customers avoid defaults.
- Loyalty-Based Risk Reduction: Customers with a long history of responsible spending can receive interest rate reductions to encourage timely payments.
- Financial Education Initiatives: Providing AI-powered spending insights and financial literacy programs can help customers make better financial decisions.
Basic, you can practice a lot of case studies and other statistics topics here –
https://thedatamonk.com/data-science-resources/
For any information related to courses or e-books, please send an email to [email protected]