SQL Theoretical Questions

Following are the top 100 SQL Questions to clear your theoretical concepts. Few questions will be repeated from Day 10, 11, and 12.

1. What is a database?
-A database is a collection of information in an organized form for faster and better access, storage and manipulation. It can also be defined as a collection of tables, schema, views, and other database objects.

2. What is a Data warehouse?
-Data warehouse refers to a central repository of data from multiple sources of information. Those data are consolidated, transformed and made available for the mining as well as online processing.

3. What is SQL?
-SQL stands for Structured Query Language , and it is used to communicate with the Database. This is a standard language used to perform tasks such as retrieval, updation, insertion and deletion of data from a database.

4. What is a Table in a Database?
A table is a database object used to store records in a field in the form of columns and rows that holds data.

5. What is a primary key?
-A primary key is a combination of fields which uniquely specify a row. This is a special kind of unique key, and it has implicit NOT NULL constraint. It means, Primary key values cannot be NULL.

6. What is a unique key?
-A Unique key constraint uniquely identified each record in the database. This provides uniqueness for the column or set of columns.
A Primary key constraint has automatic unique constraint defined on it. But not, in the case of Unique Key.
There can be many unique constraint defined per table, but only one Primary key constraint defined per table.

7. What is a foreign key?
-A foreign key is one table which can be related to the primary key of another table. Relationship needs to be created between two tables by referencing foreign key with the primary key of another table

8. What are the popular Database Management Systems in the IT Industry?
Oracle, MySQL, Microsoft SQL Server, PostgreSQL, Sybase, MongoDB, DB2, and Microsoft Access etc.

9. What is a join?
-This is a keyword used to query data from more tables based on the relationship between the fields of the tables. Keys play a major role when JOINs are used.


10. What are the types of join and explain each?

There are various types of join which can be used to retrieve data and it depends on the relationship between tables.

  • Inner Join.
  • Inner join return rows when there is at least one match of rows between the tables.
  • Right Join.Right join return rows which are common between the tables and all rows of Right hand side table. Simply, it returns all the rows from the right hand side table even though there are no matches in the left hand side table.
  • Left Join.Left join return rows which are common between the tables and all rows of Left hand side table. Simply, it returns all the rows from Left hand side table even though there are no matches in the Right hand side table.
  • Full Join.Full join return rows when there are matching rows in any one of the tables. This means, it returns all the rows from the left hand side table and all the rows from the right hand side table.

11. What are the different types of SQL commands?
SQL commands are segregated into the following types:

  • DDL – Data Definition Language
  • DML – Data Manipulation Language
  • DQL – Data Query Language
  • DCL – Data Control Language
  • TCL – Transaction Control Language

12. DDL Commands:

  • CREATE: To create databases and database objects
  • ALTER: To alter existing database objects
  • DROP: To drop databases and databases objects
  • TRUNCATE: To remove all records from a table but not its database structure
  • RENAME: To rename database objects

13. DML Commands

  • SELECT: To select specific data from a database
  • INSERT: To insert new records into a table
  • UPDATE: To update existing records
  • DELETE: To delete existing records from a table

14. DCL Commands

  • GRANT: To provide user access
  • DENY: To deny permissions to users
  • REVOKE: To remove user access

15. What is normalization?
-Normalization is the process of minimizing redundancy and dependency by organizing fields and table of a database. The main aim of Normalization is to add, delete or modify field that can be made in a single table.

16. What is Denormalization?
-DeNormalization is a technique used to access the data from higher to lower normal forms of database. It is also process of introducing redundancy into a table by incorporating data from the related tables.

17. What is an Index?
-An index is used to speed up the performance of queries. It makes faster retrieval of data from the table. The index can be created on one column or a group of columns.

18. What are all the different types of indexes?
-There are three types of indexes
1. Unique Index: Unique Indexes helps maintain data integrity by ensuring that no two rows of data in a table have identical key values. A unique index can be applied automatically when a primary key is defined. It ensures that the values in the index key columns are unique.
2. Clustered Index: Clustered Index reorders the physical order of the table and search based on the key values. There will be only one clustered index per table.
3. Non-Clustered Index: Non-Clustered Index doesn’t alter the physical order of the table and maintains a logical order of the data. Each table can have many non-clustered indexes.

19. What is a relationship and what are they?
-Database Relationship is defined as the connection between the tables in a database. There are various data basing relationships, and they are as follows:-
One to One Relationship.
One to Many Relationship.
Many to One Relationship.
Self-Referencing Relationship.

20. What are the advantages of Views?
Some of the advantages of Views are

  1. Views occupy no space
  2. Views are used to simply retrieve the results of complicated queries that need to be executed often.
  3. Views are used to restrict access to the database or to hide data complexity.

21. What is a stored procedure?
-Stored Procedure is a function consists of many SQL statement to access the database system. Several SQL statements are consolidated into a stored procedure and execute them whenever and wherever required.

22. What is a relationship and what are they?
-Database Relationship is defined as the connection between the tables in a database. There are various database relationships namely
1. One to One Relationship
2. One to Many Relationship
3. Many to One Relationship
4. Self-Referencing Relationship

23. What is the difference between Local Variables and Global Variables?
Local Variables: Local variables can be used or exist only inside the function. These variables are not used or referred by any other functions. These are not known to other functions. Variables can be created whenever that function is called.
Global Variables: Global variables can be used or exist throughout the program. Same variable declared in global cannot be used in functions. Global variables cannot be created whenever that function is called.

24. What is the difference between DELETE and TRUNCATE commands?
-DELETE command is used to remove rows from the table, and WHERE clause can be used for conditional set of parameters. Commit and Rollback can be performed after delete statement.
TRUNCATE removes all rows from the table. Truncate operation cannot be rolled back.

25. What is a constraint?
Constraint can be used to specify the limit on the data type of table. Constraint can be specified while creating or altering the table statement. Sample of constraint are:
NOT NULL.
CHECK.
DEFAULT.
UNIQUE.
PRIMARY KEY.
FOREIGN KEY.

26. What is the difference between UNIQUE and PRIMARY KEY constraints?
-There should be only one PRIMARY KEY in a table whereas there can be any number of UNIQUE Keys.
PRIMARY KEY doesn’t allow NULL values whereas Unique key allows NULL values.

27. What is the difference between NULL value, Zero, and Blank space?
-As I mentioned earlier, Null value is field with no value which is different from zero value and blank space.
Null value is a field with no value.
Zero is a number
Blank space is the value we provide. The ASCII value of space is CHAR(32).

28. What is CLAUSE?
SQL clause is defined to limit the result set by providing condition to the query. This usually filters some rows from the whole set of records.
Example –
Query that has WHERE condition
Query that has HAVING condition.

29. What is Union, minus and Interact commands?
-UNION operator is used to combine the results of two tables, and it eliminates duplicate rows from the tables.
MINUS operator is used to return rows from the first query but not from the second query. Matching records of first and second query and other rows from the first query will be displayed as a result set.
INTERSECT operator is used to return rows returned by both the queries.

30. List out the ACID properties and explain? 
Following are the four properties of ACID. These guarantees that the database transactions are processed reliably.

  • Atomicity
  • Consistency
  • Isolation
  • Durability

We will keep updating this page to collate all the theoretical questions at one place for your ease.

Keep learning πŸ™‚

SQL one table 30 questions

SQL is your bread and butter. Let’s take a table and try to answer few questions ranging from very basic to intermediate level.

Emp Table – Employees at Olivers’ Restro Pub

EmpNoENameJobMGRHireDateSalCommDeptNo
1234AmitWaiter838219-Oct-185000050050
5678AshishAnalyst86352-Nov-186000020051

Dept Table

DeptNoDnameLoc
50ServiceDelhi
51AccountMumbai

a. Show all the data from emp table
SELECT * FROM emp;

b. Show all the data from Dept table
SELECT * FROM dept;

c. Display distinct jobs from Dept table
SELECT DISTINCT(job) FROM dept;

d. Number of employees
SELECT COUNT(*) FROM emp;

e. List the employee in the ascending order of salary
SELECT * from emp ORDER BY Sal;

f. Show the employee information of the Managers
SELECT * from emp WHERE EmpNo in (SELECT MGR FROM emp);

g. List of employees who were hired before 2018.
SELECT * FROM emp WHERE HireDate < ’01-Jan-2018′;

h. List the detail of employees along with the annual salary, order it on the annual salary
SELECT *, sal*12 as Annual_Income
FROM emp
ORDER BY Annual_Income;

i. Display number of months of experience of all the Managers
SELECT *, months_between(sysdate,HireDate) as Exp
FROM emp
WHERE EmpNo IN (SELECT MGR FROM emp);

j. Display the name of the employees with Commission(Comm) less than Salary (Sal)
SELECT EName FROM emp WHERE Comm < Sal;

k. Display the name of the employee with Daily income more than 200
SELECT EName FROM emp WHERE (sal/30)>200;

l. Show information of all the Waiters
SELECT * FROM emp WHERE Job = ‘Waiter’;

m. Show all the employee who joined on 01-Aug-2018, 4-Aug-2018, 29-Oct-2018 in descending order of Hire Date
SELECT * FROM emp WHERE HireDate IN (’01-Aug-2018′,’04-Aug-2018′,’29-OCt-2018′)
ORDER BY HireDate DESC;

n. List the employees who joined in 2018
SELECT * FROM emp WHERE HireDate BETWEEN (’01-Jan-2018′) AND (’31-Dec-2018′);

o. Employees with Annual Salary between 600000 and 1000000
SELECT * FROM emp WHERE Sal*12 BETWEEN 600000 AND 1000000;

p. List the employees with name starting with N and containing 5 alphabets
SELECT * FROM emp WHERE EName LIKE ‘N____’;

Or

SELECT * FROM emp WHERE EName LIKE ‘N%’ AND len(EName) = 5;

q. List the employee with the third alphabet in their name as K
SELECT * FROM emp WHERE Upper(EName) LIKE ‘__K%’;

r. Show the name of the employees who joined in August month of any year.
SELECT * FROM emp WHERE to_char(HireDate,’mon’)=’Aug’;

s. Show the employee details of those who were hired in the 90s
SELECT * FROM emp WHERE to_char(HireDate, ‘yy’) LIKE ‘9_’;

t. Show the employee who were not hired in the month of October.
SELECT * FROM emp WHERE to_char(HireDate,’MON’) NOT IN ‘(‘Oct’);

u. List the total information of the employees along with DName and Location of people working under ‘Accounts’
SELECT *
FROM emp e
INNER JOIN dept d ON (e.DeptNo = d.DeptNo)
WHERE d.DName = ‘Account’

v. List all the employees with more than 10 years of experience as of now
SELECT *
FROM emp
WHERE TIMESTAMPDIFF(MONTH, HireDate, sysdate)

w. List the detail of all the employees whose salary is less than that of Aman

SELECT *
FROM emp
WHERE sal > (SELECT sal FROM emp WHERE EName = ‘Aman’);

x. Show the name of those employees who are senior to their own Manager.
SELECT * FROM emp w, emp m
WHERE w.MGR = m.EmpNo and w.HireDate < m.HireDate

Or

SELECT * FROM emp w, emp m
WHERE w.EmpNo = m.MGR and w.HireDate < m.HireDate

y. Show the employees who are senior to Aman
SELECT * FROM emp WHERE HireDate < (SELECT HireDate FROM emp WHERE EName = ‘Aman’)

z. Show the employees who are senior to Aman and are working in Delhi or Bangalore

SELECT * FROM emp e, dept d WHERE
UPPER(d.loc) IN (‘DELHI,’BANGALORE’) AND e.DeptNo = d.DeptNo
AND e.HireDate < (SELECT e.HireDate FROM emp e WHERE EName = ‘Aman’);

za. Show the employees with the same job as Aman or Amit.
SELECT * FROM emp
WHERE job in (SELECT job from emp WHERE EName IN (‘Aman’,’Amit’);

zb. Find the highest salary of any employee
SELECT MAX(Sal)
FROM emp;

zc. Find the detail of the employee with the minimum pay
SELECT * FROM emp WHERE Salary = (SELECT MIN(Salary) FROM emp);

zd. Show the detail of the recently hired employee working in Delhi

I am tired, comment the answer below

Keep Practicing.

XtraMous



SQL Tricky Questions

You are already done with the basics of SQL on Day 10. No matter how good you are at it, there are some tricky questions which might confuse you in its trap. Below are some not so difficult but confusing questions. Try to solve it on your own before hoping on the solution.

For the next 7-8 questions, we will be referring to the below table

Emp. IDNameSalary
123Amit50000
453Sumit50000
232Rakshit30000
124Aman40000
543Rahul30000

1. What is ROW_NUMBER() function in SQL?
– The ROW_NUMBER() ranking returns a unique and sequential number to
each row of the table without repeating or skipping any number. If there are two rows which are partitioned and holds the same value as Amit and Sumit, then the row number will be given randomly.

2. Write the syntax to create a new column using Row Number over the Salary column

SELECT *, ROW_NUMBER() OVER (Order By Salary) as Row_Num
FROM Employee

Output

Emp. IDNameSalaryRow_Num
232Rakshit300001
543Rahul300002
124Aman400003
123Amit500004
453Sumit500005

3. What is PARTITION BY clause?
PARTITION BY clause is used to create a partition of ranking in a table. If you partition by Salary in the above table, then it will provide a ranking based on each unique salary. Example below:-

SELECT *, ROW_NUMBER() OVER (PARTITION BY Salary ORDER BY Salary) as Row_Num

Emp. IDNameSalaryRow_Num
232Rakshit300001
543Rahul300002
124Aman400001
123Amit500001
453Sumit500002

4. What is a RANK() function? How is it different from ROW_NUMBER()?
RANK() function gives ranking to a row based on the value on which you want to base your ranking. If there are equal values, then the rank will be repeated and the row following the repeated values will skip as many ranks as there are repeated values row. Confused?? Try out the example below:-

SELECT *, RANK() OVER (ORDER BY Salary) as Row_Num
FROM Employee

Output

Emp. IDNameSalaryRow_Num
232Rakshit300001
543Rahul300001
124Aman400003
123Amit500004
453Sumit500004

As you can see, the rank 2 has been skipped because there were two employees with the same Salary and the result is ordered in ascending order by default.

5. How to use PARTITION BY clause in RANK() function?
– There is no point using PARTITION BY clause with RANK() function. The ranking result will have no meaning, as the rank will be done according to Salary values per each partition, and the data will be partitioned according to the Salary values. And due to the fact that each partition will have rows with the same Salary values, the rows with the same Salary values in the same partition will be ranked with the value equal to 1.

SELECT *, RANK() OVER (PARTITION BY Salary ORDER BY Salary) as Row_Num

OUTPUT

Emp. IDNameSalaryRow_Num
232Rakshit300001
543Rahul300001
124Aman400001
123Amit500001
453Sumit500001

6. What is Dense Ranking?
– DENSE_RANK() is similar to the RANK() function but it does not skip any rank, so if there are two equal values then both will be termed as 1, the third value will be termed as 3 and not 2.

Syntax:-
SELECT *, DENSE_RANK() OVER (PARTITION BY Salary ORDER BY Salary) as Row_Num
FROM Employee

Output:-

Emp. IDNameSalaryRow_Num
232Rakshit300001
543Rahul300001
124Aman400003
123Amit500004
453Sumit500004
432Nihar600006

7. What is NTILE() function?
-NTILE() is similar to percentile NTILE(3) will divide the data in 3 parts.

SELECT *, NTILE() OVER (ORDER BY Salary) as Ntile
FROM Employee

The number of rows should be 6/3 = 2, therefore we need to divide the 2 rows for each percentile

Emp. IDNameSalaryNtile
232Rakshit300001
543Rahul300001
124Aman400002
123Amit500002
453Sumit500003
432Nihar600003

8. How to get the second highest salary from a table?
Select MAX(Salary)
from Employee
Where Salary NOT IN (SELECT MAX(Salary) from Employee)

9. Find the 3rd Maximum salary in the employee table
-Select distinct sal
from emp e1
where 3 = ((select count(distinct sal) from emp e2 where e1.sal <= e2.sal);

10. Get all employee detail from EmployeeDetail table whose “FirstName” not start with any single character between ‘a-p’
– SELECT *
FROM EmployeeDetail
WHERE FirstName like ‘[^a-p]%’

11. How to fetch only even rows from a table?
-The best way to do it is by adding a row number using ROW_NUMBER() and then pulling the alternate row number using row_num%2 = 0

Suppose, there are 3 columns in a table i.e. student_ID, student_Name, student_Grade. Pull the even rows

SELECT *
FROM ( SELECT *, ROW_NUMBER() OVER (ORDER BY student_ID) as row_num FROM student) x
WHERE x.row_num%2=0

12. How to fetch only odd rows from the same table?
-Simply apply the x.row_num%2 <> 0 to get the odd rows

SELECT *
FROM ( SELECT *, ROW_NUMBER() OVER (ORDER BY student_ID) as row_num FROM student) x
WHERE x.row_num%2 <> 0

13. How to find the minimum salary using subquery?
-SELECT *
FROM employee
WHERE salary = (select MIN(salary) from employee);

14. How to find the second minimum salary?
– SELECT *
FROM employee
WHERE salary = (SELECT MIN(salary) FROM employee > SELECT MIN(salary) FROM employee)

Similarly, find the third minimum salary

– SELECT *
FROM employee
WHERE salary = (SELECT MIN(salary) FROM employee > SELECT MIN(salary) FROM employee > SELECT MIN(salary) FROM employee)

15. The above query is too lengthy, write a query to get the third minimum salary with some other method.

– SELECT DISTINCT (salary)
FROM emp e1 where 3 = (SELECT COUNT(DISTINCT salary) FROM emp e2 WHERE e1.sal >= e2.sal);

16. How to get 3 Min salaries?
-SELECT DISTINCT salary FROM emp a WHERE 3 >= (SELECT COUNT(DISTINCT salary) FROM emp b WHERE a.salary >= b.salary);

17. Some basic SQL Select questions
– SELECT 125
125
-SELECT ‘Ankit’+’1’
Ankit1
-SELECT ‘Ankit’+1
Error
– SELECT ‘2’+2
4
-SELECT SUM(‘1’)
1

18.
SELECT CURDATE(); Γ  It returns the current date(MySQL)
SELECT NOW(); Γ  It returns the current date and time(MySQL)
SELECT getdate(); Γ  It returns the current date and time(SQL Server Query)
SELECT SYSDATE FROM DUAL Γ  This Oracle query returns the current date and time

19. Explain Scalar functions.
Scalar functions are those functions which are used to return a single value based on the input. Following are the commonly used scalar functions in SQL:-
1. UCASE() – Convert to Upper Case
2. LCASE() – Convert to Lower Case
3. MID() – It extracts a substring from a string. SELECT MID(β€œABCDEFGH”,3,6) AS ExtractedString
It will extract 6 characters starting from 3rd character i.e. C,D,E,F,G,H
4. FORMAT() – It specifies the display format
5. LEN() – Gives the length of a text field
6. ROUND() – Rounds up the decimal field in a number

20. Write a generic method to fetch the nth highest salary without TOP or Limit

SELECT Salary
FROM Worker W1
WHERE n-1 = (
 SELECT COUNT( DISTINCT ( W2.Salary ) )
 FROM Worker W2
 WHERE W2.Salary >= W1.Salary
 );

Basically clear your concept about Ranking, sub-queries, joins and functions.

We will also be publishing “100 Tough SQL questions to practice before Data Science Interview”. Keep checking for the update.

Keep practicing.

XtraMous





SQL Basics to get you started

SQL stands for Structured Query Language and it is used to pull relevant data from the database. We assume that you already know a bit about SQL, so we will try to cover the base and then will quickly jump to some questions.

1. What all can we do with SQL?

We can perform the following task:-
a. Retrieval of data
b. Updation of data
c. Insertion of data
d. Deletion of data

2. A table is basically a set of data which is organized in the form of rows and columns.

3. What are the different subsets of SQL?
We can further divide SQL into 3 parts or subsets:-
a. Data Definition Language – It helps you to CREATE, ALTER, and DELETE objects
b. Data Manipulation Language –It helps us to insert, update, delete, and retrieve data from the database
c. Data Control Language – DCL allows you to control access to the database. Grant and Revoke access permission

4. What is a Primary Key?
– A key which uniquely defines each row in a table and it cannot be NULL. Example – Roll number or Employee ID in a student or Employee table

5. What are constraints?
– Constraints are used to apply a limit on the data type of the table. Following are a few examples of constraints:-
a. Primary Key
b. Foreign Key
c. Unique
d. Default
f. Check
g. Not Null

6. What is a unique key?
– A key which uniquely defines each row in a table. It is mostly a group of keys which can or can not be NULL

7. What are the different types of Joins?
– There are various types of Joins, the most important out of these are:-
a. Inner Join – Intersection of two tables
b. Right Join – All the values from the right table and their corresponding matched value from the left table
c. Left Join – All the values from the left table and their corresponding matched values from the right table
d. Outer Join – All the values from both the table

8. We will not be talking about ACID properties and Normalization. These are very easily available on the internet. It would be very good if you watch a couple of videos on these topics on Youtube. The links of the videos are given below:-

Normalization
ACID Properties

9. What is a NULL value?
A NULL value is neither zero nor a blank space. It represents a value that it unavailable, unknown, Not applicable, etc.
Remember zero is a number and blank space is a character

10. What is the order of SQL query execution?

OrderClauseFunction
1FROM and JOINSelect the table to extract from
2WHEREFilter the base data
3GROUP BYAgreegate the base data
4HAVINGFilter the aggregate data
5SELECTReturn the final data
6DISTINCTApply distinct or unique condition
7ORDER BYSort the final data
8LIMITApply limit on the number of rows

11. What is the difference between DROP and TRUNCATE commands?
-DROP command removes a table and it cannot be rolled back from the database whereas TRUNCATE command removes all the rows from the table.

12. What is a view in SQL?
– Views in SQL are kind of virtual tables. A view also has rows and columns as they are
in a real table in the database. We can create a view by selecting fields from one or
more tables present in the database. A View can either have all the rows of a table or
specific rows based on certain condition

13. What is Union, minus and Interact commands?
UNION operator is used to combine the results of two tables, and it eliminates duplicate rows from the tables.
MINUS operator is used to return rows from the first query but not from the second query. Matching records of first and second query and other rows from the first query will be displayed as a result set.
INTERSECT operator is used to return rows returned by both the queries.

14. Important syntax of SQL below:-

a. BETWEEN
SELECT *
FROM Table Name
WHERE Date BETWEEN ‘date1’ AND ‘date2’

b. ORDER BY
SELECT *
FROM Table Name
WHERE Condition
ORDER BY {ASC/DESC}

c. HAVING
SELECT SUM(column_name)
FROM table_name
WHERE CONDITION
GROUP BY column_name
HAVING (arithematic function condition);

d. TRUNCATE Table
TRUNCATE TABLE Table_Name

e. ALTER Table
ALTER TABLE Table_Name {ADD/DROP/MODIFY} Column_Name {Data Type}

f. INSERT INTO
INSERT INTO table_name( column1, column2….columnN)
VALUES ( val1, val2….valN);

g. UPDATE
UPDATE table_name
SET column1 = val1, column2 = val2….columnN=valN
[ WHERE CONDITION ];

15. Explain SQL clause
The SQL LIKE clause is used to compare a value to similar values using wildcard operators.
There are two wildcards used in conjunction with the LIKE operator.
The percent sign (%)
The underscore (_)
The percent sign represents zero, one or multiple characters. The underscore represents a single number or character.
These symbols can be used in combinations.

16. What is the difference between β€˜HAVING’ CLAUSE and a β€˜WHERE’ CLAUSE?
HAVING clause can be used only with SELECT statement.
It is usually used in a GROUP BY clause and whenever GROUP BY is not used, HAVING behaves like a WHERE clause.
Having Clause is only used with the GROUP BY function in a query
whereas WHERE Clause is applied to each row before they are a part of the GROUP BY function in a query.

17. What is the difference between JOIN and UNION?
-UNION
combines the results of two or more queries into a single result set that includes all the rows that belong to all queries in the union. By using JOINs, you can retrieve data from two or more tables based on logical relationships between the tables
JOIN:-
a.) Joins the columns
b.)Duplicates are allowed
c.) Combines the column based on condition

UNION:-
a.) Merge the row
b.) Duplicates are not allowed
c.) Combine the result of two select statements.

18. What is self join?
The SQL SELF JOIN is used to join a table to itself as if the table were two tables, temporarily renaming at least one table in the SQL statement.

SELECT a.column_name, b.column_name…
FROM table1 a, table1 b
WHERE a.common_field = b.common_field;

19.
SELECT – select records from a table
INSERT – insert new records
UPDATE – update/Modify existing records
DELETE – delete existing records
CREATE – create a new Table, database, schema
ALTER – alter existing table, column description
DROP – delete existing objects from database
GRANT – allows users to read/write on certain database objects
REVOKE – keeps users from read/write permission on database objects

20. What is the difference between inner and outer join? Explain with example.
Inner join is the most common type of Join which is used to combine the rows from two tables and create a result set containing only such records that are present in both the tables based on the joining condition (predicate).Inner join returns rows when there is at least one match in both tables.
If none of the record matches between two tables, then INNER JOIN will return a NULL set. Below is an example of INNER JOIN and the resulting set.

Outer Join- Outer Join, on the other hand, will return matching rows from both tables as well as any unmatched rows from one or both the tables (based on whether it is single outer or full outer join respectively).Outer Join can be full outer or single outer.

These were the very basic questions and topics. Apart from it there are some interesting questions waiting for you in the next exercise where we will deal with partition by, ranking and some tricky SQL questions.

You can easily find the explanation of the above questions on the internet, else you can also look into our book on SQL

112 Questions to crack Data Science Interview using SQL

There are few more questions on TheDataMonk which you can find here

Keep Practicing !!

XtraMous









Introduction to Data Science, How to start?

We will keep posting Guesstimates, Case Studies, Puzzles, etc. But it’s high time we should start with Data Science.

On an overall level we can safely divide a Data Science job in the following buckets:-

1. Statistics – This job has a lot to do with Maths, so we will put it as the top priority
2. SQL – You need to fetch data and you need to be really good in it
3. R/Python – I will keep Python above all, but if you are good at R, then it should do the job
4. Visualization tool – Tableau/PowerBI. Both are almost the same, but Tableau is preferred.
5. MS Excel and Power Point – Bread and butter

Approx. amount of time you should dedicate to cover the above topics:

1. Statistics – Are you kidding me ?? Life time πŸ™‚ But ideally an effort of 100 hours should move you from rookie to sort of intermediate

2. SQL – ~40 hours

3. R/Python – 80 hours

4. Visualization – 40 hours

5. MS Excel and Power Point – ~30 hours. You will keep learning

How to start ??

I don’t really want to promote the website, you can choose any good resource from the below list:-

1. Statistics – Don’t go for any online course. Try to complete “Introduction to Statistical Learning”. You know how and where to get it πŸ˜›

2. SQL – Practice on www.sqlzoo.net and do try all the questions posted on the website. Try the following:-

a. 112 Questions to crack Data Science Interview using SQL
b. Write better SQL Queries + SQL interview Questions

There are many other books on Kindle, do practice πŸ™‚

3. R/Python – Any good source should do. You can enroll in some online courses on Coursera, Datacamp, etc. But do practice a lot. We have a book on Python and R, see if you like it:-

a. 100 Python Questions to crack Data Science/Analyst Interview
b. 100 Questions to Learn R in 6 Hours

We will keep publishing books on these topics.

4. Visualization Tools – We strongly recommend any online or paid course on Tableau or PowerBI

5. MS Excel and PowerPoint – Ask your seniors or team leads for resources. The internet is full of cheat sheets which are actually helpful

To start with, start devoting 1-2 hours per day, follow this website or any website of your choice. We want to create a resourceful blog which can actually help people like you.

Try Hackathons on AnalyticsVidhya and Kaggle to chisel your skill. We strongly and very strongly believe that Hackathons can actually uplift your level of understanding and knowledge about different algorithms.

Keep practicing.

XtraMous

Puzzles and aptitude problem

Why add puzzles and logical problem in a Data Science course?

Well, because you are supposed to solve problems of the same intensity on the floor.
Well, because you need to chisel your logic every now and then.
Well, because these are asked in the hiring interviews πŸ˜›
Well, because you can’t run away from Mathematics in a domain which is closer to it that it appears.

1. When one of my friends was asked the following question in the last round of WNS Global Services

“Tell me different ways in which you can tell me if the light inside glows after you shut the door”

2. A classic beaker and bucket challenge is always on the cards. There are two beakers, 4 liters and 5 liters, and a bucket. Get me 7 liters of water in the bucket.

Don’t directly jump to the answer, yeahh..close your eyes and think about a solution. In an interview, you will not get more than 2 minutes for this problem.

3. Make 120 using 5 zeroes. You can use any mathematical operator

4. Now make it using 4 zeroes.

5. There are two kids, and you have a cup of coffee. How will you divide it into two cups so that both the kids are satisfied?

6. Using 3 cuts, divide a cake into 8 parts

7. Suggest me one more way. Equal means equal, you need to give cream and base equally to all the 8 parts.

8. Use only 4 weights to make a device which can measure the following weight:-
a. up to 15 kgs
b. up to 40 kgs
c. up to 80 kgs

9. You are given two candles of equal size, which can burn 1 hour each. You have to measure 45 minutes with these candles.

10. One last question in this interview, What will be the name of the third person who you will meet today?

For the aptitude part, all we can suggest is that you solve problems on the following topics from any online resource:
1. Time, Speed, and Distance
2. Probability
3. Permutation and Combination
4. Try solving Sudoku
5. Work and Time
6. Guesstimate

Puzzle answers:-
1. a. Keep a video recorder inside the fridge
b. Keep a radium inside the fridge for a couple of hours. If it glows after two hours, then the light was on, else off

2. Take the 5-liter beaker and fill it completely, pour the water in the 4-liter beaker. You are left with 1 liter water, pour it in the bucket. Repeat it again to get 2 liter water in the bucket.

Now fill the 5 liter beaker and put it in the bucket.

7 done !!

3.
– (0!+0!+0!+0!+0!)!
-(Cos(0)+Cos(0)+Cos(0)+Cos(0)+Cos(0))!

4.
-((0!+0!+0!)-0!)!

5.
-A logical question with a tinge of pun in it. Ask one of the kid to divide and the second one to choose πŸ˜›

6.
– Two cuts at the right angle and the third one directly between the top and the base will get you 8 parts. The lucky 4 will get the cream πŸ˜›

7.
-Two cuts at the right angle and then put the parts on each other and strike the third cut right on the middle. You get 8 pieces, all equal

8.
a. up to 15 kgs – 1 kg, 2 kg, 4 kg, 8kg
b. up to 40 kgs – 1 kg, 3 kg, 9 kg, 27 kg
c. up to 80 kgs – 2 kg, 6 kg, 18 kg, 54 kg

If the weight is less than 2 kgs, it is 1 kg.
If it is more, put 6kg on one side, your item and 2kg on the other. If item and 2kg are less then 6kg, the item is 3kg. 
If it is more, put 6kg on one side, your item on the other. If the item is less then 6kg, it’s 5kg….so on

9.
Keep both the candles side by side. First light just one candle, when you see that the first candle is exactly half the height of the second, light the second candle (which is when 30 minutes are up). The second candle will then burn for one hour and you’ll be able to clock 90 minutes

10.
Pappu Yadav or any random name, The interviewer can’t ask you to defend the answer as it was the last question. Don’t fall for any stupid question if the interviewer specifies that it is the last question


Candidates with average aptitude are subject to rejection, so read this part carefully before applying to jobs.

If and only if you want to practice more interview puzzles, then do try this book out 100 Puzzles and Case Studies to Crack Data Science interview

Keep practicing πŸ™‚

XtraMous

Practice Guesstimate

You already know that there are some 100 Million iPad being sold in India each year. But can you guess the area of an airport? Let’s say I need to know the total area of the Bangalore airport.

There are four important areas in an airport:-
1. Parking space
2. Sitting space
3. Hangar space
4. Runway

Let’s assume the following percentage:-
1. Parking space – 10%
2. Sitting space – 10%
3. Hangar space – 10%
4. Runway – 70%

Now, suppose that on an average 168 flights fly from Bangalore. That means per hour 7 flight will be on the runway, so we need 7 runways.

A flight runs for half a minute on the runway at a speed of 120 km/hr. Thus each runway will be 1 km long. There are 7 runways, thus 7 kilometers.

The parking space, hangar space, and sitting space is 10% each, thus it will have a total space of 3 kilometers.

The total area of Bangalore airport is 10 kilometers.

This is obviously just an estimation. You can choose other ways to do a similar calculation like below:-
1. Number of flight
2. Number of passengers to estimate the sitting space
3. The dimension of an airplane to get the length
4. Number of travelers to guess the number of cars and cabs to guess the area of the parking

We will keep updating the puzzle page with more guesstimate examples. Practice at least 10-15 problems to understand how to maintain consistency with the problem.

Few problems worth giving a shot are:-
1. Number of red cars in a city, say Mumbai
2. Number of people wearing yellow cloth on a Monday in Delhi
3. Number of people eating pizza right now
4. Revenue of Facebook/Amazon/Youtube
5. Number of beer bottles in India
6. Number of weddings in India/Spain/Timbuktur

Keep practicing πŸ™‚

XtraMous


Guesstimate

Guesstimate is a methodological process of guess and estimate. Guesstimates have now become an important part of several Business Analyst/Data Scientist and MBA interview process.

What does a person asking you questions as absurd as “How many people are wearing red today?” is looking for ?
You know that he has no idea about the same but he has the power to select or reject your candidature. You will be judged entirely on the basis of your approach and the number of points which you can think of.

One and only one thing to keep in mind when you are solving a guesstimate question – You should start with a number and you have to be very very consistent with the same in the entire solution. And be very particular about your assumptions.

Let’s take an example:

What are the number of iPad sold in India in 2018?

Assumption 1 – Population of India is 1000 Million (though it’s 130 but you can safely assume a round of number of simple calculation and this is very much acceptable)

Assumption 2 – Population percentage below poverty line – 60%
and Population percentage of upper middle class 20%, whereas the rest 20% is the middle class. So there are 200 Million people who are financially eligible to buy an iPad.

Assumption 3 – Now take these 200 Millions as your base number. Dividing this 200 Million into Rural and Urban population:

Rural – 20% = 40 Million
Urban – 80% = 160 Million

Let’s divide the Rural and Urban population on the basis of age:

Rural Total population = 40 Million

AgePopulation
Percentage
Population
(Number)
Percentage buying
iPad
iPad Sold
0-12 YO10%4M0%0M
12- 20 YO30%12M30%3.6M
20-40 YO30%12M40%4.8M
40-60 YO10%4M10%.4M
60 + YO20%8M0%0M

Total iPad sold in Rural area = 8.8 Million

AgePopulation
Percentage
Population
(Number)
Percentage buying
iPad
iPad Sold
0-12 YO10%16M0%0.0M
12- 20 YO30%48M40%19.2M
20-40 YO30%48M80%38.4M
40-60 YO20%32M40%12.8M
60 + YO10%16M20%3.2M


Total iPad sold in the Urban area = 73.6 Million

Total iPad sold in India = 82.4 Million

This number might be very close or very off, but you took a couple of factors and successfully derived a number.

You can also take more factors like:-
1. Occupation – People working in Technical field are more inclined towards buying gadgets(Assumption)
2. High or low number of service center in a particular city – A city with more number of Apple service centers will sell more devices (Assumption)

With every new factor, your calculation will increase in complexity. So, add factors only if you think you can manage the calculation.

It’s always better to arrive at a number after 15 minutes of discussion with limited factors than arriving nowhere after 1 hour of intense discussion with a lot of factors.

If you want to practice more, you can visit this link

XtraMous






Case Study III

Company Name – Sapient
Round – 2 (Case Study)
Topic – Recommendation of food items to new customer in a restaurant



Punjabi By Nature, a restaurant in Bangalore, delivers and serves food to its customer. It has been in the business for the last 10 years. Recently the owner heard about Data Science and they want to leverage the opportunity in order to boost its revenue. The restaurant has been collecting the following data since the last 10 years:-
1. Name of Customer
2. Sex of Customer
3. Age of Customer
4. Food item code

Your job is to recommend 2 food items to a customer new to the restaurant. Answer the following questions:-
a. Looking at the data, provide 3 findings for the restaurant to boost their performance
Proposed Solution.
i. Food item combination can help you recommend a particular item. For example, if people prefer curd with Parathas, then you can recommend it
ii. A list of the most popular food item
iii. Looking at the customer’s age and sex you can decide what to offer to a new customer. If the data suggests that a girl in the age group 20-25 likes chocolate ice cream then you can recommend this ice cream to the new girl customer 

b. Think of 4 more data points which might help you with the analysis

Proposed Solution
i. Pincode of delivery
ii. Time of order
iii. Phone Number
iv. Date and Day of service 

c. Now, what else can you find out from this data?

Proposed Solution
i. Day of service can get you the popular food item on each day and weekend
ii. Pincode can help you identify if there is a demand for some specific food items in a particular area
iii. The phone number to inform about new offers
iv. Time of order can get you the time at which the restaurant should shoot a particular offer for a specific food item

Company Name – Xiaomi
Round 3 – Case Study
Topic – Profit of a company selling mobile back cover is declining. List out all the possible reasons



Following is the way in which discussion proceeded with the interviewer:-
1. The demand itself has declined i.e. customers are not using cover that much. Asked to think more by the interviewer
2. Maybe the competitor is also facing loss which again means that the demand is low. Competitors are making a decent profit
3. Bad Marketing – The company is not putting stalls or shops in a crowded place. The interviewer told that the company was making a decent profit 6 months back
4. Maybe the footfall of the mall or place decreased. Could be(first positive response)
5. Maybe a popular mobile phone shop has shifted somewhere else. Could be(again a so-so response)
6. Maybe the other companies have reduced the price of their product which is why customers are drifting to these companies. The interviewer seemed pleased
7. New technology in the cover market to make covers more durable and the company we are talking about is using the same old technology. Seemed good enough point
8. Since we are talking about back covers, there could be new or trending designs which are not produced by the company
9. The company has not registered on different e-commerce websites and the website they are present on is not doing good business. He looked satisfied with the point

There were discussions on each point. The key here is to understand the point if you think it is not making much sense in a discussion. If you think you are unable to convey the point, you should try to justify the point.

If it’s correct, hang on to the point.
If you think it was a miss, admit it and move to the next point.

Keep practicing and keep looking for new problem statements.

XtraMous

Case Study II

Company Name – Myntra 
Round – 2 (Case Study)

Topic:- Suppose you have a restaurant and there is a crunch of sitting space, apart from that you have one kitchen door which is generally crowded with the waiters and food delivery guys which leads to delay in serving the food. Would you like to create room for one more small window to expand the kitchen and make the service quicker?
What will be the impact on the already small sitting space?
How to optimize the above scenario?

P.S. – The approach shared below is not the best which you can think of. You may add your points and think through the problem


Approach:-Take a stand and always talk about your solution with the mindset of a data scientist. Throw numbers wherever possible 
If I were the owner of the restaurant, I would have added a small room for some selected food items. Looking at the data, we can conclude on a few food items which are served most often to the dining customers as well as to the delivery customers.

Suppose it’s a Punjabi restaurant and the best sellers are Naan, Butter Chicken, and Punjabi Biryani. Once we have the data, we can have these items prepared in the main kitchen, but the delivery/serving should be done on the small room/window which is newly created. What about the space crunch? The waiters will have a hard time going back and forth?
The space crunch needs to be compromised and in return, we will be providing quicker service which should compensate for space. The serving time and sitting time of the customer will decrease which might counter the space crunch as more number of customers can be entertained. 

There is one more problem with the restaurant and that is the wastage of raw materials as the shelf life of a few ingredients are very low. How to counter it?

Looking at the data we can come up with a few nodes of this problem. We can ask these questions to the data:-
1. Which all items are sold the most?
2. What is the key ingredient of these items?
3. Day-Food item pair, with this we can get which all items are sold on which day

We can predict the amount of demand in the coming week and can act accordingly. But prediction might not work the best in every case, right?
I agree, to counter this we can have a variable price menu, so if the shelf life of a particular item is low, we can give a discount on these items or can give a combo offer to clear the stock at the very end of the day. There were questions on the formula which you will use to determine the new price in the above situation. Basically, you need to come up with parameters and you have to decide the importance of the parameters by either giving them multiplicative or additive importance in the formula or you can club your answer with any other offer. Following were the discussion points:-

1. We can take those items which will get wasted by the end of the day(looking at the stock near the closing time) and can set a variable price. Suppose, you realize at 9 pm that at least 100 kgs of Biryani will be wasted tonight(by 12 am), then you can set a variable price on Biryani
Cost Price of 1 kg of Biryani – 100
Selling Price of 1 kg of Biryani – 200
Stock left – 100 Kgs
Time left – 3 hours
Aim – To minimize the loss Revised Price = ((180 – x)/180)*Selling Price
where x is the number of minutes after 9
So, at 9:30 pm, the price of the Biryani will be = ((180-30)/180)*200 = Rs. 166

This is one way where we can help the restaurant in clearing the stock.

We can also give the customers a Mega Offer, where if they buy a Biryani today, then they will get some y% discount tomorrow. This will also help in retaining the customers.
Apart from that, we can also give them more Biryani at a lesser price. Suppose Rs. 166 is the price at 9:30 pm, then we can offer them 1.5 kgs of Biryani at Rs.210. This will clear the stock easily.

Keep thinking about more points.

XtraMous