SQL Basics to get you started

SQL stands for Structured Query Language and it is used to pull relevant data from the database. We assume that you already know a bit about SQL, so we will try to cover the base and then will quickly jump to some questions.

1. What all can we do with SQL?

We can perform the following task:-
a. Retrieval of data
b. Updation of data
c. Insertion of data
d. Deletion of data

2. A table is basically a set of data which is organized in the form of rows and columns.

3. What are the different subsets of SQL?
We can further divide SQL into 3 parts or subsets:-
a. Data Definition Language – It helps you to CREATE, ALTER, and DELETE objects
b. Data Manipulation Language –It helps us to insert, update, delete, and retrieve data from the database
c. Data Control Language – DCL allows you to control access to the database. Grant and Revoke access permission

4. What is a Primary Key?
– A key which uniquely defines each row in a table and it cannot be NULL. Example – Roll number or Employee ID in a student or Employee table

5. What are constraints?
– Constraints are used to apply a limit on the data type of the table. Following are a few examples of constraints:-
a. Primary Key
b. Foreign Key
c. Unique
d. Default
f. Check
g. Not Null

6. What is a unique key?
– A key which uniquely defines each row in a table. It is mostly a group of keys which can or can not be NULL

7. What are the different types of Joins?
– There are various types of Joins, the most important out of these are:-
a. Inner Join – Intersection of two tables
b. Right Join – All the values from the right table and their corresponding matched value from the left table
c. Left Join – All the values from the left table and their corresponding matched values from the right table
d. Outer Join – All the values from both the table

8. We will not be talking about ACID properties and Normalization. These are very easily available on the internet. It would be very good if you watch a couple of videos on these topics on Youtube. The links of the videos are given below:-

Normalization
ACID Properties

9. What is a NULL value?
A NULL value is neither zero nor a blank space. It represents a value that it unavailable, unknown, Not applicable, etc.
Remember zero is a number and blank space is a character

10. What is the order of SQL query execution?

OrderClauseFunction
1FROM and JOINSelect the table to extract from
2WHEREFilter the base data
3GROUP BYAgreegate the base data
4HAVINGFilter the aggregate data
5SELECTReturn the final data
6DISTINCTApply distinct or unique condition
7ORDER BYSort the final data
8LIMITApply limit on the number of rows

11. What is the difference between DROP and TRUNCATE commands?
-DROP command removes a table and it cannot be rolled back from the database whereas TRUNCATE command removes all the rows from the table.

12. What is a view in SQL?
– Views in SQL are kind of virtual tables. A view also has rows and columns as they are
in a real table in the database. We can create a view by selecting fields from one or
more tables present in the database. A View can either have all the rows of a table or
specific rows based on certain condition

13. What is Union, minus and Interact commands?
UNION operator is used to combine the results of two tables, and it eliminates duplicate rows from the tables.
MINUS operator is used to return rows from the first query but not from the second query. Matching records of first and second query and other rows from the first query will be displayed as a result set.
INTERSECT operator is used to return rows returned by both the queries.

14. Important syntax of SQL below:-

a. BETWEEN
SELECT *
FROM Table Name
WHERE Date BETWEEN ‘date1’ AND ‘date2’

b. ORDER BY
SELECT *
FROM Table Name
WHERE Condition
ORDER BY {ASC/DESC}

c. HAVING
SELECT SUM(column_name)
FROM table_name
WHERE CONDITION
GROUP BY column_name
HAVING (arithematic function condition);

d. TRUNCATE Table
TRUNCATE TABLE Table_Name

e. ALTER Table
ALTER TABLE Table_Name {ADD/DROP/MODIFY} Column_Name {Data Type}

f. INSERT INTO
INSERT INTO table_name( column1, column2….columnN)
VALUES ( val1, val2….valN);

g. UPDATE
UPDATE table_name
SET column1 = val1, column2 = val2….columnN=valN
[ WHERE CONDITION ];

15. Explain SQL clause
The SQL LIKE clause is used to compare a value to similar values using wildcard operators.
There are two wildcards used in conjunction with the LIKE operator.
The percent sign (%)
The underscore (_)
The percent sign represents zero, one or multiple characters. The underscore represents a single number or character.
These symbols can be used in combinations.

16. What is the difference between ‘HAVING’ CLAUSE and a ‘WHERE’ CLAUSE?
HAVING clause can be used only with SELECT statement.
It is usually used in a GROUP BY clause and whenever GROUP BY is not used, HAVING behaves like a WHERE clause.
Having Clause is only used with the GROUP BY function in a query
whereas WHERE Clause is applied to each row before they are a part of the GROUP BY function in a query.

17. What is the difference between JOIN and UNION?
-UNION
combines the results of two or more queries into a single result set that includes all the rows that belong to all queries in the union. By using JOINs, you can retrieve data from two or more tables based on logical relationships between the tables
JOIN:-
a.) Joins the columns
b.)Duplicates are allowed
c.) Combines the column based on condition

UNION:-
a.) Merge the row
b.) Duplicates are not allowed
c.) Combine the result of two select statements.

18. What is self join?
The SQL SELF JOIN is used to join a table to itself as if the table were two tables, temporarily renaming at least one table in the SQL statement.

SELECT a.column_name, b.column_name…
FROM table1 a, table1 b
WHERE a.common_field = b.common_field;

19.
SELECT – select records from a table
INSERT – insert new records
UPDATE – update/Modify existing records
DELETE – delete existing records
CREATE – create a new Table, database, schema
ALTER – alter existing table, column description
DROP – delete existing objects from database
GRANT – allows users to read/write on certain database objects
REVOKE – keeps users from read/write permission on database objects

20. What is the difference between inner and outer join? Explain with example.
Inner join is the most common type of Join which is used to combine the rows from two tables and create a result set containing only such records that are present in both the tables based on the joining condition (predicate).Inner join returns rows when there is at least one match in both tables.
If none of the record matches between two tables, then INNER JOIN will return a NULL set. Below is an example of INNER JOIN and the resulting set.

Outer Join- Outer Join, on the other hand, will return matching rows from both tables as well as any unmatched rows from one or both the tables (based on whether it is single outer or full outer join respectively).Outer Join can be full outer or single outer.

These were the very basic questions and topics. Apart from it there are some interesting questions waiting for you in the next exercise where we will deal with partition by, ranking and some tricky SQL questions.

You can easily find the explanation of the above questions on the internet, else you can also look into our book on SQL

112 Questions to crack Data Science Interview using SQL

There are few more questions on TheDataMonk which you can find here

Keep Practicing !!

XtraMous









SQL interview questions for Data Science and Business Analyst role

Before you start with the questions, make sure you have your basics clear. Look for explanation if you don’t have enough context about some syntax or function. We have provided the table schema wherever required, in case the schema is not provided, please assume the table and column names.

1. How to find the third highest salary in an employee table with employee number and employee salary?

SELECT * FROM Employee_Table t1
WHERE 3 =
(SELECT Count(distinct Salary) from Employee_Table t2 WHERE t1.Salary <= t2.Salary);

2. Write a query to find maximum salary of each department in an organisation.

SELECT Department_Name, Max(Salary)
FROM Department_Table
GROUP BY Department_Name

3.What is wrong with the following query?
SELECT Id, Year(PaymentDate) as PaymentYear
FROM Bill_Table
WHERE PaymentYear > 2018;

Though the variable PaymentYear has already been defined in the first line of the query, but this is not the correct logical process order. The correct query will be
SELECT Id, Year(PaymentDate) as PaymentYear
FROM Bill_Table
WHERE Year(PaymentDate) > 2018;

4. What is the order of execution in a query?

The order of query goes like this:-
FROM – Choose and join tables to get the raw data
WHERE – First filtering condition
GROUP BY – Aggregates the base data
HAVING – Apply condition on the base data
SELECT – Return the final data
ORDER BY – Sort the final data
LIMIT – Apply limit to the returned data

5.What is ROW_NUMBER() function?

It assigns a unique id to each row returned from the query ,even if the ids are the same. Sample query:-

SELECT emp.*,
row_number() over (order by salary DESC) Row_Number
from Employee emp;

Employee NameSalaryRow_Number
Amit70001
Bhargav60002
Chirag60003
Dinesh50004
Esha30005
Farhan30006

Even when the salary is the same for Bhargav and Chirag, they have a different Row_Number, this means that the function row_number just gives a number to every row

6. What is RANK() function?

RANK() function is used to give a rank and not a row number to the data set. The basic difference between RANK() and ROW_NUMBER is that Rank will give equal number/rank to the data points with same value. In the above case, RANK() will give a value of 2 to both Bhargav and Chirag and thus will rank Dinesh as 4. Similarly, it will give rank 5 to both Esha and Farhan.

SELECT emp.*,
RANK() over (order by salary DESC) Ranking
from Employee emp;

7. What is NTILE() function?

NTILE() function distributes the rows in an ordered partition into a specific number of groups. These groups are numbered. For example, NTILE(5) will divide a result set of 10 records into 5 groups with 2 record per group. If the number of records is not divided equally in the given group, the function will set more record to the starting groups and less to the following groups.

SELECT emp.*,
NTILE(3) over (order by salary DESC) as GeneratedRank
from Employee emp

This will divide the complete data set in 3 groups from top. So the GeneratedRank will be 1 for Amit and Bhargav, 2 for Chirag and Dinesh: 3 for Esha and Farhan

8. What is DENSE_RANK() ?

This gives the rank of each row within a result set partition, with no gaps in the ranking values. Basically there is no gap, so if the top 2 employees have the same salary then they will get the same rank i.e. 1 , much like the RANK() function. But, the third person will get a rank of 2 in DENSE_RANK as there is no gap in ranking where as the third person will get a rank of 3 when we use RANK() function. Syntax below:-

SELECT emp.*,
DENSE_RANK() OVER (order by salary DESC) DenseRank
from Employee emp;

9. Write a query to get employees name starting with vowels.

SELECT EmpID,EmpName
FROM Employee
where EmpName like ‘[aeiou]%’

10. Write a query to get employee name starting and ending with vowels.

SELECT EmpID,EmpName
FROM Employee
where EmpName like ‘[aeiou]%[aeiou]’

11. What are the different types of statements supported in SQL?

There are three types of statements in SQL:-
a. DDL – Data Definition Language
b. DML – Data Manipulation Language
c. DCL – Data Control Language

12. What is DDL?

It is used to define the database structure such as tables. It includes 3 commands:-
a. Create – Create is for creating tables
CREATE TABLE table_name (
    column1 datatype,
    column2 datatype,
    column3 datatype,
   ….
);
b. Alter – Alter table is used to modifying the existing table object in the database.
ALTER TABLE table_name
ADD column_name datatype
c. Drop – If you drop a table, all the rows in the table is deleted and the table structure is removed from the database. Once the table is dropped, you can’t get it back

13. What is DML?

Data Manipulation Language is used to manipulate the data in records. Commonly used DML commands are Update, Insert and Delete. Sometimes SELECT command is also referred as a Data Manipulation Language.

14. What is DCL?

Data Control Language is used to control the privileges by granting and revoking database access permission

15. What is the difference between DELETE and TRUNCATE?

16. What are the important SQL aggregate functions?
a. AVG()
b. COUNT()
c. MAX()
d. MIN()
e. SUM()

17. What is Normalization? How many Normalization forms are there?

Normalization is used to organize the data in such manner that data redundancy will never occur in the database and avoid insert, update and delete anomalies.
There are 5 forms of Normalization
First Normal Form (1NF): It removes all duplicate columns from the table. Creates a table for related data and identifies unique column values
Second Normal Form (2NF): Follows 1NF and creates and places data subsets in an individual table and defines the relationship between tables using a primary key
Third Normal Form (3NF): Follows 2NF and removes those columns which are not related through primary key
Fourth Normal Form (4NF): Follows 3NF and do not define multi-valued dependencies. 4NF also known as BCNF

18. How to fetch only common records between two tables?

SELECT * FROM Employee
INTERSECT
SELECT * FROM Employee1

19. Full Outer Join is a combination of which of the following:-
a. Left Outer and Left Inner Join
b. Left Outer and Right Inner Join
c. Left Outer and Right Outer Join
d. Left Outer and Right Outer Join

A. Full Outer Join is a combination of Left Outer and Right Outer Join in SQL

20. Right Outer Join is similar to:-
a. Right Inner Join
b. Left Inner Join
c. Left Outer Join
d. Right Outer Join

A. Right Outer Join is similar to Left Outer Join in SQL

If you want to practice SQL questions before an interview or to hone up your SQL skills, you can either practice at www.sqlzoo.net or you can buy 112 Questions to Crack Business Analyst Interview from Amazon.