Zomato Interview Questions | Fuzzy Merging
Question
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
It will take less than 1 minute to register for lifetime. Bonus Tip - We don't send OTP to your email id Make Sure to use your own email id for free books and giveaways
Answers ( 2 )
suppose you want to search a list of names to see if the name DataMonk is in the list.
You want to look for an exact match or names that are similar. This is called Fuzzy Matching.
Merging data sets on names with approximately the same spelling, or merging on times that are
within two minutes of each other are examples of these kinds of merges.
SAS has a in-built function called ‘spedis’, which stands for Spelling Distance.
The function can be used for Fuzzy Matching.
The SPEDIS function returns a 0 if the two arguments match exactly. The function
assigns penalty points for each type of spelling error. For example, getting the first letter
wrong is assigned more points than misspelling other letters. Interchanging two letters is
a relatively small error, as is adding an extra letter to a word.
To identify any name that is similar to DataMonk, you could extract all names where the
value returned by the SPEDIS function is less than some predetermined value.
Frequently SAS programmers must merge files where
the values of the key variables are only approximately the
same. Merging on names with approximately the same
spelling, or merging on times that are within three
minutes of each other are examples of these kinds of merges