Mckinsey Interview Questions | 10 Million Data Points

Question

Mckinsey Interview Questions | 10 Million Data Points

Question

How would you perform clustering on a million unique keywords, assuming you have 10 million data points—each one consisting of two keywords, and a metric measuring how similar these two keywords are? How would you create this 10 million data points table in the first place?

in progress 0

Statistics Dhruv2301 4 years 1 Answer 1043 views Great Grand Master 0

About Dhruv2301Great Grand Master

Follow Me

Answer ( 1 )

Leave an answer

Name*

E-Mail*

Website

Attachment

Browse

Featured image

Browse

Answer*

Previous question

Next question

Ognish Master · Answer 1 · July 7, 2020

Ognish Master

0

July 7, 2020 at 3:20 pm

Reply

We apply the Hadoop MapReduce standard K-means clustering algorithm to manage large datasets and introduce a new metric for similarity measurements such that the distances between objects exhibit high levels of intra-cluster similarity and low levels of inter-cluster similarity.

Register Now

Login

Lost Password

Login

Register Now

Mckinsey Interview Questions | 10 Million Data Points

Top Categories